Abstract
Which properties of metabolic networks can be derived solely from stoichiometry? Predictive results have been obtained by flux balance analysis (FBA), by postulating that cells set metabolic fluxes to maximize growth rate. Here we consider a generalization of FBA to singlecell level using maximum entropy modeling, which we extend and test experimentally. Specifically, we define for Escherichia coli metabolism a flux distribution that yields the experimental growth rate: the model, containing FBA as a limit, provides a better match to measured fluxes and it makes a wide range of predictions: on flux variability, regulation, and correlations; on the relative importance of stoichiometry vs. optimization; on scaling relations for growth rate distributions. We validate the latter here with singlecell data at different subinhibitory antibiotic concentrations. The model quantifies growth optimization as emerging from the interplay of competitive dynamics in the population and regulation of metabolism at the level of single cells.
Introduction
After the significant developments in molecular biology and biochemistry in the last century, many aspects of cellular physiology could be understood as a result of interactions between identified molecular components. Perhaps the bestcharacterized example is intermediate metabolism, the set of reactions that enable cell growth by converting organic compounds and transducing free energy. Today it is possible to some extent to infer the topology of metabolic networks from data at genomic scale, but the dynamics and parameter dependence of such networks remain difficult to analyze. Alternatively, one can assume that known reactions only provide physicochemical constraints within which some adaptive dynamics has maximized the growth rate, e.g., by adjusting enzyme levels and controlling reaction rates^{1}. An influential implementation of this idea for batch cultures under steadystate conditions has been the flux balance analysis (FBA)^{2}, which has been tested experimentally^{3,4}, also in mutant strains, strains used for industrial production^{5,6,7}, as well as phenotypes implicated in disease (e.g., Warburg effect^{8}). Using maximum entropy ideas from statistical physics, we extend the application of FBA from batch to singlecell level and show that our extension makes a wide range of predictions, some of which we test experimentally.
Recent measurements at the singlecell level demonstrated the existence of substantial celltocell growth rate fluctuations even in wellcontrolled steadystate conditions^{9}. These fluctuations exhibit universal scaling properties^{10,11,12,13}, relate to cell size control mechanisms^{14,15}, act as a global collective mode for heterogeneity in gene expression^{16,17,18}, and are ultimately believed to affect fitness^{19}. To link these observations to metabolism, however, we need to set up a mathematical description not only of the optimal metabolic fluxes and maximal growth rate in batch culture (as in FBA, which permits no heterogeneity across cells), but for the complete joint distribution over metabolic fluxes. Metabolic phenotypes of individual cells growing in steadystate conditions can then be understood as samples from this joint distribution, which would automatically contain information about flux correlations, and, in particular, could directly predict celltocell growth rate fluctuations.
The simplest construction of a joint distribution over metabolic fluxes can be derived in the maximum entropy framework^{20}. The key intuition is to look for the most unbiased (or random) distribution over fluxes through individual metabolic reactions that is consistent with the given stoichiometric constraints, while matching the experimentally measured average growth rate. The maximum entropy model that we specify below will turn out to be a oneparameter family of distributions, where the single parameter can be fit to match experimental data; all subsequent predictions follow directly, without further fitting. A similar approach has recently been used in diverse biological settings, ranging from neural networks^{21,22}, genetic regulatory networks^{23}, antibody diversity^{24}, and collective motion of starling flocks^{25}.
In addition to accounting for celltocell variability, the maximum entropy construction provides a principled interpolation between two extremal regimes of metabolic network function. In the uniform (nooptimization) limit, no control is exerted over metabolic fluxes: they are selected at random as long as they are permitted by stoichiometry, resulting in broad yet nontrivial flux distributions that support a small, nonzero growth rate. In the FBA limit, fluxes are controlled precisely to maximize the growth rate, with zero fluctuations. The existence of these two limits defines a fundamental, and still unanswered, question about metabolic networks: Is there empirical evidence that real metabolic networks are located in an intermediate regime between the two limits where fluctuations are nonnegligible^{26}, and if so, what are the properties of this intermediate regime (see Fig. 1)? Here, we address this question using metabolic flux and singlecell physiology data for Escherichia coli.
In the Methods section, we provide a review of the maximum entropy formalism for metabolic networks as it has been established in previous work^{26,27,28,29,31} and we stress its implementation in our work. In the Results section, we use this formalism to set up and test quantitative predictions for E. coli as well as to discuss possible theoretical extensions. We provide here compelling experimental evidence that the observed growth rate fluctuations reflect metabolic flux variability and suboptimality of growth, both of which are captured quantitatively by the maximum entropy model of the metabolic network.
Results
Experimental test of flux predictions for E. coli
We constructed a maximum entropy model for the catabolic core of the E. coli metabolism (see Methods section). The model has a single parameter β that constrains the average growth rate in the flux space, interpolating from an uniform sampling (β = 0) to the optimal FBA solution (β → ∞). In particular, we consider the specific value of this parameter β^{*} inferred by constraining the average growth rate in the model to match the population experimental growth rate (\(\overline \lambda = 0.2\,{\mathrm{h}}^{  1}\) for a set of 12 experiments shown in Fig. 2a, and \(\overline \lambda = 0.1\,{\mathrm{h}}^{  1}\) for a set of seven experiments, not shown).
To evaluate the quality of model predictions, we compared N_{f}= 20 measured metabolic fluxes in E. coli from previously published data to our predictions, as shown in Fig. 2a. We defined the meansquared error (MSE) as
where V_{i} is the measured flux (relative to glucose uptake) and 〈v_{i}〉 is the mean of the corresponding flux computed in the maximum entropy model of Eq. (7). Figure 2b examines the behavior of MSE as a function of the parameter β. First, we note that the best flux predictions occur at or close to the value \(\beta ^ \ast \lambda _{{\mathrm{max}}} \simeq 120\), identified by the maximum entropy fit, at both average growth rates. This is a nontrivial prediction, because the value of β* was not fitted to minimize MSE, but rather, as demanded by the maxent formalism, to match the population growth rate. Second (and unsurprisingly), we find that flux predictions are better at β* than with uniform sampling, at β = 0. Perhaps the most surprising is our third finding: flux predictions at intermediate value of beta (\(\beta ^ \ast \lambda _{{\mathrm{max}}} \simeq 120\)) significantly outperform the limit of β → ∞, i.e., the FBA solution.
In addition to a better quantitative match overall, the maximum entropy model correctly predicted nonzero flux through the glyoxylate shunt, i.e., for the isocitrate lyase (ICL) as well as ME1 reactions, which FBA misses qualitatively by setting them to zero. As a consequence, this also leads to a better match of our model with data for reactions isocitrate dehydrogenase (ICDH) and alpha ketoglutarate dehydrogenase (AKGDH) that channel pyruvate through the Krebs cycle.
Lastly, we point out that Eq. (7) can also be viewed as a phenomenological equation for average fluxes with a single fitting parameter β that is set not to match the measured population growth rate, as in the maxent formalism, but simply to minimize some error measure (say MSE) with respect to experimentally measured fluxes. In Supplementary Note 4, we show that this also leads to predictions that outperform FBA for measured fluxes both in wildtype and mutant strains.
It is instructive to examine the evolution of the joint distribution over fluxes, P_{β}(v), as a function of the optimization parameter, β. Figure 3a shows how the growth rate approaches the maximal rate achievable, λ_{max}, with the inferred values of β^{*} from Fig. 2a, c suggesting an optimization level in the range of ~80% of the maximum. These levels are reached by adjusting flux values away from what they would have been under uniform sampling from the polytope of the allowed metabolic phenotypes, \({\cal P}\). Figure 3b traces the relative changes in all fluxes as a function of β. Interestingly, in the FBA limit, almost half of the fluxes (38 out of 86 fluxes, the upper half of the plot) are forced to zero, whereas at the inferred value (β^{*}λ_{max} ≈ 120), these fluxes only decrease by about 1/3 relative to their average value in the uniform sampling limit. Furthermore, the glyoxylate shunt remains active, in agreement with experimental observations. Surprisingly, only for a few reactions the fluxes are predicted to increase with growth rate optimization relative to the uniform sampling (lowest ~5 fluxes in Fig. 3b). These are mainly nitrogen and phosphate transport reactions, and to a lesser extent, malate dehydrogenase (MDH) and phosphoglucose isomerase (PGI) reactions. The latter two reactions are classified as reversible, whereas regulation in metabolic networks is thought to take place at irreversible reactions^{32}, so the predicted increase may be a consequence of increased substrate levels.
We separately illustrate three flux behaviors in Fig. 3c, for ICL, ICDH, and glutamate dehydrogenase (GLUDy). ICL and ICDH track the relative channeling of carbon sources in the Krebs cycle vs. glyoxylate shunt; ICL flux is switched off in the β → ∞ limit, whereas ICDH flux remains nearly constant with β. In contrast, GLUDy reaction is reversible, switching sign at intermediate values of β, while at high β the reaction ultimately gets frozen in the backward direction, implying high levels of ammonia in the cell, given the low affinity of this enzyme for ammonia^{33}.
We also evaluated the predicted variability in metabolic fluxes from the maximum entropy model, Eq. (7), at β^{*}λ_{max} ~ 10^{2}, and found a clear division between reactions with high and low coefficients of variation (CV). Among fluxes with lower variability were all glycolytic reactions (CV < 0.3) with the exception of PGI, as well as all transport reactions related to biomass formation (i.e., for glucose, oxygen, ammonia, carbon dioxide, phosphate ions; CV < 0.11), the first part of the Krebs cycle, and the irreversible reactions of oxidative phosphorylation.
We next wondered how flux variances scale with the optimization level. In the uniform sampling limit (β = 0), the variances should be large, characteristic of the shape and extent of the permitted polytope, \({\cal P}\). While in the FBA limit (β → ∞) the flux variability should vanish, we expect a welldefined scaling regime at high β where the variances shrink toward the FBA solution in a manner that is independent of the global polytope properties. This regime is indeed reached for all fluxes at \(\bar \lambda /\lambda _{{\mathrm{max}}} \gtrsim 0.90\) and for some fluxes much earlier, as shown in Fig. 3d: flux variability subsequently decreases with β as \(\sigma _i(\beta )/\sigma _i(\beta = 0) \propto \left( {1  \bar \lambda (\beta )/\lambda _{{\mathrm{max}}}} \right)\).
What kind of correlation structure between fluxes does the maximum entropy model predict? While the growth rate λ is linear in constituent fluxes in Eq. (7), suggesting that the joint distribution could factorize, correlations between fluxes develop because of the stoichiometric constraints that define the polytope \({\cal P}\). A subset of fluxes that we focus on in Fig. 3 exhibits a clear structure of strong (anti)correlation both under uniform sampling (Fig. 3e) and in the FBA limit (Fig. 3f). The FBA pattern of correlations, in particular, can easily be partitioned into four groups using a clustering algorithm^{34} so that the groups are strongly enriched for reactions characteristic of glycolysis, glyoxylate shunt, pentose phosphate pathway, and citric acid cycle, respectively. Fluxes in the glycolysis cluster tend to correlate strongly with fluxes in the citric acid cycle cluster, but anticorrelate with glyoxylate shunt and pentose phosphate pathway cluster. Comparison of the FBA correlations (F) with the uniform sampling (E) reveals that stoichiometric constraints alone shape much of the correlation structure, with the exception of anticorrelation between glycolysis and glyoxylate shunt clusters, which is a distinct consequence of the growth rate optimization. More generally, it is intriguing to apply maximum entropy to recover the correlation structure of metabolic fluxes in the FBA limit and use that to identify, automatically via clustering, separate metabolic pathways.
Lower limit of regulatory information needed for fast growth
As the growth rate optimization parameter β is increased, flux variances shrink (Fig. 3d), correlations strengthen (Fig. 3f), and the distribution over fluxes within the polytope \({\cal P}\) localizes closer to the FBA solution, v_{max}. Could this localization emerge due to competitive growth dynamics in batch culture^{28}? To test this hypothesis, we checked the relationship predicted by Eq. (13), which can be solved analytically (see Supplementary Methods). For the typical values of the carrying capacity vs. inoculation size ratio (\(N_{\rm C}/N_0 \simeq 10^6\)), the corresponding prediction from Eq. (13) is β^{*}λ_{max} ~ 50, considerably underestimating \(\beta ^ \ast \lambda _{{\mathrm{max}}} \simeq 120\), recovered by the maximum entropy model as reproducing the population growth rate and showing good match with measured fluxes. In other words, metabolic fluxes are more localized and growth is closer to optimal than would be expected in a scenario where metabolic reactions at the individual cell level are not actively regulated.
Motivated by these findings, we explored an alternative scenario where the localization of the distribution over fluxes around the optimal growth rate is achieved by active regulation of metabolic reactions. First, we quantified the degree of localization by the decrease in the entropy of the distribution over fluxes, Eq. (10). This is plotted for our E. coli network in Fig. 3g, where we show the average growth rate, \(\bar \lambda\), as a function of information, I (expressed in bits), parametrically in β. The resulting curve divides the \((I,\bar \lambda )\) plane into two halves: while it is possible to achieve metabolic phenotypes below the \(I(\bar \lambda )\) curve, the dashed region above the curve is forbidden. This is because no distribution exists that achieves high growth rates \(\bar \lambda\) without also deviating from the uniform distribution by at least the required number of bits.
Figure 3g suggests that at least ~40 bits of information are required to control the fluxes and reach growth rates amounting to ~80% of the maximal rate, λ_{max}, as reported in data for E. coli; higher growth rates call for increasing amounts of information, which formally diverges in the FBA limit as β → ∞. Interestingly, the number of outofequilibrium reactions in the model, 39, is in good agreement with the inferred amount of minimal information, given a simplistic estimate of 1 bit per reaction (sufficient to distinguish, e.g., high from low expression of the metabolic enzyme). This is consistent with the hypothesis that regulatory control is exerted for enzymes that catalyze irreversible reactions^{32}.
Cells control metabolic fluxes through regulatory networks, either indirectly, by regulating the expression of metabolic enzymes, or directly, by modulating the enzymatic activity through various feedback loops; either way, metabolic resources are required to exert this control. This leads to a tradeoff: flux control is necessary to support a high growth rate, but itself carries a growth rate penalty. We created a simple toy model to capture this intuition (see Supplementary Note 2). Here, K regulatory pathways control the fluxes and each pathway is modeled as a Gaussian information channel, so that together, these channels provide I(β) bits of necessary information as shown in Fig. 3g. The signaltonoise of each regulatory channel is determined by the number of regulatory molecules: higher molecular counts enable precise control and thus higher information, but impose higher cost. In this model, the costfree growth rate at given β is reduced by the cost to support K channels which control the fluxes, so that the resulting effective growth rate is:
where α determines the metabolic cost of regulatory molecules, and we estimated the number of regulatory pathways, K, to be approximately the number of degrees of freedom of the flux polytope, K ≈ D. The cost of regulation clearly limits the achievable growth rate, as shown in Fig. 3h, where the \(\bar \lambda _{{\mathrm{eff}}}(\beta )\) curves now develop a maximum rather than increasing monotonically as in the costfree case of Fig. 3a. While our toy model is very simplistic, it does capture properly the scaling of information with the growth rate, as well as the exponential metabolic cost of achieving high information transmission in molecular networks, reported previously^{35,36}. Thus, among many possible constraints acting on a cell, the cost of regulating metabolism itself^{1} can impose nonnegligible limits to growth.
Experimental test of growth rate fluctuation scaling
Can we test the novel predictions of our theory that extend beyond the domain of validity of the FBA? While it is currently experimentally unfeasible to measure metabolic fluxes and their variability at the single cell level, one can tractably measure division times and growth rates for single E. coli cells growing in stable conditions for long periods of time. In our model, such growth measurements directly connect to the biomass producing reaction with its associated growth flux λ(v). Figure 3d suggests that flux variability should scale \(\propto (\lambda _{{\mathrm{max}}}  \bar \lambda )\), and since the growth flux is a linear combination of metabolic fluxes, its variability, too, should follow the same scaling. To verify this explicitly, we computed the fluctuations in growth rate, σ/λ_{max}, as a function of the optimization parameter β, in Fig. 4a.
In the range of \(\beta \lambda _{{\mathrm{max}}} \gtrsim 40\), characteristic of wildtype E. coli experiments, the predicted growth fluctuations indeed obey
we refer to this range as the scaling regime. Beyond variance, the complete distribution of growth rates, Q(λ), can be sampled by marginalizing the maximum entropy model, Eq. (7).
Measurements of singlecell growth rates allow us to estimate growth rate distributions and compare them to the predicted Q(λ), as well as to empirically extracted \(\bar \lambda\), λ_{max}, and the fluctuations σ, to verify the predicted relation of Eq. (3). We used previously published data^{37} where E. coli cells were stably grown in a mother machine microfluidic device while multiple subinhibitory steps of concentration of the antibiotic tetracycline were delivered as shown in Fig. 4b. Low concentrations of antibiotic allowed us to probe different average growth rates in the same setup, and to construct empirical distributions of growth rates for every antibiotic concentration by pooling data from technical replicates of the multistep experiments (Supplementary Note 3). We find an excellent match between measured and predicted growth rate distributions in Fig. 4c for all five concentrations of the antibiotic used. Looking at many individual lineages in separate microfluidic channels, we can also extract λ_{max}, \(\bar \lambda\), and σ per lineage empirically and confirm the predicted scaling of growth rate fluctuations, as shown in Fig. 4d.
To conclude this section on fluctuations, we briefly mention that, under mild conditions, it is possible to study the dynamical response of the network in the linear regime under small perturbations. For example, ref.^{26} introduced a simple biologically motivated dynamics of diffusionreplication inside the metabolic space, described by the oneparameter equation for the growth rate distribution Q(λ), showing that the following fluctuationdissipation scaling laws should hold for the typical response times as well as the growth rate fluctuation autocorrelation time τ:^{29}
This relation, which further extends Eq. (3), is experimentally testable and predicts a divergent slowing down of the response time with growth rate maximization. As a consequence, growth rate fluctuations could take on a functional role in speeding up the response to environmental perturbations, e.g., to nutritional upshifts or externally applied stresses. Even if the experimental test of such dynamic predictions is beyond the scope of this paper, our model connects to a wide range of currently ongoing metabolism and growthrelated investigations.
Discussion
In this work, we considered maximum entropy distributions at fixed average growth rate in the space of metabolic phenotypes, a straightforward and statistically rigorous extension of the FBA, which is recovered in the asymptotic limit. Experimental estimates of enzymatic fluxes of the central carbon core metabolism in bulk cultures of E. coli, as well as empirical growth rate distributions of E. coli collected from single cell measurements, are consistent with intermediate level of growth optimization (βλ_{max} ~ 10^{2} and \(\bar \lambda /\lambda _{{\mathrm{max}}}\sim 0.8\)). We find that variability can be captured by a simple maximum entropy model, and that the zerofluctuation FBA limit qualitatively misses important experimental facts, e.g., the observed nonzero fluxes through the glyoxylate shunt.
The improved ability of our model to match the flux measurements is a consequence of a single extra parameter, β, which can easily be determined from existing experimental data. Beyond a better fit, however, our model also makes a wide range of predictions, extending the domain of metabolic network analysis to the singlecell level. While it is difficult to measure the singlecell metabolic fluxes and their variability in isogenic populations in steady state, such measurements for the growth rate are increasingly available. This connection enables the new predictions of our theory to be tested, and opens up the theory for verifiable extensions. Validating the predicted scaling of growth rate fluctuations in Fig. 4 is only the first step, with two broad lines of investigation within reach.
First, our approach is not limited to the core catabolism analyzed here or to bacterial metabolism, but can in principle be extended to other genomescale networks. In practice, however, we often lack suitable largescale experimental flux measurements. It is also likely that physicochemical constraints alone are insufficient to yield quantitatively accurate predictions. Similar issues arise also in the core catabolism for high growth rates that exceed the threshold of the acetate switch, for which a tradeoff between growth yield and rate emerges and additional constraints have to be added in FBAbased approaches^{38}. Our method can be extended to accommodate such cases, or systems where strict growth maximization is likely not a suitable objective. Extra objectives or constraints in the maximum entropy would appear as additional terms in the exponent of Eq. (7), where their corresponding parameters would control various tradeoffs between the objectives. This flexibility may be required to model metabolic dependencies, cell type heterogeneity, or interactions between cells.
Such extensions of maximum entropy modeling will benefit from the recent flourish of statisticalphysicsinspired algorithms, ranging from belief propagation^{39,40,41}, relaxational learning^{42}, and gaussian analytical approximation^{43}, used to solve the sampling problem, which is computationally at the heart of our approach (where on the other hand FBA relies on simpler linear programming). While the employed Monte Carlo hitandrun Markov chain is sufficient for the analyzed network, faster methods (in particular^{43}) will pave the way to largescale applications and inverse modeling settings.
Second, we considered two possible mechanisms for the emergence of maximum entropy distributions over metabolic fluxes. On the one hand, analysis of population dynamics of competitive growth under resource constraints reveals that maximum entropy distributions at fixed average growth rate are the steady states of logistic growth, in principle giving further testable predictions on the dependence of the growth optimization from inoculum size and medium carrying capacity. In essence, it is the exponential character of the growth laws that leads naturally to Boltzmann distributions. Despite the same functional form, this is in contrast to the standard case of statistical mechanics at equilibrium, where the link between molecular dynamics and the equilibrium distribution is nontrivial. On the other hand, localization of growth distributions toward the optimum can also arise due to the active regulation of metabolic enzymes, most likely involved in catalyzing irreversible reactions. These two mechanisms are not mutually exclusive and can actually operate concurrently; in our estimate, the purely populationdynamics scenario substantially underpredicts growth rate optimization (i.e., βλ_{max}) for the bulk culture, likely because it completely disregards active regulation of metabolism, which is known to be important. Note that the two mechanisms are, at least in principle, distinguishable experimentally: in the mothermachine device, there is no competition across the independent microfluidics channels, putting us in the regime with a very small N_{C}/N_{0}, which predicts smaller β^{*}λ_{max} than in the bulk, qualitatively in line with the observations. In other words, in bulk, both singlecell regulation and competitive growth may be active simultaneously, leading to higher growth rate optimization than in singlecell microfluidics measurements, where competitive growth is nearly absent. Further investigations are required to tease apart these two contributions quantitatively, in particular allowing for growth state transitions in modeling.
The connection between maximum entropy models and fluctuationdissipation relations, Eq. (4), requires further assumptions that need to be tested separately, but makes a very strong prediction about the relationship between the autocorrelation time of growth fluctuations and the typical response time to, e.g., nutrient shifts. This relationship appears fundamental, since the response time is a central biological quantity measurable in bulk, while the fluctuation autocorrelations are microscopic, singlecell properties, which can be measured with recent experimental setups. Interestingly, the predicted response times lengthen with the degree of growth rate optimization, suggesting a tradeoff between responsiveness to changes and efficiency in steady state; as a consequence, it is unclear whether the evolutionarily optimal outcome should be equated to complete growth rate optimization with no fluctuations, e.g., the FBA limit. Quantitatively, in stable environments where E. coli grows well and possibly achieves a high degree of growth rate optimization, one could experimentally look for signatures of longtimescale fluctuations, either directly in the growth signal, or by proxy through constitutive gene expression. Curiously, we report that the parameter β of our model has the dimension of time, whose bestfit value inferred from E. coli data is of the order of 1 day.
Beyond extensions to dynamics, our analysis made two further theoretical contributions. First, it clarified the relative roles of stoichiometric constraints and the growth optimization assumption in FBA. The maximum entropy model is an explicit construction of a smooth interpolation between the uniform regime (where only stoichiometric constraints are active) and the FBA (where growth is maximized in addition). The uniform limit is a natural baseline—where no control is exerted by the cell—against which to compare the observed fluxes, their fluctuations, and correlations, as we have done in Fig. 3. Without this baseline comparison, it is hard to assess how surprising the observations of metabolic optimality should really be^{3}. Our second theoretical contribution is the observation that a certain minimal information is needed to achieve a desired growth rate (Fig. 3g, h). This information is expressed in the same currency (bits) in which we measure the performance of regulatory networks, enabling us to suggest a tradeoff that sets the optimal degree of metabolic control. Contrary to other cellular networks where estimation of information only has been done for single network components or simple pathways^{36}, the metabolic network is the sole case where we could estimate the lower bound on the required number of regulatory bits. Our statistical mechanics approach thus opens a connection between metabolic networks and their regulatory counterparts, which is both of theoretical interest and could also be probed in comparative genomic studies.
Methods
General background
We consider the set of reactions in the wellmixed, continuum limit. Let S_{iμ} be the stoichiometric coefficient of the metabolite μ (whose concentration is c_{μ}) in reaction i, whose flux is v_{i}. The metabolic network dynamics is then given by mass balance equations:
Assuming steady state, \(\dot c_\mu = 0\), and including further constraints from thermodynamics, nutrient availability, and kinetic limits in the form of lower (LB) and upper (UB) bounds on fluxes, we obtain a convex polytope \({\cal P}\) of feasible steady states (metabolic phenotypes) in the space of fluxes:
In addition to bona fide, wellbalanced chemical reactions, constraintbased models often include a phenomenological biomass reaction in the form of a linear combination of metabolite fluxes, \(\lambda ({\mathbf{v}}) = \mathop {\sum}\nolimits_i \xi _iv_i\), where the proportions ξ_{i} are set to mimic cell growth, i.e., the metabolite fluxes necessary to reconstitute the biomass of a new cell in a typical division time.
The network
The network employed in the study is the catabolic core of the genomescale reconstruction iAF1260 (see Supplementary Methods), in a glucoselimited minimal medium in aerobic conditions^{44}. The network comprises N = 86 reactions among M = 68 metabolites and includes glycolysis, pentose phosphate pathway, TCA cycle, oxidative phosphorylation, and nitrogen catabolism. The dimension of the resulting polytope \({\cal P}\) of allowed steady states is D = 23, from which we can efficiently draw flux configurations using HitandRun Monte Carlo Markov Chain after suitable preprocessing^{27} (see Supplementary Methods).
Maximum entropy modeling
FBA looks for the flux configuration v_{max} that maximizes growth λ_{max} = λ(v_{max}) subject to constraints given by Eq. (6), which can be easily found by linear programming. In contrast, our maximum entropy approach starts with a distribution over fluxes with a Boltzmann form, which assumes that the fluxes are as random as possible while achieving a desired average growth rate^{26}:
The parameter, β, of the distribution P can then be set to match the predicted average growth rate to the measured growth rate, λ_{data}:
Once β is fixed, the joint distribution of Eq. (7) can be queried for average fluxes, flux correlations, or other quantities of interest that we discuss later.
The maximum entropy distribution with a constrained average growth rate has two interesting limits, as illustrated in Fig. 1. The growth rate, \(\bar \lambda _{}^{}\), increases with β (which we will refer to as an optimization parameter) until, in the limit β → ∞, the distribution P_{∞}(v) collapses into a delta function at v_{max}, lying at the boundary of the polytope \({\cal P}\): this is the FBA solution that supports the maximal growth rate λ_{max}. Conversely, as β → 0, Eq. (7) yields a uniform sampling of fluxes over the permitted polytope \({\cal P}\): this uniform solution is an interesting baseline case for comparison because it incorporates all stoichiometric constraints but postulates no growth rate optimization. In statistical physics, highβ regime (limiting toward the FBA solution) corresponds to the energydominated regime, while the lowβ regime (limiting toward the uniform sampling) corresponds to the entropydominated regime; the optimization parameter β corresponds to the inverse temperature.
Apart from generic informationtheoretic arguments put forward by Jaynes in support of the maximum entropy approach^{20}, are there further justifications for using the Boltzmann form in Eq. (7) that would be specific to the case of metabolic networks? Below, we consider two nonexclusive possibilities: active regulation at the singlecell level and competitive growth dynamics in a population.
Information costs of regulation
The first possibility is to mechanistically interpret the deviation of flux distributions away from the uniform sampling of the polytope \({\cal P}\) and its localization around the optimal solution as a consequence of the active regulation in the metabolic network. Such regulation could be achieved by, e.g., control over gene expression of key metabolic enzymes, or by allosteric feedback regulation mediated by metabolite concentrations or fluxes. The degree of localization of the flux distribution can be quantified by its entropy:
Because P_{β}(v) is, by construction, a maximum entropy distribution with average growth rate \(\bar \lambda (\beta )_{}^{}\), the decrease in entropy^{26},
is a measure for the minimal amount of information necessary to control the fluxes and achieve a given average growth rate. Equivalently, if we were to construct a regulation system that needs to realize the Boltzmann distribution of Eq. (7), I(β) would provide a lower bound on its information demand. In the Results section, we estimate this information demand for the E. coli network and propose a toy regulatory model that can meet it.
Competitive growth dynamics
The second possibility is that the Boltzmann distribution emerges from competitive growth dynamics. Since its historical origins in statistical physics, much research has been devoted to uncovering the dynamical roots of Boltzmann distributions, whose study highlighted important concepts and applications, ranging from ergodicity to fluctuationresponse relations. The same questions naturally arise in the context of its application in metabolism. It has been shown that the maximum entropy distribution at a fixed average growth rate is recovered independently and justified dynamically as the steady state of logistic growth^{28}. Since the logistic growth is the standard model used to experimentally fit optical density curves^{45}, this link also provides a possible interpretation of the maximum entropy parameter β, as we discuss below.
Consider a population of initial size N_{0} in a medium with carrying capacity N_{C} and assume that the intrinsic growth rates of individuals, λ_{i}, are sampled independently from a distribution q(λ), defined over the feasible polytope \({\cal P}\). In the simplest setting, upon neglecting growth state transitions, the number n_{i} of cells with growth rate λ_{i} will evolve in time according to
Then, \(n_i(t) = e^{\beta (t)\lambda _i}\), with
Under a mean field approximation^{28}, the steady states of these dynamics are distributions with maximum entropy form at a fixed average growth rate, where the asymptotic optimization parameter, \(\beta ^ \star\), is given implicitly by the equation
Equation (13) can be viewed as a relationship between quantities that can be independently estimated for a specific experimental setup: the inoculum size (N_{0}) and carrying capacity (N_{C}) on the one hand, as well as the typical value of β, via Eq. (8) or direct fitting of measured metabolic fluxes, on the other.
Taken together, the two mechanisms, active regulation and competitive growth dynamics, need not be exclusive, and can operate concurrently. A simple diagnostic that could provide insight into the relative importance of both mechanisms is to examine whether the relationship of Eq. (13) is satisfied. If it were, it would suggest that the Boltzmann distribution is dynamical in origin. If, on the other hand, the values of β inferred from fitting the maximum entropy model were higher than those derived from the N_{C}/N_{0} ratio and Eq. (13), additional active regulation may be at work. In the Results section and Supplementary Methods, we provide estimates of these quantities for the experiments under consideration.
Code availability
We have provided in doi:10.15479/AT:ISTA:62 a C++ code implementing the Lovasz preprocessing as well as the HitandRun algorithm and the polytope representation of the metabolic network used in this study. Please refer to the README.txt file for further information.
Data availability
The metabolic network employed in this study is the catabolic core from the genomescale reconstruction iAF1260 and it is available in the Supplementary materials of the published reconstruction work^{44}. The experimental estimates of the metabolic fluxes can be retrieved from the database^{46} doi: 10.1093/nar/gku1137 (see also the Supplementary Methods and Supplementary Data 1). Singlecell growth rate data are available from the Supplementary materials published in^{37}.
Change history
04 September 2018
This Article was originally published without the accompanying Peer Review File. This file is now available in the HTML version of the Article; the PDF was correct from the time of publication.
References
 1.
Kacser, H. & Burns, J. A. The control of flux. Biochem. Soc. Trans. 23, 341–366 (1995).
 2.
Orth, J., Thiele, I. & Palsson, B. O. What is flux balance analysis? Nat. Biotechnol. 28, 245–248 (2010).
 3.
Ibarra, R. U., Edwards, J. S. & Palsson, B. O. Escherichia coli k12 undergoes adaptive evolution to achieve in silico predicted optimal growth. Nature 420, 186–189 (2002).
 4.
Edwards, J. S., Ibarra, R. U. & Palsson, B. O. In silico predictions of escherichia coli metabolic capabilities are consistent with experimental data. Nat. Biotechnol. 19, 125–130 (2001).
 5.
Majewski, R. A. & Domach, M. M. Simple constrainedoptimization view of acetate overflow in E. coli. Biotechnol. Bioeng. 35, 732–738 (1990).
 6.
Varma, A. & Palsson, B. O. Stoichiometric flux balance models quantitatively predict growth and metabolic byproduct secretion in wildtype escherichia coli w3110. Appl. Environ. Microbiol. 60, 3724–3731 (1994).
 7.
Edwards, J. & Palsson, B. Metabolic flux balance analysis and the in silico analysis of Escherichia coli k12 gene deletions. BMC Bioinformatics 1, 1–1 (2000).
 8.
Vazquez, A., Liu, J., Zhou, Y. & Oltvai, Z. N. Catabolic efficiency of aerobic glycolysis: the warburg effect revisited. BMC Syst. Biol. 4, 58 (2010).
 9.
Wang, P. et al. Robust growth of Escherichia coli. Curr. Biol. 20, 1099–1103 (2010).
 10.
IyerBiswas, S., Crooks, G. E., Scherer, N. F. & Dinner, A. R. Universality in stochastic exponential growth. Phys. Rev. Lett. 113, 028101 (2014).
 11.
IyerBiswas, S. Scaling laws governing stochastic growth and division of single bacterial cells. Proc. Natl Acad. Sci. USA 111, 15912–15917 (2014).
 12.
Kennard, A. S. et al. Individuality and universality in the growthdivision laws of single E. coli cells. Phys. Rev. E 93, 012408 (2016).
 13.
Naama, B. et al. Universal protein distributions in a model of cell growth and division. Phys. Rev. E 92, 042713 (2015).
 14.
TaheriAraghi, S. et al. Cellsize control and homeostasis in bacteria. Curr. Biol. 25, 385–391 (2015).
 15.
Amir, A. Cell size regulation in bacteria. Phys. Rev. Lett. 112, 208102 (2014).
 16.
Kiviet, D. J. et al. Stochasticity of metabolism and growth at the singlecell level. Nature 514, 376–379 (2014).
 17.
Shahrezaei, V. & Marguerat, S. Connecting growth with gene expression: of noise and numbers. Curr. Opin. Microbiol. 25, 127–135 (2015).
 18.
Keren, L. et al. Noise in gene expression is coupled to growth rate. Genome Res. 25, 1893–1902 (2015).
 19.
Cerulus, B., New, A. M., Pougach, K. & Verstrepen, K. J. Noise and epigenetic inheritance of singlecell division times influence population fitness. Curr. Biol. 26, 1138–1147 (2016).
 20.
Jaynes, E. T. Information theory and statistical mechanics. Phys. Rev. 106, 620 (1957).
 21.
Schneidman, E., Berry, M. J., Segev, R. & Bialek, W. Weak pairwise correlations imply strongly correlated network states in a neural population. Nature 440, 1007–1012 (2006).
 22.
Tkačik, G. et al. Searching for collective behavior in a large network of sensory neurons. PLoS Comput. Biol. 10, e1003408 (2014).
 23.
Lezon, T. R., Banavar, J. R., Cieplak, M., Maritan, A. & Fedoroff, N. V. Using the principle of entropy maximization to infer genetic interaction networks from gene expression patterns. Proc. Natl Acad. Sci. USA 103, 19033–19038 (2006).
 24.
Mora, T., Walczak, A. M., Bialek, W. & Callan, C. G. Maximum entropy models for antibody diversity. Proc. Natl Acad. Sci. USA 107, 5405–5410 (2010).
 25.
Bialek, W. et al. Statistical mechanics for natural flocks of birds. Proc. Natl Acad. Sci. USA 109, 4786–4791 (2012).
 26.
De Martino, D., Capuani, F. & De Martino, A. Growth against entropy in bacterial metabolism: the phenotypic tradeoff behind empirical growth rate distributions in E. coli. Phys. Biol. 13, 036005 (2016).
 27.
De Martino, D., Mori, M. & Parisi, V. Uniform sampling of steady states in metabolic networks: heterogeneous scales and rounding. PLoS ONE 10, e0122670 (2015).
 28.
De Martino, D., Capuani, F. & De Martino, A. Quantifying the entropic cost of cellular growth control. Phys. Rev. E 96, 010401 (2017).
 29.
De Martino, D. & Masoero, D. Asymptotic analysis of noisy fitness maximization, applied to metabolism and growth. J. Stat. Mech. Theory Exp. 2016, 123502 (2016).
 30.
De Martino, D. Maximum entropy modeling of metabolic networks by constraining growthrate moments predicts coexistence of phenotypes. Phys. Rev. E 96, 060401 (2017).
 31.
De Martino, D. Scales and multimodal flux distributions in stationary metabolic network models via thermodynamics. Phys. Rev. E 95, 062419 (2017).
 32.
Kümmel, A., Panke, S. & Heinemann, M. Putative regulatory sites unraveled by networkembedded thermodynamic analysis of metabolome data. Mol. Syst. Biol. 2, 1 (2006).
 33.
Sakamoto, N., Kotre, A. M. & Savageau, M. A. Glutamate dehydrogenase from Escherichia coli: purification and properties. J. Bacteriol. 124, 775–783 (1975).
 34.
Slonim, N., Atwal, G. S., Tkačik, G. & Bialek, W. Informationbased clustering. Proc. Natl Acad. Sci. USA 102, 18297–18302 (2005).
 35.
Tkačik, G. & Walczak, A. M. Information transmission in genetic regulatory networks: a review. J. Phys. Condens. Matter 23, 153102 (2011).
 36.
Tkačik, G. & Bialek, W. Information processing in living systems. Annu. Rev. Condens. Matter Phys. 7, 89–117 (2016).
 37.
Bergmiller, T. et al. Biased partitioning of the multidrug efflux pump AcrABTolC underlies longlived phenotypic heterogeneity. Science 356, 311–315 (2017).
 38.
Mori, M., Hwa, T., Martin, O. C., De Martino, A. & Marinari, E. Constrained allocation flux balance analysis. PLoS Comput. Biol. 12, e1004913 (2016).
 39.
FernandezdeCossioDiaz, J. & Mulet, R. Fast inference of illposed problems within a convex space. J. Stat. Mech. Theory Exp. 7, 073207 (2016).
 40.
Alessandro Massucci, F., FontClos, F., De Martino, A. & Pérez Castillo, I. A novel methodology to estimate metabolic flux distributions in constraintbased models. Metabolites 3, 838–852 (2013).
 41.
FontClos, F., Massucci, F. A. & Castillo, I. P. A weighted beliefpropagation algorithm to estimate volumerelated properties of random polytopes. J. Stat. Mech. Theory Exp. 2012, P11003 (2012).
 42.
Martelli, C., De Martino, A., Marinari, E., Marsili, M. & Castillo, I. P. Identifying essential genes in Escherichia coli from a metabolic optimization principle. Proc. Natl Acad. Sci. USA 106, 2607–2611 (2008).
 43.
Braunstein, A., Muntoni, A. P. & Pagnani, A. An analytic approximation of the feasible space of metabolic networks. Nat. Commun. 8, 14915 (2017).
 44.
Orth, J. D. et al. A comprehensive genomescale reconstruction of Escherichia coli metabolism—2011. Mol. Syst. Biol. 7, 535 (2011).
 45.
Baranyi, J. & Roberts, T. A. A dynamic approach to predicting bacterial growth in food. Int. J. Food Microbiol. 23, 277–294 (1994).
 46.
Zhang, Z. et al. CeCaFDB: a curated database for the documentation, visualization and comparative analysis of central carbon metabolic flux distributions explored by 13Cfluxomics. Nucleic Acids Res. 43, D549–D557 (2014).
Acknowledgements
We acknowledge the support of the Austrian Science Fund grant FWF P28844 (G.T.) and of the People Programme (Marie Curie Actions) of the European Union’s Seventh Framework Programme (FP7/20072013) under REA grant agreement no. [291734] (D.D.M).
Author information
Affiliations
Contributions
D.D.M. and G.T. conceived the study and developed the theory. D.D.M. performed the simulations. A.A. and D.D.M. carried out the data analysis. C.G. and T.B. carried out the experiments. All authors contributed in writing the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
De Martino, D., MC Andersson, A., Bergmiller, T. et al. Statistical mechanics for metabolic networks during steady state growth. Nat Commun 9, 2988 (2018). https://doi.org/10.1038/s41467018054179
Received:
Accepted:
Published:
Further reading

Information Theory in Computational Biology: Where We Stand Today
Entropy (2020)

Information geometry in the population dynamics of bacteria
Journal of Statistical Mechanics: Theory and Experiment (2020)

Boost carbon availability and value in algal cell for economic deployment of biomass
Bioresource Technology (2020)

Statistical mechanics of interacting metabolic networks
Physical Review E (2020)

The common message of constraintbased optimization approaches: overflow metabolism is caused by two growthlimiting constraints
Cellular and Molecular Life Sciences (2020)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.