Main

The small molecules of cellular metabolism are generally thought to rapidly diffuse throughout the cytoplasm. A notable exception to this occurs when the products of an enzyme active site are locally processed by a subsequent active site. This phenomenon, known as direct channeling, relies on the formation of protein tunnels that connect consecutive active sites, preventing metabolic intermediates from diffusing away (Fig. 1a). Indeed, numerous occurrences of direct channeling have been observed. Substrates can be channeled between active sites of two separate enzymes1, or between two active sites within a single polyfunctional enzyme2,3.

Figure 1: Different types of intermediate channeling in a two-step metabolic pathway, where a substrate is processed by enzyme E1 and turned into intermediate, which is then processed by enzyme E2 and turned into product.
figure 1

(a) Direct channeling. The intermediate is funneled from enzyme E1 to enzyme E2 by means of a protein tunnel that connects the active sites of E1 and E2, thus preventing the intermediate from diffusing away. (b) Proximity channeling. Top: E1 and E2 are positioned near enough to each other such that the intermediate produced by E1 is processed by E2 before it can escape by diffusion, even in the absence of an actual channel. Bottom: if E1 and E2 are not near enough to each other, an intermediate molecule produced by E1 escapes by diffusion, and it cannot be processed by E2. (c) Enzyme clustering. Once E1 produces an intermediate molecule, even though the probability of the intermediate being processed by any individual E2 enzyme is low, the probability that the intermediate will be processed by one of the many E2 enzymes in the agglomerate can be high.

An alternative channeling mechanism, proximity channeling, is the subject of much speculation4,5,6,7. Proximity channeling involves two enzymes positioned near enough to each other such that the intermediate produced by the first enzyme is processed by the second enzyme before it can escape by diffusion, even in the absence of an actual channel (Fig. 1b). Proximity channeling has been invoked to explain the large improvements in product titers that occur when enzymes are colocalized on protein scaffolds8,9.

However, there is a fundamental problem with proximity channeling in its simple form. For a diffusing substrate in the vicinity of an enzyme, processing only becomes likely if the substrate approaches the catalytic site of the enzyme within the radius of the site (0.1−1 nm). Hence, an intermediate produced by the first member of an enzyme pair is unlikely to be processed by the second member even if the active sites of the two enzymes are only 10 nm apart7. Thus, simply fusing two enzymes together will not cause productive channeling.

So how can we resolve the paradox that scaffolds empirically improve metabolic yields, yet cannot bring two enzymes close enough to mediate channeling? To address this question, we theoretically and experimentally explored the feasibility of another form of channeling, achieved by assembling multiple copies of both upstream and downstream enzymes into a functional cocluster we refer to as an 'agglomerate'. The central idea is that once an upstream enzyme produces an intermediate, even though the probability of the intermediate being processed by any individual downstream enzyme is low, the probability that the intermediate will be processed by one of the many downstream enzymes in the agglomerate can be high (Fig. 1c). Indeed, it has already been suggested that the spontaneous formation of multiscaffold agglomerates might account for the surprising effectiveness of engineered enzyme scaffolds9.

Notably, evidence for enzyme clustering has been found in several organisms10,11,12,13,14,15,16,17,18,19,20. For example, the six enzymes of the human de novo purine biosynthetic pathway were observed to reversibly form clusters, named purinosomes, in HeLa cells in response to purine availability17,21, though the role of expression levels and fluorescent fusions remains open22. Furthermore, in a recent study18 of Saccharomyces cerevisiae it was observed that out of 800 GFP-tagged cytosolic proteins, 180 clearly displayed evidence of dynamic clustering, with many of these proteins involved in intermediary metabolism and stress response.

Although cluster-mediated channeling has been previously hypothesized9, it has lacked a quantitative model to understand its limits, benefits and experimental support. Here we show quantitatively that compact agglomerates in which enzymes are coclustered offer many of the same advantages as direct channeling, in particular the acceleration of intermediate processing. Importantly, our theoretical approach allows us to find the optimal distribution of enzymes. For a representative linear pathway we are able to predict the optimal separation and size of agglomerates, as well as their detailed internal structure including enzyme ratios. Remarkably, the advantages of enzyme clustering are uniquely due to an enzyme-concentration effect. Two consecutive enzymes in a pathway are present at high concentration in the same region of space. This mechanism induces efficient intermediate channeling without requiring any of the microscopic features of direct or proximity channeling scenarios, such as protein tunnels or extreme proximity between catalytic sites.

In support of our theoretical conclusions, we experimentally confirmed metabolic channeling by constructing an enzyme agglomerate. A robust steady-state way to detect channeling in vivo is to assess flux division at a metabolic branch point. Specifically, we focused on a fundamental branch point in E. coli, where carbamoyl phosphate synthetase (CarB) synthesizes carbamoyl phosphate, which can then be committed toward pyrimidine biosynthesis by aspartate carbamoyltransferase (PyrB) or toward arginine biosynthesis by ornithine carbamoyltransferase. We found that if CarB and PyrB are coclustered into an agglomerate, the flux is shunted toward the PyrB branch, leading to ample production of pyrimidines but low levels of arginine. We have extended our model for a linear pathway to the above branch point in E. coli, and the results confirm the shunting of metabolic flux owing to clustering observed in the experiments. To our knowledge, no other group has demonstrated the acceleration of intermediate processing by intentionally engineered enzyme agglomerates.

Results

Model

We first describe a mathematical model describing a two-step metabolic pathway. We write the reaction-diffusion equations describing the pathway, discuss how to determine the efficiency of the pathway given a particular enzyme distribution, and how to find the enzyme distribution yielding the maximal efficiency (Supplementary Figs. 1–5).

Reaction-diffusion equations. We modeled a two-step metabolic pathway (Fig. 2a) where substrate S0 is processed by enzyme E1 into the intermediate S1, which in turn can be processed by enzyme E2 into product.

Figure 2: Two-step metabolic pathway with an unstable intermediate.
figure 2

(a) The two-step metabolic pathway. Substrate S0 is processed by enzyme E1 and turned into intermediate S1, which is then processed by enzyme E2 and turned into product P. (b) Enzyme configurations in the two-step metabolic pathway. Left: the cell cytoplasm is divided into multiple identical basins, each basin is represented by a dashed circle. Within each basin, enzymes are clustered into a central spherical agglomerate (shown in green). Right: blow-up showing the dynamics of the metabolic pathway within an agglomerate with radius r*. Substrate S0, which is produced throughout the cytoplasm, is processed by E1 and turned into S1, which is then processed by E2. Both S0 and S1 may escape from the agglomerate by diffusion. (c) Metabolic pathway efficiency ɛ and efficiencies of the first and second step ɛ1, ɛ2 as functions of basin radius R. For each R, efficiency is optimized over enzyme densities , which are assumed to be spherically symmetric. Local enzyme density is constrained by , and the total catalytic activity is fixed to κcat. The efficiencies of the first and second step ɛ1, ɛ2 are a decreasing and an increasing function of R respectively. Hence, the optimal efficiency is obtained as a tradeoff between ɛ1 and ɛ2, and is equal to ɛopt = 0.53 at Ropt = 6.5 μm. Except where noted, k1 = k2 and parameter values are the same for all figures. The optimal efficiency ɛopt = 0.53 is about 5.9 times larger than the efficiency ɛdelocalized = 0.09 of a delocalized configuration where enzymes are uniformly distributed in space. (d) Optimal distributions of enzymes E1,E2 and corresponding concentrations of substrate S0 and intermediate S1 as functions of r/R, where the optimal basin radius (i.e., half the optimal spacing between clusters) is R = Ropt = 6.5 μm from b. The local enzyme density n1(r) + n2(r) and its maximal value nmax are also shown. The optimal enzyme distribution is a compact cluster with radius composed of a shell of E1 and E2 surrounded by a halo of E2. Inset: concentrations of substrate S0 and intermediate S1 as functions of r/R in the entire basin.

In what follows, we define a minimal model describing the above metabolic pathway. Given the large number of enzyme and substrate molecules, we represent their spatial distribution by continuous densities in the cytoplasm. The steady-state densities of enzymes E1. E2 are denoted by , where is the position vector. The concentration of S0 at time t is denoted by , and the concentration of S1 at time t is denoted by . We assume that the substrate S0 is the product of upstream metabolic processes performed by enzymes that are distributed uniformly in the cytoplasm. These metabolic processes act to relax the local concentration toward a homeostatic level . We will denote by α0 the rate at which relaxes to . To model intermediate decay23 and/or incorrect processing24, we assume decays at a rate β. For β = 0, intermediates can accumulate indefinitely curtailing the advantage of clustering, although clustering can still reduce overall metabolite concentrations (Supplementary Fig. 6).

Hence, the reaction-diffusion equations governing the two-step pathway are

where k1, k2 are the kcat/KM values of enzymes E1 and E2 from Michaelis-Menten kinetics, and D is the diffusion coefficient.

Because the time scale of enzyme clustering (>1 h)18 is much longer than the time scale of intermediate processing (l s), we will focus on the steady state of equation (1), where , and thus set .

Efficiency of the pathway. An important quantity is the efficiency

The efficiency 0 ≤ ɛ≤ 1 is the total rate of production of product in the volume V of the system divided by the maximal possible rate of production of the substrate of E1. Given the enzyme distributions , one can solve equation (1), determine the concentrations , and thus compute the efficiency ɛ[n1, n2] of the enzyme configuration by means of equation (2).

Before seeking the enzyme distributions that optimize the efficiency, we observe that for enzyme clustering in yeast25, mammalian cells17, and some bacteria26, the overall enzyme distribution consists of one or more well-separated dense clusters. Guided by this observation, in our model we assumed that the optimal distribution is given by the identical repetition of a simple pattern: an enzyme cluster and its surrounding volume (Fig. 2b). We approximated this surrounding volume as a sphere of radius R, which we call the 'basin' of the cluster. Furthermore, we expect and assume spherical symmetry within each basin, that is, enzyme densities and substrate concentrations depend only on the distance r from the center of the basin , and we fixed no-flux boundary conditions at the edge of the basin.

To compute the efficiency of an enzyme configuration and determine the enzyme configuration with maximal efficiency, we do not have to solve equation (1) in the whole volume V. We only need to identify the size R of the basin, and solve the reaction-diffusion equations within one basin. Because all basins are identical, the efficiency of the configuration will be then given by the efficiency ɛ[n1, n2] computed within one single basin

Determining the optimal enzyme distribution. We computed the basin radius R and the enzyme configuration n1(r), n2(r) that maximize the efficiency in equation (3). As there are natural constraints both on enzyme catalytic constants and on the cell's enzyme production, we chose to optimize the efficiency while holding constant the total catalytic activity

where

are the total enzyme numbers in the basin. Another physical constraint that must be taken into account is the finite size of enzymes. This implies a maximal local enzyme density nmax that cannot be exceeded27, which we modeled by the constraint

The optimization of the efficiency was performed under the constraints (4,5) (Supplementary Fig. 1 and Supplementary Software 1), the values of the model parameters (diffusion coefficients, catalytic rates, etc.) being fixed from experimental data (Online Methods: Model parameters).

In the following section we present and discuss the enzyme distribution yielding the maximal efficiency.

Optimal enzyme distribution

For our chosen parameters the optimal size of the basin is R = 6.5 μm (Fig. 2c). Thus, if the cell radius is large enough, the optimal enzyme distribution consists of multiple enzyme clusters spaced by 6.5 μm. Notably, this distance is similar to the spacing between purine biosynthesis enzyme clusters in human cells17.

For the optimal basin radius, Figure 2d shows the distribution of enzymes that optimizes metabolic efficiency. E2 is colocalized with E1 in a shell to be able to process the intermediate where S1 is produced, and the extra shell and halo of E2 help to process molecules of S1 that might otherwise diffuse away. The outer radius of the spherical shell of E1 and E2 is only about 4% of the basin radius R. Within the shell, the total enzyme density is the maximum value allowed by crowding nmax, and both of the enzyme distributions are nearly uniform. The fraction of enzyme E1 in this optimal configuration with k1 = k2 is N1/(N1+ N2) = 0.33.

How advantageous is clustering? The optimal efficiency, ɛopt=0.53 (Fig. 2) for our choice of intermediate decay rate β = 10/s, should be compared with the efficiency in the delocalized case, for example, the case where enzymes uniformly fill the whole cytoplasm. The latter was found to be ɛdelocalized = 0.09, showing that enzyme clustering improved the efficiency of the two-step pathway by almost sixfold.

Notably, the detailed internal structure of the enzyme distribution (Fig. 2d) was not essential to achieve a high metabolic efficiency. A much simpler enzyme configuration where both E1 and E2 were uniformly distributed in a maximally dense enzyme sphere of radius r* (Fig. 3) yielded an efficiency within 6% of the optimal efficiency ɛopt = 0.53 of the enzyme configuration shown in Figure 2d (Supplementary Software 2).

Figure 3: Two-step metabolic pathway with an unstable intermediate for uniform enzyme spheres with equal density.
figure 3

(a) Metabolic pathway efficiency for uniform enzyme spheres of enzymes E1 and E2 with N1 = N2 compared to the optimal case from Figure 2c, and efficiencies of the first and second step of the pathway for uniform enzyme spheres as functions of basin radius R. For each R, efficiency is optimized over enzyme densities n1(r),n2(r), which are assumed to be spherically symmetric. Local enzyme density is constrained by and the total catalytic activity is fixed to . The optimal efficiency and radius for uniform enzyme spheres are and , close to the values ɛopt = 0.53 and Ropt = 6.5 μm obtained for the fully optimized case. (b) Optimal distributions of enzymes E1,E2 and corresponding concentrations of substrate S0 and intermediate S1 for uniform enzyme spheres of enzymes E1 and E2 as functions of r/R, where the basin radius is from a. The local enzyme density and its maximal value nmax are also shown. The optimal enzyme distribution is a compact cluster with radius composed of a sphere uniformly filled with enzymes E1 and E2. Inset: concentrations of substrate S0 and intermediate S1 as functions of r/R in the entire basin region.

It can be shown that the optimal basin radius Ropt is the best compromise between the efficiency of the first and second steps of the pathway (Fig. 2c). For large R, diffusion of S0 to the enzyme cluster is limiting and the maximum possible rate of production of S0 greatly exceeds the rate of processing by E1 leading to small ɛ1. By contrast, for small R the enzyme agglomerate is so small that much of the intermediate S1 escapes by diffusion, resulting in small ɛ2. Note that for a cell whose radius is much smaller than the optimal basin radius Ropt = 6.5 μm, the highest achievable efficiency was substantially reduced compared to the optimal value ɛopt = 0.53 for the pathway parameters used in Figure 2.

Finally, we performed the optimization in Figure 2c,d for a range of values of each relevant parameter (Supplementary Fig. 4), and showed that these results were robust with respect to the assumptions used to build the model, that is spherical symmetry, instability of intermediates and absence of enzyme saturation (Supplementary Figs. 5–7).

Branch-point regulation by enzyme clustering in E. coli

An important question is whether enzyme clustering can do more inside cells than increase metabolic efficiency. An intriguing possibility is that clustering might prove effective in regulating flux division, specifically at metabolic branch points. To experimentally test this possibility, we focused on a fundamental branch point in E. coli, where carbamoyl phosphate synthetase (CarB) synthesizes carbamoyl phosphate, an intermediate that can then be committed toward pyrimidine biosynthesis by aspartate carbamoyltransferase (PyrB) or toward arginine biosynthesis by ornithine carbamoyltransferase (ArgI). Analysis of the catalytic parameters specific to this branch point suggests that a large shunting of the metabolic flux can be obtained by coclustering the upstream enzyme with the downstream enzyme for one branch. In this regard, suppose that we colocalize a typical number28,29 of CarB and PyrB molecules in a cluster with radius , so that the combined density of CarB and PyrB is equal to the dense-packing limit nmax = 25 mM. Given the catalytic constant of PyrB kPyrB = 4.8×107 liter/s/mol (ref. 30) and an intermediate diffusion coefficient D = 100 μm2/s (ref. 31), the rate at which an intermediate molecule produced within the cluster is processed by PyrB is , whereas the rate at which the intermediate molecule escapes from the cluster is . As the processing rate is much larger than the escape rate, we expect that by colocalizing CarB and PyrB in a compact cluster we can efficiently channel the intermediate (carbamoyl phosphate) toward the PyrB branch, and thus obtain substantial shunting of flux. Such flux shunting toward the PyrB branch should manifest in high pyrimidine production and low arginine production. The growth of E. coli cells would be consequently stimulated by the presence of additional arginine and insensitive to the presence of additional uracil (a pyrimidine precursor). We refer to such a growth dependence on arginine as arginine pseudoauxotrophy.

Simply fusing CarB to PyrB does not induce channeling. We translationally fused CarB to PyrB (hereafter called CarB-PyrB, Fig. 4) and confirmed that each of the two enzymes in the resulting fusion is functional by demonstrating that the fusion complements the growth of a ΔcarB ΔpyrB double-deletion mutant. We replaced the native carB gene with the carB-pyrB fusion in an NCM3722 ΔpyrB background and assayed growth in minimal media with or without additional arginine or uracil. In all conditions, CarB-PyrB supported wild-type growth with no signs of auxotrophy for either pyrimidines or arginine (Fig. 4b). Thus, there was no evidence that simply fusing CarB and PyrB increased flux of carbamoyl phosphate toward PyrB at the expense of ArgI.

Figure 4: Metabolic pathway with a branch point, in E. coli.
figure 4

(a) The arginine/pyrimidine branch point in E. coli. CarB (with CarA, not shown) synthesizes carbamoyl phosphate, which can be committed towards arginine biosynthesis by ArgI or towards pyrimidine (e.g., uracil, UMP) synthesis by PyrB. For our experiments we fused CarB to PyrB (CarB-PyrB). Ornithine and aspartate are, respectively, arginine-specific and pyrimidine-specific biosynthetic reactants upstream of the branch point. (b) (left) Expression of CarB-PyrB at the low level characteristic of the endogenous carB gene does not produce phase-bright foci and (right) does not generate arginine pseudoauxotrophy. Cell density is plotted versus time in minimal conditions, in the presence of additional arginine, additional uracil, and both additional arginine and additional uracil (scale bar, 5 μm). Optical densities plotted are the mean ± s.e.m. . (c) (left) High-level expression of CarB-PyrB induces the formation of phase-bright foci, indicated by arrows, and (right) causes arginine pseudoauxotrophy. The same quantities as in b are plotted (scale bar, 5 μm, . (d) The arginine pseudoauxotrophy results from metabolite shunting. Metabolomic analysis reveals that high-level CarB-PyrB expression causes the pyrimidine pathway pools to increase whereas the arginine pathway pools decrease downstream of ArgI but increase upstream of ArgI. Relative metabolite levels are the mean ± s.e.m. (e) Two-step metabolic pathway with a branch point. Substrate S0 is processed by enzyme E1 and turned into substrate S1. Substrate S1 is then processed by either enzyme EA or EB and turned into product PA or PB, respectively. (f) Schematics of the spatial distributions of enzymes E1,EA,EB in the colocalized case. E1 and EB are uniformly distributed in a compact sphere of radius , whereas EA is uniformly distributed in a larger sphere with the typical radius of an E. coli cell rA = R = 0.79 μm, we set N1=NB, and the combined density of E1 and EB is set at the dense-packing limit nmax = 25 mM. (g) Efficiency fractions of the two branches of the pathway as functions of the fraction of enzyme EB, for the colocalized case in f and for the delocalized case where E1,EA and EB are uniformly distributed in a sphere of radius rA = R = 0.79 μm. In both cases the number of EA molecules is fixed to NA = 2,000 enzymes, and the catalytic constants for E1,EA,EB, that is, the values of kcat/KM for CarB, ArgI and PyrB, respectively, are (ref. 35), (ref. 36), (refs. 30,37). Other parameters are as given in Online Methods: Model parameters.

Channeling occurs when CarB-PyrB forms large clusters. We expected that flux shunting at the CarB-PyrB-ArgI metabolic branch point could be achieved when enzymes were coclustered into functional agglomerates. To test this prediction, we engineered many-enzyme clusters by overexpressing the CarB-PyrB fusion protein, and examined the resulting effect on flux shunting. We conditionally expressed the carB-pyrB gene fusion from the tetracycline-inducible PLtetO-1 promoter and assayed CarB-PyrB agglomeration and function in a ΔcarB ΔpyrB background. At low induction levels (0–0.1 nM anhydrotetracycline (aTc)) cells exhibited generally homogeneous cytoplasm and wild-type growth patterns (Fig. 5a,b). At higher induction levels, however (above 0.5 nM aTc), cells began to exhibit phase-bright cytoplasmic structures typical of protein-dense clusters (Fig. 5b,c), and became strikingly pseuodoauxotrophic for arginine: cell growth was insensitive to the presence of additional uracil, but was strongly stimulated by the addition of arginine (Fig. 5a). The growth dependence on arginine became increasingly exaggerated upon the addition of higher concentrations of aTc inducer (Fig. 5a), and was most pronounced in an unrepressed (maximally expressed) strain lacking the tetR repressor gene altogether (Fig. 4c). For this unrepressed strain, the exponential phase doubling time in minimal media supplemented with arginine was faster by 26.6 ± 0.9 min, or about 30%, compared to the doubling time in minimal media alone (Supplementary Table 1). If we assume that the fusion strain's reduction in growth rate in comparison with wild type reflects the fraction of carbamoyl phosphate flux being channeled away from arginine biosynthesis, then we infer that the ratio of arginine biosynthetic fluxes between the clustered and unclustered strains is about 1:2 (from doubling times of 89.4 ± 0.6 min and 47.9 ± 0.6 min; Supplementary Table 1).

Figure 5: Phase-bright clusters and arginine pseudoauxotrophy increase with increasing CarB-PyrB overexpression.
figure 5

(a) Cell density with inducible CarB-PyrB expression as a function of time for different levels of anhydrotetracyline (aTc) inducer concentration and in different media: in minimal conditions, in the presence of additional arginine, additional uracil and both additional arginine, and additional uracil. Mean of Nreplicates = 4 plotted. (b,c) Phase images (b) (scale bars, 5 μm) and quantification (c) of phase-bright foci as functions of aTc inducer concentration. Number of phase-bright spots per cell followed an exponential distribution. Mean and 95% confidence intervals are shown as the s.e.m. (N0 nM = 760, N0.5 nM = 632, N5 nM = 317).

We have also shown that the arginine pseudoauxotrophy was not simply a consequence of PyrB overexpression, but rather required CarB and PyrB to be coclustered and PyrB to be functional (Supplementary Figs. 8–9, Online Methods: Arginine pseuodoauxotrophy results from coclustering of CarB and PyrB, and Supplementary Video 1).

The hallmark of branch-point shunting was observed downstream of CarB-PyrB with large increases in pyrimidine pathway metabolite pools (as large as 50×) and a concomitant drop in arginine pathway metabolite pools (Fig. 4d and Supplementary Fig. 10). Furthermore, the arginine-specific biosynthetic intermediates upstream of the branch point, such as ornithine, accumulated significantly (P = 4.0 × 10–4) (Fig. 4d). This result indicates that ArgI was unable to access its carbamoyl phosphate substrate, which it requires to produce citrulline by carbamoylating ornithine (for evidence that clusters contained active enzyme see Supplementary Fig. 11).

Model for branch-point regulation by clustering in E. coli

Our model predicts that enzyme clustering yields high metabolic efficiency only for cells with radius larger than 6.5 μm (Fig. 2c), with the effectiveness of enzyme clustering progressively reduced for smaller cells. Nevertheless, the above experiments for a branch point in E. coli show that enzyme clustering may also be advantageous in smaller cells. Importantly, compared to the 'typical' catalytic constants used in Figure 2c, the catalytic constant of PyrB is much larger. As a consequence, the rate at which an intermediate molecule is processed within the cluster substantially exceeds the rate at which the molecule escapes from the cluster, and so enzyme clustering is able to channel intermediates to the PyrB branch of the pathway.

Next, we compared the predictions of our model directly with the experimental results for the CarB, PyrB, ArgI branch point in E. coli. Specifically, we applied our modeling formalism to the metabolic branch point shown in Figure 4e (Supplementary Software 3 and 4). A substrate S0 is processed by enzyme E1 into intermediate S1, which can subsequently be processed by enzyme EA into product PA or by enzyme EB into product PB, or S1 can decay. We modeled this pathway using reaction-diffusion equations similar to equation (1). The efficiencies, ɛA,ɛB, of processing by enzymes EA,EB, respectively, are the total rates of production of products PA,PB divided by the maximal possible rate of production of the substrate S0.

A natural control parameter in our experiments in E. coli the expression level of the CarB-PyrB fusion protein. With this in mind, in we fixed NA and we varied N1 = NB together as if these two enzymes were fused (Fig. 4f,g). We computed the efficiency fractions of the two branches of the pathway as functions of the enzyme fraction . We took EA to be uniformly distributed in a sphere with the typical radius of an E. coli cell rA = R = 0.79 μm, with a number of molecules NA = 2,000, and we assumed no-flux boundary conditions at the boundary of the basin r = R = rA. We then considered two cases: the colocalized case and the delocalized case (Fig. 4f,g and Supplementary Software 3 and 4).

Moderate expression levels of E1 and EB produced considerable flux shunting toward branch B if E1 and EB were colocalized, whereas little to no flux shunting occurred if E1 and EB were delocalized (Fig. 4g). This result agrees with the experimental observation that significant flux shunting occurred within an agglomerate at enzyme concentrations corresponding to moderate induction levels (Fig. 4b–d). In particular, for moderate expression levels the efficiency ɛA in the colocalized case was about half of ɛA in the delocalized case (Fig. 6). This prediction is in fairly good agreement with the experimental observation that the ratio of arginine biosynthetic flux between the clustered and unclustered strains is about 1:2 (see also Supplementary Fig. 12 for details).

Figure 6
figure 6

Ratio between the efficiency ɛA in the colocalized case and the efficiency ɛA in the delocalized case as a function of the fraction of enzyme EB for the two-step metabolic pathway with a branch point for the same geometry and parameters as in Figure 4.

Discussion

Motivated in part by the recent observation that enzyme scaffolds, presumably bound together into compact agglomerates, can improve metabolic efficiency8,9,32,33, and in part by in vivo observations of enzyme clusters17,18, we developed a quantitative model to assess the benefits of enzyme clustering. We found that enzyme clustering can provide benefits by colocalizing many copies of the enzyme that processes an intermediate with many copies of the enzyme that produces it. By thus increasing the rate of intermediate processing, enzyme clustering can increase the metabolic efficiency of pathways with unstable intermediates17.

Our model achieves computational speed by treating both enzymes and substrates as continuous densities with spherical symmetry, allowing us to find globally optimal enzyme distributions. For a simple two-step metabolic pathway with a short-lived intermediate, we found that the optimal distribution of enzymes in a large cell is given by multiple enzyme clusters, and is almost 6 times more efficient than a delocalized (uniform) distribution (Fig. 2), and 110 times more efficient than a delocalized distribution for a three-step pathway (Supplementary Fig. 13). Moreover, our method allows us to find the optimal spacing between clusters. For our choice of parameters (such as diffusion coefficient, enzyme catalytic rates) in the simple two-step pathway, we found a spacing of 6.5 μm, comparable to the range observed for purinosome spacings in human cells17. Notably, a cluster spacing of 6.5 μm implies that multiple clusters would be optimal only in large cells, such as human cells17. Still, our analysis shows that for enzymes with high catalytic constants, clustering can also be effective in smaller cells, such as bacteria.

In addition, we provide a simple analytical expression for the metabolic efficiency of a clustered enzyme configuration (Supplementary Fig. 4). This expression can be used to obtain the predictions of our model (such as the efficiency of a clustered enzyme configuration, the optimal cluster radius and intercluster spacing) for other metabolic pathways by simply substituting the values of the relevant parameters (such as diffusion coefficient, catalytic constants).

The central prediction of our model is that enzyme clustering within cells can achieve rapid processing of intermediates. From an experimental point of view, measuring the rate of intermediate processing in a linear pathway such as the one shown in Figure 2a is not an easy task, generally requiring kinetic measurements that are hard to carry out in vivo. In contrast, if we add a second branch to the pathway—for example, if we consider a metabolic branch point as shown in Figure 4e—the relative rate of intermediate processing by the two branches can be determined by steady-state measurement of the flux branching ratio. Therefore, an ideal place to experimentally verify our prediction of rapid intermediate processing by an enzyme agglomerate is at a metabolic branch point. We tested this prediction for a fundamental branch point in E. coli. At this branch point, carbamoyl phosphate synthetase (CarB) synthesizes carbamoyl phosphate, which can then be driven into pyrimidine biosynthesis by aspartate carbamoyltransferase (PyrB) or toward arginine biosynthesis by ornithine carbamoyltransferase. A simple estimate based on the catalytic constants of the downstream enzymes suggested that enzyme clustering could yield considerable shunting of flux at this metabolic branch point. Indeed, we showed that if CarB and PyrB are coclustered into an agglomerate, flux is strongly shunted toward the PyrB branch.

To compare the predictions of our model with the above experimental results, we applied the model to the specific metabolic branch point studied in the experiments. The model confirms that if many copies of the upstream enzyme CarB and the downstream enzyme PyrB are suitably colocalized into a cluster, the carbamoyl phosphate produced by CarB is processed by PyrB before decaying or diffusing out of the cluster, resulting in substantial shunting of flux to the pyrimidine pathway.

Although previous studies8 on enzyme scaffolds demonstrated an increase in product titer, leading to speculation that enzyme agglomerates were formed9, ours is the first demonstration of acceleration of intermediate processing as a direct consequence of enzyme-agglomerate formation, without any specific microscopic arrangement of the enzyme molecules.

Overall, our results provide some general guidance for cluster engineering efforts—for example, highlighting the importance of achieving maximum density but the relative unimportance of internal cluster organization. Moreover, our approach provides a computationally tractable means of targeting de novo engineered clusters to the right sizes, stoichiometries and intercluster spacings. These tools will provide for designable control of cluster-regulated metabolic networks to produce economically viable product titers and meet demands for several biotechnology applications, such as therapeutics, drugs and biofuels. Interesting directions for the future will be to determine the mechanisms by which natural enzyme clusters are formed34.

Methods

Bacterial strain and plasmid construction.

The bacterial strains, plasmids, and oligonucleotide primers used in this study are listed in Supplementary Table 2. All gene deletions were generated by P1vir-transducing the kanamycin cassette insertions from the Keio Collection25 into WT NCM 3722 or other strain of interest and selecting for kanamycin resistance. Kanamycin insertions were then removed through the transformation of the heat-inducible flippase-expressing plasmid, pCP20 (ref. 38). Transformed strains were incubated overnight and patched on to LB, LB + Kan, and LB + Carb to assure loss of plasmid and gene insertion. Synthetic linear DNA fragments were used to carry out chromosomal integrations via the lambda-red recombineering method, which has been described39. Deletions and integrations were verified for their resulting genomic scar and modification, respectively, by PCR, followed by sequencing when deemed necessary. All plasmids and synthetic linear DNA fragments were constructed by incubating purified PCR products, flanked by at least 20 bp overlapping ends, together with 1X Gibson Assembly Master Mix (NEB Labs) at 50 °C for 1 h40. Isolated constructs were then verified by sequencing.

Bacterial growth conditions.

E. coli strain NCM3722 and its derivatives were grown at 37 °C in Luria Broth, Gutnick Minimal Media (0.4% glucose (w/v))41, or Gutnick Minimal Media supplemented with either 0.2 mM uracil (Sigma) or 0.5 mM arginine (Sigma). Antibiotics were added at the following concentrations (to liquid media/to agarose plates): carbenicillin (Omega Scientific Inc.) (50 g/ml/100 g/ml), kanamaycin (Gibco by Life Technologies) (30 g/ml/50 g/ml), chloramphenicol (Fisher) (50 g/ml/100 g/ml).

Microscopy and image analysis.

Images were taken with a Nikon 90i upright microscope equipped with a Nikon Plan Apo 100X/1.4 phase-contrast objective. Images were collected with a Rolera XR cooled CCD camera and initially processed by NIS-Elements Advanced Research software. Images were further analyzed for inclusion-body content with either custom Matlab code or ImageJ. Samples were spotted onto 1% agarose (Invitrogen) pads, resting on glass slides, made with the appropriate medium. Coverslips were sealed with valap (1:1:1, lanolin:paraffin:petroleum jelly).

Growth assays.

Cell growth was assayed in flat-bottom Costar 96-well polystyrene plates using a Biotek Synergy HT. Wells were filled with 150 μl of the appropriate liquid media inoculated from overnight cultures back-diluted 1:145, and covered with 50 μl mineral oil (Sigma). Plates were incubated at 37 °C with continuous shaking and the optical density (OD) at 600 nm was read every 20 min.

Single-cell inclusion body lysis assay.

Overnight cultures of inclusion-body-containing strains were diluted 1:50 into Gutnick minimal media and grown at 37 °C for 2 h. Cells were then spotted onto 1% agarose pads containing minimal media, 0.1% Triton X-100 (Sigma), 10 mM EDTA (Sigma), and 1 mg/ml chicken egg lysozyme (Sigma).

Inclusion-body isolation.

Cell-free inclusion bodies were collected by following a modified isolation protocol detailed elsewhere42. Briefly, mid-log phase cells were pelleted and resuspended in ice-cold lysis buffer made from 50 mM TrisHCl (pH 8.8) (Fisher), 100 mM NaCl (Sigma), 1.5 mM EDTA (Sigma) and distilled water. This suspension was flash frozen to −80 °C, thawed at room temperature, and combined with chicken egg lysozyme (to 1 mg/ml) and PMSF (to 200 μM) (Sigma). After incubating at 37 °C for 1 h Triton-100× (to 1 μl/ml), NP-40 (to 0.1 μl/ml) (as IGEPAL from Sigma), DNAse (to 0.3 g/ml) (Sigma), and MgSO4 (to 0.15 mM) (Fisher) were added; this suspension was incubated for 1 h at 37 °C. Finally, the lysate was pelleted at 15,000g for 15 min and washed twice in lysis buffer before being resuspended in distilled water + 5% glycerol and stored in aliquots at −80 °C.

Inclusion-body proteomics.

Cell-free inclusion bodies were reduced in 1× NuPage Sample Buffer (Invitrogen), incubated at 70 °C for 10 min, then alkylated with iodoacetamide (100 mM) at room temperature for 30 min before being heated to 95 °C for 2 min. Soluble protein was resolved by 1 dimensional gel-electrophoresis (4–12% Bis-Tris NuPAGE gel) and digested in-gel with trypsin, as previously described43. Digested peptides were concentrated by vacuum centrifugation, desalted using StageTips44 constructed using Empore C18 extraction discs (3M Analytical Biotechnologies). Desalted peptides were analyzed by nanoliquid chromatography–tandem mass spectrometry using a Dionex Ultimate 3000 nRSLC coupled to an LTQ-Orbitrap XL mass spectrometer (ThermoFisher Scientific, San Jose, CA), as previously described9. MS/MS spectra were extracted by Proteome Discoverer and analyzed using SEQUEST by searching E. coli and contaminant protein databases. Probabilistic calculation of false-positive rates (<1% FDR) was performed by Scaffold/X! Tandem (Proteome Software) using the PeptideProphet and ProteinProphet algorithms45.

Metabolite measurement.

The metabolome of batch culture E. coli was quantitated by liquid chromatography–mass spectrometry. Briefly, saturated overnight cultures were diluted 1:50 and grown in liquid media in a shaking flask to OD600 of 0.3. A portion of the cells (3 ml) were filtered onto a 50-mm nylon membrane filter, which was immediately transferred into –20 °C extraction solvent (40:40:20 acetonitrile/methanol/water). Cell extracts were analyzed by reversed phase ion-pairing liquid chromatography (LC) coupled by electrospray ionization (ESI) (negative mode) to a high-resolution, high-accuracy mass spectrometer (Exactive; Thermo Fisher Scientific) operated in full scan mode at 1 s scan time, 105 resolution, with compound identities verified by mass and retention time match to authenticated standard46. Quantitation of low abundance metabolites such as arginine and citrulline was also confirmed by carbobenzyloxy (CBZ) derivitization followed by LC-MS analysis. Briefly, 200 μl of cell extract was mixed with 5 μl of triethylamine (Sigma) and 1 μl benzylchloroformate (Sigma). Resulting samples were analyzed by reversed phase ion-pairing liquid chromatography (LC) coupled to a Thermo TSQ Quantum triple quadrupole mass spectrometer operating in multiple reaction monitoring mode with compound identities verified by mass spectrometry and retention time match to authenticated standards. Day-to-day variation of absolute spectral counts prevents useful biological replicate comparison; therefore all data presented were from the same sequence with blanks run between each set of four samples. Three technical replicates were taken to measure reproducibility of the extraction procedures and quantifications via the mass spectrometer. Sample replicates typically varied between 13% and maximally by 45% as calculated by the coefficient of variation. Replicates were averaged before fold changes were calculated.

Colorometric assay of aspartate transcarbamylase activity.

Standard aspartate transcarbamylase assay conditions have been detailed47. All reactions were carried out at pH 7 at 37 °C for 1 h in 1 ml reaction volumes. One activity unit, defined as 10 μl of isolated inclusion body material, was assayed per reaction. Briefly, TrisHCl (100 mM pH 7) (Fisher), L-aspartate (100 mM pH 7) (Sigma), ATP (2 mM) (Sigma), lithium carbamoyl phosphate (10 mM, prepared fresh) (Sigma) were added to distilled water and equilibrated to 37 °C. To begin the reaction 100 μl of water containing 1 enzyme activity unit was added to the reaction volume. The reaction was halted by the addition of 2 ml of 5% (w/v) trichloroacetic acid solution (Sigma). Color development was carried out as detailed by Prescott and Jones48. Developed samples were assayed for the production of carbamoyl-aspartate by measuring the absorbance at 466 nm.

Arginine pseudoauxotrophy results from coclustering of CarB and PyrB.

To establish that the arginine pseudoauxotrophy was not simply a consequence of PyrB overexpression, but rather required CarB and PyrB being coclustered, we constructed a synthetic operon where CarB and PyrB were similarly co-overexpressed as two separate proteins. Without fusing CarB and PyrB, their co-overexpression resulted in wild-type growth with no detectable auxotrophy at maximal induction (Supplementary Fig. 8a), in agreement with the model prediction that no significant flux shunting occurs if CarB and PyrB are moderately overexpressed and delocalized (Fig. 4g).

To establish that the arginine pseudoauxotrophy did not result from a specific effect caused by the linker region, we sampled serine-glycine repeat linkers from 3 to 25 amino acids in length as well as the TEV protease linker region and found that the phenotype did not depend on the composition of the chain used to tether CarB to PyrB.

To establish that the arginine pseudoauxotrophy required functional PyrB, we constructed a CarB fusion to an enzymatically dead form of PyrB (CarB-PyrB(R54A)) and found that it did not display arginine pseudoauxotrophy (Supplementary Fig. 8b). The arginine pseudoauxotrophy thus requires not only colocalization of CarB with PyrB, but also functional PyrB, and cannot be attributed to a dominant-negative effect of the CarB-PyrB fusion on the activity of ArgI.

One final concern might be that the CarB-PyrB fusion somehow “hyperactivates” PyrB, reducing flux through ArgI by processing an increased fraction of the cellular pool of carbamoyl phosphate. To eliminate this possibility, we introduced the CarB-PyrB fusion into a strain expressing functional CarB-msfGFP from the native carB locus. CarB-msfGFP did not become incorporated into the phase-bright foci (Supplementary Fig. 9a), indicating that it synthesized a delocalized, cellular pool of carbamoyl phosphate. This unclustered CarB-msfGFP eliminated the arginine auxotrophy of CarB-PyrB overexpression (Supplementary Fig. 9b). Thus, auxotrophy must involve a local flux shunting within a CarB-PyrB cluster, rather than an overall effect of the fusion on cellular metabolite pools. Our modeling results taken together with these experiments led us to hypothesize that the observed pseudoauxotrophy for arginine results from flux shunting to PyrB away from ArgI due to coclustering of CarB and PyrB.

To support the coclustering-mediated shunting hypothesis, we performed a metabolomic analysis of the ΔcarB ΔpyrB strain expressing the unrepressed CarB-PyrB fusion protein. Inspection of the relative metabolite concentrations in the unrepressed fusion-containing cells revealed large alterations of the metabolite pools only within the metabolic network of the carbamoyl phosphate branch point, suggesting the pseudoauxotrophy resulted from local modifications to the network (Supplementary Fig. 10). In this regard, it is important to point out that although CTP is an allosteric inhibitor of PyrB49 we find that the CTP pool size in the fusion strain does not change substantially relative to wild type (Supplementary Fig. 10). This observation is in agreement with the findings that CTP levels are tightly regulated by multiple means, including a recently described directed overflow mechanism50.

Our hypothesis of flux shunting via coclustering requires that CarB-PyrB form large enzyme clusters. Indeed, we observed a correlation between the onset of metabolic shunting and the accumulation of phase-bright structures in cells expressing CarB-PyrB (Fig. 5). To establish that the agglomerates observed above a certain critical concentration are composed of CarB-PyrB we used two approaches. First, we made a three-part CarB-PyrB-msfGFP fusion and overexpressed it as was done for CarB-PyrB. We again observed the concentration-dependent development of phase-bright structures (Supplementary Fig. 11a). These structures also displayed extremely high levels of msfGFP fluorescence (Supplementary Fig. 11a), indicating that the compound fusion protein was present in these structures. As a second assay, we used real-time imaging to determine that the phase-bright structures remained intact upon cell lysis (Supplementary Video 1), which enabled us to purify these structures from cell lysates by ultracentrifugation. Analysis of the protein content of these structures by mass spectrometry revealed that they were overwhelmingly composed of the CarB-PyrB fusion (Supplementary Fig. 11b).

Our hypothesis that the CarB-PyrB agglomerates are responsible for metabolic shunting also depends on the agglomerates being enzymatically active. Although insoluble protein clusters have generally been assumed to be dominated by misfolded protein51, there are several reports of enzymatically active clusters52. Consistently, we note that our cell-free msfGFP-containing inclusion bodies are highly fluorescent (containing 91 ± 7% of the total cellular fluorescence). Because msfGFP fluorescence requires its proper folding, the CarB-PyrB structures must contain a substantial population of properly folded protein (Supplementary Fig. 11a). Furthermore, we subjected purified inclusion bodies composed of the CarB-PyrB fusion to a well-established in vitro aspartate carbamoyltransferase assay48, and we found that they readily produced carbamoyl aspartate when supplied with carbamoyl phosphate and aspartate (Supplementary Fig. 11c). This result confirmed that our purified inclusion bodies indeed contained active enzyme and support our clustering-mediated metabolic-channeling model.

Although we were unable to achieve metabolic shunting of carbamoyl phosphate toward the arginine side of the branch point using our simple translational fusion method, we note that the lack of such a phenotype could be caused by partial inactivation of the fused form of ornithine transcarbamylase or a relative decrease in the density of active sites within a cluster. This negative result does not alter the fact that our experiments on the pyrimidine side of the carbamoyl phosphate branch in E. coli qualitatively confirm a central prediction of our model.

Model parameters.

Except where otherwise noted, the parameters in the reaction-diffusion equations are chosen as follows. The diffusion coefficient is taken to be D = 102 μm2/s (ref. 53) for all metabolites. The rate α0 = 10-1/s at which c0(r,t) relaxes to c*0 is estimated from the typical timescale of substrate turnover in bacteria. The enzyme kcat/KM values k1 = k2 = 106 liter/s/mol (ref. 54) are chosen to be within the measured range for metabolic enzymes. We assume the single-enzyme size to be typical of a globular protein55, that is, radius ≈ 2 nm. Taking into account that active enzymes are solvated by water with a roughly 50% volume fraction, we obtain a maximum enzyme density . The intermediate decay rate was chosen to be β = 10/s (Supplementary Fig. 14 for details regarding the optimization for different values of β). The total catalytic activity is , and corresponds to 1,000 enzymes per μm3 for our choice of k1,k2. Finally, the homeostatic value c*0 of the concentration of the substrate S0 is arbitrary because rescaling c*0 by a constant factor does not change the optimal enzyme densities or the basin radius or the efficiency.