Ranking network mechanisms by how they fit diverse experiments and deciding on E. coli's ammonium transport and assimilation network.

The complex ammonium transport and assimilation network of E. coli involves the ammonium transporter AmtB, the regulatory proteins GlnK and GlnB, and the central N-assimilating enzymes together with their highly complex interactions. The engineering and modelling of such a complex network seem impossible because functioning depends critically on a gamut of data known at patchy accuracy. We developed a way out of this predicament, which employs: (i) a constrained optimization-based technology for the simultaneous fitting of models to heterogeneous experimental data sets gathered through diverse experimental set-ups, (ii) a 'rubber band method' to deal with different degrees of uncertainty, both in experimentally determined or estimated parameter values and in measured transient or steady-state variables (training data sets), (iii) integration of human expertise to decide on accuracies of both parameters and variables, (iv) massive computation employing a fast algorithm and a supercomputer, (v) an objective way of quantifying the plausibility of models, which makes it possible to decide which model is the best and how much better that model is than the others. We applied the new technology to the ammonium transport and assimilation network, integrating recent and older data of various accuracies, from different expert laboratories. The kinetic model objectively ranked best, has E. coli's AmtB as an active transporter of ammonia to be assimilated with GlnK minimizing the futile cycling that is an inevitable consequence of intracellular ammonium accumulation. It is 130 times better than a model with facilitated passive transport of ammonia.


Contents
UNDX/MGG firstly tries to decrease the constraint violation to zero. Therefore, the objective function sometimes increases in return for decreases in the constraint violation. IS-SR-REX star /JGG found solutions (parameter sets with  = 0) within 30 min. However, UNDX/MGG was not able to find solutions within 12 hours.

Figure S3 Simulation of NHx diffusion and consumption in Yuan's experiments
We simulated NHx (i.e. NH4 + + NH3) diffusion and consumption in Yuan's experiments (triangles in Figure 2A of [1]). (a) Geometry of plates. The plate consists of agarose, filter, and E. coli cells. We assumed that Yuan used 20 ml of agarose per 10-cm plate [2][3][4], resulting in an agarose layer of 2.5 mm thick. We assumed the filter was 0. NHx and grow. We assumed homogenous distribution of the cells. We assumed that the diffusion coefficient for NHx in agarose is equal to the diffusion coefficient in water: 1.12 x 10 -7 m 2 /min [5]. Based on the triangles of Figure 1b of [2], we estimated the diffusion coefficient for NHx in filter: 2.23 x 10 -8 m 2 /min. The specific Since E. coli can grow at the maximum specific growth rate at 4 M external NH4 + [6], K must be less than 4 M. Thus, we used max = 0.72 h -1 and K = 1 M. A single cell consumes NHx from the top of the filter at a rate of vsingle =  N0 Vcell, where N0 = 3 mol-N/L-cyt and Vcell = 2.15 x 10 -18 m 3 [7]. The total NHx consumption rate is vtotal = vsingle X, where X is the number of cells. The number of cells follows the ordinary differential equation: dX/dt =  X with the initial condition X(t) = X0. We assumed that Yuan used 5 ml of culture with ~0.085 OD650 to inoculate cells on the filter [1]. Assuming 1.11 x 10 9 cells/(ml OD600) [8] and 1.2 OD600/OD650, the initial number of cells is 5.66 x 10 8 . In our simulation (b-e), the specific growth rate was predicted to be 0.72 h -1 for 0 -180 min and 0.16 h -1 for 180 -300 min. These specific growth rates are close to Yuan's observation (0.72 h -1 and 0.19 h -1 , respectively , see triangles in Figure 2A of [1]), validating our simulation.  Table   S10.

Figure S5 Modeling workflow
First, we develop the ordinary differential equation (ODE) model for the E. coli ammonium transport and assimilation network. Second, we formulate the parameter estimation problem as a constrained optimization problem. The objective function (f) quantifies the deviation of parameters from their reference values, and constraint functions (g) quantify the difference between model behavior and training experimental data. The lower bound (p L ) and the upper bound (p U ) of the parameter search space are determined based on the reference values. We obtain reference parameter values and training experimental data from literature. Next, we run the genetic algorithm (GA) to solve the constrained optimization problem. Finally, we obtain the ODE model with the estimated parameter values. The modeling workflow is the same for the active, the passive, and the refined active transporter models. We perform 5 independent GA runs for each model. Thus, we obtain 5 parameter sets for each model. 8

Figure S6 Constrained optimization-based parameter estimation
To illustrate how the constrained optimization-based parameter estimation works, we consider the following simple problem here: where p1 * = 1, p2 * = 1, 1, = 1, 2 = 1. The objective function f(p1,p2) is the two-variable version of Eq. (S4.1-1) and quantifies deviations of p1 and p2 from p1 * and p2 * . The simple constraints g1(p1,p2) ≤ 0 and g2(p1,p2) ≤ 0 enable us to illustrate the feasible region (the subspace in which all the constraints are satisfied). The aim is to find the pair of p1 and p2 which minimizes f(p1,p2) and satisfies g1(p1,p2) ≤ 0, g2(p1,p2) ≤ 0, 10 -2 < p1 < 10 2 , and 10 -2 < p2 < 10 2 . The constraint violation (p1,p2) is provided as (p1,p2) = [max(0,g1(p1,p2))] 2 + [max(0,g2(p1,p2))] 2 . Only if g1 ≤ 0 and g2 ≤ 0,  takes the minimum:  = 0. In (a), the  value is visualized by a change of color: it increases as the color changes from blue to yellow. The shaded area shows the feasible region in which the  value takes the minimum:  = 0. In the modeling context, the feasible region is a parameter space in which the model fits training experimental data (with certain allowable errors). In (b)-(e), the f value is visualized by a change of color: it increases as the color changes from blue to yellow. In (b), the x symbol shows the position at which f takes the minimum: f = 0. At the position indicated by the star symbol in (b), the 9 f value is minimized with  = 0. Thus, the aim of Eqs. (Figure S6-1) is to find p1 and p2 at the star symbol. It should be noted that p1 and p2 cannot be uniquely estimated based only on the constraints: any combinations of p1 and p2 in the feasible region (the shaded area) can satisfy  = 0. However, if the constraints are combined with the minimization of f, p1 and p2 can be uniquely estimated. In (c)-(e), the parameter estimation procedure by genetic algorithms (GAs) is illustrated. In (c)-(e), each white circle shows an "individual" (a pair of p1 and p2). The GA initiates the search by generating random individuals within the upper and lower bounds (c).
Individuals which provide smaller f and  values are considered better than those who provide larger f and  values. The GA gives a slightly higher priority to reducing  (which is done by stochastic ranking [9] Tables   Tables S1 -S4   Tables S1 -S4 are shown in a separate XLSX file.

Table S9 Training data for the parameter estimation
In the parameter estimation, we used experimental data from [1,2,6,10]   Ratios of the glutamate production flux via GOGAT to total glutamate production flux [2] g15 -g16 Specific growth rates for glucose as the C-source (Open and solid triangles of Figure 3A of [6]) g17 -g21 N-assimilation rate for glucose as the C-source [6] g22 Uridylylation states of GlnKs of WT ( Figure 3A of [10]) g23 -g26 GlnK-AmtB complex of WT ( Figure 3B of [10]) g27 Ratios of glutamate and glutamine consumption rates [1] g28 -g33 Internal NH4 + concentrations for glucose as the C-source (Open and solid triangles of Figure 3C of [6]) AmtB-mediated ammonium transport rates for glucose as the C-source [6] g56 -g58

Supplementary Notes 1 Model Description
The schematic diagram of the E. coli ammonium transport and assimilation network is shown in Figure 1, where CADLIVE notation [11][12][13] is used for simplicity. The mathematical model is described in Tables S1-S4. The equations relevant to GDH, GS, GOGAT, GlnB, UTase, and ATase are based on the Bruggeman model [14]. The rate equations for glutamate and glutamine consumption, and the cell growth function are taken from the Yuan model [1]. The models in this paper newly include unmediated diffusion of NH3, AmtB-mediated ammonium transport through the cytoplasmic membrane, and regulation of AmtB by GlnK. For Yuan's experiments, we modelled an NHx diffusion barrier between the medium and the cell surface so as to explain an apparent contradiction between two recent experimental data [1,6] (see Section 5; In this paper, we use the term NHx when we do not wish to discriminate between NH4 + and NH3). In order to simulate multiple experiments conducted by different research groups, the values of some parameters have to be modified since parameters such as enzyme expression levels and external pH differ among experiments (see Section 2).

Regulation of GS
The activity of GS is feedback-controlled based on intracellular NH4 + availability: As the intracellular NH4 + increases, 2-oxoglutarate decreases and glutamine increases, leading to (partial) inactivation of GS. The enzyme level of GS is controlled by gene expression and GS activity by covalent modification. The models in this paper include only the covalent modification. The concentration of total GS is constant for Yuan's and Radchenko's experiments. In Kim's experiments, it is modified based on measured promoter activity.
The covalent modification of GS is controlled via ATase, UTase and GlnB. ATase is a bifunctional, ambiguous enzyme, which inactivates GS through a progressive adenylylation of its 12 subunits, and is also able to deadenylylate the adenylylated form of GS (GSAMP) [15,16]. The adenylylation reaction is stimulated by the nitrogen signaling protein GlnB and glutamine [17,18]. The deadenylylation reaction is stimulated by GlnBUMP and 2-oxoglutarate. The uridylylation and deuridylylation of GlnB is catalyzed by UTase, which is also an ambiguous enzyme [19]. Uridylylation and deuridylylation activities of UTase are respectively inhibited and stimulated by glutamine. Taken together, a decrease (increase) in nitrogen availability causes activation (inactivation) of GS.

Regulation of AmtB
The activity of AmtB is also feedback-controlled based on intracellular ammonium availability. The level of AmtB is controlled by gene expression, and AmtB activity is regulated via binding of the protein GlnK. The models in this paper include only the regulation via binding of GlnK. The concentration of total AmtB is taken to be constant for Yuan's and Radchenko's experiments. In Kim's experiments, it is modified based on measured promoter activity.
GlnK, a paralogue of GlnB, is the main regulator of AmtB-mediated ammonium transport. GlnK binds to AmtB and thereby blocks ammonium transport [10,20,21]. At low NH4 + concentration, GlnK is uridylylated by UTase in the same manner as GlnB and then loses its ability to bind to AmtB. An increase in NH4 + concentration leads to deuridylylation of GlnK, enabling GlnK to block the ammonium transport. It has been reported that the complex formation between AmtB and GlnK is affected by metabolites, such as 2oxoglutarate, ATP, and ADP. 2-oxoglutarate is an indicator of the cellular nitrogen status. Recently, Radchenko et al. revealed the function of ATP/ADP binding [22]: GlnK has an ATPase activity that is inhibited by 2oxoglutarate. Hydrolysis of ATP to ADP enables GlnK to change its confirmation to bind to AmtB. Taken together, a decrease (increase) in nitrogen availability causes activation (inactivation) of the AmtB-mediated ammonium transport.
For the passive transporter model, we assume AmtB facilitates the passive transport of NH3 [23,24] (In this paper, we use "facilitated" passive transport to distinguish it clearly from unmediated passive NH3 diffusion through the cytoplasmic membrane

Regulation of AmtB
In this paper, "GlnK" depicts the GlnK trimer which harbours three uridylylation sites. We modelled the (de)uridylylation rate equations as Bruggeman et al. did for GlnB [14] (with correction). The rate equation of the first uridylylation reaction is given by: where j = 2, 3. The rate equation of the deuridylylation reaction is given by: where j = 1, 2, 3.
We do not explicitly include ATP hydrolysis by and subsequent conformational change of GlnK in the models because the kinetic details remain unclear. Instead, we model only binding between GlnK and 2-oxoglutarate, implicitly assuming that ATP/ADP binding to GlnK, ATP hydrolysis, and the conformation change are very fast. Let GlnKUMP0OG0 be completely deuridylylated and 2-oxoglutarate-unbound GlnK. Since a GlnK trimer harbors three 2-oxoglutarate-binding sites, the concentration of GlnKUMP0OG0 is given by: ,················· (S1.  where [GlnKUMP0] represents the sum of all deuridylylated forms of GlnK (GlnKUMP0OG0, GlnKUMP0OG1, GlnKUMP0OG2, and GlnKUMP0OG3; see also Section 1.6). GlnKUMP0OG0 is the only GlnK form that can bind to AmtB.

Ammonium/ammonia (NHx) transfer
The net NHx transfer rate vnet (mM/min) is given by: where vamtb is the rate of the AmtB-mediated ammonium transport given by Eq. (S1.3-1), and vdiff is the rate of unmediated NH3 diffusion across the cytoplasmic membrane. Based on Fick's law of diffusion, vdiff is given by: , ······························································· (S1.  where Pcm is the permeability coefficient of the cytoplasmic membrane (m/min), Acell is the surface area of a cell (m 2 ), Vcell is the volume of a cell (m 3 ). Based on the experimental data from Kim [6], we calculated the NH3 permeability coefficient of the cytoplasmic membrane (Pcm) and obtained Pcm = 0.077 m/min. With this value of Pcm, Yuan's experimental data can be explained only on the assumption that the NHx concentration is lower at the cell surface than in the 20 bulk medium (see Section 5). This observation convinced us that the models require an additional diffusion barrier adjacent to the cell surface.
Generally, if stirring is slow or absent or if solution viscosity is high, the movement of dissolved molecules is limited [26], resulting in a decrease in the concentration at the cell surface compared to that in the bulk medium if cells consume the solutes faster than can be replenished from the medium. This probably applies to the cells in Yuan's experiments whenever they were growing on filters on top of solid agarose medium under N-limiting condition: NHx has to move from the solid agarose medium to the backside of the filter and from there through the filter to the cells growing on top. Thus, the NHx transfer from the medium to the intracellular space is determined not only by the permeability of the cytoplasmic membrane and by AmtB-mediated transport, but also by permeability of the diffusion barrier. Let vdb be the NHx flux through the diffusion barrier. Assuming that the permeability coefficients for NH3 and NH4 + are equal and that Fick's law of diffusion is applicable, we obtain: where kdb is the mass transfer capacity coefficient for NHx diffusion through the diffusion barrier (min -1 ).
[NHx,ext] and [NHx,surf] are the NHx concentrations (mM) in the bulk medium and at the cell surface, respectively.
See also Section 6.

Details on GlnK-, GlnB-, AmtB-and GS-related variables
Since GlnK, GlnB, AmtB, and GS form complexes and/or are covalently modified, there are many possible molecular species related to these proteins. In this section, we explain how the concentrations of the "child" variables add up to the total concentrations.
[GlnKtotal] denotes the total concentration of GlnK and consists of four child variables.

Simulation Settings
In this section, we explain in detail how we simulated Yuan's Some parameters needed to be adjusted because they depend on the specific experimental culture conditions. 22 We assumed that the expression levels of GlnK, AmtB, GS, GDH, and GOGAT are different between Yuan's, The extracellular NH4 + and the intracellular 2-oxoglutarate, ATP, and ADP are sometimes used as dynamic model inputs, i.e. their concentrations are changed within an experiment.

Yuan's experiments
E. coli K12 strain NCM3722 was used as wild type, all mutants strain were all isogenic with NCM3722 [1].
Cells grown under ammonium-limited condition before ammonium-upshift are assumed to have expressed GS, AmtB, and GlnK and that the expression levels have remained unchanged after N-upshift because of the short simulation duration (up to 30 min after N-upshift). In Yuan's experiments, we assumed that NHx concentration at the cell surface can be different from that in the bulk medium as described in Section 1.5. As AT P, ADP, NADPH, and NADP + concentrations did not change upon the ammonium-upshift ( Fig. 3 and Supp. Fig. 4 of [1]), these cofactors are modeled as constants. For ATP, ADP and NADPH we used the concentrations as measured for unrestricted growth on filters with glucose and ammonium [28]; however for NADP, we used a higher value to obtain an NADPH/NADP ratio that lies within the commonly observed range (0.8-3.0) [29][30][31][32].
Parameter modifications needed to simulate various mutants are shown in Table S5. In order to simulate mutant strains, we changed Vmax or protein concentrations of corresponding enzymes. Note that GOGAT requires a change not only in Vgog but also in Vgdh and [GStotal] because the enzyme expression level of GDH is tripled in this mutant strain and that of GS is reduced by half compared to wild type [1]. Qualitatively comparable changes were observed by Kumar and Shimizu in gltB and gltD mutants of strain BM25113 [33].
It was shown that a Salmonella typhimurium GOGAT mutant (gltB) did not express AmtB [34]. Thus, we set [GlnKtotal] = 0 and [AmtBtotal] = 0. It should be noted though that Yuan's GOGAT is gltD, and we assume gltB and gltD behave similarly. Since the reported specific growth rates vary for different strains (Supp.  (Table S4).

Extracellular NHx concentration
The extracellular NHx concentrations used for simulations are summarized in Table S6. We explain below how we obtained these concentrations.
The extracellular NHx concentrations of wild type and GOGAT in 13x N-upshift experiments are given in  of [1]. We utilized these data with linear interpolation. Likewise, to simulate the G6P + gluconate culture, we multiplied 0 by 0.89 (= 0.8 / 0.9).

AmtB, GlnK, and GS concentrations
The amtB and glnA promoter activities with respect to the extracellular NH4 + concentration were reported in Supp. Table 6-11 of [6], where the activities were presented as fluorescence intensities of GFP and mCherry, respectively. In order to develop continuous functions that yield promoter activities with respect to the extracellular NH4 + concentration, we took an approach similar to that shown in [6]. We fitted the following 24 Hill function to the amtB promoter activity data: where PAmtB,std is the promoter activity at a relatively high extracellular NH4 + level (e.g. the value at 10 mM extracellular NH4 + , which is the highest concentration provided in Supp. Table 6-11 of [6]. If the promoter activity at 10 mM is zero, the activity at 1 mM is used). HAmtB, KAmtB, and fAmtB are Hill coefficient, halfinhibition concentration, maximal fold change, respectively. These parameters differ for strains and carbon sources. The fitted parameters are provided in Table S8.
Using the above promoter activity function, we obtain the AmtB concentration by assuming that the AmtB concentration is proportional to the promoter activity and that the promoter activity of wild type cells grown [GlnKtotal] * is estimated in our study (Table S4). Similarly to AmtB, we fitted the following Hill function to the GS promoter activity data of [6]: We assumed that the promoter activity of wild type cells grown on medium containing 4 M of NH4 + as  (Table S4).

Extracellular NH4 + concentration
The extracellular NH4 + is given as indicated on the abscissas of Figure 4 and 6.

2-Oxoglutarate concentration
The 2-oxoglutarate concentration was not reported in [6]. Therefore, we had to develop a function of the 2oxoglutarate concentration with respect to the NH4 + concentration. Kim predicted a change in 2-oxoglutarate level with respect to the extracellular NH4 + ( Figure 5C of [6]

Radchenko's experiments
E. coli strain GT1000 (glnKamtB) harbouring plasmid pAD2 (glnK His6amtB) or pADY51A (glnK_Y51A His6amtB) were used as the wild type and the GlnK Y51A mutant, respectively; the expression of WT GlnK or Y51A GlnK and his-tagged AmtB were under the control of the native promoter of the glnKamtB operon [10]. Cells grown under N-limited condition before the ammonium-upshift are assumed to have expressed GS, AmtB, and GlnK and that the expression levels have remained unchanged after N-upshift during the short duration of the experiment (20 min) after the shift. NH4 + concentration at the cell surface was taken to be equal to that in the bulk medium. The cells are expected to be severely N-limited before the N-upshift. Therefore, we assumed that the cells did not grow in Radchenko's experiments after the N-upshift ( = 0, vgludemn = 0, vgludemf = 0, vglndemn = 0, and vglndemf = 0, but were still able to metabolise ammonium. In order to simulate GlnK Y51A mutant whose GlnK cannot be uridylylated, we set kcat of the uridylyl transfer reaction of UTase to zero 26 (kcatutglnk = 0).

2-Oxoglutarate, ATP, and ADP concentrations
Radchenko did not measure 2-oxoglutarate, ATP, and ADP in [10]. However, they conducted a similar experiment where they did measure these metabolites (Figure 2A and Figure 3A of [36] We assume that the probability density of qi follows a normal distribution with the mean of zero. The probability density function (PDF) for pi is: where i is the standard deviation for qi. Here, we define parameter plausibility (PP) for the estimated value of the ith parameter: PPi indicates how realistic an estimated parameter pi is. If pi = pi * , then ln(pi/pi * ) = 0 and PPi = 1. As pi deviates from pi * , PPi decreases to zero. Next, we define model plausibility MP as the product of parameter plausibilities:

Parameter Estimation
As described in Methods in the main text, we formulated our variable fitting and parameter estimation problem as a constrained optimization problem, and employed a genetic algorithm (GA) to solve it with the aid of a supercomputer. First, we elaborate on the objective function and the constraint functions used to fit experimental data. Next, we provide a brief description of the GA we employed.

Objective function
We define the objective function f as the natural logarithm of the inverse of model plausibility: where j (j = I, II, III) is the class-related penalty weight for a parameter change (I > II > III ≥ 0). We use I . That is, PDFi for class I parameters gives probability of 68% for 1/2 < pi/pi * < 2, and that for class II gives 68% probability for 1/5 pi/pi * < 5 [see Eq. (S3-2)]. We use III = 0, that is, we allow class III parameters to change freely because class III parameters have not been measured, and it is difficult to guess their values. In parameter estimation, GA tries to minimize f, i.e. maximize MP. The concept of our parameter estimation is analogous to maximum likelihood estimation. Actually, model plausibility is the ratio of the likelihood for p to that for p * . Therefore, maximization of MP is equal to maximization of the likelihood.
Two of the authors of this paper used an objective function similar to Eq. (S4.1-3) in the previous work [37].
The difference is that the objective function in the present work employs "squared" and "natural logarithm" [Eq. (S4.1-3)] while "absolute" and "common logarithm" were used in the previous work. The difference is subtle but important. Eq. (S4.1-3) is derived based on model plausibility while the objective function in the previous work was just an empirical formula, i.e. without theoretical background.

Constraint functions
In a GA, we sequentially simulated Yuan's, Kim where x sim , x exp , and  are the simulated variable, measured variable, and allowable error, respectively. Type 3 constraint function imposes a penalty if the simulated value (x sim ) is smaller than a certain lower bound (x lb ). 2 . 4)" of msb200960-s4.xls of [1]. In contrast to Fig. 4  According to Fig. 2A of [1], the wild type grows at  = 0.19 h -1 before the N-upshift. Thus, the simulated value of the specific growth rate ( Yuan,WT,N-limit ) should be close to that value. If the deviation of  Yuan,WT,N-limit from 0.19 h -1 (= 0.0032 min -1 ) is less than or equal to 10%, then no penalty is imposed. Otherwise, the squared relative deviation is used as the penalty, i.e. gfun2 with II. After the N-upshift, the wild type grows at  = 0.36 h -1 (= 0.0060 min -1 ) and the constraint is similarly given by: According to the right of Supp. Fig. 2  Most of glutamate production is achieved by GOGAT, both before and after N-upshift [2,38]. According to [2], it is more than 85%. We use gfun3 for g15 and g16:

Constraint functions for Radchenko's experiments
We semi-quantified blackness of the protein bands visible in Fig. 3A of [10] using ImageJ [39] and calculated the relative abundance of GlnKUMP0-3 in the cytoplasm (the fractions with an asterisk below). The constraints below are a variant of gfun1 (the total concentration is used in the denominators).

Other constraint functions
The two central amino acids glutamate and glutamine are used for anabolic purposes; i) both molecules are constituents of proteins; ii) glutamate (glutamine) also serves as amino group (amide) donor for generation of all other amino acids; iii) both molecules serve as N donors for the N-atoms present in the RNA and DNA nucleotides; iv) glutamate provides the C skeletons of arginine and proline [40,41]. We allocated processes iiv to four demand functions and pose constraints to the respective flux ratios as imposed by the cellular composition. The sum of the demand fluxes is determined by the specific growth rate (See Section 8.5). In addition, the model should reach steady states in case of Kim'

s experiments and in case of Yuan's and
Radchenko's experiments before the N-perturbation is applied. Nine auxiliary constraint functions (g43 up to g51) are used to detect the failure of the model to reach steady state, if so, g43-51 returns 10 10 in each case. We also used the steady state constraint (g52) for wild type of Yuan's experiment after the 13x N-upshift in order to make sure the models reach a steady state with reasonable glutamate and glutamine levels.

Additional constraint functions for refining the active transporter model
The following constraints (g53-58) are based on Kim's calculation of the intracellular NH4 + concentrations and the rate of the AmtB-mediated ammonium transport [6]. In their calculation, Kim assumed that AmtB is an active transporter. For this reason, we did not use these constraints for the first parameter estimation in which the active and passive transporter models are compared (the section "The active transporter model is 130 times more likely than the passive transporter model" in the main text). We used these constraints only for refining the active transporter model (the section "Refining the active transporter model" in the main text).
According to Figure 3C of [6], the internal NH4 + is 35 M at 60 M external NH4 + . The constraint is given as: The AmtB-mediated ammonium influx at 60 µM external NH4 + is nearly zero. Thus, we impose a penalty if the AmtB-mediated ammonium influx is more than 10 % of the ammonium assimilation flux (40 mM/min). According to [6], the rate of AmtB-mediated ammonium transport rate for glucose at 20 M external NH4 + is 243 mM/min (this is a corrected value based on our Acell, Vcell, and pKa).

IS-SR-REX star /JGG
Parameter estimation problems are highly nonlinear and have multiple local optima [42,43]. For such problems, deterministic optimization methods hardly find solutions that provide sufficient fitting to training data.
Population-based stochastic optimization algorithms such as genetic algorithms, evolution strategies, and scatter searches are more promising alternative approaches [43][44][45][46][47]. We developed the novel real-coded genetic algorithm (GA), named IS-SR-REX star /JGG (Iterative Start and Stochastic Ranking-REX star /JGG), in order to estimate kinetic parameters in E. coli ammonium transport and assimilation models. The working of this GA can be briefly explained as follows: IS-SR-REX star /JGG employs REX star (Real-coded Ensemble Crossover star) and JGG (Just Generation Gap) as a crossover method and a generation alternation method, respectively [48,49]. REX star /JGG find solutions much faster than UNDX (Unimodal Normal Distribution Crossover)/MGG (Minimal Generation Gap) which two of the authors of this paper employed in [37]. However, REX star /JGG has a drawback: the search sometimes becomes trapped in local optima. To compensate for this weakness, we incorporate an iterative start strategy into REX star /JGG: If REX star /JGG does not find any solutions (i.e. parameter sets that provide  = 0) by a predefined generation (say at generation 2,000), it discards the current search and restarts the search using a newly generated random initial population. In addition, if REX star /JGG does not improve f by 1 % for 2,000 generations, it restarts the search. Also, REX star /JGG (and most of GAs) has originally been designed for unconstrained optimization problem. To effectively handle constrained optimization problems, we employed the stochastic ranking method [9,50]. In this study, the size of the initial population, the number of parents, and the number of children are 500, 300, and 500, respectively.
As shown in Figure S2, IS-SR-REX star /JGG is superior to UNDX/MGG.
Population-based stochastic optimization algorithms are computationally demanding. IS-SR-REX star /JGG is no exception in this regard. To reduce the computational time, we used the development version of libRCGA [51] in which REX star /JGG and UNDX/MGG are implemented in C language and paralleled by MPI. We employed CVODE [52] for numerical integration of ordinary differential equations (ODEs). We executed the IS-SR-REX star /JGG on the supercomputer Shirokane3, provided by the Human Genome Center, Institute of Medical Science, the University of Tokyo. A single run for the parameter estimation took 12 hours using 21 cores of Intel Xeon E5-2670 v3 (2.3 GHz).

Apparent Discrepancy between Kim's and Yuan's Experimental Data
In this section, we elaborate on an apparent discrepancy (for glucose as carbon source) we observed between Kim's [6] and Yuan's [1] experimental data. We will do this step by step.

The net NHx transfer rate vnet (mM/min) is given by:
where vamtb is the AmtB-mediated ammonium transport rate [see Eq. (S1.3-1)], and vdiff is the rate of unmediated NH3 diffusion across the cytoplasmic membrane. Based on Fick's law of diffusion, vdiff is given by: where Pcm is the permeability coefficient of the cytoplasmic membrane (m/min), Acell is the surface area of a   [53,54], and the intracellular pH (pHint) is 7.6 [55,56]. When the NH4 + concentration in the medium was set to 4 M, Kim reported that the intracellular ammonium [NH4 + int] is 24 M, the ammonium flux by AmtB (vamb) is 356 mM/min, the specific growth rate () is 0.8 h -1 , and the pH of their medium is 7.4 (pHext; see data for glucose shown in Figure 3 of [6]). Here, we assume that the NHx concentration at the cell surface is equal to that in the bulk medium. The same goes for the pH at the cell surface and in the bulk medium. These assumptions are considered valid since Kim cultivated cells in microfluidic chambers that were continuously supplied with liquid medium. Based on these assumptions, we  [57]. Thus, to explain the increase in glutamine production upon the ammonium upshift, the intracellular NH4 + concentration before the upshift should be much less than 0.1 mM. Solving Eq (S5-8) for the intracellular NH4 + concentration, we obtain: Yuan reported that the NH4 + concentration at the surface of the filter was 0.75 mM, the specific growth rate was 0.19 h -1 before the ammonium upshift, and the medium pH was 7.0 throughout the experiment. We again assume that the NHx concentrations at the cell surface and at the surface of the filter are equal, and that the same is valid for the pH (but see the next paragraph). Thus, [NH4 + surf] = 0.75 mM and [H + surf] = 100 nM. Since Yuan reported that E. coli is N-limited in this situation, AmtB was probably operative (vamtb > 0 mM/min).
Using those values and Eq. (S5-9), we obtain [NH4 + int] > 0.19 mM, which is twice the Km of GS for NH4 + . Consequently, the possible increase in glutamine production rate upon the ammonium upshift is predicted to be at most ~1.5-fold, which cannot explain the observed sharp glutamine increase upon the ammonium upshift  [58], 0.078 m/min [59], 0.12 m/min [60].
Even if the lowest of these values is used, the sharp increase in glutamine upon the ammonium upshift observed in Yuan's experiments still cannot be explained: [NH4 + int] > 0.14 mM and the increase in glutamine production rate can only be maximally 1.7-fold. The sharp increase in glutamine can be explained only if [NH4 + surf] would be (much) smaller than 0.75 mM. 38 6 Discussion on the Mass Transfer Capacity Coefficient (kdb) As we described in Section 1.5, we modeled the ammonium/ammonia (NHx) transfer from the bulk medium to the cell surface through the diffusion barrier in order to explain Yuan's experimental data: where kdb is the mass transfer capacity coefficient for NHx diffusion through the diffusion barrier (min -1 ), and the estimated value is 13.9 min -1 . [NHx,ext] and [NHx,surf] are the NHx concentrations (mM) in the bulk medium and at the cell surface, respectively. In this section, we first show that Yuan implicitly assumed the presence of a diffusion barrier in their model. Then, we discuss the validity of the estimated value of kdb.

The Yuan model implicitly includes a diffusion boundary for NHx
Yuan did not explicitly model the ammonium/ammonia diffusion barrier or the AmtB-mediated ammonium transport. In the Yuan model, the rate of NH3 diffusion from the medium to cytoplasm is given by: where [NH3ext] and [NH3int] are NH3 concentrations in medium and cytoplasm, respectively. The term kdiff is the "ammonia membrane diffusion constant" (See Supp. Table 3 of [1]). Here, we assume "membrane" means cytoplasmic membrane. Thus, in standard terminology, kdiff is the mass transfer capacity coefficient for NH3 in cytoplasmic membrane. Yuan estimated kdiff to amount to 24.6 min -1 to fit their model to Fig. 4 of [1]. In general, the mass transfer capacity coefficient consists of three parameters: where, Pcm is the permeability coefficient for cytoplasmic membrane, and Acell and Vcell are the area and volume of a single cell. If we use Acell = 9.18 x 10 -12 m 2 and Vcell = 2.15 x 10 -18 m 3 , we obtain Pcm = 5.76 x 10 -6 m/min, which is too low to be realistic (our Pcm is ~0.077 m/min, as derived from [6]). Therefore, kdiff cannot be considered a mass transfer capacity coefficient for NH3 in cytoplasmic membrane.

Thickness of the diffusion barrier
In our parameter estimation, we categorized kdb as a Class III parameter and estimated its value [see Eq. To support the specific growth rate of 0.19 h -1 before the N-upshift, NHx consumption rate must be around 9.5 mM/min, thus, vdb ≈ 9.5 mM/min. We assumed that the diffusion coefficient for NHx in agarose medium and filter is equal to that in water. Since most NHx molecules exist in the NH4 + form around pH = 7.0, the diffusion coefficient of NH4 + in water can be used as Ddb: Ddb = 1.12 x 10 -7 m 2 min -1 [5]. Asurf and Vsurf are the area and volume of the surface compartment. Since the surface compartment is a conceptual compartment, the determination of Asurf and Vsurf is somewhat arbitrary. Here, we assume that the surface compartment is a space at distance of up to 10 m from the cell surface; then, Asurf = 1. is likely that the cells formed multi-layered colonies. (iv) Another possible factor to reduce Asurf is air bubbles.
If tiny air bubbles were present, they would hamper NHx transfer, and Asurf would then decrease. (v) [NHx,ext] might be smaller than Yuan reported (0.75 mM). Yuan claimed that they measured the ammonium concentration at the surface of the medium. However, it was not possible to obtain the necessary details of their measurement. Therefore, we simulated NHx diffusion and consumption by wild type cells in Yuan's filter culture. As shown in Figure S3, the simulation with reasonable physical and biochemical parameters suggests that the NHx concentration at the surface of the medium could have been lower than 0.75 mM just before the N-upshift was administered at 180 min. Our simulation predicts that, at 180 min, the NHx concentration at the top of the medium is 0.29 mM, and that at the top of filter is 7.4 M.

Derivation of the Rate Equation of AmtB-mediated Ammonium Transport
To test the active transporter and the passive transporter hypotheses of AmtB, we derived the rate equations based on these hypotheses. Although the transport hypotheses are different, the derived rate equations can be converted into the same equation [Eq. (S1.3-1)] with different theoretical accumulation factors. In this section, we elaborate on the derivation of the rate equations for AmtB-mediated ammonium transport and how these can be transformed into Eq. (S1.3-1).

Active transport of NH3
For active transport of NH3, we start off with the reaction scheme shown in Figure S1a. Please note that NH4 + is transported in Figure S1a, but it is also valid if NH3 and H + are separately transported, provided that their transport is coupled. As shown in Figure S1b, Thus, the ratio is determined by kf, kr, Kext, and Kint. These parameters are not quite independent however. At 42 carrier equilibrium, the electrochemical potential difference across the cytoplasmic membrane for NH4 + is given by: where  is the transmembrane electrical potential (usually in the order of 150 mV for E. coli), F is the Faraday constant, R is the gas constant, and T is the absolute temperature. According to Eq. (S7.1-7), the theoretical accumulation factor of NH4 + () is given by:  where AmtBGlnKfree depicts GlnK-unbound AmtB, i.e. the operative form of AmtB.

Facilitated passive transport of NH3
According to the facilitated passive transport mechanism proposed by Khademi et al. [61], NH4 + binds to the periplasmic cage of AmtB, then becomes deprotonated and moves to the middle membrane channel, whereafter NH3 is released into the cytoplasm (Figure S1c). Based on this mechanism, we derive a rate equation for the facilitated passive transport of NH3. As shown in Figure S1d, AmtB is distributed over three states: the empty carrier (E), the carrier with NH4 + at the periplasmic cage (EN4ext), and the carrier with NH3 at the midmembrane path (EN3mm). We assume NH4 + binding to the periplasmic cage and deprotonation to be at equilibrium, and NH3 transport from the mid-membrane to the cytoplasm to be the rate-limiting step. This value is the same as the value Kim used [6], but their calculation is based on different experimental data (see Supp. Table 3 of [6]).

Cell volume (Vcell) and surface area (Acell)
We use Vcell = 2.15 m 3 and Acell = 9.18 m 2 . Radzikowski et al. estimated that the cell volume is 2.15 m 3 for cells growing on glucose in the exponential growth phase [7]. It should be noted that the same research group had reported a larger cell volume (3.2 m 3 ) in [8], but recently asserted that this was an overestimation [64].
Since cell surface area is not provided in [7], we estimated it as follows. (i) Vcell = 2.15m 3 [7]. (ii) The cell length was 2.5-fold larger than the cell width [7]. (iii) We assumed the cells to have the shape of a cylinder capped by two half-spheres [8]: L/mgDW (= 400 gDW/L) [63] and Vcell = 2.15 m 3 , we obtain 860 fgDW/cell which is within the physiological range [65]. Protein constitutes 50 -60 % of cell dry weight, so we get 430 -516 fg protein/cell, which is close to the value reported in [64] (In [64], 280 fg protein/cell was obtained using a cell volume of 3.2 m 3 . If we replace this with 2.15 m 3 , the protein weight will be 417 fg protein/cell).

Vmax of GS, GOGAT, and GDH (Vgs, Vgog, and Vgdh)
Most data available in the literature for the Vmax of GS are based on the glutamyl transferase assay, which measures the reverse reaction of GS with hydroxylamine and glutamine as substrates. However, we prefer using data obtained for the biosynthetic GS assay that employs ammonium and glutamate as natural substrates.
Also, we prefer using data obtained for ammonium-limited chemostat cultures, where GS will be not (or hardly) adenylylated. Kumar and Shimizu observed a Vmax value for GS of ~ 80 mM/min for cells growing in an ammonium-limited chemostat at 0.2 h -1 [33]. A higher Vgs was obtained in Bruggeman et al. [4] though and we will use this value, as well as their values for Vgog and Vgdh: Vgs = 600 mM/min, Vgog = 85 mM/min, and Vgdh = 360 mM/min. However, these Vmax's were measured under enzyme-specific in vitro conditions. We corrected these values considering the fact that Vmax in vitro may be different from that in vivo [66]. According We did not correct Vgdh because the in vivo-in vitro difference in Vgdh stems primarily from K + (see Fig. 5 of [66]) and sufficient K + (>20 mM) was present in the assay medium used in [14]. Kim [6]. According to Fig. 2C of [1], the blackness of the GS band of Western blot moderately increased upon transition from ammonium excess to limitation (with glucose). According to Supp. Table 8 of [6], mCherry (glnA) promoter activity increased by 3-fold upon transition from ammonium excess to limitation (with glucose). In conclusion, we took GStotal = 7 M as the reference value, and thus kcatgs = 38571 min -1 to obtain Vgs of 270 mM/min.
Data on cellular quantities of AmtB and GlnK are scarce. According to van Heeswijk et al. [68], GlnK is 1.7fold higher than GlnB under N-limited condition. In our study, GlnBtotal is 0.65 M and GlnKtotal to 2 M, so the difference is 3-fold. Moreover, for GlnK to bind AmtB in 1:1 ratio and effectively block ammonium transport, AmtB should be somewhat lower than GlnK. Also, Blauwkamp and Ninfa found that coexpression of AmtB and GlnK at proportional levels are required [74]. It should be noted though that Radchenko et al.
At any rate, it seems likely that the AmtB level is less than the GlnK level. Therefore, we took 1.5 M for AmtBtotal. Assuming Vcell = 2.15 m 3 and Acell = 9.18 m 2 , we obtain 634 AmtB monomers/m 2 , which is within the range assumed in [71].
Since GlnK is a paralogue of GlnB, we assumed that GlnK-related kinetic parameters are similar to the corresponding counterparts of GlnB. However, we make an exception for the parameters Kglnbog1-3 (dissociation constants for GlnK and 2-oxoglutarate). Atkinson and Ninfa determined a Kactivation of 2-oxoglutarate for uridylylation of GlnK (Fig. 7 of [75]), which would be ~ 10 µM. In analogy with GlnB, the Kglnkog1-3 might have similar values. This was however measured in a reconstituted system with purified UTase (10 nM) and GlnK (15 µM should be in the mM range. We have chosen 5 mM for the reference values of Kglnkog1-3.

GLUdemn, GLUdemf, GLNdemn, and GLNdemf
As shown in where the unit is mM/min. As shown in Table S1, the consumption rates are: of GS (glutamate and ATP) would be saturating, and the products (glutamine, ADP, and Pi) would be zero.

Parameters that are different between the active and passive transporter models
As we showed in the main text, model plausibility for the active transporter model is 2.2 x 10 -4 , and that for the passive transporter model is 1.7 x 10 -6 . Therefore, we concluded that the active transporter model is 130 times more likely than the passive transporter model. To investigate what causes the 130-fold difference in model plausibility, we plotted the deviation of the averages of estimated values (n = 5; ±SD) from their reference values ( Figure S4). Overall, the active and passive transporter models have similar parameter values.
However, some parameters are different between the two models, which results in the difference in model plausibility. The difference in parameter values comes from the fact that the passive transporter model cannot accumulate NH4 + inside and thus needs GS-related parameters to be changed from the reference values in order to achieve an N-assimilation rate of 40 mM/min at a low internal NH4 + concentration.

Low value for model plausibility is to be expected
As mentioned above, the model plausibility of the active transporter model is 2.2 x 10 -4 , and that of the passive transporter model is 1.7 x 10 -6 . One might argue that the plausibility of the active transporter model in absolute sense is already just too low, and that therefore the conclusion should be that not only the passive transporter model but also the active transporter model is unrealistic. However, we argue that the low value just implies that even the active transporter model cannot fit the training data without changing parameter values. As shown in Figure S4, most values are close to the reference values, but some are a little off and a few do substantially deviate; all together, this results in a low absolute value for the plausibility. Also, theoretically, a value of 2.2 x 10 -4 can be obtained by assuming that all model parameters deviate by 44.7 % from their reference values, which means that they are all just ~1.45-fold higher or lower. Thus, for most realistic models, the fact of the matter is that a low value for the plausibility is to be expected. 2.4 x 10 -5 for active and passive, respectively]. Therefore, our conclusion that the active transporter model is more likely than the passive transporter model holds at least qualitatively even if we change values of i.

Active transporter model is more likely even if Km of GS for NH4 + is allowed to be changed
In parameter estimation above and in the main text, we considered Km of GS for NH4 + (Kgsnh) as a constant (i.e. an unsearched parameter) because there is broad consensus in the literature on its value (100 µM). Is the active transporter model more likely than the passive transporter model even if we allow Kgsnh to be changed? To answer this question, we conducted an additional parameter search in which we made Kgsnh a class I parameter so that Kgsnh would be searched. We used I = 1.0407, II = 0.1930, and III = 0. Model plausibility of the active transporter model was significantly higher than that of the passive transporter model (p = 0.008, Wilcoxon rank-sum test): 3.4 x 10 -4 ± 4.7 x 10 -5 vs. 1.9 x 10 -5 ± 1.5 x 10 -6 (n = 5; ± SD). Thus, even if we allow Kgsnh to be changed, the active transporter model is 18 times more likely than the passive transporer model. For the active transporter model, Kgsnh, GStotal (Kim), and kcatgs were estimated to be 65 ± 3 M, 10 ± 0.6 M

Model improvements
The following is a list of the improvements that were implemented in the models we made for this paper; most corrections relate to errors made by ourselves and by others in the past.
1. The AmtB-mediated ammonium transport kinetics is consistent with the detailed balance principle, i.e.
[NH4 + int]/[NH4 + ext] =  at vamtb = 0 (see Section 7). This was not explicitly considered in our previous study an error; the set of reactions constituting ammonium transport did not comply with the detailed balance principle [89]. Although the Bruggeman and Ma models captured qualitative or semi-quantitative behaviors known to exist at the time, they have not been challenged with more recent quantitative experimental data.
Yuan developed a kinetic model of E. coli ammonium assimilation, which includes GDH, GOGAT, GS, and the PII-UTase-GS-ATase bicycle, but not AmtB-mediated ammonium transport or gene expression regulations [1]. Yuan reported that their model successfully reproduced the transient dynamic of glutamate, glutamine, and aspartate upon N-perturbation. We used their transient metabolome data upon N-perturbation as training data.
The Yuan model is a great step toward the holistic understanding of the ammonium assimilation. However, we think their model is incomplete: (i) NH4 + and NH3 were not discriminated in their model. (ii) The intracellular NH4 + was calculated to be as low as ~1 µM for a specific growth rate of ~0. Their model did not contain AmtB, and thus the internal NH4 + can be as low as 10 M, but still they assumed that E. coli at such a low internal NH4 + grows as fast as under N-rich condition.
Recently, Gosztolai et al. developed a concise model of E. coli's ammonium assimilation and investigated the role of GlnK in detail [25]. Their model consists of three variables: GlnBUMP, GlnKUMP, and GSAMP. Their model requires 2-oxoglutarate, glutamine, total GlnB, total GlnK, and total GS concentrations as model inputs and predicts posttranslational modification states of GlnB, GlnK, and GS. Their model does not include ammonium/ammonia, glutamate, GDH, GOGAT, AmtB, or metabolic reactions. Since the models in our study focus on short-term transient responses [1,10] or steady states [6], we did not use their experimental data (long-term transient responses with changes in gene expressions) as training data in our modeling.

Constrained Optimization-based Approach vs. Conventional Approach
Conventionally, a parameter estimation problem is formulated as an (unconstrained) optimization problem where p = (p1, p2, ...) is the search parameter vector. fconv is the (conventional) objective function that indicates badness-of-fit to training data (typically, the sum of squared residuals between experimental values and simulated values). p L and p U are the lower bound and upper bound vectors, respectively. The aim of this (unconstrained) optimization problem is to minimize the badness-of-fit (i.e. to maximize the goodness-of-fit).
In a few studies (e.g. [51,90]), parameter estimation problems have been formulated as a constrained optimization problem; however, in those studies, the objective function was used as a badness-of-fit indicator as in Eqs. (S12-1), and constraint functions were used to incorporate the relationships among parameters (e.g. a certain parameter is larger than another). In contrast, in this study, we use the objective function to quantify parameter deviation from the reference values and the constraint functions to quantify the badness-of-fit.
The conventional approach based on Eqs. (S12-1) suffers from the following three problems. First, it cannot incorporate prior knowledge about parameter values efficiently: what one can do is to believe it without any doubt or not to believe it at all. If a certain parameter has been measured in vitro, one might fix it to the measured value during parameter estimation. However, if the parameter value in vitro is very different from that in vivo, the fixation prevents a model from fitting to training data. Indeed, parameter values in vivo can be very different from those in vitro [66]. On the other hand, if the measured parameter is searched during parameter estimation, the parameter can take any value within the lower and upper bounds without any penalties. To achieve good fitting with reasonable parameter values, the best one can do is to choose appropriate lower and upper bounds by trial and error. Our approach [Eqs. Third, the conventional approach [Eqs. (S12-1)] often suffers from the parameter non-identifiability problem.
In parameter estimation, it is important to uniquely determine a parameter set p. If multiple parameter sets provide similar fittings (i.e. similar fconv values), then the prediction based on a single parameter set will not be reliable. p can be uniquely determined only if the model is structurally and practically identifiable [91,92]. If parameters compensate for each other's effect on model behavior, they cannot be uniquely determined (structurally non-identifiable). If there are fewer experimental data points than search parameters, these parameters cannot be uniquely determined (practically non-identifiable). Generally, it is difficult for the conventional approach to avoid parameter non-identifiability in realistic kinetic modeling. The conventional approach pursues good fitting only. In contrast, our constrained optimization-based approach [Eqs. (3)] not only looks for good fitting but also minimizes parameter deviation from the reference values. The latter additional requirement can help to identify a particular parameter set (see Section 13 and Figure S6). This technique is called "regularization" [92]. The regularization is analogous to the maximization of the objective function (e.g. biomass yield and ATP production) in flux balance analysis (FBA): Maximizing the objective function, one can determine a particular flux distribution (i.e. is identifiable). Without the objective function, the flux distribution cannot be uniquely determined (i.e. is non-identifiable). The difference between FBA and