Genome-scale metabolic rewiring to achieve predictable titers rates and yield of a non-native product at scale

Achieving high titer rates and yields (TRY) remains a bottleneck in the production of heterologous products through microbial systems, requiring elaborate engineering and many iterations. Reliable scaling of engineered strains is also rarely addressed in the first designs of the engineered strains. Both high TRY and scale are challenging metrics to achieve due to the inherent trade-off between cellular use of carbon towards growth vs. target metabolite production. We hypothesized that being able to strongly couple product formation with growth may lead to improvements across both metrics. In this study, we use elementary mode analysis to predict metabolic reactions that could be targeted to couple the production of indigoidine, a sustainable pigment, with the growth of the chosen host, Pseudomonas putida KT2440. We then filtered the set of 16 predicted reactions using -omics data. We implemented a total of 14 gene knockdowns using a CRISPRi method optimized for P. putida and show that the resulting engineered P. putida strain could achieve high TRY. The engineered pairing of product formation with carbon use also shifted production from stationary to exponential phase and the high TRY phenotype was maintained across scale. In one design cycle, we constructed an engineered P. putida strain that demonstrates close to 50% maximum theoretical yield (0.33 g indigoidine/g glucose consumed), reaching 25.6 g/L indigoidine and a rate of 0.22g/l/h in exponential phase. These desirable phenotypes were maintained from batch to fed-batch cultivation mode, and from 100ml shake flasks to 250 mL ambr® and 2 L bioreactors.

Heterologous production of bioproducts has been demonstrated for a very large number of 58 compounds and in a wide variety of microbial hosts 1,2 . Yet, even the most well-designed 59 heterologous pathway requires considerable additional work to reach titers, rates and yields 60 (TRY) necessary for the adoption of these systems by industry 3,4 . In addition, the production 61 parameters of a strain at lab-scale is often not predictive of its performance and robustness 62 when cultivated in different modes or at larger scales. As a result, only a small fraction of 63 bioproduction strains have been successfully scaled and deployed 2 . 64 65 Here we explore if it is possible to rewire the metabolism of the host strain such that production 66 of a final product or a key intermediate is coupled with the carbon source, and used to maximize 67 and maintain productivity at scale. Native microbial processes that take such growth coupled 68 routes include the generation of ethanol and organic acids during fermentation. Production of 69 these metabolites are required for carbon utilization during fermentative growth, and 70 correspondingly these compounds represent the most prominent examples of successful high-71 volume bioproduction 5,6 . We hypothesized that coupling production to growth is implementable 72 for a heterologous product, and that such a dependence could provide high TRY and the ability 73 to maintain production parameters across different growth modes and scales. 74

75
The availability of comprehensive metabolic models and genome editing tools in a wide variety 76 of microbes suitable for industrial use provides the foundation for our approach. We use the 77 production of indigoidine, a bipyridyl compound derived from glutamine, as the target 78 heterologous product. Both as a sustainable replacement for blue pigments 7 in a wide array of 79 applications as well as a model non-ribosomal peptide 8 , this compound provides a valuable 80 target to explore. We used Pseudomonas putida KT2440 as our production host, leveraging the 81 availability of the iJN1462 genome scale model for P. putida KT2440 9 . We adapted elementary mode analysis (EMA) 10 to determine the constrained minimal cut set (cMCS) required to 83 minimize metabolic flux towards undesired products and link indigoidine formation to cell 84 viability 11 . These analyses, combined with publicly available omics data 12,13 , provided the set of 85 gene loci that represented the reactions necessary for removal. The corresponding set of gene 86 loci were repressed using multiplex CRISPR interference (CRISPRi) that we optimized for use 87 in P. putida KT2440. Our implementation resulted in a highly edited strain that, in a single 88 iteration of strain engineering, achieved close to 50% max theoretical yield of indigoidine in P. To develop the product coupling approach (Figure 1a), we first identified the number of 97 metabolites represented in P. putida iJN1462 9 model that can be made essential for growth. 98 For this we used the cMCS algorithm 11 that can identify minimal sets of reactions, the 99 elimination of which would cause production of a given metabolite to become essential for 100 growth. Aerobic conditions with glucose as the sole carbon source were used to model growth 101 parameters. We searched for gene knockdown sets to satisfy three potential constraints in 102 which the theoretical product yield was at a minimum of 10%, 50%, or 80% of the maximum 103 theoretical yield (MTY) for all producible metabolites in the model coupled to a minimum 10% 104 biomass yield. This analysis, completed for all 2145 metabolites in the genome scale model, 105 indicated that 979 organic metabolites could potentially be made essential for growth. In the first 106 pass, 98.6% of these 979 metabolites had the potential to satisfy this biomass-formation 107 constraint, with a minimum production threshold of 10% MTY. When the threshold for minimum production was set to 50% MTY, 903 metabolites could be essential for growth; for an 80% 109 threshold MTY, only 444 metabolites could be made essential for growth, representing only 45% 110 of the total producible metabolites. This potential growth coupling for all metabolites is 111 consistent for similar calculation for other hosts 11 (see Supplementary Table 1). Setting a 112 higher demand for minimum product yield results in fewer metabolites that can be used to 113 implement a production obligatory regime. 114 115 As the framework proposed by von Kamp and Klamt required an extensive rewiring of microbial 116 physiology, a priori it was not obvious how to account for a heterologous end product, a 117 challenge to implement in practice. We began by adding an in silico reaction for the 118 heterologous product, indigoidine, to the genome scale metabolic model iJN1462 9 . This reaction 119 represents the biosynthesis of indigoidine from glutamine and accounts for all necessary 120 cofactors. The MTY for glutamine and indigoidine was calculated to be 1.141 mol/mol and 0.537 121 mol/mol respectively from glucose as the carbon source ( Table 1). The MTY for glutamine in P. 122 putida was high relative to other hosts screened by us (Supplementary Table 2). As this 123 method accounts for the other physiological processes competing for resources, a MTY derived 124 from a genome scale model provided a more accurate assessment compared to simpler 125 methods, as is commonly done in the field 14,15 . 126

127
In order to predict reactions that would be required to improve indigoidine production, we used 128 glutamine, its precursor, to conduct the analysis. Our process for determining the list of required 129 gene targets is diagrammed in Figure 1. The minimum theoretical product yield of glutamine 130 was set at 10%, 50% and 80% MTY to derive the reactions that would require knockout or 131 knockdown for product substrate paired growth. We eliminated potential target sets that needed 132 the removal of genes coding for multi-functional proteins, as we sought to limit additional 133 metabolic perturbations that could confound our analysis. Of the 1956 reactions in iJN1462 that 134 are associated with genes, only 60% have a single gene associated with them. If a metabolic 135 reaction was catalyzed by more than one gene product (genes coding for isozymes or multi-136 subunit enzymes), we included both genes for inactivation. After implementing these filters, we 137 found that a threshold of 80% MTY could be achieved using the elimination of 14 cellular 138 reactions. These 14 metabolic reactions when mapped to their corresponding genes and gene 139 products represented 16 gene loci (Figure 1b and Supplementary Dataset 1). 140 141 Next, we used Flux Balance Analysis (FBA) and Flux Variability Analysis (FVA), to confirm the 142 16-gene cMCS strategy to be obligatory for glutamine production. Using our constructed cMCS 143 platform, we set the parameters to explore potential product-obligatory strategies to enhance 144 the production of indigoidine in P. putida when glucose is fed as the sole carbon source. This 16 145 gene set provided for glutamine was then extended to assess production paired growth for 146 indigoidine. FBA analyses confirmed that growth using glucose could be paired with indigoidine 147 production at 90% theoretical yield (0.48 mol/mol or 0.66 g/g of glucose). This analysis also 148 confirmed that we can adapt the work from von Kamp and Klamt 11 for non-native final products 149 and target specific genes rather than enzymatic reactions for intervention. 150 151 Since EMA requires the delineation of specific growth conditions, such as starting carbon 152 source, we examined if the gene cut set with glucose as a substrate, could maintain product 153 pairing with other known native carbon substrates for P. putida, such as para-coumarate and 154 lysine 12,16 . FBA with these alternate carbon sources (i.e. lysine, para-coumarate) indicated that a 155 strain engineered using the 16-gene cMCS strategy for the glucose would fail to produce 156 glutamine (Supplementary Table 3). In contrast, this gene targeting set (Supplementary 157 Dataset 1) results in the desired production obligatory growth using galactose as a carbon 158 source because it shares the same downstream catabolism as glucose (Figure 1b). Prior to construction of the multi-gene edited production strain, we assessed if our gene set 179 contained essential genes. The iJN1462 model has an incomplete list of essential genes; in 180 addition we manually annotated genes as essential or dispensable using gene essentiality data 181 generated from a barcoded fitness library (RB-TnSeq) 13 (Supplementary Dataset 2). Out of the 182 16 genes identified for knockdown, two genes were excluded because they are essential for 183 viability. By eliminating essential genes from the targeted gene set, we hypothesized that the 184 predicted metabolic rewiring is more consistent with product substrate pairing rather than growth 185 coupling.

187
To efficiently overcome technical limitations required to make 14 gene edits, we implemented a 188 multiplex CRISPRi/dCpf1 targeting strategy. We drew on our understanding of repetitive 189 element instability 23,24 to minimize use of repeated DNA sequences to limit gRNA array loss.

213
A consequence of pairing production to the catabolism of specific carbon sources, results in 214 predictions that other carbon sources can no longer be metabolized (Supplementary Table 3). 215 We tested this prediction experimentally and observed that engineered strains for product 216 substrate pairing showed reduced growth when using either lysine or para-coumarate as the 217 sole carbon source, in agreement with the modeling (Figure 2d). 218

219
Characterizing the multi-gene engineered production strain 220 221 Redirecting metabolic flux to improve glucose paired glutamine/indigoidine formation is 222 quantifiable across several other metrics. We should observe high TRY for our desired product 223 since higher glutamine yields, to support growth, should result in more indigoidine yields. The 224 production of indigoidine would shift from stationary phase to exponential phase, as the 225 metabolism of glucose catabolism and glutamine production are paired. Finally, these 226 phenotypes should maintain fidelity across a range of growth modes and scales. 227

228
We tested to ensure that indigoidine production was improved in the engineered strain relative 229 to the controls in several laboratory cultivation formats. We tested production using both the 230 native glucose and engineered galactose pathways as carbon sources. Both strains were 231 cultivated with either 10 g/L glucose or galactose, as the same targeted reaction set would 232 function on either carbon source. In a deep well plate format, we observed that the engineered 233 strain produced nearly three-fold more indigoidine than the control strain when fed glucose 234 (Figure 3a). In a shake flask format, the engineered strain produced 30% more than the control 235 strain. Finally, when cells were cultivated with galactose in the deep well format, the same 236 engineered strain was able to produce indigoidine in contrast to the galactose utilization control 237 strain which only formed biomass (Figure 3b).

239
In a 2L bioreactor, cultivated in a batch-mode with glucose as the carbon source, we observed 240 improved titers at 12.5 g/L indigoidine from 60 g/L glucose. The control production strain, in 241 contrast, produced 5 g/L, and production of the final molecule was realized after glucose was 242 exhausted from the media. When galactose was fed, the engineered strain also exhibited 243 improved titers of 25.6 g/L of indigoidine from 60 g/L galactose as opposed to the control strain 244 that generated only around 900 mg/L of indigoidine; a 28-fold improvement in production was 245 observed in the engineered strain. Moving to an industrially relevant cultivation format did not 246 impact the final product titer, allowing us to further develop cultivation methods by switching to 247 fed-batch mode. 248

249
We realized greater improvements in final product titer as well as improvements in production 250 kinetics in the fed-batch mode using the ambr® 250 system. After administering an initial high 251 nutrient feed to increase biomass in the reactor, we reduced the feed rate to study indigoidine 252 product formation during exponential phase growth (Figure 3a, right hand panel, and 253 Supplementary Figure 3). During this phase, the engineered strain produced at a rate of 0.22 254 g/l/h, while the control strain accumulated no additional product. This observation is consistent 255 with our hypothesis that indigoidine formation would occur during exponential phase due to 256 pairing with glucose. In terms of yield, the engineered strain generated consistently higher 257 production than the control strain when cultivated with glucose (0.2 g/g compared to 0.1 g/g), 258 but was not as consistent when cultivated on galactose (Figure 3c). Together all aspects of the 259 phenotypes that were desirable for the engineered strain were found to be true. involving such computational methods to improve 1,3-BDO production to 18 g/L 33 , but their 278 method cannot be generalized for other molecules. Others have described growth coupling as 279 the creation of a "driving force" such as ATP production or cofactor imbalance, and link the 280 driving force to the desired production pathway 29,34-36 . Driving force coupling is also pathway 281 specific and requires additional strain engineering. Examples include 1-butanol production in E. 282 coli using NADH as the driving force 34 or media supplementation for butanone production in E. 283 coli linked to acetate assimilation 29 . 284 285 In contrast to the examples described above, the multi-gene engineered product substrate 286 pairing we report here is an implementation of "strong" growth coupling. It relies on EMA based 287 methods that provide targets at the genome-scale level 11,37,38 but predicts a large number of 288 enzymatic reactions for elimination. We used FBA to corroborate our optimal cMCS and removed essential genes from targeted gene sets using -omics data to determine the genes that 290 should be targeted for CRISPRi. This genome scale approach also represents a valuable 291 paradigm for the evaluation of microbial hosts for their production capacity and could 292 significantly reduce the time taken to optimize carbon source conversion to the final product. 293 The appeal of this strategy is that the gene knockdown solution is scale-agnostic; the predicted 294 metabolic rewiring should apply even in the largest bioreactor formats. 295

296
In the context of TRY improvement alone, indigoidine itself is an example of a heterologous 297 product that has been demonstrated at high titers 39-41 . The production of indigoidine was high in 298 the oleaginous yeast Rhodosporidium toruloides but remained low in the model yeast S. 299 cerevisiae, despite similar optimization of cultivation parameters. This comparison represents an 300 empirical example of the innate metabolic potential of a given host, and is consistent with our 301 calculated max theoretical yields for indigoidine (Supplementary Table 2). Genome scale 302 metabolic models can accurately predict how microbial hosts could be advantageous for the 303 production of a given metabolite. For indigoidine, the MTY from glucose in P. putida is 0.54 304 mol/mol and is comparable to that for R. toruloides (0.5 mol/mol), while E. coli (0.4 mol/mol) and 305 S. cerevisiae (0.079 mol/mol) are much lower. It is likely that every molecule will be different. 306 Thus, selecting the best host/ final product pair is a crucial aspect of developing the ideal 307 production platform. 308 309 While our engineered strains showed many desirable phenotypes, several aspects merit 310 additional discussion. The predicted EMA based cut set (cMCS) demands zero flux through 311 these reactions for strong growth coupling. We excluded two genes from the predicted gene set 312 due to their essentiality. Of the remaining gene targets, our RNAseq and proteomic data 313 indicates a partial gene knockdown, implying that a non-zero flux could occur through the 314 predicted reactions. The resulting yield space for indigoidine production is now different from what was predicted by EMA (Figure 3d). This suggests that partial implementation of the EMA 316 predictions, still resulted in a shift of production from stationary to exponential phase while 317 maintaining desirable indigoidine TRY. 318

319
Our approach allowed us to achieve, in one cycle of strain engineering, a high and consistent 320 TRY for indigoidine across cultivation scales. With improvements in genetic tools and metabolic 321 models it may be possible to obtain the 90% MTY predicted by the EMA based cMCS. A better 322 understanding of the terminator sequence efficiency (as observed in this study and elsewhere in 323 E. coli 25 ) would enable more efficient CRISPR mediated gene knockdown. Similar fold 324 repression of targeted proteins by CRISPRi/dCas9 were recently reported 42 , suggestive of a 325 limitation for existing CRISPR systems in P. putida. Additionally, delineation of gene targets for 326 this approach relies on the availability of high-quality genome scale metabolic models, and also 327 calculated using a single carbon source. Future mechanistic studies of these strains will lead to 328 more refined genome scale models, enabling more accurate metabolic flux modeling when the 329 engineered strains are grown with these carbon sources. This approach cannot be used for 330 certain mixed carbon streams, such as glucose and xylose, as our calculations for glucose 331 pairing inactivates the pentose phosphate pathway. Similarly, there are metabolites that cannot 332 be made obligatory for growth 11 . For final products derived from this class of metabolites, 333 alternative strategies or hosts would need to be explored. We also do not take into 334 consideration products or intermediates that may be toxic. As industrial processes also use 335 renewable carbon sources that may contain inhibitory byproducts, microbial hosts will require 336 some degree of tolerance engineering 43 to unlock its potential. Addressing these aspects will 337 further boost the usefulness of this product substrate pairing approach. 338 339

Computation of constrained minimal cut sets (cMCS)
Pseudomonas putida KT2440 genome scale metabolic model (GSM) iJN1462 9 was used. The 342 ATP maintenance demand and glucose uptake were 0.97 mmol ATP/gDW/h and 6.3 mmol 343 glucose/gDW/h, respectively. Constrained minimal cut sets (cMCS) were calculated according 344 to the algorithm as previously described 11 . Excretion of byproducts was initially set to zero, 345 except for the reported overflow metabolites for secreted products specific to P. putida 346 (gluconate, 2-ketogluconate, 3-oxoadipate, catechol, lactate, methanol, CO 2 , and acetate).

Indigoidine Extraction and Quantification 412
Indigoidine was purified from P. putida with slight modifications as previously described 50 . Cells 413 were lysed in 1% SDS and 100 mM NaCl and then centrifuged at 14,000 xg for 3 minutes. The 414 supernatant was discarded and the pellet was washed with three rounds of methanol, 415 isopropanol, water, ethanol, and hexane to remove contaminating proteins and metabolites. The 416 pellet was allowed to dry overnight and then resuspended in DMSO at a final concentration of 1mg/mL. Indigoidine purity was characterized by NMR. A standard curve correlating indigoidine 418 concentration to OD 612 was generated using this reagent (Supplementary Figure 5). The purity 419 of extracted indigoidine (Supplementary Figure 6) from both E. coli and P. putida were cross-420 validated by NMR as previously described 41  were filled with 150 mL M9 minimal salt media containing 10 g/L glucose as carbon source. 501 Temperature was maintained at 30 • C throughout the fermentation process and agitation was 502 set constant to 1300 RPM. Airflow was set constant to 0.5 VVM based on the initial working 503 volume and pH was maintained at 7.0 using 4 N NaOH. Reactors were inoculated manually with 504 5 mL of pre-culture cell suspension. After an initial batch phase of 12 hours, cultures were fed 505 with a concentrated glucose feed solution (600 g/L glucose, 120 g/L ammonium sulfate, 50 506 µg/mL kanamycin, 3 g/L arabinose and 500 µM IPTG) by administering feed boluses every two 507 hours restoring glucose concentrations to 10 g/L (feed parameters: 3.1 min @ 50 mL/h). After 508 observing glucose accumulation, feed addition was paused and resumed at reduced feed rates 509 when glucose levels dropped below 10 g/L (1 min @ 50 mL/h). Experiments with a continuous 510 feeding regime were initially fed at 1.3 mL/h (0.3 mL/h after seeing glucose accumulation). 511 Samples were taken 1-2 times every day (2 mL) and stored at -20 Modeling and engineering workflow diagram. This approach can potentially be extended to any 551 carbon source, host and/or metabolite. Input specific to this specific host/final product work is 552 marked in green font.  Proteomics.  format assessed is indicated above each panel. A fed-batch mode of cultivation was 588 implemented in the ambr® 250 cultivation format. Glucose feeding is indicated by the gray 589 shaded area. Control samples indicated with black outlined bars or black dots. The multi-gene 590 engineered strains are indicated with blue bars or blue dots. c, Analysis of indigoidine yield 591 across cultivation formats for both glucose-fed and galactose-fed strains. Yield from the control 592 strain is indicated with black bars, and the multi-gene engineered strain is indicated with green 593 bars. d, Predicted production envelope using genome scale model and constraint-based 594 methods represented as theoretical yields of indigoidine as a function of biomass yields. 595 Possible yield space for control strain is represented in grey. The possible yield space for 16 596 gene cMCS predicted by EMA is represented in orange. The range of observed experimental 597 yield space for either the control or engineered strain across different production formats is 598 represented with black and teal fill. A red dot indicates the realized production yield vs biomass 599 yield in exponential phase under optimized conditions. The phase shift in production from 600 stationary phase to exponential is not depicted. 601 602