Rational modular design of metabolic network for efficient production of plant polyphenol pinosylvin

Efficient biosynthesis of the plant polyphenol pinosylvin, which has numerous applications in nutraceuticals and pharmaceuticals, is necessary to make biological production economically viable. To this end, an efficient Escherichia coli platform for pinosylvin production was developed via a rational modular design approach. Initially, different candidate pathway enzymes were screened to construct de novo pinosylvin pathway directly from D-glucose. A comparative analysis of pathway intermediate pools identified that this initial construct led to the intermediate cinnamic acid accumulation. The pinosylvin synthetic pathway was then divided into two new modules separated at cinnamic acid. Combinatorial optimization of transcriptional and translational levels of these two modules resulted in a 16-fold increase in pinosylvin titer. To further improve the concentration of the limiting precursor malonyl-CoA, the malonyl-CoA synthesis module based on clustered regularly interspaced short palindromic repeats interference was assembled and optimized with other two modules. The final pinosylvin titer was improved to 281 mg/L, which was the highest pinosylvin titer even directly from D-glucose without any additional precursor supplementation. The rational modular design approach described here could bolster our capabilities in synthetic biology for value-added chemical production.

The polyphenol pinosylvin (trans-3,5-dihydroxystilbene) is a plant secondary metabolite fulfilling several functions including protection from attack by microbes or insects 1 . Pinosylvin has emerged as a promising nutraceutical or pharmaceutical because of its antioxidative, anti-inflammatory, anticancer, and chemopreventive activities 2 . Pinosylvin is primarily existed in the heartwood of genus Pinus. However, the concentration of this compound in plants only ranges from 1 to 40 mg/g 3 , rendering access to this medicinally important product through plant extraction difficult. Additionally, isolation of single compound from plants is limited by seasonal and regional variations and often difficult due to the complexity of secondary metabolites from plant extracts 4 . Alternatively, microbial production of pinosylvin may accelerate its large-scale production and is more environmentally friendly 5 .
The conversion of the aromatic amino acid L-phenylalanine to pinosylvin requires three steps (Fig. 1). Firstly, phenylalanine ammonia lyase (PAL) deaminates L-phenylalanine to cinnamic acid. 4-Coumarate:CoA ligase (4CL) subsequently converts cinnamic acid into its corresponding coenzyme A ester cinnamoyl-CoA. The resulting cinnamoyl-CoA is then condensed with three units of malonyl-CoA via stilbene synthase (STS) to form the stilbene pinosylvin 4 .
To date, several studies have made significant achievements for microbial production of stilbene 4, 5 , particularly for synthesizing resveratrol 6,7 . For example, relevant study reported that supplementation of 15 mM p-coumaric acid led to a high product titer of 2.3 g/L resveratrol in Escherichia coli 6 . However, these strategies rely heavily on supplementation of phenylpropanoids as stilbene precursors, which is costly and present in few industrial processes.
In contrast, de novo microbial pinosylvin production makes biological production economically viable and would accelerate the application of pinosylvin as both nutraceutical and pharmaceutical. Previous studies have demonstrated the feasibility of microbial production of pinosylvin from D-glucose or glycerol. In one relevant study, researchers constructed different configurations of three-step pinosylvin biosynthetic pathway and up to 3 mg/L titer was achieved. Further addition of fatty acid synthesis inhibitor cerulenin increased product titer to 70 mg/L 4 . As cerulenin is expensive and cost prohibitive for large-scale fermentation process, another study examined using clustered regularly interspaced short palindromic repeats interference (CRISPRi) system to repress fadD gene, and product titer of 47.5 mg/L was obtained from glycerol 5 . Additionally, Corynebacterium glutamicum also has been engineered and 121 mg/L of pinosylvin and 158 mg/L of resveratrol was achieved in the presence of 25 μM cerulenin 8 .
Despite exciting achievements under these methods, there is a pressing need to develop more economically viable process with high productivity and yield. However, the most difficult part exists in finding and applying effective resolutions to overcome metabolic flux imbalances when implementing a heterologous pathway using non-natural substrates 9,10 . De novo pinosylvin synthesis from D-glucose involves manipulating multi-gene pathways that are subjected to tight cellular regulation. Previous studies always engineered part of the overall pathway such as L-phenylalanine to pinosylvin or even from cinnamic acid 4,5 , ignoring the balance of the overall pathway. It was hypothesized that one pathway bottleneck might be eliminated while another bottleneck might be brought in somewhere else along the pathway when examining only part of the pinosylvin pathway 11 . Furthermore, endogenous central metabolism still strongly competes and predominates for energy and carbon sources during the synthesis of malonyl-CoA synthesis, leaving only a few amounts for producing recombinant products 12,13 .
More recently, we introduced a modular metabolic engineering strategy to balance resveratrol synthetic pathway and achieved 35 mg/L resveratrol from 3 mM L-tyrosine 14 . Despite the potential of modular metabolic engineering to significantly bolster the capabilities in synthetic biology 15 , this area still lacks a standard principle for module grouping and further optimization. Here, a rational modular design approach was developed for grouping and optimizing modules. Compared to our previous studies 14,16,17 , this rational modular design approach demonstrated that choosing separating node at cinnamic acid rather than previous cinnamoyl-CoA lead to a dramatic increase in final production titer. Final pinosylvin titers were improved to 281 mg/L, which represented the highest titer reported to date. This rational modular design approach provides a framework for module grouping and optimization and would expedite developing robust and efficient microbial cell factories for value-added chemical production.

Results
Design of de novo pinosylvin synthetic pathway. For de novo microbial production of pinosylvin, strains displaying enhancing ability for the synthesis of L-phenylalanine are required. In E. coli, two rate-limiting steps exist toward the synthesis of L-phenylalanine. The first one is the condensing erythrose 4-phosphate (E4P) and phosphoenolpyruvate (PEP) via 3-deoxy-D-arabinoheptulosonate 7-phosphate synthase (DAHP) synthase isozymes encoded by aroG, aroF and aroH. The second one is the conversion of chorismate (CHO) to phenylpyruvate (PPY) via chorismate mutase/prephenate dehydratase (CM/PDT). Previously, a β-2-thienylalanine-resistant E. coli K12 mutant exhibiting high titers of L-phenylalanine was obtained 18 . This strain carried a wild-type DAHP synthase (DAHPS: aroF wt ) and a mutant CM/PDT (CM/PDT: pheA fbr ). As such, aroF wt and pheA fbr were overexpressed to enhance L-phenylalanine synthesis 19 .
As the first step of phenylpropanoid pathway, two candidate PAL enzymes were chosen. One was selected from the red yeast Rhodotorula glutinis (RgPAL), which was successfully used in our previous studies 16,19 . The other one was chosen from Trichosporon cutaneum (TcPAL), which was a novel phenylalanine/tyrosine ammonia-lyases exhibiting high activities toward both L-phenylalanine and L-tyrosine 20 . 4CL from Petroselinum crispum and STS from Vitis vinifera served as the second and third enzyme because these two enzymes achieved the highest production of stilbene resveratrol in E. coli demonstrated by previous study 6 .
Furthermore, by a comparative analysis of pathway intermediate concentrations (Table 1), we found that these two combinations (pCOLA-aroF wt -pheA fbr , pCDFD-Trc-RgPAL-Trc-4CL or pCDFD-Trc-TcPAL-Trc-4CL, pETD-STS) both led to high accumulation of cinnamic acid. Therefore, results from this metabolite analysis indicated that efficient conversion of cinnamic acid presented the pathway bottleneck, suggesting that subsequent engineering efforts should focus on addressing this obstacle.
At the very start, module two was overexpressed relative to module one to alleviate cinnamic acid accumulation (Fig. 2). In the first round (S1-S8), module one was expressed at a lowest value (COLA × Trc, 5 a.u.), while module two expression increased from a higher value (p15A × Trc, 10 a.u.). It was found that the pinosylvin titer increased from 0.6 mg/L to 3.2 mg/L until an intermediate value (p15A × T7, 50 a.u.) and the titer of intermediate product cinnamic acid decreased from 431 mg/L to 386 mg/L following an opposite trend. This result suggested that the low expression of module one were not suitable for the high expression of module two. Additionally, it was found that use of pRSFDuet-1 in every combination (S5 and S8) led to low titers of pinosylvin and high titers of intermediate cinnamic acid. Furthermore, the final OD600 of these two strains were 2.6 and 2.2, respectively. Hence, it was supposed that the high copy number of pRSFDuet-1 would increase metabolic burden and further lead to negative effect on cell behavior.   Table 1. Analysis of intracellular pools of pathway intermediates. a RgPAL was directly cloned into NcoI/AvrII sites of pCDFDuet-1. Engineered strains contained pCOLA-T7-aroF wt -T7-pheA fbr and pCDFD-T7-RgPAL. b TcPAL was directly cloned into NcoI/AvrII sites of pCDFDuet-1. Engineered strains contained pCOLA-T7-aroF wt -T7-pheA fbr and pCDFD-T7-TcPAL. c Engineered strains contained pCOLA-T7-aroF wt -T7-pheA fbr , pCDFD-Trc-RgPAL-Trc-4CL, pETD-T7-STS. d Engineered strains contained pCOLA-T7-aroF wt -T7-pheA fbr , pCDFD-Trc-TcPAL-Trc-4CL, pETD-T7-STS.
Hence, in the second round, module one expression elevated to a higher level (p15A × Trc, 10 a.u.) and module two expression increased from a value of 20 a.u. (CDF × Trc), similar trends were observed and higher pinosylvin production (15.8 mg/L) was obtained. Then in the subsequent rounds of modular engineering, module one expression was set as a value of 20 a.u. (CDF × Trc), 40 a.u. (pBR322 × Trc), 50 a.u. (p15A × T7), 100 a.u. (pCDF × T7), respectively, while module two expression increased from a higher value compared to module one. It was observed that when module one expression elevated to 50 a.u. (p15A × T7), the titer of pinosylvin increased following the increasing expression of module two and highest pinosylvin production (61 mg/L) was obtained with the highest expression of module two (200 a.u., pBR322 × T7).
In this study, the relative heterologous gene expression strength is calculated based on promoter strength and the plasmid gene copy number. To support this calculation, transcriptional expression levels of TcPAL, 4CL and STS were calculated by qPCR from strains of S1, S9, S13, S17 and S20, as these strains exhibited different gene expression levels. As seen from Fig. 3A, the mRNA transcriptional level directly sustained this calculation method that increasing plasmid gene copy number and promoter strength modulated heterologous gene expression.
Scientific RepoRts | 7: 1459 | DOI:10.1038/s41598-017-01700-9 Optimization of module two by changing 5′ region of mRNA secondary structure. As seen from Fig. 2, in the final round of modular engineering, the titer of pinosylvin increased following the increasing expression of module two and highest pinosylvin production was obtained with the highest expression of module two. That means increasing module two expression would be favorable for pinosylvin production. However, employment of high copy number plasmid pRSFDuet-1 led to high metabolic burden and resulted in negative effect on cell behavior. This genetic recalcitrance restricted the ability to balance the overall pathway. Previously, we demonstrated that reducing the 5′ region secondary structure of the open reading frame of target gene mRNA could function in improving protein expression with no additional metabolic burden 16 . Hence, in this study, this strategy was further developed to optimizing and balancing the heterologous gene expression coding for enzymes of the pinosylvin production pathway.
Firstly, the expression of 4CL was optimized via reducing its 5′ region secondary structure of the open reading frame of mRNA. Based on our previous study 16  (+1 to +42), the minimum free energy of folding for the 5′ region of mRNA transcript (ΔG) was calculated by NUPACK software 22 (Fig. 4).
As seen in Fig. 5, a total of seven variants spanning a range of free energies from −12.8 to −4.1 kcal/mol were chosen. These different variants of 4CL were used to replace the original 4CL in module two to further balance the overall pathway. ΔG of original 4CL was −12.8 kcal/mol. It was found that variants with reduced 5′ region of mRNA secondary structure (larger value of ΔG) increased the pinosylvin production (97 mg/L) until ΔG value of −5.0 kcal/mol. While the concentration of the intermediate product cinnamic acid continuously decreased from 168 to 114 mg/L. It was observed that continuously reducing the 4CL mRNA secondary structure decreased cinnamic acid accumulation constantly while the pinosylvin production increased until an intermediated value of ΔG. This result meant STS may present as another pathway bottleneck. Hence, six different STS variants spanning a range of free energies from −9.4 to −4.4 kcal/mol were also chosen to replace the original STS (Fig. 6). It was found that variants with reduced 5′ region of mRNA secondary structure increased pinosylvin production from 96 mg/L to 160 mg/L (Fig. 7). It was notably that modifying the expression of STS resulted in a dramatic change of pinosylvin production and the low activity of STS would be another pathway bottleneck.

Construction of CRISPRi system to enhance malonyl-CoA concentration.
To further improve the intracellular availability of the limiting precursor malonyl-CoA, the malonyl-CoA synthesis module (module three) based on CRISPRi was assembled and optimized with other two modules. Based on our previous studies 12,17 , low repressing efficacy toward genes of eno, adhE fabB (anti-eno, anti-adhE, anti-fabB sgRNA), medium repressing efficacy toward genes of sucC and fumC (anti-sucC, anti-fumC sgRNA) and high repressing efficacy toward genes of fabF (anti-fabF sgRNA) were conducted in engineered BL21(DE3) strains, because repression of these genes would not alter final biomass significantly while enhancing the intracellular malonyl-CoA level.

Discussion
Efficient biosynthesis of pinosylvin from renewable and cheap substrate D-glucose would accelerate the application of pinosylvin as both nutraceutical and pharmaceutical. Previous studies on pinosylvin production have C o n tr o l a n ti -e n o a n ti -a d h E a n ti -f a b B a n ti -s u c C a n ti -f u m C a n ti -f a b F

A B
C o n tr o l a n ti -a d h E a n ti -e n o a n ti -f a b B a n ti -f a b F a n ti -f u m C a n ti -s u c C a n ti -f a b F /a d h E a n ti -f a b F /e n o a n ti -f a b F /f a b B a n ti -f a b F /f u m C a n ti -f a b F /s u c C a n ti - mainly focused on engineering the three-step pathway ranging from L-phenylalanine to pinosylvin 4,5 . Despite these significant achievements, further improvement in pinosylvin production need investigating metabolic constraints and removing pathway limitations throughout the overall pathway instead of only partial pathway. In this study, by a rational modular design approach, the overall pinosylvin biosynthetic pathway including a total of 11 genes was constructed and arranged into three modules (Fig. 1). These modules were systematically optimized for efficient microbial production and the final engineered strain could produce 281 mg/L pinosylvin. The present work demonstrated that this rational modular design approach would expedite developing robust and efficient microbial cell factory for value-added chemical production. The ability to more broadly use biological systems for chemical production is always limited by the inefficiency of complex biosynthetic pathways in heterologous hosts 11,23,24 . Despite the potential of modular metabolic engineering to revolutionize the field of synthetic biology 15 , the field of synthetic biology still lacks a standard principle for strain optimization. The rational modular design approach described here provides a framework for fine-tuning synthetic pathways, identifying potential pathway bottlenecks and further alleviating them. Initially, different candidate pathway enzymes were screened to construct the overall pathway. Secondly, metabolite profiling indicated that cinnamic acid accumulated in the initial strain (Table 1), the pinosylvin synthetic pathway was then divided at a new node cinnamic acid. These new modules were combinatorially optimized at transcriptional and translational levels, which exhibited a 16-fold increase in pinosylvin titer compared to the initial construct (Fig. 2). It is notably that choosing separating node at cinnamic acid rather than previous cinnamoyl-CoA 16,17 lead to a dramatic increase in final production titer. This further highlighted the importance of selecting suitable separating node in modular pathway engineering. The present work reports a rational modular design approach for systematically identifying and removing metabolic bottlenecks.
Compared to the initial modular pathway engineering strategies only employing different plasmids and promoters to fine-tuning pathway efficiency 15,19,21 , in this work, we further explored reducing 5′ mRNA secondary structure and synthetic CRISPRi system to modulate pinosylvin biosynthetic pathway and endogenous central metabolism. Based on these combined strategies, our analyses revealed that accumulation of cinnamic acid, low stilbene synthase activity and limited malonyl-CoA availability were the main bottlenecks of the overall pathway and led to engineering efforts that significantly increased pinosylvin titer to 281 mg/L, which represents the highest titer reported to date in microbial production strains. This demonstrated that emerging synthetic devices and strategies in the context of synthetic biology would greatly complement modular pathway engineering to exploit the full potential of cell metabolism 23,[25][26][27] .
Natural genes often exhibit a bias preference of codon usage and could strongly influence the expression of heterologous genes 28 . In order to improve the expression of heterologous genes, researchers always replace non-optimal codons by optimal ones. However, previous studies find that many organisms are enriched for rare codons at 5′ region of genes and local RNA structure in 5′ region of genes mostly influence expression change 28,29 . Furthermore, we proved that weakening 5′ region of target gene mRNA secondary structure enhanced gene expression levels dramatically 16 . Here, it was found that increasing module two expression would be favorable for pinosylvin production. However, employment of high copy number plasmid pRSFDuet-1 resulted in negative effect on cell behavior due to the large metabolic burden. Hence, in this study, the strategy was further developed to optimizing module two expression instead of employment of high copy number plasmid. The balance of the overall pathway was finally achieved and the production titer of pinosylvin increased by 1.6-fold (160 vs 62 mg/L). Besides, this strategy described here identified that the low activity of STS was one of the pathway bottlenecks, suggesting a strategy for identifying pathway bottlenecks.
Endogenous central metabolism strongly competes and predominates for energy and carbon sources when synthesizing malonyl-CoA 12, 13 , leaving only a few amounts for pinosylvin production. In one relevant study, researchers demonstrated that addition of fatty acid synthesis inhibitor cerulenin increased production titer significantly 4 . While cerulenin is costly prohibitive for large-scale fermentation process, another study explored using CRISPRi system to repress fadD gene and obtained a 1.9-fold increase on production titer 5 . However, it is urgently needed to examine other metabolic engineering targets and further evaluate combinatorial effect of different genetic interventions.
Here, it was found that repression of eno, adhE, fabB, sucC, fumC and fabF could improve the intracellular malonyl-CoA concentration. It was presumed that repression of eno would channel carbon flows toward pyruvate (acetyl-CoA precursor), repression of adhE, sucC and fumC would decrease the consumption of acetyl-CoA in TCA cycle or glycolysis and repression of fabB and fabF would prevent the diversion of malonyl-CoA to the synthesis of fatty acid 12 . Furthermore, the combinatorial effect of various genetic interventions implicated in central metabolic pathways was explored and the best combination was obtained (Fig. 8). Our study demonstrated that CRISPRi system coupled with modular pathway engineering strategy are powerful tools with which to expand methods and strategies for systematic engineering of industrially important microorganisms.