A highly selective biosynthetic pathway to non-natural C50 carotenoids assembled from moderately selective enzymes

Synthetic biology aspires to construct natural and non-natural pathways to useful compounds. However, pathways that rely on multiple promiscuous enzymes may branch, which might preclude selective production of the target compound. Here, we describe the assembly of a six-enzyme pathway in Escherichia coli for the synthesis of C50-astaxanthin, a non-natural purple carotenoid. We show that by judicious matching of engineered size-selectivity variants of the first two enzymes in the pathway, farnesyl diphosphate synthase (FDS) and carotenoid synthase (CrtM), branching and the production of non-target compounds can be suppressed, enriching the proportion of C50 backbones produced. We then further extend the C50 pathway using evolved or wild-type downstream enzymes. Despite not containing any substrate- or product-specific enzymes, the resulting pathway detectably produces only C50 carotenoids, including ∼90% C50-astaxanthin. Using this approach, highly selective pathways can be engineered without developing absolutely specific enzymes.

We hypothesized that we could shift the specificity range of CrtM toward larger (C >40 ) carotenoids by accumulating mutations neutral to C 40 function but deleterious to C 30 function. (b) C 30 and C 40 synthase functions of isolated CrtM variants. Three mutants and their parent, CrtM W38A (on a pUC vector) were co-expressed with either pAC-crtN-crtNb-idi or pAC-crtE-crtI. (c) Mutation F233 and F26 mapped onto the structure of CrtM (2ZCCP). (d,e) The level of diaponeurosporene (C 30 ) or lycopene (C 40 ) production by E. coli cells harboring pUC-crtM variants, together with pAC-crtN-idi or pAC-crtE-crtI-idi, respectively. Carotenoid pigments were extracted from the cell pellets and absorbance at 470 nm (d) or 475 nm (e) were used to calculate the carotenoid amounts. Bars represent the average of four replicates; error bars represent ± 1 standard deviation. (f,g) Carotenoid backbone production by E. coli cells harboring pUC-crtM variants (f) or pUC-crtM variants and pAC-crtE (g). Carotenoids were extracted and analyzed by HPLC (see Methods). Asterisks indicates that carotenoid production was non-detectable. Figure 6. The production of C 55 backbone. Shown are HPLC traces at 287 nm of carotenoid extracts of E. coli harboring the indicated plasmids. The identity of the novel large C 55 backbone was confirmed by mass spectrometry, absorption spectrum (287 nm), and retention time.

Primer name Sequence
• Restriction site in bold • Over-hang sequence for type II restriction enzyme is indicated with boxes • Annealing sequence in lowercase • Homology sequence used for SLIC is italicized Table 10 Plasmid combinations used in each experiment

Description of the enzymatic steps
Escherichia coli synthesizes C 15 PP (farnesyl diphosphate) by the consecutive condensation of two molecules of isopentenyl diphosphate (IPP) with dimethylallyl diphosphate (DMAPP), both of which are provided by the 2-C-methyl-D-erythritol 4-phosphate (MEP) pathway. From this C 15 PP, our pathway produces C 50 -astaxanthin via the following steps: i) A two-step prenyl transfer reaction to create C 25 PP: Naturally-occurring C 25 PP synthases exist in some bacteria and archaea 1-3 . However, they synthesize C 20 PP in approximately equal proportion with C 25 PP. On top of this, the expression of the Aeropyrum pernix C 25 PP synthase gene accumulates only C 20 PP in Escherichia coli 4 , which is probably due to a suboptimal working temperature. Instead, we chose the specificity-shifting mutant of farnesyl diphosphate synthase (FDS Y81A ) from Geobacillus stearothermophillus 5 . Expression of FDS Y81A in E. coli resulted in the accumulation of C 25 PP, but only as a mixture with C 15 PP and C 20 PP 6 . Further engineering was required to convert this variant into a more specific and efficient C 25 PP synthase (Fig. 3a).
ii) A one-step head-to-head condensation of the precursors to make C 50 -phytoene (7): There are two types of bacterial carotenoid backbone synthases in nature: CrtM for C 30 backbone and CrtB for C 40 backbone (1). Previously, we discovered that the F26A/W38A mutant of Staphylococcus aureus CrtM produced small but detectable amounts of C 50 backbone 6 (7). In this paper, we further evolved this variant for improved preference for C 50 backbone synthesis (Fig. 3c).
iii) A six-step desaturation for chromophore formation to make C 50 -lycopene (8): Natural two-, three-, four-, and five-step carotenoid desaturases are known, but no six-step desaturases have been discovered in nature. Because the C 40 backbone desaturase (CrtI) from Pantoea ananatis showed a detectable level of C 50 backbone desaturation 7 , we decided to evolve it into an efficient six-step desaturase (Fig. 6a).
iv) A two-step cyclization to make C 50 -β-carotene (9): Cyclization is widespread in C 40 carotenoid pathways but not observed in the C 30 pathway at all. However, lycopene cyclases are known to be 'locally-specific' enzymes that recognize only a particular part (locus) of their substrates 8 . Therefore, they can accept a variety of non-cognate substrates, including C 35 carotenoids. Because C 50 carotenoids possess the same loci for cyclization as C 40 carotenoids 9 (ψ-and 7,8-dihydro-ψ ends), we reasoned that the natural C 40 carotenoid cyclase (CrtY from P. ananatis) had an excellent probability of acting on C 50 -lycopene (8) to synthesize C 50 -β-carotene (9). We found this was indeed the case (Fig. 7a).
v) A two-step hydroxylation to produce C 50 -zeaxanthin (10): In natural C 40 carotenoid pathways, zeaxanthin (4) is formed by enzymatic hydroxylation of two specific positions (3-and 3'-moieties) of β-carotene (3). As was often the case with other modification enzymes, we later found these steps in the C 50 pathway could be fulfilled simply by recruiting the β-carotene 3-hydroxylase (CrtZ) from an astaxanthin-producing microbe (Fig. 7b). The most promising CrtZ seemed to be the one from Brevundimonas sp. SD212: literature indicated that this had the least substrate specificity among the known homologues 10 . Due to its high GC-content, we decided to codon-optimize the Brevundimonas CrtZ for expression in E. coli.
vii) The routes for C 50 -astaxanthin (12): The last four steps catalyzed by CrtZ and CrtW are known to constitute a representative example of a so-called 'matrix pathway', where eight different intermediates can be created (See Fig. 1 and Supplementary Fig. 1). It is not known whether the path from C 50 -β-carotene (9) to C 50 -astaxanthin (12) [or the path from β-carotene (3) to astaxanthin (6)] proceeds in a defined or random (matrix) sequence.

Nomenclature and issues on trivial names of C 50 carotenoids
As shown in Fig. 1, the C 50 -astaxanthin (12) pathway parallels the astaxanthin (6) pathway. Considering the biochemical steps to acyclic, cyclic and oxo-cyclic C 50 carotenoids toward C 50 -astaxanthin (12) and the hundreds of other C 50 carotenoids we would in principle be able to biosynthesize in the future, we propose that the most convenient nomenclature rule for C 50 carotenoids is that named simply by adding prefix "C 50 -" to the trivial names of their respective, natural C 40 counterparts. We have adopted this rule for the six C 50 carotenoids (7)(8)(9)(10)(11)(12) reported in this paper, as shown in Fig. 1 and Supplementary Fig. 1.
On the skeletal structure of C 50 carotenoids. Historically, there have been various ways to name C 50 carotenoids, and we anticipate some confusion in the trivial naming of carotenoids with unique skeletons. Years ago, synthetic chemists synthesized the "C 50 -versions" of β-carotene (9), zeaxanthin (10), and astaxanthin (12). Based on their sub-structural elements, these carotenoids were named decapreno-β-carotene 12 , decaprenozeaxanthin 13 , and decaprenoastaxanthin 13 , respectively. On the other hand, some bacteria are known to biosynthesize a different type of C 50 -carotenoids 14 formed by the attachment of an isopentenyl (C 5 ) unit to each end of lycopene (C 40 ), yielding cyclic and acyclic C 50 carotenoids. Although they are structurally different from the carotenoids the synthetic chemists and we have created, some of them are also called 'decapreno'-carotenoids. In our previous work, we referred to the C 50 backbone (7) as 1,1'-diisopentenylphytoene 6 . This nomenclature rule is applicable to acyclic C 50 carotenoids with zero-to six-desaturation step numbers, but not to those with seven-and eight-step numbers or to cyclized ones. This drove us to refer to C 50 carotenoids based on their structural similarity and parallel biosynthesis to their C 40 analogues.

Confusion in desaturase step numbers.
In the rule above, C 50 -lycopene (8) is the six-step desaturation product of C 50 backbone (C 50 -phytoene, 7). Although it shares the same terminal structures (ψ-ends) with its natural C 40 counterpart, lycopene (2), its chromophore is longer than that of lycopene: C 50 -lycopene has 15 conjugated double bonds in its chromophore, while (C 40 ) lycopene is a four-step desaturation product of phytoene (1), and it has 11 conjugated double bonds in its chromophore. This simple rule is applicable to carotenoids with other unnatural backbone sizes. For instance, the hypothetical 8-step desaturation product of C 60 backbone would be called C 60 -lycopene, and it would have 19 conjugated double bonds in its chromophore.

Supplementary Note 2 The number of biochemically possible carotenoids that can be produced by the combinatorial expression of six promiscuous enzymes along the C 50 -astaxanthin pathway
According to the literature, each of the six enzymes along the path from C 15 PP to C 50 -astaxanthin (12) possesses considerable tolerance to alternative substrates: 1. FDS Y81A synthesizes C 15 PP, C 20 PP and C 25 PP 5 . The I78G mutation further increases the number of consecutive condensation steps, allowing the enzyme to synthesize C 30 PP 15 . 2. CrtM F26A,W38A synthesizes C 30 , C 35, C 40 , C 45 , and C 50 backbones by the conjugation of two molecules of C 15 PP, C 20 PP and C 25 PP 6 . 3. CrtI desaturates single bonds in a step-wise fashion, elongating the system of conjugated double bonds by two for each step 7,16 . It is known that in each step, CrtI only acts on the saturated sites adjacent to the developing chromophore 16 . In other words, CrtI desaturates only positions that would increase the size of the conjugated system comprising the chromophore. 4. CrtY cyclizes not only the ψ-end group but also the 7,8-dihydro-ψ end group 9 . In addition, we know that P. ananatis CrtY cyclizes carotenoids with different backbones [such as C 35 17 , C 30 18 , and C 50 carotenoids (this work)]. That is, CrtY acts on C 15 -, C 20 -, and C 25 -"halves" of carotenoids. 5. CrtW oxidizes position 4 of β-end groups with or without 3-keto groups 11 . We confirmed herein that CrtW acts on both sides of C 50 carotenoids, indicating β-cyclized halves are good substrates for CrtW, irrespective of their size, from C 15 to C 25 . The same applies to CrtZ, the hydroxylase that acts on position 3 of β-end groups 10 .
From the information above, we calculated the number of possible carotenoids that could be created by combinatorial expression of the six enzymes above, for each of carotenoid backbone (Supplementary Table 1 and Supplementary Fig. 2; see below for the explanation of the calculation). A total of 642 possible carotenoids can be biosynthesized starting from a pool of the three precursors, C 15 PP, C 20 PP, and C 25 PP. If C 30 PP is additionally considered, the number of possible carotenoids reaches 929. Note that even the natural (C 40 ) astaxanthin pathway can harbor 78 different compounds. Heterologous expression of the natural biosynthetic genes required for formation of astaxanthin (6) indeed results in the frequent accumulation of multiple compounds other than astaxanthin 10,11,19 .
Below we provide the explanation of calculation in Supplementary Table 1

. The alphabets (A-E) in
Supplementary Table 1 corresponds to that in this section.

A. The number of acyclic carotenoids for a given backbone size:
For symmetrical backbones (C 30 , C 40 and C 50 ), diversity at the level of desaturation can be calculated using the formula for combinations without permutation ("x-Choose-y"): n+r-1 C r = (n+r-1)! / r! (n-1)! where n: maximum number of desaturation steps possible for the half backbone (see Supplementary Fig. 2a).
r: number of objects in combination. We use 2, for both sides of the backbone For example, for the C 30 backbone, there are three kinds of steps: 0, 1 or 2 steps (see Supplementary Fig.   2a). So the total number of steps (n) will be 3. There are two "independent" sides of the backbone (two half backbones) to consider, so r should be 2.
The total number of carotenoids that can arise from desaturation of asymmetric carotenoid backbones (such as C 35 and C 45 ) can be calculated by multiplying together the step numbers (counting zero) for each half.

B. The number of monocyclic carotenoids (for a given backbone size):
There are two types of end groups that CrtY can act upon: the ψ and 7,8-dihydro-ψ ends (see Supplementary   Fig. 2b). Note that the conjugated system must extend to either end for cyclization to occur.
For carotenoids with symmetric backbones: the number of monocyclic carotenoids can be calculated as follows: [the number of desaturation steps for the half backbone] × 2 Explanation: the factor of 2 accounts for the "choice" of ψ or dihydro-ψ end groups as substrates for cyclization. The number of steps for the half backbone refers to the desaturation level of the non-cyclized side.

For asymmetric backbones:
[the sum of the maximum number of desaturation steps for each half] x 2

C. The number of bicyclic carotenoids (for given backbone size):
For symmetric backbones, there are 3 kinds of bicyclic carotenoids, while there are 4 for asymmetric bicyclic carotenoids (see Supplementary Fig. 2c)

D. The number of monocyclic xanthophylls (for given backbone size):
Starting from monocyclic precursors, there exist three types of oxidized products by the action of ketolase and hydroxylase: 3-hydroxylated, 4-ketolated, and 3-hydroxylated/4-ketolated. Therefore, the total number of possible monocyclic xanthophylls for a given backbone size can be obtained by multiplying the number of monocyclic carotenoids by 3 (see Supplementary Fig. 2d).

E. The number of bicyclic xanthophylls (for given backbone size):
As was discussed in Supplementary Note 1 and Supplementary Fig. 1, there are 9 different oxidation patterns for each symmetric bicyclic carotenoid (see Supplementary Fig. 2e). On the other hand, there exist 15 different oxidation patterns for each asymmetric bicyclic carotenoid (see Supplementary Fig. 2f).
For carotenoids with symmetric backbones (C 30 , C 40 and C 50 ), there exist 3 bicyclic products, 2 of which are symmetric and the other asymmetric (see Supplementary Fig. 2c). Therefore, the number of possible oxidized products for each symmetric bicyclic substrate is: For the asymmetric backbones (C 35 , C 45 ), there exist 4 bicyclic products, all of which are asymmetric (see Supplementary Fig. 2c). Therefore, the number of possible oxidized products for each symmetric bicyclic substrate is:

Supplementary Note 3 Directed evolution of farnesyl diphosphate synthase (FDS) for improved C 25 PP precursor supply
Directed evolution of FDS Y81A . A library of random point mutants of FDS Y81A was created by error-prone PCR and cloned into a pUC-based vector. The resultant plasmid library was transformed into E. coli cells harboring pAC-crtE-crtB-crtI-idi (see Fig. 3a), which were then plated on LB-agar to form colonies. The rationale was that FDS variants with improved activity for converting C 20 PP into C 25 PP would more fully deplete the lycopene precursor C 20 PP, resulting in paler colonies (Fig. 3a). We visually screened approximately 600 colonies, and found three colonies (named FDS m1 , FDS m2 and FDS m3 ) that appeared much paler than those expressing the parent FDS Y81A . C 20 PP consumption assay. We scored the in vivo C 40 carotenoid accumulation to compare the variants' ability of C 20 PP consumption (Supplementary Fig. 3a) of the three FDS variants. E. coli harboring plasmid pAC-crtE-crtB-crtI-idi accumulates lycopene to about 300 µg gDCW -1 . Additional expression of FDS Y81A on a pUC vector resulted in decreased lycopene production (~100 µg gDCW -1 ) by diversion of the intermediate C 20 PP from the lycopene pathway. As anticipated, additional expression of each of the three isolated FDS variants resulted in even lower (0-30 µg gDCW -1 ) levels of lycopene accumulation. This indicates that the three selected variants possess improved in vivo C 20 PP consumption activity. C 15 PP consumption assay. We also tested the effect of FDS expression on the level of C 30 carotenoid accumulation to compare the variants' abilities to consume C 15 PP (Supplementary Fig. 3a).
Here, transformation of pAC-crtM-crtN causes E. coli to accumulate ~200 µg gDCW -1 of 4,4'-diaponeurosporene by way of C 15 PP as precursor. Co-expression of FDS Y81A completely abolished pigment accumulation, but co-expression of the 3 isolated mutants did not (Supplementary Fig. 3a). Thus, the 3 new FDS mutants appear to possess slightly compromised in vivo C 15 PP consumption activity compared with FDS Y81A . Sequence analysis. The sequences of these clones are summarized in Supplementary Table 2. FDS m1 and FDS m3 contain only one amino acid substitution each: T121S and V157A, respectively. FDS m2 contains three additional amino acid substitutions, in addition to a different substitution at T121 (T121A) than found in FDS m1 . Because the C 20 PP consumption activity of FDS m1 was not higher than FDS m2 (Supplementary Fig. 3a), we conclude that the three other mutations (H215R, P239T, F266L) probably do not contribute to improvement of C 20 PP consumption activity.
Mutation mapped on crystal structure. When mapped onto the crystal structure of S. aureus FDS (Supplementary Fig. 3b), residues T121 and V157 were located on the "wall" of the substrate pocket.
We speculated that these mutations (T121A/S and V157A) enlarge the reaction pocket because they substitute smaller amino acids, thereby better accommodating larger substrates and products. Therefore, we chose to move forward with Ala instead of Ser for the substitution at residue 121. Altogether, we verified that Y81A, T121A and V157A were size-shifting substitutions in FDS.
In vitro product analysis of FDS variants. Each FDS variant was purified using a his-tag column and product distributions were evaluated in vitro (see Methods). [1-14 C]IPP and DMAPP were provided to the purified FDS variants, and the hydrolyzed products were analyzed using TLC autoradiography. Throughout this experiment, reaction times were limited so that <25% of the DMAPP was consumed, in order to obtain data representative of specificity under in vivo conditions of constant substrate supply. The molar ratio of IPP:DMAPP was set at 10:1, 5:1 or 1:1. Fig. 3b and Supplementary Fig. 4. The molar ratio of IPP:DMAPP affected the product distribution: larger fractions of IPP resulted in larger fractions of longer products. Wild-type FDS produced only C 15 PP (100%) under all three IPP:DMAPP ratio, showing its stringent product specificity. On the other hand, most FDS variants showed relaxed specificity towards a larger product, and none of them yielded a single product. Four FDS variants with the Y81A mutation produced C 30 PP as the terminal product. Especially, FDS Y81A,T121A and FDS Y81A,T121A,V157A produced a relatively larger amount of C 30 PP, under the condition of IPP:DMAPP = 10:1 (Supplementary Fig. 4c). FDS Y81A,V157A appeared to be the most selective C 25 PP producer in this in vitro experiment. However, the conditions of this experiment may differ in important ways from those in E. coli cytoplasm, and the in vitro product specificities of these purified FDS variants do not necessarily mirror the proportions of available carotenoid precursors in E. coli; see below. It should be noted:

Results are shown in
Although over-expressed, FDS variants are not the only providers of carotenoid precursors: an endogenous C 15 PP synthase, IspA, is also present. The product spectrum of FDS variants could be further shifted toward larger products, when C 15 PP was fed as a substrate. Also the C 15  For instance, in vitro, FDS Y81A,V157A (FDS m3 ) did not produce C 15 PP (Supplementary Fig. 4). However, this same variant apparently produced slightly more C 15 PP compared to FDS Y81A in E. coli (Supplementary Fig. 3a bottom panel). Also, when FDS Y81A,V157A was co-expressed with various carotenoid synthase variants, asymmetric C 40 carotenoids (condensation product of C 15 PP and C 25 PP) accumulated (Fig. 4a). This means that the E. coli cells expressing FDS Y81A clearly feeds C 15 PP to the carotenoid pathway, in addition to C 25 PP.

Supplementary Note 4. Directed evolution of diapophytoene synthase (CrtM) for improved C 50 synthase activity
Strategy for screening C 50 -synthase activity. Previously, we had discovered some mutants of CrtM [the C 30 backbone (diapophytoene) synthase from S. aureus] with detectable C 50 backbone (7) synthase activity 6 . However, these mutants also synthesize C 35 , C 40 (symmetric C 20 PP+C 20 PP, and asymmetric C 25 PP+C 15 PP), and C 45 backbones. To create a more specific C 50 backbone synthase, we decided to conduct additional rounds of directed evolution on our CrtM variants. Because there is no simple medium-or high-throughput assay for C 50 backbone synthesis, we decided instead to use our established colony color-based screens for C 30 -and C 40 -activites. We hypothesized that mutations that further shift the size specificity of a C 50 -capable CrtM variant toward larger substrates might be obtained by searching for additional mutations that diminish C 30 backbone synthase function but maintain C 40 synthase function (Supplementary Fig. 5a). We hoped that these additional mutations could then be combined with previously-discovered ones to further shift the specificity of CrtM toward larger substrates and products.
CrtM W38A as a parent for directed evolution. As the parent for mutagenesis and directed evolution, we chose CrtM W38A , a variant with both C 30 and C 40 backbone synthase activities 6 instead of CrtM F26A,W38A , the variant with the best C 50 synthase activity at the time. The latter enzyme exhibited very low C 30 synthase activity, so it would have been difficult to screen for mutations that further decreased its C 30 activity. In contrast, CrtM W38A is a 'generalist' mutant that retains wild-type C 30 synthase activity.
Directed evolution of CrtM W38A . Using error-prone PCR, we created a library of genes encoding variants of CrtM W38A , which we cloned into a pUC-based vector. The plasmid library was co-transformed with pAC-crtE-crtI-idi (see left panel in Fig. 3c) into E. coli cells, which were plated on LB-agar. About 70% of the ~200 variant colonies screened had red pigmentation, indicating the desired retention of substantial C 40 synthase activity. From among them, approximately 50 colonies were picked, pooled, and subjected to plasmid purification. This mixture of plasmids was then used as the template for the next round of PCR mutagenesis, which was followed by further screening for maintenance of C 40 activity. Five successive rounds of this process were conducted to accumulate mutations apparently neutral to C 40 function. Next, the resultant plasmid mixture was co-transformed with pAC-crtN-idi (see right panel in Fig. 3c) in a search for white colonies (indicating diminished C 30 synthase activity). Three mutants conferring the desired white phenotype were isolated and named CrtM g5L-1 , CrtM g5H-1 , CrtM g5H-2 .

C 40 -and C 30 -activity of isolated CrtM variants.
To test the activity of these mutants as C 40 and C 30 synthases, the CrtM mutants were re-transformed into XL1-Blue cells harboring pAC-crtE-crtI-idi and pAC-crtN-crtNb-idi (here, CrtNb, a C 30 carotenoid oxidase 20 that converts 4,4'-diaponeurosporene to 4,4'-diaponeurosporenal (an orange pigment) was additionally expressed to facilitate the visual screening). We reconfirmed that all three CrtM variants demonstrated the desired combination of reduced C 30 synthase activity and undiminished C 40 synthase activity (Supplementary Fig. 5b). Table 3) revealed that they all possess a mutation at F233 (F233S or F233L), and two of the three also have a previously reported size-shifting mutation (F26L) 21 . Previously, Umeno et al. performed site-saturation mutagenesis at position 26 of CrtM, and concluded that F26A was the best substitution for C 40 synthase activity 6 . As a result, in addition to the W38A mutation in the parent, we decided to move forward with combinatorial testing of the F26A and F233S mutations.

Sequence analysis. Sequencing of the three isolated CrtM variants (Supplementary
Structural mapping of F233S mutation. Although F233S alone does not appear to change the specificity of CrtM, this mutation significantly shifts the product specificity of CrtM to C 50 synthesis when combined with F26A and/or W38A ( Fig. 4 and Supplementary Fig. 5d-g). Located at the end of the active site cleft of CrtM, F233S is expected to further shift the size specificity of CrtM by enlarging the pocket (Supplementary Fig. 5c). This shifted specificity of F233S-containing variants is further emphasized when C 30 PP is supplied to it as a potential precursor for carotenoid synthesis. Upon co-expression with FDS I78G,Y81A 15 , CrtM F26A,W38A,F233S produced C 55 -phytoene, the largest carotenoid backbone ever reported to have been biosynthesized ( Fig. 4a and Supplementary Fig. 6).

Supplementary Note 5. Regression analysis of carotenoid backbone titers vs mutations from the combinatorial expression of FDS and CrtM mutants
Because the true product distribution of FDS variants is extremely challenging to measure in living cells, we performed multiple linear regression on the carotenoid backbone titer data behind the bars in Fig. 4a (an 8×8 full-factorial experiment) in an effort to ascertain the contributions of the FDS and CrtM mutations and their interactions to the measured titers. Using the General Linear Model function in Minitab ® v.16.2, we performed 5 separate regressions, one for each carotenoid backbone from C 30 to C 50 (see Supplementary Table 4), of the backbone titer vs. the set of 6 total amino acid substitutions in the FDS and CrtM variants (3 each) plus the 15 two-body (epistatic) interaction terms of the 6 substitutions.
We were most interested in the resulting sets of p-values (from two-tailed F-tests) for each term in each model. For each carotenoid backbone model, we note there is at least one FDS-CrtM substitution interaction term (indicated by the bottom 9 rows containing both blue-and green-highlighted cells) whose p<0.1. The interpretation is that at least one FDS-CrtM interaction term is significant (meaning that its coefficient is statistically different from zero) at α=0.1 for each model, implying that the matching of FDS and CrtM variants is important for determining the resulting distribution of carotenoid backbone titers.
Notes for Supplementary • T is the 64×1 matrix of titers of a carotenoid backbone for each FDS-CrtM combination in the experiment.
• X constitutes the 64×7 design matrix for the experiment, containing a first column of ones for the constant and 6 columns of predictor variables whose values are either 0 or 1 depending on whether that amino acid substitution (in the FDS or the CrtM) is absent or present in the combination.
• β is the 7×1 matrix of first-order term coefficients (plus the constant term) to be solved for by the regression.
• W is the 64×15 design matrix of 2-body (epistatic) interactions whose values are either 0, if both substitutions specified by the entry are not present in the FDS-CrtM combination, or 1 if both are present in the combination.
• γ is the 15×1 matrix of 2-body interaction term coefficients to be solved for by the regression.

Supplementary Note 6. How metabolic filtering works -An Illustrative Model
To date, we have experimentally biosynthesized eight different carotenoid backbone structures, C 30 (C 15 PP+C 15 PP), C 35 (C 20 PP+C 15 PP), C 40 (C 20 PP+C 20 PP), asymmetric C 40 (C 25 PP+C 15 PP), C 45 (C 25 PP+C 20 PP), C 50 (C 25 PP+C 25 PP), C 55 (C 30 PP+C 25 PP), and C 60 (C 30 PP+C 30 PP). Co-expression of an FDS variant and a CrtM variant yields some distribution of these eight compounds. The specificity of this two-member pathway (i.e., its product distribution) is, at first approximation, determined by the relative production rate of each of the eight carotenoids C j . These rates are proportional both to the concentrations of isoprenyl diphosphates generated by the FDS (its product specificity) and the kinetic preference (substrate specificity) of the carotenoid synthase. Because of this relationship, improvements to both factors would multiplicatively alter the distribution of carotenoid backbones.
The following model, intended for illustrative and explanatory purposes (not for fitting to experimental measurements), shows quantitatively how modest improvements in the specificities of successive pathway enzymes can be combined to give substantial focusing of pathway flux to a desired product by "filtering out" undesired precursors from being incorporated into carotenoids.
Assumptions: 1. Each carotenoid C j is the condensation product of two isoprenyl diphosphates C i1 PP and C i2 PP, which may be identical or different from each other. 2. The product distribution of carotenoid backbones (for a given pairing of FDS-and CrtM-variants) is determined by the production rates (p j ) of each possible backbone. 3. The production rate (p j ) of each carotenoid C j is determined by the concentrations of the two prenyl diphosphate substrates (C i1 PP and C i2 PP) multiplied by the rate constant for their condensation by a carotenoid synthase variant (k j ). The concentration of synthase enzyme is assumed to be constant across all cases and "included" in the values of k j . Thus, differences in k j only reflect differences in the specificity for the various substrates. Therefore, Where: y i1 and y i2 are the molar fractions of isoprenyl diphosphates used to make carotenoid C j and I Tot is the total concentration of isoprenyl diphosphates. I Tot is assumed to be constant (and equal to 1) for the purposes of this exercise.
We illustrate metabolic filtering in Supplementary Fig. 7  , is a poor substrate for this CrtM variant, which is reflected by the small amounts of its condensation products (0.5% and 0.2% for C 55 and C 60 carotenoids, respectively). Thus, C 30 PP is largely "filtered out" of being incorporated into carotenoids. The rate constants for C 45 , and C 50 synthesis are identical for this CrtM variant, but production of the former backbone is limited by the reduced supply of C 20 PP. Thus, as a result of combining precursor enrichment with condensation specificity, the molar fraction of the target C 50 carotenoid becomes quite high (~90%) because other precursors are "filtered out" at the FDS or CrtM stage.

Supplementary Note 7. Directed evolution of phytoene desaturase for C 50 desaturase activity
Directed evolution of CrtI. Prior establishment of a specific C 50 backbone pathway (Fig. 4) was necessary to enable simple visual screening for C 50 desaturase activity without erroneously evolving the desaturase for improved or altered activity on C 40 backbone (phytoene, 1), the native substrate of phytoene desaturase, CrtI (Supplementary Fig. 8a). Using error-prone PCR, we created a library of genes encoding point-mutants of CrtI. These variants were cloned into a pUC-based plasmid with an arabinose promoter. The resultant CrtI plasmid library (size ~10 5 ) was transformed into E. coli cells harboring one of the FDS-CrtM variant pairs that selectively produces C 50 backbone (pAC-fds Y81A,V157A -crtM F26A,W38A,F233S ). After plating the library onto LB-agar and subsequent colony formation (Fig. 6a), nitrocellulose membranes were used to transfer the colonies onto fresh LB-agar plates containing 0.2% (w/v) arabinose. Out of the 2000 colonies surveyed, we isolated 8 colonies with an intense red hue. After a second screening of the 6 hits, we chose six CrtI variants for detailed analysis. Table 5) revealed three mutations N304S, I338V, and F339S/L that significantly increase C 50 desaturase activity. The CrtI variants were co-transformed with pAC-fds Y81A,V157A -crtM F26A,W38A,F233S in E. coli and the distribution of C 50 desaturation products was analyzed. CrtI mut2 and CrtI mut8 , which both possess the N304S mutation, exhibited the highest in vivo desaturation levels of all the variants (Supplementary Fig. 8b).

Sequence/product analysis of CrtI variants. Sequence analysis (Supplementary
Site-saturation mutagenesis of N304 residue and analysis. We then performed a site-saturation mutagenesis experiment on position 304 of CrtI using NNK degenerate oligonucleotides (N: equimolar mixture of dA, dG, dC, dT; K: equimolar mixture of dG and dT), and screened for mutants with elevated C 50 desaturation activity (pink colonies). The N304P mutant appeared to have the highest in vivo desaturation activity (Supplementary Fig. 8d). Although the accumulation of C 50 -lycopene (8) is not significantly different between CrtI N304S and CrtI N304P , the amount of undesaturated C 50 -phytoene (7) was lower in the cells expressing CrtI N304P . All of the CrtI variants showed wild-type activity in C 40 pathway (Supplementary Fig. 8c). Thus, they acquired C 50 desaturase activity without compromising their original C 40 activity.
The plasmids for the downstream enzymes (crtI, crtY, crtW, and crtZ) and their derivatives are based on pUCara vectors; these vectors were made by replacing the lac promoter of pUC18m (from the O 3 operator site to the O 1 operator site) with the araC to araBAD promoter region from the pBADHisA vector (Invitrogen).