Genetically engineered biosynthetic pathways for nonnatural C60 carotenoids using C5-elongases and C50-cyclases in Escherichia coli

While the majority of the natural carotenoid pigments are based on 40-carbon (C40) skeleton, some carotenoids from bacteria have larger C50 skeleton, biosynthesized by attaching two isoprene units (C5) to both sides of the C40 carotenoid pigment lycopene. Subsequent cyclization reactions result in the production of C50 carotenoids with diverse and unique skeletal structures. To produce even larger nonnatural novel carotenoids with C50 + C5 + C5 = C60 skeletons, we systematically coexpressed natural C50 carotenoid biosynthetic enzymes (lycopene C5-elongases and C50-cyclases) from various bacterial sources together with the laboratory-engineered nonnatural C50-lycopene pathway in Escherichia coli. Among the tested enzymes, the elongases and cyclases from Micrococcus luteus exhibited significant activity toward C50-lycopene, and yielded the novel carotenoids C60-flavuxanthin and C60-sarcinaxanthin. Moreover, coexpression of M. luteus elongase with Corynebacterium cyclase resulted in the production of C60-sarcinaxanthin, C60-sarprenoxanthin, and C60-decaprenoxanthin.

In a previous study, we constructed a nonnatural C 50 backbone carotenoid pathway in which two geranylfarnesyl diphosphates (C 25 PP) are condensed to produce C 50 -phytoene (Fig. 1b). This compound is subsequently subjected to a six-step desaturation reaction resulting in the synthesis of C 50 -lycopene. Using a metabolic filtering approach 17 , we extended this C 50 -lycopene pathway to specifically produce the nonnatural C 50 backbone carotenoids C 50 -β-carotene, C 50 -zeaxanthin, C 50 -canthaxanthin, and C 50 -astaxanthin. In this study, we systematically expressed elongase and cyclase in engineered E. coli strains harboring enzymes of the C 50 -lycopene synthetic pathway and established pathways for the synthesis of the novel carotenoids C 60 -flavuxanthin, C 60 -sarcinaxanthin, C 60 -sarprenoxanthin, and C 60 -decaprenoxanthin (Fig. 1b).  (C 50 ) and nonnatural (C 60 ) carotenoids. (a) Natural C 40 -to-C 50 carotenoid pathway. Using lycopene (C 40 ) as a substrate, the lycopene elongases CrtEb from Corynebacterium and CrtE2 from M. luteus attaches two isoprene units to C 40 . The resulting flavuxanthin (C 50 ) is then cyclized by the γand/or ε-cyclases CrtYe/Yf (Corynebacterium) or CrtYg/Yh (M. luteus) to produce sarcinaxanthin, sarprenoxanthin, or decaprenoxanthin (all C 50 ). LbtABC genes from Dietzia sp. CQ4 produces the β,β-cyclic C 50 carotenoid C.p.450 via the independent intermediate C.p.496. (b) Nonnatural C 50 -to-C 60 carotenoid pathway construction. Combined expression of Corynebacterium and M. luteus elongases and cyclases resulted in the conversion of laboratory-generated C 50 -lycopene to C 60 carotenoids with γ and/or ε-cyclic ends. We could not obtain C 60 counterparts of β-end C 50 carotenoids (indicated in arrows with dashed lines).

Results
Activities of elongase and cyclase in the natural C 40 pathway. We selected Corynebacterium glutamicum, M. luteus, and Dietzia sp. CQ4 as sources of lycopene elongases and C 50 cyclases, because the C 50 carotenoid pathways from these organisms have been previously reconstructed in E. coli [10][11][12] . The carotenoid cluster (shown in Supplementary Fig. 1) of C. glutamicum 11 comprises crtEb (lycopene elongase), crtYeYf (heterodimeric C 50 ε/γ-cyclase), and lycopene-producing genes (crtEBI). The expression of this gene cluster in E. coli produces sarprenoxanthin, decaprenoxanthin, and sarcinaxanthin via the acyclic intermediate flavuxanthin 10,11 (Fig. 1a). The corresponding gene cluster in M. luteus 10 encodes crtE2 (lycopene elongase) and crtYgYh (C 50 γ-cyclase), and coexpression of these genes with crtEBI results in specific accumulation of sarcinaxanthin 10 . Uniquely, the carotenogenic gene cluster of Dietzia sp. CQ4 12 encodes a gene for elongase (lbtC) that is fused in frame with a single subunit of cyclase (lbtB), which forms a heterodimer with another unit of cyclase (lbtA). Expression of lbtB with an lbtA gene product results in the production of a functional C 50 β-cyclase 12 .
Although different in organization, all of the elongase and cyclase genes are clustered in small <1.7-kb regions ( Supplementary Fig. 1), allowing one-step PCR amplification from genomic DNA. We cloned these DNA fragments into a plasmid that encodes Pantoea ananatis phytoene desaturase variant 17 crtI N304P , a variant that can desaturate both (C 40 -) phytoene and C 50 -phytoene (see Fig. 2a for plasmid construct). In addition to these three operons (for C. glutamicum, M. luteus and Dietzia sp.), we cloned the elongase and cyclase genes from Corynebacterium efficiens. The carotenoid gene cluster of this organism has exactly the same organization as that of C. glutamicum, and in our homology analyses, DNA sequence identity of the two clusters (from crtE to crtEb, see Supplementary Fig. 1) was 64.3%. Yet, the functions of the C. efficiens carotenoid pathway and its products remain uncharacterized.
To characterize the functions of elongase and cyclase in the context of natural (C 40 ) pathway ( Fig. 1a) in E. coli, we used an engineered E. coli strain that constitutively expresses fds Y81M and crtM F26A,W38A genes (crtEB equivalent 17 ) and produces phytoene (C 40 ). We introduced plasmids encoding phytoene desaturase (a crtI variant), elongases and cyclases into this strain, and after culturing for 48 h, we extracted carotenoids using acetone and www.nature.com/scientificreports www.nature.com/scientificreports/ analyzed these using high performance liquid chromatography (HPLC) (Fig. 2a). These analyses showed that cells expressing C. glutamicum crtYeYfEb produced a mixture of (C 50 -) sarcinaxanthin (2), (C 50 -) sarprenoxanthin (3), and (C 50 -) decaprenoxanthin (4), with significant quantities of their substrate (C 40 -) lycopene (1). Cells expressing C. efficiens crtYeYfEb indistinguishable composition of carotenoids as those from the C. glutamicum gene cluster. Cells expressing M. luteus crtE2YgYh produced a single peak of (C 50 -) sarcinaxanthin (2), whereas expression of Dietzia sp. CQ4 lbtABC resulted in the specific production of C.p.450 (C 50 ). Despite the differences in the source of lycopene biosynthetic genes, promoters, RBS sequences, E. coli strain or culture condition, product distribution of the pathway we constructed was very similar to those from previous reports 10 .
To determine whether these enzymes can metabolize C 50 -lycopene, we coexpressed plasmids encoding the elongase and cyclase enzymes in another E. coli strain that constitutively expresses fds Y81A,V157A and crtM-F26A,W38A,F233S and produces C 50 -phytoene 17 . In this strain, the phytoene desaturase mutant (CrtI N304P ) can desaturate C 50 -phytoene to produce C 50 -lycopene selectively 17 , providing a sole substrate for the elongase and cyclases in this stain. However, we observed no novel peaks but C 50 -lycopene in carotenoid fractions from this strain.
Ribosome binding site (RBS)-optimized cyclase/elongase results in higher production of cyclic C 50 carotenoids. In the previous section, we observed (C 40 -) lycopene accumulated in cells expressing elongase and cyclases (Fig. 2a), suggesting the presence of inefficiencies of this pathway, and room for improvement. The present four carotenoid operon constructs share similar operon organization, in which open reading frames (ORF) for cyclases and elongases are partially overlapping. Specifically, the start codon of M. luteus crtYg is 28 base pairs upstream of the stop codon of the crtE2 gene, and the start codon of crtYh overlaps with the stop codon of the crtYg gene. In C. glutamicum and C. efficiens, crtYe and crtYf overlap by 4 bases, and in Dietzia, lbtA and lbtBC genes also overlap by 4 bases, as is commonly observed in various gene clusters containing carotenoid operons 10 . We investigated translation initiation rates using the RBS calculator [18][19][20] and found RBS scores as low as 0.01 (Table 1), suggesting very low translation initiation rates and likely formation of stable secondary mRNA structures. In the native operons with overlaps in translational stop and start of ORFs, the translational re-initiation is likely ensuring efficient translation in vivo.
To improve expression levels of elongases and cyclases in E. coli, we redesigned our artificial operons (Fig. 2b) by (1) separating the ORFs to yield tandem and distinct non-overlapping reading frames, and (2) to provide stronger RBSs, designed using RBS calculator to have RBS strengths between 1,000 -5,000 (Table 1). These RBS strengths range were chosen since it was previously confirmed optimal for expressing various carotenoid biosynthetic genes in the plasmid backbone we use in this study 17,21 (i.e. p15A, pUC). When expressed with (C 40 -) lycopene pathways, all of these new constructs with engineered RBSs led to improved production of natural C 50 carotenoids and significant reductions in unconverted lycopene contents (Fig. 2b). production of nonnatural cyclic C 60 carotenoids. Following the RBS engineering of elongases and cyclases, we cotransformed the new constructs into the strains that selectively synthesize C 50 -phytoene (Fig. 3). From the cells expressing M. luteus crtE2YgYh, we observed novel peaks 7 and 9 in HPLC chromatograms, and these were eluted earlier than that of C 50 -lycopene (6) (Fig. 3a,b). Peak 9 exhibited absorbance maxima at 496 nm; shorter than that for C 50 -lycopene (9, 512 nm). The m/z value (837) of peak 9 matched that of C 60 -sarcinaxanthin (C 60 H 84 O 2 , Fig. 1), and the carotenoid from peak 9 had absorption maxima at 460, 494, and 527 nm (Fig. 3h). Furthermore, analyses using positive ion electrospray ionization time-of-flight mass spectrometry (ESI TOF MS) at m/z 837.6542 MH + indicated a molecular formula for this carotenoid of C 60 H 84 O 2 (C 60 H 85 O 2 calcd. for 837.6550) (Supplementary Fig. 2). The structure of this carotenoid was determined as C 60 -sarciniaxanthin using nuclear magnetic resonance ( 1 H-NMR) with correlation spectroscopy (COSY) and rotating frame overhause effect spectroscopy (ROESY) ( Table 2). In addition, we identified peak 7 as the C 60 pathway intermediate  www.nature.com/scientificreports www.nature.com/scientificreports/ The crtE2 gene expression (without crtYgYh) from M. luteus resulted in a new peak 8 with an m/z value of 838 and an absorbance maximum of 513 nm (Fig. 3g). The carotenoid from peak 8 showed absorption maxima at 460, 494, and 527 nm (Fig. 3h) Fig. 3) for this carotenoid, and structural analyses using 1 H-NMR with COSY and ROESY revealed the carotenoid C 60 -flavuxanthin (Table 2).
We did not observe any novel carotenoid peaks in experiments using elongase and cyclase genes from C. glutamicum, C. efficiens, or Dietzia sp. CQ4 (Fig. 3c-e). Hence, the lycopene elongases (CrtEb in Corynebacterium and LbtC in Dietzia sp. CQ4) at least fail to convert C 50 -lycopene in E. coli. However, the activities of the cyclases (CrtYeYf from Corynebacterium and LbtAB from Dietzia sp. CQ4) were still unknown, since the first elongase step failed to provide the substrate (C 60 -flavuxanthin or C 60 -C.p.496) for these cyclase enzymes.
Because Corynebacterium and M. luteus pathways share the intermediate flavuxanthin, we generated a chimera operon containing M. luteus crtE2 and C. efficiens crtYeYf. Following expression in cells, we detected three carotenoid peaks (Fig. 3f) (9, 10, and 11) with identical absorption spectra (Fig. 3h), indicating the presence of the same chromophore. Although we could not determine the NMR spectroscopy of the compound from the associated chromatographic peaks, their absorption spectra, m/z values (838) and retention times strongly indicate that peaks 9, 10, and 11 were C 60 -sarcinaxanthin (as shown in Fig. 3b), C 60 -sarprenoxanthin, and C 60 -decaprenoxanthin, respectively. These results also indicate that cyclases (crtYeYf) from Corynebacterium are functional in the C 50 -to-C 60 pathway.

Discussion
In this study, we demonstrated that carotenoid elongation and cyclization enzymes from natural C 50 carotenoid pathways metabolize C 50 -lycopene to novel nonnatural C 60 carotenoids. These carotenoids (C 60 -flavuxanthin, C 60 -sarcinaxanthin, C 60 -sarprenoxanthin, and C 60 -decaprenoxanthin) are larger than any known natural carotenoids, and their absorbance spectra are red-shifted by as much as 58 nm compared with the natural counterpart (440 nm vs. 496 nm). With rare γ-ring structures, these C 60 carotenoids provide a unique set of accessible carotenoid structures with as yet unknown functions. Natural C 50 carotenoids were previously discovered in thicker membranes of halophillic bacteria, which survive in extreme hypersaline and low-temperature environments 13,14,16 . The present nonnatural C 60 carotenoids are interesting candidates for functional characterization in extremophiles. These future studies may lend understanding to the roles of long-chain carotenoids in nature.
Through the coexpression of natural carotenoid enzymes, we have increased the number of laboratory-generated C 50 carotenoid pathways. Many carotenoid enzymes from C 40 and C 30 pathways exhibit Figure 3. Lycopene elongases and C 50 cyclases function in the C 50 -to-C 60 pathway. (a-g) HPLC chromatogram of carotenoid extracts from E. coli cells expressing genes for C 50 -phytoene production with indicated genes. The indicated peak numbers correspond with those in Fig. 1. Peaks labelled with asterisks correspond to unidentified non-carotenoid compounds. (h) Absorbance spectra of the indicated peaks.
www.nature.com/scientificreports www.nature.com/scientificreports/ significant substrate promiscuity and accept a wide range of substrates with recognizable locally specific 22 structures. Consequently, these enzymes are active in nonnatural pathway contexts 22-25 without any mutations. However, there is increasing reports on carotenoid modifying enzymes that exhibit unexpected selectivity against non-cognate but very similar substrates. For example, lycopene ε-cyclases from plants (LCYe) cyclize only one end of the acyclic substrate lycopene and leave the other end un-cyclized 26 . Hence, ε-cyclases likely possess mechanisms for avoiding cyclization of non-cognate substrates. Similarly, the β-carotene 15,15′ cleavage enzyme BCMO1 only accepts β-carotene (β,β-end) as a substrate and does not act on ε-carotene (ε,ε-end). However, after removing the ε-end via 9′,10′ cleavage by BCMO2, BCMO1 precisely cleaves the 15,15′ bond and liberates retinal 27 .
While the elongases from Corynebacterium and Dietzia sp. CQ4 exhibited a significant activity towards natural substrate (C 40 -lycopene), they did not show any activity towards C 50 -lycopene. Considering that enzymes from both strains have significant activity in the C 40 -to-C 50 context, it is unlikely that their inability to act on C 50 -lycopene reflects a lack of activity in E. coli. The more likely alternative is that some of these enzymes have higher substrate/size specificity and resist C 50 -lycopene as a substrate. Kim et al. showed that the Corynebacterium elongase CrtEb is functional in another non-cognate context and acts on the ψ-end of the C 30 carotenoid 4,4′-diaponeurosporene 25 . Given this unpredictability of promiscuous functions toward non-cognate substrates, studies of several accessible gene candidates are required to identify genes that perform intended nonnatural tasks. Also, it is interesting how or whether Dietzia elongase and cyclase and Corynebacterium elongase, which were found non-functional in C 50 -to-C 60 context in the present work, can acquire these new activities by mutations.
Methods strains and reagents. E. coli XL10-Gold cells were used for cloning, and XL1-Blue cells were used for carotenoid production. All enzymes were purchased from New England Biolabs. Lennox-LB Broth Base was purchased from Life Technologies, Bacto TM Yeast Extract, and Bacto TM Tryptone were purchased from BD Biosciences, and all other chemicals and reagents were obtained from Nacalai Tesque (Kyoto, Japan). The antibiotics carbenicillin and chloramphenicol were used at 30 and 50 µg/mL, respectively. www.nature.com/scientificreports www.nature.com/scientificreports/ plasmid construction. pUCara-crtI N304P , a plasmid encoding a crtI N304P gene downstream of an arabionse inducible araBAD promoter, was derived from a previous study 17 . The plasmids pUCara-crtI N304P -crtYeYfEb Cg , pUCara-crtI N304P -crtYeYfEb Ce , pUCara-crtI N304P -crtE2YgYh Ml , and pUCara-crtI N304P -lbtABC CQ4 were constructed by amplifying genes for elongase and cyclase from various genomes using the primers listed in Supplementary Table 1 and cloning these into the ApaI/SpeI restriction site of pUCara-crtI N304P . The plasmids pAC-fds Y81A,V157A -crtM F26A,W38A,F233S and pAC-fds Y81M -crtM F26A,W38A were derived from the previous study 17 . The fds and crtM variant genes on these plasmids were expressed constitutively under lac promoter. Supplementary Table 2 shows the DNA sequence of the designed RBS construct. These sequences were concatenated and were then inserted into the ApaI/SpeI restriction site of pUCara-crtI N304P without spacer sequences.
Culture conditions. Single colonies were inoculated into 2 ml of LB media with antibiotics in culture tubes and were shaken at 37 °C for 16 h. Overnight cultures of 2 mL were diluted 100-fold into 40 mL of fresh Terrific Broth media in 200 mL flasks and were then shaken at 200 rpm in an incubator at 30 °C. After 8 h, 0.2% (w/v) arabinose inducer was added and cells were cultured for an additional 40 h.
Product extraction and purification. Cell cultures were centrifuged at 3,270 × g for 15 min at 4 °C. Cell pellets were washed with 10 ml of 0.9% (w/v) NaCl aq and were then repelleted by centrifugation. Products were extracted by vigorously vortexing for 5 min in 10 ml of acetone containing 30-mg/L butylated hydroxytoluene. One-mL aliquots of hexane and 35-ml aliquots of 1% (w/v) NaCl aq were then added, and the samples were centrifuged at 3,250 × g for 15 min. After collecting the product-containing hexane phase, the solvent was evaporated in a vacuum concentrator. Extracts were finally dissolved in 15-50-μl aliquots of tetrahydrofuran:methanol (6:4) for further analysis.
ESI TOF MS spectra were acquired using a Waters Xevo G2S Q TOF mass spectrometer (Waters Corporation, Milford, CT, USA) equipped with an Acquity UPLC system with scanning from m/z 100 to 1,500 with a capillary voltage of 3.2 kV, a cone voltage of 40 eV, and a source temperature of 120 °C. Nitrogen was used as the nebulizing gas at a flow rate of 30 L/h. MS/MS spectra were measured using a quadrupole-TOF MS/MS instrument with argon as a collision gas at a collision energy of 30 V. 1 H NMR (500 MHz) including COSY and ROESY spectra were generated using a Varian UNITY INOVA 500 spectrometer in CDCl 3 with tetramethylsilane as an internal standard.

Data Availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.