Genes encoding norcoclaurine synthase occur as tandem fusions in the Papaveraceae

Norcoclaurine synthase (NCS) catalyzes the enantioselective Pictet-Spengler condensation of dopamine and 4-hydroxyphenylacetaldehyde as the first step in benzylisoquinoline alkaloid (BIA) biosynthesis. NCS orthologs in available transcriptome databases were screened for variants that might improve the low yield of BIAs in engineered microorganisms. Databases for 21 BIA-producing species from four plant families yielded 33 assembled contigs with homology to characterized NCS genes. Predicted translation products generated from nine contigs consisted of two to five sequential repeats, each containing most of the sequence found in single-domain enzymes. Assembled contigs containing tandem domain repeats were detected only in members of the Papaveraceae family, including opium poppy (Papaver somniferum). Fourteen cDNAs were generated from 10 species, five of which encoded NCS orthologs with repeated domains. Functional analysis of corresponding recombinant proteins yielded six active NCS enzymes, including four containing either two, three or four repeated catalytic domains. Truncation of the first 25 N-terminal amino acids from the remaining polypeptides revealed two additional enzymes. Multiple catalytic domains correlated with a proportional increase in catalytic efficiency. Expression of NCS genes in Saccharomyces cereviseae also produced active enzymes. The metabolic conversion capacity of engineered yeast positively correlated with the number of repeated domains.


Results
Isolation and phylogeny of NCS candidates. BLASTx analysis of PhytoMetaSyn Project databases revealed 33 candidate genes encoding polypeptides displaying > 30% amino acid identity with NCS from either P. somniferum (PSONCS) or T. flavum (TFLNCS) (Fig. 2). Phylogenetic analysis showed a strong relationship among NCS candidates from the same family. Other previously characterized NCS enzymes from Papaver bracteatum 29 and Coptis japonica 30 shared considerable identity with those from P. somniferum (PSONCS) and T.flavum (TFLNCS), respectively. Amino acid sequences from P. bracteatum and C. japonica represented the most distant branches on the phylogenetic tree, with the 33 NCS candidates showing intermediate levels of amino acid sequence identity between these two extremes. Ten of 33 candidates, all within the Papaveraceae family, were represented in the assembled Illumina and 454 sequence databases as genes encoding fusion proteins containing two to five tandem repeats of complete catalytic domains, or partial domains with lengths between 70 and 170 amino acids (Fig. 3). The crystallographic structure for TFLNCS showed that Y 108 , E 110 , K 122 and D 141 dominate catalysis in the active site 14 . Amino acid sequence alignments (Supplemental Fig. 1) showed that the repeated domains in most fusion proteins each contained a full set of these four proposed catalytic residues. However, two fusion proteins (CMANCS1 and ECANCS1), both containing two partial domains, each possessed only one complete set of catalytic residues (Fig. 3). Similarly, the predicted ECANCS2 polypeptide contained five tandem repeats, but displayed only three sets of NCS catalytic residues (Fig. 3). Moreover, catalytic residues in CMANCS1, ECANCS1 and ECANCS2 were not always located in repeated regions of conserved domains. Expression of NCS candidates in Escherichia coli. Fourteen genes encoding NCS candidates were amplified from corresponding cDNA libraries. Eight of the genes encoded proteins with single NCS domains ( Fig. 2), whereas six genes encoded fusion proteins containing tandem repeats of two to four NCS domains (Figs 2 and 3). Amplification of the remaining three putative NCS fusion proteins (i.e. PBRNCS2, CMANCS2 and ECANCS2) was unsuccessful, potentially as the result of DNA sequence or assembly errors affecting annealing sites for the selected PCR primers (Table S1). Immunoblot analysis confirmed the production of soluble polypeptides in E. coli corresponding to 14 recombinant proteins, all of which showed empirical molecular weights in agreement with the theoretical size of predicted translation products (Supplemental Fig. 2A).
Using crude E. coli protein extracts, six of the fourteen soluble recombinant enzymes converted [ 14 C]-dopamine and 4-HPAA to [ 14 C]-norcoclaurine (Fig. 4A). One additional candidate, TFLNCS2, also showed a possible low level of activity. Negative control reactions containing only buffer and substrates, or crude soluble protein from E. coli harboring the empty pET29b vector did not yield detectable [ 14 C]-norcoclaurine (Fig. 4A). Truncation of the first 25 amino acids from eight NCS candidates that did not efficiently convert [ 14 C]-dopamine and 4-HPAA to [ 14 C]-norcoclaurine as nascent proteins (i.e. retaining a putative N-terminal signal peptide) yielded two additional enzymes with substantial activity, TFLNCS2 and XSINCS1 (Fig. 4B). Remarkably, the amino acid sequence identity among several of the active enzymes was less than 35%, but was as high as 90% between variants from closely related species (Supplemental Fig. 1). In contrast, some catalytically inactive proteins showed amino acid sequence identities of 94-96% compared with active enzymes.
Alignment of single-domain polypeptides, and the duplicated regions of multiple-domain proteins, revealed an absolute conservation of all four catalytic residues 14 , and several other amino acids, in all active NCS enzymes (Supplemental Fig. 3). Some precisely conserved residues in active enzymes were substituted in inactive proteins, notably CCHNCS5 and PBRNCS3. A schematic representation of the amino acid sequence alignment containing all tested single-domain proteins, together with the predicted translation products of additional NCS candidates, revealed three general architectural groups (Fig. 5). Several proteins (e.g. TFLNCS) included an N-terminal regional corresponding to a characterized signal peptide 29,31 , whereas this domain was lacking in an approximately equal number of other candidates (e.g. SCANCS1). A smaller number of proteins displayed an N-terminal extension beyond the putative signal peptide (e.g. CMANCS1). Among functionally tested proteins containing a full complement of conserved residues, polypeptides lacking the signal peptide domain were consistently active (Fig. 4). Proteins containing a putative signal peptide were generally inactive, but two of these could be substantially activated by cleavage of the first 25 amino acids of the nascent polypeptide. Notable exceptions were NDONCS3 and PSONCS1 8 , which were active without cleavage of the signal peptide. The retention of a signal peptide has been associated with thermodynamic destabilization, reducing enzyme activity and increasing protein aggregation 32 . Finally, proteins with an N-terminal extension beyond the signal peptide domain were consistently inactive.
Two active single-domain enzymes (i.e. NDONCS3 and TFLNCS2Δ 25) along with the four functional multi-domain proteins (i.e. CCHNCS2, SDINCS1, PBRNCS5 and PSONCS3) were selected for further characterization. Despite considerable effort to obtain soluble recombinant proteins, purified SDINCS1, PBRNCS5 and PSONCS3 could not be obtained in sufficient quantities to perform kinetic analyses. However, NDONCS3, TFLNCS2Δ 25 and CCHNCS2 were purified to near homogeneity (Supplemental Fig. 4), and were shown to display Michelis-Menten kinetics for dopamine at saturating concentration of 4-HPAA (Supplemental Fig. 5). In contrast, sigmoidal kinetics were observed for 4-HPAA at saturating concentrations of dopamine with a Hill coefficient of > 1 (Supplemental Fig. 5; Table 1). The K cat and K m values for dopamine and 4-HPAA of the two single-domain enzymes were similar, whereas the K cat values for both substrates of the double-domain fusion (i.e. CCHNCS2) were more than two-fold higher compared with those for NDONCS3 and TFLNCS2Δ 25 ( Table 1). The K m value for dopamine of CCHNCS2 was only marginally higher than those of the single-domain enzymes. However K m value for 4-HPAA was four-fold higher compared with the K m values for NDONCS3 and TFLNCS2Δ 25. The similarities and differences in kinetic parameters are notable in the context of the amino acid sequence identity among the three tested enzymes. NDONCS3 and TFLNCS2Δ 25 share 54% amino acid sequence identity, whereas each enzyme shares only 33-35% identity with the individual domains of CCHNCS2 (Supplemental Fig. 1).

Expression of NCS candidates in Saccharomyces cerevisiae.
Immunoblot analysis showed the production of soluble polypeptides in S. cerevisiae corresponding to all eight NCS variants and two previously reported enzymes (i.e. TFLNCSΔ 19 and PSONCS2) 7,8 (Supplemental Fig. 2B). All recombinant proteins displayed molecular weights in agreement with the size of predicted translation products, including two with double, one with triple, and one with quadruple NCS domains. All ten enzymes assayed in crude S. cerevisiae protein extract converted [ 14 C]-dopamine and 4-HPAA to [ 14 C]-norcoclaurine (Fig. 6A). Negative control reactions containing crude soluble proteins from S. cerevisiae harboring the empty expression vector did not yield detectable  14 . Active NCS enzymes are named in red, whereas tested, but functionally inactive NCS homologs are named in black. NCS homologs named in grey were not successfully isolated by RT-PCR and, thus, were not tested.

Performance of NCS variants in engineered Saccharomyces cerevisiae. Co-expression of MAO
and individual NCS variants in engineering yeast containing chromosomally intergrated DODC, 6OMT, CNMT and 4′ OMT2 genes established strains capable of converting exogenous L-DOPA to (S)-reticuline (Supplemental Fig. 6A). Twenty-four hours post-induction, the yield of (S)-reticuline was approximately 4 to 11 mg L −1 of culture yeast, with the highest and lowest levels produced by strains containing the gene encoding PBRNCS5 and CCHNCS2, respectively. Corresponding western blot analysis showed substantial differences in the abundance of each recombinant NCS variant, with the highest and lowest levels in strains expressing genes for NDONCS3 and PSONCS3, respectively (Supplemental Fig. 6B). Recombinant protein degradation was associated with all four repeated-domain variants, especially PBRNCS5 and PSONCS3. Normalization of (S)-reticuline yield with respect to the relative abundance of non-degraded NCS protein, with NDONCS3 set arbitrarily at a value of 1.0, showed that the catalytic performance of each variant positively correlated with the number of repeated domains (Fig. 7). When normalized to the relative abundance of NDONCS3, the four-domain PSONCS3 variant produced the highest level of (S)-reticuline. The three-domain PBRNCS5 and the two-domain proteins, SDINCS1 and CCHNCS2, showed progressively lower normalized production of (S)-reticuline, but all were higher than levels obtained with the single-domain enzymes TFLNCSΔ 19 and NDONCS3.

Discussion
A survey of NCS orthologs in available transcriptome databases of diverse BIA-producing plant species 28 was pursued on the basis that novel NCS variants might potentially enhance the relatively low yield of (S)-norcoclaurinederived alkaloids in engineered microorganisms [20][21][22] . An unexpected outcome of this survey was the detection of several genes encoding fusion proteins with tandem repeats consisting of 70-170 residues. In some fusions, the repeated domains contained highly conserved amino acids found in all active single-domain proteins, including four identified catalytic residues 14 . However, other fusions consisted of repeated domains that did not include complete sets of catalytic residues and were inactive. The natural occurrence of NCS fusions was confirmed by the amplification of cDNAs encoding proteins containing two, three or four catalytic domains. The occurrence of tandem repeats is not uncommon with approximate 14% of all proteins reported to feature duplicated domains 33 . and PSONCS2 were previously reported as active NCS enzymes 8 . The unpublished sequences for AmNCS1 and AmNCS2 were deposited in NCBI database (GenBank accession numbers ACJ76785.1 and ACJ76787, respectively). PBRNCS and PBRNCS1 were previously reported as putative NCS enzymes 42  The exclusive detection of contigs representing tandem domain repeats in the assembled 454 and Illumina transcriptome databases of all species in the Papaveraceae, and the absence of detected multiple-domain proteins in members of the related Berberidaceae, Menispermaceae or Ranunculaceae families, suggests that the occurrence of NCS fusions is taxonomically restricted. The current absence of genomic sequence data for plants in Ranunculales order precludes an unequivocal conclusion regarding the distribution or origin of NCS fusions. Moreover, the number of tandem repeats detected in each amplified cDNA was dependent on the selection of RT-PCR primers, the design of which was based on available assembled RNA-seq data. Nevertheless, several important conclusions can still be drawn. First, NCS unequivocally occurs as fusions of two or more tandem domains in most, if not all, members of the Papaveraceae. Second, the differential arrangement of repeated  -O-methyltransferase 2 (4'OMT2) were stably integrated into the yeast genome. Genes encoding a monoamine oxidase (MAO) and the NCS variants were transiently expressed from a pESC vector. The level of (S)-reticuline produced by each strain was measured in the culture medium. The relative abundance of each NCS variant was determined by performing immunoblot analysis on total protein extracted from a standard volume of each culture adjusted to consistent cellular density. (S)-Reticuline production was normalized to the level of NCS in TFLNCSΔ 19, which was arbitrarily set at a value of 1. Values represent the mean ± standard deviation of 4 independent replicates.
Scientific RepoRts | 6:39256 | DOI: 10.1038/srep39256 domains and the variable length of intervening regions between the conserved catalytic units suggest an independent evolutionary origin for each fusion. Third, fusions consisting of two, three and four repeated domains were confirmed to possess NCS activity.
NCS is the second known example of an enzyme fusion involved in BIA metabolism. Recently, fusion between a cytochrome P450 (CYP) monooxygenase and an NADPH-dependent aldo-keto reductase (AKR) was shown to catalyze the S to R stereochemical inversion of reticuline as the gateway reaction to morphinan alkaloid biosynthesis in opium poppy 34,35 . The CYP-AKR fusion was also detected in Persian poppy (Papaver bracteatum), which accumulates the morphinan alkaloid thebaine. Mechanistically, the unstable iminium ion intermediate formed during the epimerization of reticuline would be prone to tautomerization, and furthermore could be oxidized to undesirable byproducts. The fusion of CYP and AKR could ensure protection of the cellular environment against reactive intermediates. However, the CYP-AKR fusion is inherently different compared with NCS fusions in that it combines two enzymes catalyzing independent conversions, whereas repeated NCS domains all catalyze the same reaction.
NCS fusions might have arisen as a mechanism to enhance the efficiency of the enzyme, as suggested from kinetic analysis of variants with single and double catalytic domains (Table 1; Supplemental Figure 4). Despite playing a pivotal role in BIA biosynthesis, NCS is a relatively inefficient catalyst, which has been suggested as a salient feature considering its role as a 'gatekeeper' enzyme responsible for regulating entry into alkaloid metabolism 26 . However, NCS must still generate enough (S)-norcoclaurine to support the production of abundant alkaloids, such as morphine, noscapine, and papaverine in opium poppy. One potential mechanism to increase enzyme function is the fusion of multiple catalytic domains, each of which exhibits 'gatekeeper' qualities such as low substrate affinity (i.e. high K m ), but collectively operate as a single polypeptide with increased catalytic efficiency (i.e. high K cat ) proportional to the number of sequentially repeated domains. Kinetic data were measured using CD spectroscopy to detect the chirality of the norcoclaurine reaction product. A racemic mixture of norcoclaurine enantiomers produced non-enzymatically 30 does not display a CD signal; thus, any spontaneous background condensation of dopamine and 4-HPAA will not affect kinetic values determined by CD spectroscopy. The measured K m values for dopamine and HPAA are consistent with our previous results 7,10 , but are substantially divergent from those in a recent report highlighting the inefficiency of NCS and potential solutions for metabolic engineering applications 26 . Comparison of the 4-HPAA and dopamine K cat values for CCHNCS2, containing two fused NCS domains, with those of NDONCS3 and TFLNCSΔ 25, each possessing only single domains, is also in agreement with the proposed proportional increase in the catalytic efficiency of NCS fusions.
The crystallographic structure of TFLNCS showed that the entrance to the catalytic cavity is formed by Y 108 , Y 131 , Y 139 and E 103 14 . Sequence alignment of single-domain and individual repeated regions of multiple-domain proteins (Supplemental Fig. 3) showed the absolute conservation of Y 108 and Y139 in all fourteen NCS polypeptides. However, Y 131 was replaced by F 136 in XSINCS1, whereas E 103 was substituted with Q 98 in XSINCS1 and V 95 in NDONCS3, and by alanine in five other active enzymes, indicating some flexibility with respect to the residues that form the opening to the catalytic tunnel. The four proposed catalytic residues in TFLNCS (i.e. Y 108 , E 110 , K 122 and D 141 ) were absolutely conserved in all active NCS peptides reported in this work. Conservation of additional residues in active NCS variants might provide an extension to the salient features in PR10/Bet v1 proteins associated with NCS activity. For example, despite relaxed substrate specificity for the native substrate 4-HPAA, TFLNCS in unable to convert various α -substituted aldehyde analogs owing to the distal proximity of I 143 to the α carbon of 4-HPAA 16 . Interestingly, I 143 is conserved in all active single-and multiple-domain variants of NCS (Supplemental Fig. 3), suggesting that I 143 could be a target to engineer broader substrate specificity for aldehyde substrates.
Genes encoding NCS appear to have been the target for substantial and potentially independent duplication in the genomes of BIA-producing plants. In the Papaveraceae, gene duplication has apparently resulted in the emergence of active fusion proteins with higher enzyme efficiency proportional with the number of repeated catalytic domains. However, multiple-domain fusions of NCS could represent an evolutionary mechanism to enhance the catalytic efficiency of an enzyme that is potentially recalcitrant to improved catalytic performance via protein engineering. The low amino acid sequence identity among active enzymes from different plant families suggests that natural genetic variation has not resulted in a substantial improvement in the kinetic parameters of NCS. A marked improvement in catalytic efficiency among three enzymes (i.e. NDONCS3, TFLNCS2Δ 25, and CCHNCS2) appears to correlate only with the number of fused catalytic domains ( Table 1). The common occurrence of NCS orthologs encoding inactive variants, including fusions lacking key catalytic domains, suggests an error-prone gene duplication process or low selection pressure to maintain the functional integrity of gene family members. A wide range in ploidy levels among members of the Papaveraceae might also account for the large variation in NCS sequences.
One initial objective of this study was the isolation of superior NCS variants as alternatives to the enzymes (e.g. TFLNCSΔ 19 and PSONCS2) previously used in synthetic biology applications [20][21][22] . Interestingly, TFLNCSΔ 19 showed low activity in yeast whereas all other NCS variants yielded consistently higher levels of [ 14 C]-norcoclaurine in soluble yeast protein extracts (Fig. 6). Variations in NCS activity could result from differences in transcript or protein abundance, or from one of several forms of post-translational modification in yeast. The discovery of enzyme variants with improved functional characteristics in heterologous hosts is generally an empirical process. However, the availability of correlative information linking amino acid sequence, protein architecture and functional performance could substantially streamline the selection of candidates.
A positive correlation between the number of repeated domains and the apparent catalytic performance of NCS has important implications for the engineering of BIA metabolism in microorganisms. With a relatively high K m for both dopamine and 4-HPAA, and potential limitations in cellular substrate concentrations owing to the chemical reactivity and possible toxicity of these compounds, strategies to increase BIA production in yeast should benefit from an increase in the catalytic efficiency of NCS. The natural fusion of multiple catalytic domains as an apparent evolutionary adaptation to enhance (S)-norcoclaurine formation in plants could be deployed for a similar purpose in microorganisms. The ability of NCS variants with repeated catalytic domains to facilitate the biosynthesis of (S)-reticuline in yeast cultures with yields similar to those obtained using single-domain proteins, but in some cases with substantially reduced cellular levels of NCS, is in agreement with the progressively higher catalytic efficiencies associated with the number of fused enzymatic subunits. Increasing the steady-state abundance of multiple-domain NCS variants in engineered yeast could increase the reported low yield of (S)-reticuline in yeast 20,21 and, subsequently, facilitate the improved production of derived metabolites such as morphine 23  Selection of NCS candidates. Full-length NCS candidate genes were identified through BLASTx analysis of PhytoMetaSyn Project databases using Papaver somniferum NCS1 (PSONCS1) and Thalictrum flavum NCS (TFLNCS) as query sequences. Repetitive patterns were calculated using the RADAR (Rapid Automatic Detection and Alignment of Repeats, www.ebi.ac.uk/Tools/pfa/radar) algorithm 37 . First-strand cDNA synthesis was performed on total RNA isolated from each plant species using Moloney murine leukemia virus reverse transcriptase and an oligo-dT primer. Double-strand cDNAs encoding full-length NCS candidates were amplified by PCR (30 cycles each consisting of 94 °C for 30 sec, 52 °C for 30 sec, and 72 °C for 2 min) using specific forward and reverse primers (Table S1). PCR reactions contained dNTPs at a concentration of 0.3 mM each, 0.3 mM of each primer, 50 ng of template DNA, 5 x KAPA Hifi reaction buffer, and KAPA Hifi DNA polymerase (Kapa Biosystems). Amplicons were cloned in the pGEM-T easy vector and used as a template for subsequent PCR under the same conditions. To clone the full-length coding region of native NCS candidate genes into the pET29b expression vector, primers were designed to include flanking 5′ HindIII and 3′ BamHI or XhoI restriction sites (Table S2). A synthetic gene encoding full-length SDINCS1 was codon-optimized for expression in Escherichia coli. Truncated genes encoding NCS candidates lacking the first 25 amino acids of the nascent protein were generated using forward primers annealing to sequences 75 nucleotides from the native transcription start sites (Table S3). Amplicons were inserted into between HindIII and BamHI or XhoI restriction sites of pET29b using T4 DNA ligase (Invitrogen). Ligations were used to transform E. coli strains BL21 pLysS or ER2566 pLysS.

Expression of NCS candidates in Escherichia coli.
Fresh overnight E. coli cultures harboring pET29b expression constructs encoding full-length or truncated NCS candidate enzymes were used to inoculate 250 mL of lysogeny broth (LB) medium containing kanamycin (50 μ g/mL) and chloroamphenicol (35 μ g/mL). Cultures were grown at 37 °C to an A 600 of 0.5 with shaking. Subsequently, isopropyl β -D-thiogalactoside (IPTG) was added to a final concentration of 0.5 mM. Culture growth was continued for 4.5 h at 37 °C. For the production of soluble PSONCS3, cultures were induced with 0.5 mM IPTG overnight at 4 °C. Bacteria were harvested by centrifugation at 8000 × g for 10 min and re-suspended in cold 20 mM Tris, pH 7.5, 100 mM KCl, 10% (v/v) glycerol. Cells were disrupted by sonication, and proteins were separated into soluble and insoluble fractions by centrifugation.
Expression of NCS candidates in Saccharomyces cerevisiae. The synthetic SDINCS1 gene included a C-terminal His 6 -tag and was flanked by NotI and SacI restriction sites for direct insertion into the pESC-leu2d yeast expression vector (Agilent). C-terminal His 6 -tags were fused to other NCS candidates by re-amplifying NCS gene candidates by PCR using reverse primers that included sequences encoding the His 6 -tag (Table S4). Amplicons were ligated into pESC-leu2d using NotI and BglII, NotI and SpeI, SpeI and PacI, or NotI and SacI restriction enzyme sites, and the resulting yeast expression vectors were used to transform Saccharomyces cerevisiae strain YPH 499 38 . A single transformed yeast colony was used to inoculate 2 mL of synthetic complete (SC) medium lacking leucine, but containing 2% (w/v) glucose, and grown overnight at 30 °C and 200 rpm. A flask containing 50 mL of SC medium lacking leucine, but containing 1.8% (w/v) galactose, 0.2% (w/v) glucose and 0.1% (w/v) raffinose, was inoculated with 1 mL of the overnight culture and grown at 30 °C and 200 rpm for approximately 55 h. Yeast cells were collected by centrifugation and suspended in 3 mL of 50 mM phosphate buffer, pH 7.3. Cells were lysed by sonication, cell debris was removed by centrifugation at 20,000 × g at 4 °C for 30 min, and the supernatant was used for enzyme assays.

SDS-PAGE and immunoblot analysis.
Recombinant proteins were fractionated by SDS-PAGE using a 12% (w/v) polyacrylamide gel. Immunoblot analysis was performed using an anti-His 6 antibody and chemoluminescent detection with SuperSignal West Pico substrate (Thermo Scientific). Blots were developed on Kodak XAR film. NCS enzyme assays. NCS enzyme assays were performed as described previously 8 . Briefly, reaction mixtures containing recombinant protein, 1 nmol [8-14 C]-dopamine and 10 nmol 4-HPAA were incubated for 1.5 h at 37 °C. Reactions were subsequently spotted onto silica gel 60 F 254 TLC plates and developed in n-butanol:acetic acid:water (4:1:5, v/v/v). Autoradiogram visualization and analysis was performed using a Molecular Imager PharosFX system and Quantity One software (Bio-Rad).
Purification of NCS candidate proteins from Escherichia coli. NDONCS3, TFLNCS2Δ 25, CCHNCS1 ligated into the NdeI and XhoI sites of pET29b vector were transformed into the E. coli strain Rosetta (DE3). A fresh overnight culture was used to inoculate 250 mL of LB medium containing kanamycin (50 μ g/mL) and chloroamphenicol (35 μ g/mL). The culture was grown at 37 °C to an A 600 of 0.5 with shaking. IPTG was subsequently added to a final concentration of 0.5 mM and cultures were grown for an additional 15-17 h at room temperature. Cells were harvested by centrifugation at 11,300 × g for 20 min at 4 °C. All purification steps were performed at 0-4 °C. Approximately 2-3 g of the cellular pellet was suspended in 40 mL of buffer A (50 mM potassium phosphate, pH 7.5, 50 mM NaCl, 5 mM imidazole) containing lysozyme (1 mg/mL) and DNase I (5 μ g/mL). After incubation for 15 min at room temperature and an additional 20 min on ice, the cell suspension was lysed by sonication. Cellular debris was removed by centrifugation at 20,000 × g for 30 min at 4 °C. The supernatant was mixed with approximately 1 mL of TALON metal affinity resin (Clontech), shaken gently for 20 min and loaded onto a fritted column (1 × 15 cm). The flow through was removed and 30 mL of buffer A was used to wash out weakly bound contaminant proteins. The target protein was recovered by consecutive elution using 10 mL of buffer A containing 10-100 mM imidazole at a gravity-controlled flow rate. Fractions containing pure protein, as determined by SDS-PAGE, were collected, dialyzed against phosphate buffer (50 mM sodium phosphate, pH 7.5) and stored at − 20 °C. Protein concentration was determined by absorbance at 280 nm.
Kinetic analysis of purified NCS enzymes. Enzyme kinetic data was obtained by monitoring the formation of (S)-norcoclaurine using circular dichroism (CD) spectroscopy 10 . The data was processed in the program GraphPad prism. Briefly, the experiments were performed on a Jacsco J-715 spectropolarimeter (JASCO Corporation), by monitoring the increase of ellipticity at 285 nm, which corresponds to the formation of (S)-norcoclaurine. All the reactions were monitoring in a 0.1 cm quartz cell. Molar ellipticity of 12,541 mdeg cm −1 M −1 was used to calculate the concentration of (S)-norcoclaurine 10  Phylogenetic analysis. Sequence editing, sequence alignments and a phylogenetic tree obtained from PHYML analyses were performed with the Geneious Pro 5.6 software package (Biomatters). ClustalW2 was used to perform multiple sequence alignments of the amino acid sequences associated with NCS candidates and query sequences 39 . A consensus tree was generated from 100 samples to obtain bootstrap values.
Engineered Saccharamyces cerevisiae culture and substrate feeding. Four independent colonies of each yeast strain transformed with pEV-MAO constructs containing NCS variants were used to inoculate individual wells of a 96-well plate, with each well containing 600 μ L of SD-His-Ura-Leu medium supplemented with 2% (w/v) glucose. Yeast cultures were grown overnight at 30 °C on a gyratory microplate shaker at 900 rpm. Subsequently, 30 μ L of the original culture was transferred to 600 μ L of SD-His-Ura-Leu containing 2% (w/v) galactose and 1 mM L-DOPA and cultures were grown at 30 °C for 24 h on a gyratory microplate shaker at 900 rpm. For analysis, 5 μ L of the supernatant from each culture were subjected to high-resolution mass spectrometry analysis. Yeast cultures were adjusted to a consistent cellular density and volume, using fresh culture medium when required, and cells were collected directly in the 96-well culture plate by centrifugation. Pelleted cells were re-suspended in SDS-PAGE loading buffer and boiled for 5 min and 5 μ L of lysate were subjected to immunoblot analysis using an anti-His 6 antibody and chemoluminescent detection with SuperSignal West Pico substrate (Thermo Scientific). Blots were imaged using a GE Amersham Imager 600 (GE Healthcare Life