Biosynthesis of strychnine

Strychnine is a natural product that, through isolation, structural elucidation and synthetic efforts, shaped the field of organic chemistry. Currently, strychnine is used as a pesticide to control rodents1 because of its potent neurotoxicity2,3. The polycyclic architecture of strychnine has inspired chemists to develop new synthetic transformations and strategies to access this molecular scaffold4, yet it is still unknown how plants create this complex structure. Here we report the biosynthetic pathway of strychnine, along with the related molecules brucine and diaboline. Moreover, we successfully recapitulate strychnine, brucine and diaboline biosynthesis in Nicotiana benthamiana from an upstream intermediate, thus demonstrating that this complex, pharmacologically active class of compounds can now be harnessed through metabolic engineering approaches.

Strychnine is a natural product that, through isolation, structural elucidation and synthetic efforts, shaped the field of organic chemistry. Currently, strychnine is used as a pesticide to control rodents 1 because of its potent neurotoxicity 2, 3 . The polycyclic architecture of strychnine has inspired chemists to develop new synthetic transformations and strategies to access this molecular scaffold 4 , yet it is still unknown how plants create this complex structure. Here we report the biosynthetic pathway of strychnine, along with the related molecules brucine and diaboline. Moreover, we successfully recapitulate strychnine, brucine and diaboline biosynthesis in Nicotiana benthamiana from an upstream intermediate, thus demonstrating that this complex, pharmacologically active class of compounds can now be harnessed through metabolic engineering approaches.
Strychnine-a complex monoterpene indole alkaloid-was isolated in 1818 from the seeds of Strychnos nux-vomica (poison nuts) 5 , which were used in traditional medicine in China and South Asia. Currently, strychnine is used as a pesticide 1 because of its neurotoxicity, which is mediated by high-affinity binding to the glycine receptor 2,3 . Approximately 130 years after its isolation, the structure of strychnine was independently elucidated by Robinson in 1946 (refs. 6,7 ) and Woodward in 1947 (ref. 8 ). Robinson noted that 'for its molecular size, it is the most complex substance known' 9 . For centuries, strychnine had a large role in the field of chemistry through its isolation, structural elucidation and synthesis ( Supplementary Fig. 1). Its polycyclic architecture inspired chemists to develop new synthetic transformations and strategies, and ultimately led to a number of total syntheses 4 since the first seminal total synthesis in 1954 (ref. 10 ). Surprisingly, it is still unknown how plants create this complex structure. Here we report the biosynthetic pathways of strychnine, brucine and diaboline.
A partial biosynthetic hypothesis of strychnine was proposed in 1948 (ref. 11 ), which was substantiated by feeding studies of radioisotope-labelled substrates in S. nux-vomica [12][13][14][15] . These labelling studies demonstrated that, like all monoterpene indole alkaloids, strychnine 10 originates from tryptophan and geranyl pyrophosphate 13 . These starting materials are converted to two central intermediates, first geissoschizine 1 and then, through a series of unknown steps, to Wieland-Gumlich aldehyde 6 (refs. 14,15 and Fig. 1; see Supplementary  Fig. 2 for full biosynthetic hypothesis). Wieland-Gumlich aldehyde 6 has been proposed to be converted to strychnine 10 through the incorporation of acetate to form the piperidone moiety, although the mechanism of acetate incorporation and ring cyclization has remained unclear 12,13 (ring G in Fig. 1; see Supplementary Fig. 3 for carbon and ring annotations). Subsequent hydroxylations and methylations of strychnine 10 would yield brucine 15 (ref. 16 and Fig. 1).
To identify strychnine biosynthetic genes, we selected two members of the Strychnos genus (family: Loganiaceae), one known producer of strychnine 10, S. nux-vomica 17 and one non-producer, Strychnos sp. 18 , to investigate this biosynthetic pathway. Metabolic analysis of S. nux-vomica revealed the presence of several strychnos alkaloids, including strychnine 10, isostrychnine 11, β-colubrine 13 and brucine 15, all of which accumulate in the roots ( Supplementary Fig. 4). These alkaloids were absent in the non-producer, although a biosynthetically related compound, strychnos alkaloid diaboline 8, was detected in its roots and stems ( Supplementary Fig. 5). We generated tissue-specific RNA-sequencing data from these two plants to enable gene discovery.
The biosynthetic pathway of geissoschizine 1 from tryptophan and geranyl pyrophosphate has been completely elucidated in the phylogenetically related plant Catharanthus roseus (family: Apocynaceae) (see Supplementary Fig. 6 for the phylogenetic relationship of C. roseus and S. nux-vomica). C. roseus produces monoterpene indole alkaloids unrelated to strychnine 19 . A homologue for each biosynthetic gene in the geissoschizine 1 pathway was readily identified in the S. nux-vomica transcriptome, suggesting that the biosynthetic pathway of geissoschizine 1 is conserved in C. roseus and S. nux-vomica. These genes are all expressed preferentially in S. nux-vomica roots (Supplementary Fig. 7), consistent with previous feeding studies that suggest strychnine 10 biosynthesis occurs primarily in the roots 12,13 . Candidate genes for subsequent steps were selected according to three criteria: (1) high expression in the roots of S. nux-vomica (fragments per kilobase of transcript per million mapped reads (FPKM) ≥ 20); (2) co-expression with putative upstream genes; and (3) genes that could encode proteins with catalytic functions that are consistent with the chemical logic of our hypothesized biosynthetic pathway (Fig. 2).
The chemical steps for transformation of geissoschizine 1 to Wieland-Gumlich aldehyde 6 are not known. However, given the structural similarity between the Wieland-Gumlich aldehyde 6 and the known early alkaloid intermediate dehydropreakuammicine 2 (ref. 19 and Fig. 3a), chemical logic suggests that Wieland-Gumlich aldehyde 6 could form from dehydropreakuammicine 2 through ester hydrolysis, decarboxylation, oxidation and reduction ( Supplementary Fig. 2). If this hypothesis is correct, S. nux-vomica should contain a homologue of geissoschizine oxidase, which has also been isolated from C. roseus (CrGO). In vitro, CrGO converts geissoschizine 1 to akuammicine 3,

Article
presumably through the spontaneous deformylation of dehydropreakuammicine 2 (ref. 19 , Fig. 3a and Supplementary Fig. 2). A BLAST search using CrGO as the query against the S. nux-vomica transcriptome identified one hit (transcript cluster 4032.29856; CYP71AY6) with 46% amino-acid sequence identity ( Supplementary Fig. 8) that showed similar expression profiles with upstream biosynthetic gene candidates (Fig. 2a). We expressed this gene in N. benthamiana leaves through Agrobacterium tumefaciens-mediated transient expression followed by infiltration of geissoschizine 1. Liquid chromatography-mass spectrometry analysis of leaf extracts revealed the deformylation product of dehydropreakuammicine, akuammicine 3 ( Fig. 3b and Extended Data Fig. 1). Therefore, cluster 4032.29856 was named SnvGO.
Because it is known that decarboxylation of a methyl ester can be triggered by ester hydrolysis 20 , we speculated that an α/β hydrolase 20,21 would hydrolyse the ester moiety of dehydropreakuammicine 2 and therefore lead to decarboxylation before spontaneous deformylation to akuammicine 3 occurs. This would result in the formation of the strychnos alkaloid norfluorocurarine 4 ( Fig. 3a and Supplementary  Fig. 2). On the basis of a co-expression analysis using SnvGO as bait, we initially selected five α/β hydrolases (r ≥ 0.95, Pearson correlation coefficient) for functional characterization (Fig. 2b). Each was tested in N. benthamiana along with SnvGO and geissoschizine 1 as substrate. Two of these candidates (clusters 4032.2064 and 4032.2781) led to the production of norfluorocurarine 4, along with substantially  decreased levels of the deformylation product akuammicine 3 ( Fig. 3b and Extended Data Fig. 1). Therefore, we named these two α/β hydrolases norfluorocurarine synthase 1 and 2 (SnvNS1 and SnvNS2). SnvNS1 and SnvNS2 share 74% identity at the protein level and showed the same reactivity in the N. benthamiana transient-expression system. We used SnvNS1 in all subsequent experiments.
To convert norfluorocurarine 4 to Wieland-Gumlich aldehyde 6, a hydroxylase and a reductase are required to install the C18 hydroxyl group and reduce the 2,16 double bond, respectively (Fig. 3a). A total of five cytochrome P450 proteins 22 and four medium-chain dehydrogenase/reductases (MDRs) 23 that were co-expressed (r ≥ 0.95) ( Fig. 2b) with SnvGO were initially considered, because these two protein families are often involved in alkaloid biosynthesis. Because the order of hydroxylation and reduction is unknown, combinatorial transient-expression experiments in N. benthamiana [24][25][26] were adopted. Simultaneous expression of all candidate cytochrome P450 proteins and MDRs in N. benthamiana leaves combined with SnvGO and SnvNS1 indeed resulted in the consumption of norfluorocurarine 4 and production of Wieland-Gumlich aldehyde 6 (Fig. 3a). Co-infiltration of one cytochrome P450 (cluster 4032.6332; CYP71A144) along with SnvGO, SnvNS1 and geissoschizine 1 in N. benthamiana leaves produced a hydroxylated product 18-OH norfluorocurarine 5 that co-eluted with   Article the synthetic standard ( Fig. 3b and Extended Data Fig. 2). Intermediate 5 is consumed after one candidate MDR (cluster 4032.5004) is added to the co-infiltration experiments and the accumulation of Wieland-Gumlich aldehyde 6 is observed (Fig. 3b and Extended Data Fig. 3). Therefore, we named this cytochrome P450 norfluorocurarine oxidase (SnvNO) and the MDR Wieland-Gumlich aldehyde synthase (SnvWS). Notably, in planta and in vitro assays showed that SnvWS could reduce the 2,16 double bond in both norfluorocurarine 4 and 18-OH norfluorocurarine 5 (Fig. 3a, Extended Data Fig. 3 and Supplementary Fig. 9). Stereoselective reduction by SnvWS is probably initiated by the tautomerization of the enamine moiety in 4 and 5 through protonation at the α face, followed by NADPH reduction at the β face. The subsequent spontaneous cyclization between the C18-OH and C16 aldehyde, possibly facilitated by the conformational flexibility of the reduced substrate, forms the hemiacetal in 6 ( Supplementary  Fig. 10). In vitro steady-state kinetics indicated that SnvWS had a higher catalytic efficiency with 5 than with 4 (k cat /K m = 0.297 min −1 μM −1 for 5 compared with 0.068 min −1 μM −1 for 4) (Supplementary Fig. 11). A model of SnvWS docked with 18-OH norfluorocurarine 5 suggests that Thr95 and Ser309 in SnvWS may hydrogen bond with the C18 hydroxyl group in 18-OH norfluorocurarine 5, providing an explanation for the differences in catalytic efficiency between norfluorocurarine 4 and 18-OH norfluorocurarine 5 ( Supplementary Fig. 10). No cytochrome P450, including SnvNO, could hydroxylate desoxy Wieland-Gumlich aldehyde 7, suggesting that the order of the reactions is first oxidation to form 18-OH norfluorocurarine 5, followed by reduction.
To complete the biosynthesis of strychnine 10 from Wieland-Gumlich aldehyde 6, a new piperidone ring containing two additional carbon atoms must be installed (ring G in Fig. 1). However, the intermediates or the reaction steps for this ring construction are not known; the only clue is that the additional two-carbon unit (C22 and C23) originates from [ 14 C]acetate 12,13 . To facilitate the discovery of these cryptic late biosynthetic steps, we compared the strychnine producing and non-producing Strychnos plants. Metabolic analysis showed that the major alkaloid in the non-strychnine producer Strychnos sp. is diaboline 8 ( Supplementary Fig. 5), a compound that is most likely derived from N-acetylation of Wieland-Gumlich aldehyde 6 (Fig. 3a). Therefore, we hypothesized that S. nux-vomica and Strychnos sp. should share the same biosynthetic pathway from geissoschizine 1 to Wieland-Gumlich aldehyde 6 (Fig. 2c). Indeed, a BLAST search against the non-producer transcriptome identified orthologues SpGO (CYP71AY7, 92% amino-acid identity to SnvGO), SpNS1 (92% amino-acid identity to SnvNS1), SpNS2 (88% amino-acid identity to SnvNS2), SpNO (CYP71A145, 91% amino-acid identity to SnvNO) and SpWS (93% amino-acid identity to SnvWS). To validate the function of these genes, we expressed them in two combinations (SpGo, SpNS1, SpNO and SpWS; and SpGo, SpNS2, SpNO and SpWS) in N. benthamiana leaves with co-infiltration of geissoschizine 1. Both combinations led to the formation of Wieland-Gumlich aldehyde 6 ( Fig. 3b and Extended Data Fig. 4). The only remaining step for the biosynthesis of diaboline 8 is the acetylation of the indole amine (Fig. 3a), which in alkaloid biosynthesis is often catalysed by a BAHD acyltransferase using acetyl-CoA as an acyl donor 27 . Four BAHD acyltransferase candidates were co-expressed with all five genes (r > 0.6) (Fig. 2d). Transient expression of one candidate (SpAT) with upstream genes generated diaboline 8 in N. benthamiana (Fig. 3b and Extended Data Fig. 4).
S. nux-vomica contains an orthologue (cluster 4032.2753; SnvAT) of SpAT (85% amino-acid identity to SpAT) that is highly expressed in the roots and showed high expression correlation with previously identified genes (r ≥ 0.99 with each gene) (Fig. 2a,b). However, S. nux-vomica does not produce diaboline 8, and previous feeding studies demonstrated that diaboline 8 is not a biosynthetic precursor of strychnine 10 (ref. 14 ). We surmised that SnvAT and SpAT may have distinct enzymatic activities, and indeed, simultaneous expression of SnvAT and SnvGO, SnvNS1, SnvNO, SnvWS and geissoschizine 1 in N. benthamiana led to only trace levels of diaboline 8. However, a new compound with a mass corresponding to a malonylated product was detected in the leaf extracts, which suggested that SnvAT is a BAHD acyltransferase with predominantly malonyltransferase activity ( Fig. 3b and Extended Data Fig. 5). Although the expression of this enzyme in N. benthamiana resulted in only the partial consumption of Wieland-Gumlich aldehyde 6, we hypothesized that the conversion might be limited by the low concentration of malonyl-CoA in N. benthamiana leaves. Therefore, we expressed these enzymes along with AAE13 (Arabidopsis thaliana), a cytosolic enzyme that produces malonyl-CoA accessible to cytosolic SnvAT 28 (Supplementary Fig. 12). The addition of AAE13 and co-infiltration of the co-substrate disodium malonate to the transient-expression system resulted in a tenfold increase in the production of malonylated product (Fig. 3b and Extended Data Fig. 5). During purification, this product rapidly decomposed, so we treated the crude methanolic extracts of N. benthamiana leaves with trimethylsilyldiazomethane to methylate the carboxylic acid, followed by aldehyde reduction with sodium borohydride. The derivatized products were confirmed by comparison to synthetic standards (Supplementary Fig. 13), indicating that the SnvAT product was N-malonyl Wieland-Gumlich aldehyde 9 (Fig. 3a). Therefore, although SnvAT and SpAT share 85% amino acid identity, they have distinct catalytic activities. Phylogenetic analysis showed that SnvAT clusters with SpAT in an acetyltransferase clade, which is evolutionarily distinct from the canonical malonyltransferase clade (Supplementary Fig. 14). Homology models of SnvAT and SpAT 29 ( Supplementary Fig. 15) were used to identify one amino acid (SnvAT(R424F) and SpAT(F421R)) that controls the selectivity between acetyl and malonyl transferase activity (Supplementary Figs. 16 and 17). These models suggest that the arginine residue is responsible for the malonyl-CoA selectivity by forming a bidentate salt bridge with the carboxylate of malonyl-CoA 30,31 ( Supplementary  Fig. 18), providing a straightforward mechanistic explanation for the difference in alkaloid accumulation in these two plants. Notably, the 17-O-acylation product was predominant in in vitro assays at physiological pH ( Supplementary Fig. 19), which may be because of changes in the protein activity in a non-cellular environment or differences in the equilibration of the open and closed forms of the Wieland-Gumlich aldehyde substrate.
Notably, a trace amount of strychnine 10 and isostrychnine 11 could be detected in the methanolic extracts of N. benthamiana leaves that produce malonylated Wieland-Gumlich aldehyde 9 ( Fig. 3b and Extended Data Fig. 6). These two alkaloids accumulated and 9 decreased over time when stored at room temperature (Supplementary Fig. 22). Indeed, most of 9 was converted to strychnine 10 and isostrychnine 11 in N. benthamiana leaves that were harvested 4 weeks after infiltrating the substrates (Fig. 3b and Extended Data Fig. 6). Incubating 9 with recombinant SnvAT or N. benthamiana crude protein extracts did not accelerate the conversion of 9 to 10 ( Supplementary  Fig. 23). These experiments suggest that conversion of 9 to strychnine 10 and isostrychnine 11 could occur spontaneously both in vitro and under physiological conditions. Alternatively, heating N. benthamiana leaves at 60 °C for 2 h substantially accelerated the conversion (Fig. 3b and Extended Data Fig. 6). We think that 10 and 11 are formed through the decarboxylation of the β-keto acid moiety in 9 to form an α,β-unsaturated amide. Subsequent oxa-Michael addition by C18 hydroxyl group would generate strychnine 10. The α,β-unsaturated amide can also tautomerize to the β,γ-unsaturated amide to form isostrychnine 11 (Supplementary Fig. 24).
Previous radioisotopic labelling studies indicated that a structurally uncharacterized biosynthetic intermediate could be converted to strychnine by warming the acid extracts from S. nux-vomica roots 14,15 . The reported chemical properties of this intermediate 14,15 , which was called prestrychnine (see Supplementary Fig. 2 for the previously proposed structure), are similar to 9. Therefore, we suggest that the proposed structure of prestrychnine be revised to 9. Notably, in this feeding study the levels of radioisotope-labelled prestrychnine was 9 times higher than strychnine 10 after 3 days of feeding of S. nux-vomica with 14 C-tryptophan 14 , suggesting that the conversion of prestrychnine to strychnine 10 is a slow process in S. nux-vomica. Indeed, we screened numerous α/β hydrolases 21,32 and polyketide synthases 33 , as well as members of these two families that are known to catalyse decarboxylation of β-keto acid functionalities, and we also screened numerous transporters that could transfer prestrychnine to the vacuole where the acidic environment might accelerate the decarboxylation. However, none of these gene candidates accelerated the formation of strychnine 10 and isostrychnine 11. To establish whether conversion of prestrychnine to strychnine is a slow, non-enzymatic process in S. nux-vomica, we performed hydroponic feeding of deuterium-labelled Wieland-Gumlich aldehyde 6 to the roots of S. nux-vomica. Labelled prestrychnine 9 could be detected after 3 days, but trace amounts of strychnine 10 and isostrychnine 11 appeared only after 7 days (Extended Data Fig. 7). Collectively, these data are consistent with the previously published experiments 14,15 and with the rate of strychnine formation in our heterologous expression system. The fact that prestrychnine 9 is converted to strychnine 10 slowly in S. nux-vomica is consistent with a non-enzymatic process, although the involvement of an enzyme with only modest rate acceleration cannot be definitively ruled out.
Brucine 15, which is a dimethoxylated derivative of strychnine 10, is also highly accumulated in the roots of S. nux-vomica ( Fig. 3a and Supplementary Fig. 4). To identify the hydroxylase, 12 full-length cytochrome P450 proteins that shared a relatively high co-expression correlation with SnvGO (Pearson's r > 0.7) were selected for subsequent tests (Supplementary Table 1). When one cytochrome P450 (cluster 4032.17050; CYP82D367) was expressed in the presence of strychnine 10 in N. benthamiana, 10-OH strychnine 12 was formed (strychnine-10-hydroxylase (Snv10H)) ( Fig. 3c and Extended Data Fig. 8). The presence of β-colubrine 13 in S. nux-vomica suggests that the two methoxy groups are installed sequentially (Fig. 3a), so we next identified five methyltransferases 34 that were highly expressed in the roots of S. nux-vomica (Supplementary Table 2). Expression of one of the methyltransferases (cluster 4032.16453; SnvOMT) with Snv10H in N. benthamiana resulted in the formation of a compound corresponding to synthetic β-colubrine 13 ( Fig. 3c and Extended Data Fig. 8). None of the aforementioned 12 co-expressed cytochrome P450 proteins catalysed the hydroxylation of β-colubrine 13, but the high accumulation of the final product brucine 15 in roots led us to identify all 13 other cytochrome P450 proteins that were strongly expressed (FPKM ≥ 20) in roots (Supplementary Table 1). Of these 13 proteins, we initially targeted the 3 within the CYP71 clade ( Supplementary Fig. 25). One of these cytochrome P450 proteins (cluster 4032.16581; CYP71AH44, Snv11H)assayed in combination with strychnine, Snv10H and SnvOMT-produced brucine 15 as a major product along with trace amounts of the hydroxylated product 11-deMe brucine 14 ( Fig. 3c and Extended Data Fig. 9). When we infiltrated synthetic β-colubrine 13 into tobacco leaves that express Snv11H alone only 11-deMe brucine 14 is formed; brucine 15 is formed only in the presence of SnvOMT (Extended Data Fig. 9). In vitro and in planta assays showed that SnvOMT could also methylate 11-OH strychnine 16 to α-colubrine 17 ( Supplementary Fig. 26), and 10-deMe brucine 18 to brucine 15 ( Supplementary Fig. 27), although with lower efficiency. Overall, these results highlight the promise for production of strychnos-type alkaloids using synthetic biology approaches, although substantial optimization of the heterologous host production system is required.
Having completed the pathway of brucine 15, we then reconstituted the pathway in N. benthamiana from geissoschizine 1. We transiently expressed all of the enzymes (SnvGO, SnvNS1, SnvNO, SnvWS and SnvAT, AAE13, Snv10H, SnvOMT and Snv11H) in tobacco leaves followed by infiltrating geissoschizine 1 and disodium malonate. If the tobacco leaves were harvested 1 week after infiltrating the substrates, the accumulation of strychnine 10, isostrychnine 11, β-colubrine 13 and brucine 15 was observed ( Fig. 3d and Supplementary Fig. 28). Additionally, all of the intermediates in the pathway except for 11-deMe brucine 14 could be detected in the roots of S. nux-vomica ( Fig. 3a and Extended Data Fig. 10), suggesting that the heterologously reconstituted pathway in N. benthamiana matches the physiologically relevant pathway in Strychnos plants.
Here we report the discovery of nine enzymes that convert geissoschizine 1 to diaboline 8, strychnine 10 and brucine 11, using a combination of chemical logic, -omics datasets and enzymatic characterization. Pioneering studies of the structure and synthesis of strychnine provided the foundation for discovery of the enzymes of strychnine biosynthesis as it occurs in nature. These discoveries not only shed light on how plants produce these diverse alkaloids, but also provide a genetic basis for heterologous production of strychnos alkaloid derivatives to discover potent lead compounds through metabolic engineering approaches, providing a new challenge for synthetic biology.

Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41586-022-04950-4.