Reaction mechanism of the farnesyl pyrophosphate C-methyltransferase towards the biosynthesis of pre-sodorifen pyrophosphate by Serratia plymuthica 4Rx13

Classical terpenoid biosynthesis involves the cyclization of the linear prenyl pyrophosphate precursors geranyl-, farnesyl-, or geranylgeranyl pyrophosphate (GPP, FPP, GGPP) and their isomers, to produce a huge number of natural compounds. Recently, it was shown for the first time that the biosynthesis of the unique homo-sesquiterpene sodorifen by Serratia plymuthica 4Rx13 involves a methylated and cyclized intermediate as the substrate of the sodorifen synthase. To further support the proposed biosynthetic pathway, we now identified the cyclic prenyl pyrophosphate intermediate pre-sodorifen pyrophosphate (PSPP). Its absolute configuration (6R,7S,9S) was determined by comparison of calculated and experimental CD-spectra of its hydrolysis product and matches with those predicted by semi-empirical quantum calculations of the reaction mechanism. In silico modeling of the reaction mechanism of the FPP C-methyltransferase (FPPMT) revealed a SN2 mechanism for the methyl transfer followed by a cyclization cascade. The cyclization of FPP to PSPP is guided by a catalytic dyad of H191 and Y39 and involves an unprecedented cyclopropyl intermediate. W46, W306, F56, and L239 form the hydrophobic binding pocket and E42 and H45 complex a magnesium cation that interacts with the diphosphate moiety of FPP. Six additional amino acids turned out to be essential for product formation and the importance of these amino acids was subsequently confirmed by site-directed mutagenesis. Our results reveal the reaction mechanism involved in methyltransferase-catalyzed cyclization and demonstrate that this coupling of C-methylation and cyclization of FPP by the FPPMT represents an alternative route of terpene biosynthesis that could increase the terpenoid diversity and structural space.

www.nature.com/scientificreports/ Despite this large number and the impressive complexity and diversity of these compounds, they are commonly synthesized from only two isomers of C 5 units, isopentenyl pyrophosphate (IPP) and dimethylallyl pyrophosphate (DMAPP). Both compounds derive either from the mevalonate (MVA) or 2-C-methyl-D-erythritol 4-phosphate (MEP) pathway 3,4 . The C 5 building blocks are condensed by prenyl pyrophosphate synthases to form linear precursors of all terpenoids with different chain lengths including geranyl pyrophosphate (GPP, C 10 ), farnesyl pyrophosphate (FPP, C 15 ), geranylgeranyl pyrophosphate (GGPP, C 20 ) and their respective isomers. These precursors serve as canonical substrates for terpene synthases. The reaction involves the formation of highly reactive carbocation intermediates that undergo diverse cyclization cascades to form a broad range of compounds comprising monoterpenes (C 10 ), sesquiterpenes (C 15 ) and diterpenes (C 20 ) 2 . The resulting terpenes can undergo different additional decorating reactions, e.g. oxidation, methylation, esterification, or carbon elimination reactions, which further increase terpenoid diversity. An aspect that has not yet been deeply investigated is the biological modification of the conventional building blocks and linear isoprenoid pyrophosphate precursors prior to the cyclization cascade of the terpene cyclases, which also leads to the expansion of the terpenoid repertoire. In 2007, Dickschat et al. 5 described the first example of prenyl pyrophosphate (GPP) precursor modification in the biosynthesis of 2-methylisoborneol, a contaminant of drinking water. Later on, it was shown in more detail that during the biosynthesis of this volatile by Streptomyces coelicolor, a GPP C-methyltransferase (GPPMT) catalyzed the methylation of GPP to yield a non-canonical acyclic allylic pyrophosphate intermediate , which is the substrate of the methylisoborneol synthase 6,7 . This finding led to the use of GPPMT in synthetic biology to increase terpenoid structures with potentially new flavors or biological activities 8,9 . Additionally, the C 5 building blocks (IPP and DMAPP) can be mono-or dimethylated enabling the biosynthesis of C 11 , C 12 , C 16 , and C 17 prenyl pyrophosphates 10 . Likewise, the biosynthetic pathway of the antitrypanosomal homoterpenoid longestin from Streptomyces argenteolus includes a methyltransferase that methylates homo-IPP to produce (3Z)-3-methyl IPP which, along with IPP, is selectively accepted as extender unit by a GGPP synthase homolog to yield the dimethylated intermediate of GGPP 11,12 . Finally, in Lepidoptera the C 6 compounds homo-IPP and homo-DMAPP enable the formation of FPP analogs with 16, 17, and 18 carbons 13 .
In the last decade, it was shown that several Serratia plymuthica isolates produce the unique sesquiterpene sodorifen (1,2,4,5,6,7,8-heptamethyl-3-methylenebicyclo[3.2.1]oct-6-ene (C 16 H 26 )), a polymethylated bicyclic volatile compound, with the main producer being S. plymuthica 4Rx13 [14][15][16] . The ecological role of sodorifen is so far unknown, but its production by the bacteria can be significantly up-regulated during interaction with other microorganisms 17,18 . Transcriptome and genome analysis of S. plymuthica 4Rx13 highlighted a cluster of four genes encoding for a terpene synthase, C-methyltransferase, DOXP synthase, and IPP isomerase 19 . Knockout studies of these genes by insertional mutagenesis showed that the terpene synthase (SODS), C-methyltransferase (FPPMT), and IPP isomerase are indispensable for the biosynthesis of sodorifen 19,20 . Further studies on the biosynthesis of sodorifen revealed that FPP is methylated by a S-adenosyl methionine (SAM)-dependent C-methyltransferase and that SODS does not accept FPP as a substrate in contrast to common sesquiterpene synthases 21 . Moreover, besides transferring a methyl group to carbon 10 of FPP, the FPPMT also exhibits a cyclase activity to produce the monocyclic compound pre-sodorifen pyrophosphate (PSPP). The FPPMT is the first known enzyme that catalyzes not only the methylation of FPP but also the cyclization of a prenyl pyrophosphate precursor in the biosynthesis of terpenoids. 13 C isotope labeling experiments and NMR analysis revealed that the methylation of FPP leads to the formation of a five-membered ring, accompanied by several methyl or/and hydride shifts 21 .
To confirm the reaction mechanism of the FPPMT, we now identified this unusual cyclic prenyl pyrophosphate intermediate and elucidated the catalytic mechanism of the FPPMT involved in the formation of PSPP based on a high-quality protein model obtained from the Robetta protein folding Web Server with subsequent refinements and quantum chemical calculations. The model guided a multitude of site-directed mutagenesis studies that strongly support the correct structure of the protein model and the calculations of the catalytic mechanism. Furthermore, the absolute configuration of PSPP as predicted by the model and DFT calculations was experimentally confirmed by CD spectroscopy of the corresponding hydrolysis product pre-sodorifen.

Results
Identification of pre-sodorifen pyrophosphate (PSPP). Serratia plymuthica 4Rx13 FPP C-methyltransferase (FPPMT) exhibits an unprecedented activity during the biosynthesis of the sesquiterpene sodorifen ( Fig. 1). This enzyme catalyzes the methylation of FPP along with a cyclization reaction to produce a polymethylated compound with a five-membered ring named pre-sodorifen pyrophosphate (PSPP), which is a noncanonical substrate for the terpene synthase SODS 21 . Its molecular structure was derived by NMR analysis of  Figure 1. Biosynthesis of sodorifen from FPP catalyzed by a C-methyltransferase (MT) to afford cyclic PSPP as a non-canonical substrate for the terpene synthase (SODS). The black dot on pre-sodorifen and sodorifen indicate the methyl group transferred from SAM.
its hydrolysis product (the alcohol pre-sodorifen) that accumulates in the terpene synthase knockout mutant. However, the presence of the corresponding pyrophosphate has so far only been deduced from coupled enzyme assays 21 .
Here we employed LC-MS to unambiguously establish the presence of this intriguing cyclic prenyl pyrophosphate intermediate which has a theoretical mass of m/z 395.1 Da. Direct infusion of the enzyme assay reaction mix into the mass spectrometer (scan mode) confirmed the presence of a mass peak with m/z 395.1 Da and fragmentation of the precursor ions furnished pyrophosphate and phosphate ions as fragments with major intensity (Fig. S1). Hence, for the LC-MS detection in multiple reaction monitoring (MRM) mode, we optimized substrate-dependent parameters for the fragmentation of PSPP to phosphate. LC-MS separation of the products of the enzyme assay mixture (FPP + SAM + FPPMT) revealed three peaks corresponding to S-adenosyl methionine (SAM), FPP, and PSPP. The retention time of PSPP was close to that of FPP ( Fig. 2A). PSPP formation was FPPMT dependent since enzyme assays in the absence of FPPMT did not lead to any PSPP formation (Fig. 2B). This result corroborated with a significant reduction in the amount of FPP compared to the control enzyme assays (Fig. 2).
LC-MS analysis allowed us to compare the fragmentation patterns of putative PSPP and FPP (Fig. S2A,B). The pattern of FPP in the negative ionization mode is well known 22 and was confirmed in our analysis. Apart from the most abundant mass fragments (phosphate and pyrophosphate), there was only one carbon fragment of FPP with sufficient intensity (C 15 H 23 O 6 P 2 at m/z 363.1 Da) (Fig. S2B,C). In the fragmentation pattern of the putative PSPP, a carbon fragment at m/z 377.2 Da was observed. The additional methylene group (-CH 2 -) in the PSPP molecule resulted in a mass increase of about 14 Da compared to the corresponding carbon fragment of FPP.
Protein structure model of the Serratia plymuthica FPP C-methyltransferase. To determine the catalytic reaction mechanism of the S. plymuthica FPPMT that converts FPP to PSPP, structural knowledge of the enzyme is a prerequisite. Since so far, its crystallization was not successful, we performed in silico homology modeling. Using the Robetta protein folding Web Server 23 , a model of the tertiary structure was obtained (Fig. 3). The co-factor SAM and the substrate FPP (magenta carbon atoms) were docked into the active pocket located at the center of the model. All criteria for a reasonable structure indicated by PROSA II and a PRO-CHECK analysis were fulfilled (Figs. S3, S4). The energy graphs were all in the negative area and a combined energy z-score of -10.15 by PROSA II indicated a native-like fold of the model.
The best scored docking pose that fulfills a close distance (3.1 Å) of the SAM methyl carbon atom to the carbon atom at position 10 of FPP was used as starting structure for all the semi-empirical quantum mechanical PM7 calculations. The entire structure can be downloaded from the IPBs home page 24 . The binding site of FPP is characterized by a very narrow hydrophobic environment (Fig. 4A) and is formed by W46, F56, F58, L239, V273, and W306 (Fig. 4B). The diphosphate moiety of FPP is recognized by a magnesium ion ligated by E42 and www.nature.com/scientificreports/ forms a hydrogen bond with the protonated H45. A catalytic dyad is formed by Y39 and H191 and was identified to be of high importance for the catalytic mechanism.
Proposed catalytic mechanism of the formation of pre-sodorifen pyrophosphate. Semiempirical quantum mechanical PM7 calculations were performed based on the starting model of the active site (Fig. 4B). The calculation rate of this method is rather high, particularly when, as in this case, a molecular system with 424 atoms must be calculated. On the other hand, the average error in heat formation for PM7 has been given as less than 4 kcal/mol 25 . Thus, the method is appropriate for studying a multitude of alternative reaction pathways to predict the most likely one. The entire catalytic mechanism is summarized in Fig. 5. Based on in vitro enzyme assays and incorporation experiments with 13 C labeled precursors a reaction mechanism has previously been proposed for the formation of PSPP from FPP that commences with the transfer of the methyl group of SAM to the C10-carbon atom of FPP 21 . The energy of the optimized starting structure (1 Fig. 5) as relative reference energy was set at 0.0 kcal/mol, an activation barrier (the inversion of the leaving methyl cation) of 20.5 kcal/mol has to be passed. This rather high barrier of the initial step is not unusual for methyltransferases 26 . The energy of the C10-methylated farnesyl cation complex (2) is 3.0 kcal/mol higher than  The starting structure of the active site was used for PM7 reaction coordinate calculations. All backbone atoms of the amino acid residues were fixed during all calculations. For better visualization, non-polar hydrogen atoms are not displayed. The substrate FPP is represented by green carbon atoms and SAM is shown in magenta. Green ball close to E42 represents Mg 2+ . Except for V236, V273, and F58, all labeled amino acid residues were experimentally mutated to alanine (Tables 1, S3), which resulted in inactivation or reduction of enzymatic activity. www.nature.com/scientificreports/ the energy of the starting complex. However, in the next reaction step the dimethyl cation (C11) is in close distance (3.2 Å) to the C6-carbon atom of the double bond thus a six-membered ring (3) can be easily formed and leads to an energy gain with respect to the starting complex of -8.4 kcal/mol accompanied with a very low Figure 5. Intermediate steps of the catalytic mechanism for the formation of PSPP by the Serratia plymuthica FPP C-methyltransferase. Intermediate steps of the catalytic mechanism for the formation of PSPP by the Serratia plymuthica FPP C-methyltransferase. The PM7 calculations include all atoms of the active site model shown in Fig. 4B. For better visualization only the essential parts are displayed, thus non-polar hydrogen atoms that are not required for the understanding of the mechanism and the formed SAH in reaction steps (2)(3)(4)(5)(6)(7)(8) are not displayed in the 3D representations. Below each reaction arrow, the reaction enthalpy is given (in kcal/mol). Above each reaction arrow, the required activation energy (kcal/mol) is indicated by "#". The reaction starts with the methyl-transfer from SAM (magenta carbon atom) (1) to the C10 carbon atom of FPP (green carbon atoms). The carbon atom colored in gray attached to FPP is in all 3D figures (steps 2-8) the one that has been transferred from SAM. Red dotted lines indicate the reaction coordinates. Histidine (H191) and tyrosine (Y39) are the amino acids that form the essential dyad. www.nature.com/scientificreports/ barrier of 3.8 kcal/mol. For alternative reactions, deprotonation from either terminal methyl group seems not to be possible as this could potentially only happen from the methyl group next to the cationic center. The formation of PSPP required a conversion of the six-membered ring into a five-membered ring, with several hydrid-and methyl-shifts. The hypothesis whether the formation of an intermediate cyclopropyl with subsequent opening can lead to an energetically favored formation of the five-membered ring was tested by calculation. The distance between the C9-carbon atom to the C7-carbocation was stepwise shortened. In a distance of 1.9 Å (representing the transition state of 16.9 kcal/mol) between both atoms one of the protons at C9 jumps without barrier to the phenolic hydroxyl group of Y39. Y39 was identified as part of a catalytic dyad with H191. H191 serves as proton acceptor from Y39. Thus without barrier the intermediate cyclopropyl moiety is formed instantaneously (3 to 4). Since a cyclopropyl ring has high ring constrain the reaction is slightly disfavored by 3.5 kcal/mol with an activation barrier of 16.9 kcal/mol, thus subsequently it can be opened by reprotonation from Y39 to C8 (4) with an energy gain of -6.8 kcal/mol and a barrier of 22.9 kcal/mol to form structure 5. Subsequently, a hydride transfer from C6 to C7 (5 to 6, energy effort 4.6 kcal/mol, activation barrier 19.5 kcal/ mol) followed by a methylanion transfer of C15 from C11 to C6 (6 to 7, energy effort 2.0 kcal/mol, activation barrier 17.9 kcal/mol) led to the correct relative stereochemistry of the intermediate 7 (cis methyl groups at C6 and C7 and trans-configurations to the migrated hydrid at C7). The theoretically alternative transfer of the equatorial C12 instead of C15-C6 is disfavored by 4 kcal/mol and requires the passage of an energy barrier of 24 kcal/mol. Since a proton of C10 (where the former methyl group of SAM was attached) was in close proximity (1.86 Å) to the N-atom of H191, no other alternative pathway seemed to be favored except the obvious proton abstraction by H191 accompanied with an energy gain of − 7.7 kcal/mol to form PSPP. Altogether, the entire catalytic mechanism was thermodynamically favored by − 9.5 kcal/mol. All the positions of the carbon atoms are in agreement with the ones derived and experimentally determined including the cis-C14-C15 configuration. In summary, among the previously suggested reaction mechanisms 21 , the shown pathway in Fig. 5 is strongly favored and supported by detailed quantum mechanical calculations and the identification of the catalytic dyad Y39-H191. The graphical representation of the complete energy profile is shown in Fig. S5.
Several catalytic mechanisms and pathways as well as terpenoid formation have already been studied by quantum mechanical methods 27,28 and the application of the density functional theory like DFT (B3LYP) was demonstrated to be appropriate for carbocation reactions involving terpenes. Therefore, to evaluate and support the results of the semi-empirical PM7 calculations more advanced DFT calculations have been performed. Single point energies for all the intermediates and corresponding transition state structures (Fig. 5) were calculated using a smaller model of the active site (Fig. S6). The results are summarized in Table S2. The relative energies showed some slight differences in comparison to the PM7 calculations in between each reaction step and in the transition state energies. These differences may result from the usage of a truncated model of the active site. However, the energy gain of − 19.4 kcal/mol obtained by the DFT calculations for the entire reaction supports the suggested mechanism.
Determination of the stereochemistry of pre-sodorifen pyrophosphate by circular dichroism spectroscopy. The enzymatic cyclization of the achiral FPP to the chiral PSPP by the FPPMT involves the induction of chirality, which facilitates the comparison of the predicted and experimental data. Based on the quantum mechanical calculations of the catalytic reaction pathway the absolute stereochemistry of PSPP was predicted to be 6R,7S,9S, which prompted us to determine its absolute stereochemistry using circular dichroism spectroscopy.
Since PSPP has not yet been isolated, CD spectroscopy was performed with the corresponding alcohol pre-sodorifen. Its production is strongly upregulated in the S. plymuthica terpene synthase (SODS) knockout mutant 20 . The CD-spectrum of pre-sodorifen in 6R,7S,9S configuration was calculated with DFT using the B3LYP functional and subsequently compared with the experimentally obtained one. The CD spectrum of the natural product matched well with those calculated for (6R,7S,9S)-pre-sodorifen (but not with those of the 6S,7R,9R enantiomer) (Fig. 6). This is in agreement with the absolute stereochemistry predicted by in silico modeling of the reaction mechanism of the FPP C-methyltransferase.
Experimental validation of the structure model by site-directed mutagenesis. The in silico homology modeling identified amino acids that form the active pocket and the modeling studies of the catalytic reaction mechanism suggested their putative functions. The importance of these amino acids was subsequently confirmed by site-directed mutagenesis. Almost all amino acid residues of the active site (Fig. 4B) were experimentally replaced by alanine by a single nucleotide mutation. The resulting mutant enzymes were tested in double (FPPMT + SODS, product: sodorifen) or coupled enzyme assays (FPPMT + alkaline phosphatase, product: pre-sodorifen). In total twenty-six amino acids were mutated to alanine (Tables 1, S3). Among these mutations, 12 were located outside of the active site and did not show any change in enzyme activity (Figs. S7, S8, S9). All 14 mutations located within the active center led to a strong reduction or loss of enzyme activity as almost no products (sodorifen or pre-sodorifen) could be detected (Table 1, Figs. S10, S11). Mutations of the amino acids W46, F56, and W306, proposed to stabilize the carbocation and form a hydrophobic environment in the binding site of FPP, resulted in completely inactive enzymes. Also, the enzyme assays performed with the mutated amino acids of the catalytic dyad, Y39 and H191, did not lead to any product. Altogether, the GC-MS analysis of the assay products obtained from all these 5 mutant enzymes showed no detectable sodorifen or pre-sodorifen (Figs. S10, S11). H45 and E42 were proposed to be involved in magnesium cation binding which interacts with the diphosphate moiety of FPP. Their mutation to alanine led to an inactive enzyme (E42A) or drastically reduced activity (H45A). In the case of E42A there were no detectable products upon GC-MS analysis while H45A results in ca. 25% of sodorifen and pre-sodorifen compared with the product profile of the wild type enzyme ( www.nature.com/scientificreports/ over, mutations of the amino acids L239 and C241, located in the hydrophobic active pocket near the farnesyl moiety (particularly L239), also resulted in a strong reduction of enzymatic activity. Only ca. 20% of sodorifen and pre-sodorifen was produced by these mutant enzymes. Interestingly, the mutant enzymes L239A and C241A led to the production of six different but new compounds (#1, 2#, and 3# in the double assays and #4, 5#, and 6# in the coupled assays) (Figs. S12, S13). However, the molecular structures of these potential derailment products remain so far unidentified.

Discussion
Pre-sodorifen pyrophosphate (PSPP) represents an unprecedented cyclic prenyl pyrophosphate proposed to serve as intermediate in the biosynthesis of the sesquiterpene sodorifen 21 . Using LC-MS analysis it was now possible to detect and analyze this biosynthetic intermediate for the first time. After fragmentation in the MS, the major carbon fragment of PSPP exhibited a mass increase of 14.1 Da compared to FPP, indicating the presence of an additional methylene group (-CH 2 ). Ion pair chromatography showed that the retention time of the PSPP is very close to that of FPP. This is due to the identical phosphate content and similar lipophilicity, as the latter are critical separation parameters of the ion pair chromatography. Moreover, the amount of FPP was significantly reduced (Figs. 2, Fig. S3) in the assay mixture incubated with FPP C-methyltransferase (FPPMT). This strongly indicated that PSPP formation is due to the conversion of FPP by the FPPMT enzyme. In conclusion, the identification of  www.nature.com/scientificreports/ PSPP is in good agreement with the proposed reaction pathway involving both methylation and cyclization of FPP catalyzed by the SAM-dependent FPPMT 21 .
In silico modeling of the catalytic mechanism of FPPMT showed that the biosynthesis of PSPP is initiated by the formation of a carbocation due to the S N 2-like C-methylation of FPP. Carbocation formation is the typical initiation reaction for terpenoid biosynthesis in plants and microorganisms. Its formation is commonly initiated by conserved aspartate-rich motifs, present in the active site of the terpene cyclases, that bind cation and trigger the departure of the diphosphate moiety of the FPP to form a carbocation that will undergo the cyclization cascade 2 . The aspartate-rich motifs were not identified in FPPMT and although E42 and H45 are suggested to bind a magnesium cation that interacts with the diphosphate moiety of FPP, the carbocation is generated by methylation instead of elimination of the diphosphate moiety. Moreover, the subsequent cyclization cascade of this carbocation towards the formation of the five member-ring of PSPP is guided by the His-Tyr (H191 and Y39) catalytic dyad localized near the reactive methyl group of SAM. A His-Tyr dyad closely located to the reactive methyl group of SAM was also reported for the phosphoethanolamine N-methyltransferase from Plasmodium falciparum and is required for the methylation of phosphoethanolamine to phosphocholine. In the latter case, no cyclization reaction occurred and the dyad formes a catalytic lid that locks ligand in the active site while D128 deprotonated the substrate via a bridging water molecule followed by a typical SN 2 -type methyl transfer from SAM to phosphoethanolamine 29,30 . The only known methyltransferase with cyclization activity is the TleD methyltransferase of Streptomyces blastmyceticus 31 . The X-ray crystal structure of TleD methyltransferase, which induces a cyclization reaction after methylation in the biosynthesis of the antibiotic teleocidin B, requires a hydrogen bond between Y21 and H157. Both amino acids were shown not to be involved in the cyclization cascade but play the most important role in maintaining the local protein fold which is important for the TleD enzyme activity 32 . Therefore, the catalytic dyad of FPPMT is highly specific because the acceptance of a proton by the dyad does not terminate the reaction but instead, the proton is used to (re)-protonate the cyclopropyl moiety, which initiates ring opening followed by another rearrangement cascade like in terpenoid cyclization reactions (Fig. 5). Histidine and tyrosine perform general acid-base mediated catalysis in the cyclization mechanism of terpene cyclases. In the active site of the tobacco 5-epi-aristolochene synthase, for example, Y520 forms with two aspartate residues a catalytic triad, involved in the cyclization reaction of germacrene A during the biosynthesis of the sesquiterpene 5-epi-aristolochene 33 . Likewise, the side chain of H232, an essential and conserved residue among the oxidosqualene cyclases, is suggested as the only base that accepts a proton in the active site of lanosterol synthase to terminate the cyclization cascade during the biosynthesis of the triterpene lanosterol 34 . Besides this catalytic dyad (H191 and Y39), the amino acids W46, W306, F56, and L239 of the FPPMT create a hydrophobic environment for the substrate's lipophilic tail (farnesyl moiety) and the aromatic side chains of F56, W306 and W46 may stabilize the carbocation intermediate during the rearrangement reactions. This is also consistent with the binding site of prenyl pyrophosphate substrates of terpene synthases where the lipophilic tail of the substrate is buried in a hydrophobic pocket. During the cyclization cascade of the terpene cyclase, the carbocation intermediates can be stabilized by the π electrons of the aromatic ring of amino acids like phenylalanine, tyrosine, and tryptophan through cation − π interactions 2 . Such a hydrophobic active site is also found in prenyl pyrophosphate methyltransferases such as IPPMT of Streptomyces monomycini and GPPMT of Streptomyces coelicolor 10,35 . The X-ray crystal structure of GPPMT showed that the binding site of GPP is defined by the side chains of several aromatic and other hydrophobic amino acids that favor the binding of prenyl pyrophosphate substrate 35 . Especially, the side chain of F222 was shown to stabilize the carbocation by cation-π interactions and by electrostatic interactions with the side chain of E173, while each phosphate group of the substrate (GPP) was coordinated to a single Mg 2+ ion complexed by the side chain of N37. Nevertheless, the amino acids Y51 and H221 found in the active pocket of GPPMT were shown to not be involved in any acid-base catalysis 35 . Furthermore, GPPMT showed a very low sequence identity (16.9%) with FPPMT and also turned out to be an unsuitable template in the FPPMT homology modeling studies 36 . In summary, it seems likely that the amino acid Y39 and H191 which form the catalytic dyad of the Serratia plymuthica FPPMT are conserved in several methyltransferases (Fig. S14) and could be very important for the methylation reaction. However, the function of the catalytic dyad (important for the cyclization reaction during the formation of PSPP) seems not to be conserved and is evolved in FPPMT.
Furthermore, site-directed mutagenesis exchange of the amino acids Y61, N219, D237, D272, L239, C241, and E297 to alanine led also to a strong reduction or loss of the FPPMT enzymatic activity. These amino acid residues are all closely located in the active site of the enzyme, most likely being important for the catalysis, e.g. Y61 could stabilize the correct positioning of SAM in the active site. Since the mentioned amino acids have the potential to form H bonds it can be speculated that they play a role in protein stability and help to form a high-fidelity active site for the PSPP biosynthetic mechanism. The mutants L239A and C241A were particularly interesting since new compounds were produced, the structures of which were so far not elucidated. Their mass spectra exhibited some similarities (Figs. S15, S16) with those of sodorifen, pre-sodorifen, or related compounds observed in S. plymuthica VOC profile, suggesting that these compounds might be intermediates of the cyclization cascade during the biosynthesis of PSPP. The elucidation of their structures will provide additional insights into the biosynthesis of PSPP. It will be also important to perform X-ray crystal structure and CD spectra analyses of the products of the FPPMT mutants compared to those of the wildtype (ongoing work), to support the suggested function of the targeted amino acids. In summary, the cyclization reaction catalyzed by FPPMT is similar to that of terpene cyclases and is particularly intriguing since no analogous reaction has been reported that includes cyclic prenyl pyrophosphate substrates for terpene synthases. In this context and given the fact that the SODS does not accept FPP as a substrate, it is an interesting but presently unsolved question why both FPPMT and SODS co-evolved to catalyze such a complex and unique reaction to produce sodorifen, as the ecological and biological function of sodorifen has also not yet been solved. www.nature.com/scientificreports/ It is known that SAM-dependent methyltransferases appear ubiquitous in all branches of life and are involved in a multitude of biological reactions [37][38][39][40] . Five structurally distinct classes have been described and the largest majority of known methyltransferases belong to class I 41,42 . They have a characteristic structure called Rossmannlike superfold, consisting of alternating β-strands and α-helices, which form an αβα sandwich structure 37,38 . The tertiary structure model of the FPPMT obtained from the Robetta protein folding Web Server shows similar structural features as it consists of the αβα fold. These results strongly indicated that FPPMT belongs to the class I methyltransferases. Early studies of the structural features of class I methyltransferases showed that they are a good example of convergent evolution in enzymes as the SAM core fold is highly conserved among them despite the low overall methyltransferase amino acid sequence similarity 37,[41][42][43] . Alignment of FPPMT with 14 other microbial class I methyltransferases revealed also low sequence similarity 36 . Nevertheless, the conserved GxGxG SAM-binding motif of class I methyltransferases 37 is formed by the amino acids G114, G116, and G118 of the FPPMT (Fig. S14). While the core SAM binding motif is highly conserved, the binding site of class I methyltransferase substrates varies considerably according to the nature of the respective molecule. It also has to be kept in mind that methyltransferases can transfer a methyl group on the carbon, oxygen, nitrogen, or sulfur atom of a broad range of substrates varying from macromolecule like lipids, proteins, nucleic acids, hormones to small molecules like catecholamine [37][38][39][40] . However, FPPMT is the first methyltransferase that accepts FPP as a substrate, consequently, its binding pocket does not share (conserved) similarities with other substrate binding sites. Recent data clearly provide evidence for specialized bacterial C-methyltransferases that accepted prenyl pyrophosphates like IPP, DMAPP, and GPP as substrates. They catalyzed the biosynthesis of a variety of methylated non-canonical substrates for the subsequently acting terpene synthases [5][6][7]10,12,13 . While GPPMT catalyzed the methylation of GPP followed by subsequent deprotonation of the carbocation to produce 2-methyl-GPP, FPPMT catalyzed a methylation at carbon 10 of FPP followed by a cyclization to synthesize PSPP as the substrate for the SODS. Although FPP and GPP are terpenoid building blocks and undergo C-methylation, the catalytic mechanisms of both enzymes are completely different. It could be speculated that prenyl pyrophosphate methyltransferases adapted to the lipophilic compounds and with the size of these substrates (long hydrocarbon chain and more double bonds), they evolved to catalyze intramolecular cyclization reaction as well, similar to the cyclization reactions in sesquiterpene biosynthesis.
Altogether, the findings of this new type of C-methyltransferases, involved in terpenoid biosynthesis, strongly indicate that substrates for terpenoid biosynthesis are more diverse than previously expected. Such substrate variations greatly increase the structural diversity of terpenoids as shown for some GPP methyltransferases when they were used in synthetic biology to increase terpenoid structures with potentially new/interesting properties 8,9,41 . It is also interesting to note that the multiple finding of prenyl pyrophosphate methyltransferases highlights a new dimension of substrate promiscuity of the corresponding specialized terpene cyclases in bacteria 8,9,44 . Therefore, bacteria may be able to diversify and increase the terpenoid structural space by using non-canonical substrates. It is yet not known if this is an evolutionary old or recent adaptation for terpenoid metabolism. So far, methylated or cyclized prenyl pyrophosphate substrates of terpene synthases are only found in the prokaryotic domain as these specialized methyltransferases have yet not been discovered in plants and animals.

Methods
A search in the protein database (PDB) for homologous proteins to the FPPMT revealed only three proteins ((pdb-code: 5KOK: the pavine N-methyltransferase 45 ; 5DOO: a protein-lysine methyltransferase from Rickettsia 46 , and 3MGG: a methyltransferase from Methanosarcina mazei 47 ), however, with very low sequence identity (5KOK 16.6%, 5DOO 14.8%, 3MGG 12.7%). This very low sequence identity was not sufficient for standard protein homology modeling. Therefore, the sequence was submitted to the Robetta Web Server 23 for ab initio modeling or/and threading. Five models were returned. Their quality for putative native fold was checked with PROSA II and stereochemical quality with PROCHECK. The one with the lowest z-score value from the PROSA II analysis (Fig. S3) was used for further studies. The structure was superimposed on the X-ray structure of 5KOK which contains S-adenosyl-L-homocysteine (SAH) as a cofactor in its structure. The tertiary structure of the Robetta-model coincides very well in the central part of the enzyme (Fig. S17), which supports highly the principal correctness of the model. Therefore, SAH could be appropriately merged into the model without any difficulties. SAH was manually modified to SAM by replacing the hydrogen atom with a methyl group using MOE (molecular operating environment 2016.08, chemical computing group Inc., Montreal, QC, Canada) and subsequently submitted to 20 steps of simulated annealing md-refinement using YASARA 48,49,50 . The quality of the final optimized structure was again checked with PROSA II 51,52 and PROCHECK 53 (Figs. S3 and S4). 100 docking runs of FPP were performed with GOLD 54,55 . For the definition of the docking site, the methyl carbon atom of SAM was defined as origin with a radius of 15 Å. The docking results were evaluated with ChemPLP score 56,57 . The side chains of W46 and F56 were considered as flexible according to the rotamer library of GOLD. The dockings were inspected for appropriate orientation of the terminal prenyl moiety to the SAM methyl carbon atom, i.e. the distance of this carbon atom to the C10-carbon atom was measured. Only two very similar docking arrangements fulfilled the cut-off criterion of a distance smaller than 4 Å. Since in both docking positions the diphosphate moiety was located close to the side chains of E42 and a protonated H45, a magnesium ion was manually included between the side chain of E42 and the diphosphate moiety. Both structures were finally optimized with Amber 14:EHT force field 58,59 in MOE keeping the backbone atoms of the protein fixed. The lipophilic potential (Fig. 6a) was also calculated with MOE.
Starting with both docking arrangements a multitude of semi-empirical quantum mechanical reaction coordinate calculations using PM7 60 included in MOPAC2016 25 were performed. For this purpose, the active site was cut from the protein structure model (see Fig. 4B). All backbone atoms were fixed during all quantum mechanical calculations to avoid distortions from the tertiary structure of the protein. www.nature.com/scientificreports/ For optimization, the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm was used [61][62][63][64] . Performed scan and grid calculations were done with a step size of ± 0.2 Å. For each reaction coordinate (scan) or pair of reaction coordinates (grid), the final heat of formation (ΔH f ) of the system was calculated directly resulting in an energy profile (scan) or an energy hyperplane (grid) from which the corresponding energy pathway was extracted.
Grids and scans were analyzed in MOE 2016.08 using a set of svl-scripts implemented by Richard Bartelt 65 . All images of the 3D-structures were rendered with PyMOL 66 .
All the intermediates as well as the corresponding transition state structures (Fig. 5) were directly taken from the PM7 optimized structures for single point energy calculations by more advanced energy optimizations using the density functional theory (DFT) implemented in the ab initio ORCA 4.0.0.2 program package 67 . The optimizations were done with the B3LYP functional with the def2-TZVP(-f) basis set and TightSCF optimization [68][69][70][71] . To save computational time, the diphosphate binding site was removed and a truncated model of the active site was used (F86, Y39, H191, L239, see Fig. S6). Therefore, instead of FPP, farnesyl alcohol was used to form presodorifen in the last step of the reaction. All the backbone atoms and the hydroxyl group of the farnesyl alcohol were fixed during the optimization. Furthermore, SAM was fixed except of the three carbon atoms bound to the sulfur atom. The results are summarized in Table S3.
Measurement of the CD spectrum. Pre-sodorifen was obtained from the S. plymuthica terpene synthase mutant as previously described 21 . The CD measurement in n-hexane was performed using a Jasco J-715 spectrometer (JASCO, Deutschland GmbH).
Calculation of spectrum. The structure of pre-sodorifen that resulted as the product of the reaction mechanism calculation was optimized by applying the density functional theory (DFT) using the B3LYP functional with the SV (P) basis set [68][69][70][71] implemented in the ab initio ORCA 3.0.3 program package 72 . The influence of the experimentally used n-hexane solvent was included in the DFT calculations using the COSMO model 73 . The quantum chemical simulations of the CD spectra were also carried out using ORCA. Therefore, the first 30 excited triplet states of the structure were calculated by applying the long-range corrected hybrid function TD B3LYP/G with SV (P). The CD-spectra of the corresponding enantiomer was obtained by mirroring from the calculated spectrum. The CD curves were visualized and compared with the experimental spectra with the help of the software SpecDis 1.64 74 .
Site-directed mutagenesis. The Serratia plymuthica 4Rx13 FPP methyltransferase (SOD_c20760) and terpene synthase (SOD_c20750) genes were cloned into the Champion pET151/D-TOPO vector (Thermo Scientific, St. Leon-Roth, Germany) 21 . Nucleotide changes on the genes were generated using a modified Quick Change Site-Directed Mutagenesis Kit (Agilent, Böblingen, Germany) according to the manufacturer's recommendation. Modifications of the protocol were: the pfu Ultra HF DNA polymerase and DpnI of this kit were replaced by a Phusion DNA polymerase (2 U/µL) and DpnI (10 U/µL) from Thermo Scientific (St. Leon-Roth, Germany) respectively. PCR parameters: 2 min initial denaturation at 98 °C was followed by 16 cycles of denaturation at 98 °C for 30 s, annealing at 65 °C for 60 s, and elongation at 72 °C for 14 min. Reactions were finished by a final elongation of 72 °C for 10 min. Each amino acid of interest was changed to alanine and the primers used are shown in the supplement (Table S1). The digestion of the methylated parental DNA template was performed by adding 0.5 µL of DpnI restriction enzyme to the PCR reaction tubes. The digestion was carried out for 1 h at 37 °C. Eighty ng of the mutated plasmid were used for the transformation of E. coli XL-blue cells. Stocks were stored at − 70 °C. Plasmids were re-isolated from single E. coli XL-blue clones using the NucleoSpin Plasmid Easy Pure Kit (Macherey-Nagel, Düren, Germany), and mutated sequences were confirmed by Sanger sequencing (Eurofins GATC Biotech, Konstanz, Germany).
Heterologous expression and purification of proteins. The proteins were expressed using the Champion pET151/D-TOPO protein expression system (Invitrogen, Thermo Scientific, St. Leon-Roth, Germany). Expression and purification of the wild type and mutated proteins were carried out as described previously 21 .
Briefly, E. coli BL21 (DE3) was used for the overexpression of His6-tagged proteins. Overexpressed proteins were obtained after a pre-incubation of 150 mL of bacterial culture at 37 °C until OD 600 of 0.8 -1 was reached. Gene expression was induced with 0.5 mM isopropylthio-β-galactoside (Carl Roth, Karlsruhe, Germany) and incubated for 20 h at 20 °C. Crude extracts were obtained by incubating the cell pellet with lysozyme (final concentration, 1 mg/mL), sonication, and centrifugation to separate cell debris from the protein-containing soluble fraction. The overexpressed protein was purified by Ni-NTA affinity chromatography (Qiagen, Hilden, Germany) according to the manufacturer's instructions. Protein concentrations were measured using the standard Bradford assay 75 . Protein purity was confirmed using SDS-PAGE (Fig. S18). The purified proteins were stored at − 20 °C or − 70 °C for further use.
Enzyme assay. Double enzyme assays, to determine sodorifen formation, were performed using FPPMT wildtype or mutated enzyme together with the SODS enzyme. The reaction tubes containing 20 μg of each purified enzyme, 50 μL assay buffer (250 mM HEPES-KOH, 100 mM MgCl 2 , 2.5 mM MnCl 2 , 50% (v/v) glycerol, pH 8), 30 mM dithiothreitol, 2.3 mM of S-adenosyl methionine (Merck Sigma-Aldrich, Darmstadt, Germany), 0.06 mM of farnesyl pyrophosphate (Echelon Biosciences, Salt Lake City, USA) and double distilled water (ad 200 μL) were incubated at 37 °C for 3 h 30 min. To determine pre-sodorifen synthesis, coupled enzyme assays were performed as described above (for the double enzyme assays) by starting the reaction with the FPPMT wildtype or mutant enzyme (without SODS (terpene synthase)). After incubation at 37 °C for 3 h 30 min, 10 U of alkaline phosphatase (Thermo Scientific, St. Leon-Roth, Germany) was added to the reaction mix and incubated www.nature.com/scientificreports/ for 1 h at 37 °C. Subsequently, each enzyme assay was overlaid with 200 μL hexane (containing 5 ng/μL nonyl acetate as internal standard). The reaction products were extracted by vortexing for 30 s followed by centrifugation (2 min at 5000g). The top layer representing the hexane phase was removed for GC-MS analysis. For the analysis of pre-sodorifen pyrophosphate, enzyme assays were performed as described above except that FPP was incubated only with FPPMT at 37 °C for 3 h 30 min. Thereafter, proteins in the reaction mix were precipitated by the addition of 50% (v/v) acetonitrile (Carl Roth, Karlsruhe, Germany) and the reaction mixture was filtered using 10 kDa molecular-mass cut-off Amicon Ultra filter (Merck Millipore, Darmstadt, Germany). The filtrate was lyophilized reconstituted in 700 µL of acetonitrile/water 7:3 and analyzed by LC-MS.
GC-MS analysis. The volatile compounds were analyzed with a Shimadzu GC-MS-QP500 or QP2010 system (Kyoto, Japan) with a CTC autosampler (CTC Analytics, Zwingen, Switzerland) equipped with a DB5-MS column (60 m × 0.25 mm × 0.25 μm; J&W Scientific, Folsom, California, USA). Samples of 1 μL were injected at 200 °C using splitless mode. Helium was used as carrier gas at a flow rate of 1.1 mL/min. A temperature gradient was applied by starting from 35 °C for 2 min followed by an increase of 10 °C/min to 280 °C within 24.5 min, followed by 15 min at 280 °C. Electron ionization at 70 eV was used. Mass spectra were obtained using the scan mode (with m/z 40-280). Data were analyzed using the Lab Solution software (Shimadzu, Duisburg, Germany). Compound identity was confirmed by comparison of the mass spectra and GC retention times with those of sodorifen and pre-sodorifen.
LC-MS analysis. LC-MS analysis was performed using a Nexera X2 liquid chromatograph (Shimadzu Corporation, Kyoto, Japan) coupled to an AB Sciex QTRAP 5500 mass spectrometer (AB Sciex GmbH, Darmstadt Germany). Data were analyzed using the Analyst Instrument and Data Processing Software Version 1.6.3. Proteins in the enzyme assay reaction mix were precipitated by the addition of 50% acetonitrile and ultra-filtrated. FPP, PSPP, and SAM were separated by ion-pair chromatography according to Balcke et al. 76 . Briefly, the samples were separated on a Nucleoshel RP18, 2.7 µm column (150 × 2 mm) (Macherey Nagel, Düren, Germany) with a linear gradient of 10 mM aqueous tributylamine (eluent A) adjusted to pH 6.