Characterization of 10-Hydroxygeraniol Dehydrogenase from Catharanthus roseus Reveals Cascaded Enzymatic Activity in Iridoid Biosynthesis

Catharanthus roseus [L.] is a major source of the monoterpene indole alkaloids (MIAs), which are of significant interest due to their therapeutic value. These molecules are formed through an intermediate, cis-trans-nepetalactol, a cyclized product of 10-oxogeranial. One of the key enzymes involved in the biosynthesis of MIAs is an NAD(P)+ dependent oxidoreductase system, 10-hydroxygeraniol dehydrogenase (Cr10HGO), which catalyses the formation of 10-oxogeranial from 10-hydroxygeraniol via 10-oxogeraniol or 10-hydroxygeranial. This work describes the cloning and functional characterization of Cr10HGO from C. roseus and its role in the iridoid biosynthesis. Substrate specificity studies indicated that, Cr10HGO has good activity on substrates such as 10-hydroxygeraniol, 10-oxogeraniol or 10-hydroxygeranial over monohydroxy linear terpene derivatives. Further it was observed that incubation of 10-hydroxygeraniol with Cr10HGO and iridoid synthase (CrIDS) in the presence of NADP+ yielded a major metabolite, which was characterized as (1R, 4aS, 7S, 7aR)-nepetalactol by comparing its retention time, mass fragmentation pattern, and co-injection studies with that of the synthesized compound. These results indicate that there is concerted activity of Cr10HGO with iridoid synthase in the formation of (1R, 4aS, 7S, 7aR)-nepetalactol, an important intermediate in iridoid biosynthesis.

M onoterpene indole alkaloids (MIAs) are a multifarious class of natural products with distinct chemical and biological properties [1][2][3] . To date, over 3000 MIAs are known with diverse structures and biological activities. The Apocynaceae family plant, C. roseus is a rich source of the iridoid-derived MIAs and is known to contain over 200 alkaloids in various tissues. Two MIAs from this plant, vincristine and vinblastine, are widely prescribed as potent anti-cancer agents 4,5 . These MIAs were synthesized from the condensation of tryptamine and the iridoid monoterpene, secologanin. The MIAs' biosynthesis diverges from the isoprenoid biosynthetic pathway at the 19-4 chain elongation intermediate geranyl diphosphate (GPP) formed through head-to-tail condensation of isopentenyl diphosphate (IPP) with dimethylallyl diphsophate (DMAPP) catalyzed by Geranyl diphosphate synthase (CrGDS) 6 . Geraniol synthase (CrGS) 7 hydrolyses GPP into geraniol, which undergoes hydroxylation at C10 to form 10-hydroxygeraniol by the cytochrome P450 system, Geraniol 10-hydroxylase (CrG10H) 8 (Fig. 1). Feeding experiments with labelled 10-hydroxygeraniol, 10-hydroxynerol and iridodiol in C. roseus and Lonicera morrowii suspension cultures clearly indicated that 10-hydroxygeraniol is oxidized to 10oxogeranial by the oxidoreductase system, 10-hydroxygeraniol dehydrogenase (10HGO) [9][10][11][12] (Fig. 1). Recently, a short chain reductive cyclase, iridoid synthase (CrIDS) 13 , which cyclises 10-oxogeranial into an equilibrium mixture of cis-trans-nepetalactol and iridodials, has been characterized. The bicyclic compound, cis-transnepetalactol, is the key intermediate involved in the biosynthesis of a structurally diverse array of MIAs. Experiments using labelled intermediates indicated that one of the committed steps during the biosynthesis of iridoids is the oxidation of 10-hydroxygeraniol to its dialdehyde cognate, 10-oxogeranial 11 . Ikeda et al. 14 had purified the NADP 1 dependent oxidoreductase protein from Rauwolfia serpentina cells which could convert 10hydroxygeraniol into 10-oxogeraniol, 10-hydroxygeranial and 10-oxogeranial. However, it was found to have better activity on nerol and geraniol. The present work describes the cloning and functional characterization of Cr10HGO and the study on the orchestration of enzyme activity of Cr10HGO with CrIDS in the biosynthesis of desired (1R, 4aS, 7S, 7aR)-nepetalactol (Fig. 1). Also, substrate specificity studies of Cr10HGO indicated that, it has broad substrate specificity with 10-hydroxygeraniol, 10-oxogeraniol or 10-hydroxygeranial over monohydroxy linear terpene derivatives.

Results and Discussion
Transcriptome Analysis. RNA sequencing is a powerful technique for profiling transcriptome because of its high-throughput accuracy and reproducibility. In plants, high-throughput RNA sequencing has accelerated the discovery of novel genes, transcription pattern, and functional analysis. In the present work, we have performed RNA sequencing on Illumina GAII Analyzer, and have screened for unigenes involved in the biosynthesis of secologanin in Catharanthus roseus. The raw RNA-seq paired reads from stem, root and leaf RNA were deposited with NCBI (Accession ID: SRR1693842) and have been assembled using the Velvet_1.1.05 (Oases).
Various approaches for functional annotation of the assembled transcripts have been used to identify the genes, which are involved in MIA biosynthesis in C. roseus. All the 62,352 putative unigenes obtained were compared with manually curated KEGG (Kyoto Encyclopedia of Genes and Genomes) database of Arabidopsis thaliana (Thale cress) and Oryza sativa japonica (Japanese rice) for functional annotation of genes by bidirectional BLAST. 4335 unigenes were assigned with KEGG Orthology (KO) number representing 327 KEGG pathways involved in majority of plant biochemical pathways including metabolism, cellular processes and genetic information processing. All the unique transcripts (62,352) were submitted to Virtual Ribosome-V1.1 to predict ORF of maximum length for each unigene in all six reading frames. A total of 62,290 unigenes (99.9%) were identified as having an ORF starting at the ATG codon, from which 22,224 unigenes (35.64%) contained the ORF of $70 amino acids length. To identify protein domain architecture, these 22,224 unigenes were submitted for Pfam analysis against the PfamA database. The transcripts for various Pfam domain matches were identified and run individually on BLAST program to identify various ORFs related to the mevalonate (MVA) pathway, methylerythritol phosphate (MEP) pathway and Secologanin biosynthetic pathway. A summary of the ORFs identified is shown in Supplementary Table S1.
Cloning and functional characterization of Cr10HGO. An Open reading frame (ORF) of 1083 bp ( Supplementary Fig. S1), encoding a polypeptide of 360 amino acids ( Supplementary Fig. S2), displaying 78% sequence identity with both cinnamyl alcohol dehydrogenase from Populus trichocarpa (Genbank ID: ACC63874 15 ) and geraniol dehydrogenase from Ocimum basilicum (GenBank ID: Q2KNL6 16  Gas chromatography and mass spectrometric analyses of the reaction products after incubation of purified Cr10HGO protein with 10hydroxygeraniol in the presence of NADP 1 resulted in the formation of 10-oxogeranial along with 10-oxogeraniol and 10-hydroxygeranial as minor products ( Fig. 2a and Supplementary Fig. S3). The formation of these products was further confirmed by comparing the fragmentation pattern as well as co-injection studies using corresponding synthesized compounds, 10-oxogeranial, 10-oxogeraniol and 10-hydroxygeranial. Further, Cr10HGO efficiently converted 10-oxogeraniol and 10-hydroxygeranial into 10-oxogeranial in the presence of NADP 1 . However, when NADPH was used as cofactor, 10-hydroxygeraniol was found to be the major enzymatic product with substrates, 10-oxogeraniol, 10-hydroxygeranial and 10-oxogeranial ( Fig. 2) indicating that the Cr10HGO mediated reaction ( Fig. 3) is reversible. The NADP 1 dependent oxidoreductase protein purified from R. serpentina catalyzes dehydrogenation of nerol and geraniol in an efficient manner compared to10-hydroxygeraniol 14 . Similarly, the oxidoreductase purified from Nepeta racemosa 18 also showed better activity towards geraniol, nerol and 10-hydroxynerol than towards 10-hydroxygeraniol. The recently reported 8-HGO, which encodes the NADP 1 dependent oxidoreductase from C. roseus carries out the dehydrogenation of 10-hydroxygeraniol and also other acyclic monoterpenes 19 , but does not possess much sequence similarity with Cr10HGO. In contrast to these observations, monohydroxy terpene derivatives such as geraniol, nerol, and farnesol were found to be poor substrates for Cr10HGO as compared to the reported 8HGO 19 (Supplementary Table S2 and Supplementary  Fig. S4). Studies on the effects of temperature on Cr10HGO mediated reaction revealed that the Cr10HGO activity was found to be optimum at 30uC. The apparent K m values were found to be 1.50 mM for 10-hydroxygeraniol, 1.0 mM for 10-oxogeraniol and 10-hydroxygeranial at saturated concentrations of NADP 1 (Supplementary Figures S5-S11, Supplementary Table S3). The kinetic studies also indicated that among NAD 1 /(H) and NADP 1 /(H), the latter was found to be the preferred coenzyme for Cr10HGO (Table 1).  Fig. S13), with a calculated molecular weight of 43.69 KDa. This polypeptide sequence showed high similarity with the reported IDS sequence but was shorter by six nucleotides (nucleotides from 47-52 missing), thereby causing a deletion of two amino acids (Pro and Asn). Nevertheless, this shorter CrIDS efficiently carried out the reductive cyclization of 10oxogeranial into four compounds ( Supplementary Fig. S14). The GC and GC-MS analyses of the assay mixture extracts of 10-oxogeranial with CrIDS, in the presence of NADPH, indicated the presence of an equilibrium mixture of cis-trans-nepetalactol and iridodials (Rt: 8.5, 8.7, 8.8 and 9.2 min) (Fig. 4c), in line with the other study 13 .
Cascaded enzyme activity of Cr10HGO with CrIDS. Surprisingly, when 10-hydroxygeraniol was incubated with Cr10HGO and CrIDS for 30 min, in the presence of NADP 1 , it yielded a major metabolite (Rt: 9.2 min) (Fig. 4b). The major metabolite formed was identified as cis-trans-nepetalactol with stereochemistry 4aS, 7S, 7aR by com-paring the retention time and co-injection studies in GC and GC-MS analyses with the synthesized diastereomeric mixture of cis-transnepatalactols [containing (1R, 4aS, 7S, 7aR)-nepetalactol as a major diastereomer] 20-22 arising due to the asymmetry at carbon 1. These two diastereomers were not resolved under the GC conditions with chiral capillary column. However, the acetylated diastereomers were separated under similar GC conditions ( Supplementary Fig. S15). The stereochemistry of the major enzymatic product formed was determined as (1R, 4aS, 7S, 7aR)-nepetalactol acetylation of assay mixture, followed by GC and GC-MS analyses and comparing the retention time and co-injection studies with that of acetylated derivatives of synthesized nepetalactols' mixture containing (1R, 4aS, 7S, 7aR)-nepetalactol as a major diastereomer 21 (Supplementary Figures S15, S16 and S17).
Further, when 10-hydroxygeraniol was incubated with Cr10HGO and CrIDS in the presence of NADP 1 for prolonged incubation beyond 30 min led to the formation of open structures of iridodials in equilibrium with cis-trans-nepetalactol ( Supplementary Fig. S18). The formation of a major metabolite by the combined action of two enzymes clearly indicates the concerted enzymatic action of Cr10HGO and CrIDS in the formation of desired (1R, 4aS, 7S, 7aR)-nepetalactol, an important intermediate in iridoids and MIAs biosynthesis. As both Cr10HGO and CrIDS are cytoplasmic enzymes, presumably, the products of Cr10HGO [10-oxogeranial and NAD(P)H] will be used by CrIDS to synthesize (1R, 4aS, 7S, 7aR)-nepetalactol, indicating a physiological enzyme cascade.
Cloning and Functional Characterization of CrGDS, CrGS and CrG10H. As a part of ongoing efforts to elucidate the MIAs biosynthetic pathway in C. roseus and also to explore their production in heterologous systems, the full length unigenes, which showed high ranking with known GDS and GS from various sources, were used for the cloning and functional characterization of CrGDS and CrGS from C. roseus. CrG10H was cloned using the primers designed from the reported gene, which encodes geraniol hydroxylase in C. roseus 23 as the unigenes 742 had similar sequence to that of the reported one (Supplementary Table S4). These genes were cloned in various expression vectors compatible with E. coli or yeast systems and the expressed proteins were purified by Ni-NTA chromatography, except CrG10H, which remained as microsomal pellet when expressed in yeast system (Supplementary Figures S19-S27). While we were characterizing CrGDS and CrGS from C. roseus, similar studies were reported elsewhere 6,7 .
To understand the combinatorial action of these enzymes on their cognate substrates, we incubated dimethylallyl diphosphate (DMAPP) and isopentenyl diphosphate (IPP) with purified CrGDS and CrGS proteins and yeast expressed microsomal pellet containing CrG10H, in the presence of coenzymes, which yielded 10-hydroxygeraniol as the enzymatic product ( Supplementary Figures S28 and S29). Furthermore, incubation of geraniol in combination with CrG10H, Cr10HGO  and CrIDS revealed the formation of (1R, 4aS, 7S, 7aR)-nepetalactol by GC and GC-MS analyses ( Supplementary Figures S30 and S31).
These experiments suggest in vivo proximity and ''cross talk'' of the enzymes involved in the biosynthetic cascade leading to the formation of the desired product. As CrG10H is localized on endoplasmic reticulum, the product of this enzyme, 10-hydroxygeraniol might be sequestered out to the subsequent enzymes of MIAs pathway. These observations are further supported by the studies on cellular localization of MIAs biosynthetic pathway enzymes in a particular cell type in C. roseus [24][25][26][27] . It also appears that the formation of 10-hydroxygeraniol through CrG10H catalyzed reaction is the rate-limiting step for iridoid biosynthesis.
Concluding Remarks. Cloning and functional characterization of 10hydroxygeraniol dehydrogenase (Cr10HGO) system from C. roseus indicated that Cr10HGO showed broad substrate specificity for 10hydroxygeraniol, 10-oxogeraniol or 10-hydroxygeranial over monohydroxy linear terpene derivatives. Concerted enzymatic function in the biosynthesis of cis-trans-nepetalactol has been demonstrated using 10-hydroxygeraniol and NADP 1 with Cr10HGO and CrIDS combined assay system. The stereochemistry of the enzymatic product was determined and is (1R, 4aS, 7S, 7aR)-nepetalactol, which is a key intermediate in the biosynthesis of iridoids and MIAs. Further we have demonstrated the in vitro formation of (1R, 4aS, 7S, 7aR)nepetalactol when geraniol was incubated with CrG10H, Cr10HGO and CrIDS.

Methods
Plant Sample Source and Strains used. The various tissues, leaves, stem and roots were collected from C. roseus plants of height 26 cm above the ground, grown in a greenhouse. The tissues were flash frozen in liquid nitrogen and stored in 280uC till used. A summary of the strains and plasmids used in the study are mentioned in Supplementary Table S4.
Isolation of Total RNA and cDNA Synthesis. Total RNA was extracted from the leaf, stem and root tissues of C. rosues using Spectrum TM Plant Total RNA Isolation Kit from Sigma ( Supplementary Fig. S32). These RNA samples were utilised for Transcriptome Sequencing (all combined in 151 ratios), and for construction of cDNA using SuperScript III RT Kit from Invitrogen.
Transcriptome Sequencing and Assembly. Briefly, mRNA was purified from 1 mg of intact total RNA using oligodT beads (TruSeq RNA Sample Preparation Kit, Illumina). The purified mRNA was fragmented for 2 minutes at elevated temperature (94uC) in the presence of divalent cations and reverse transcribed with Superscript II Reverse transcriptase by priming with Random Hexamers. Second strand cDNA was synthesized in the presence of DNA polymerase I and RnaseH. The cDNA was cleaned up using Agencourt Ampure XP SPRI beads (Beckman Coulter). Illumina Adapters were ligated to the cDNA molecules after end repair and addition of a base. SPRI cleanup was performed after ligation. The library was amplified using 8 cycles of PCR for enrichment of adapter ligated fragments. The prepared library was quantified using Nanodrop and validated for quality by running an aliquot on High Sensitivity Bioanalyzer Chip (Agilent).
A total of 15.49 million raw reads were generated with a length of 70 bp. Adapter trimming and low quality trimming was performed throughout the sequence to get better quality reads. High quality reads (.20 phred score) were then used for de novo assembly with varying hash lengths. The 12,409,039 raw reads (80.11%) obtained were assembled into 53,544 contigs with optimized hash length of 49, having an average contig length of 1594.83 bp and N50 value of 2485. These contigs were submitted as inputs for Oasis_0.2.01 to generate 70,779 transcripts having N50 value of 2355 and an average transcript length of 1457.98 bp. These transcripts were further subjected to cluster and assembly analysis using CD-HIT to remove the redundancy, which resulted in a total of 62,352 unique transcripts with an average size of 1024 bp and N50 value of 2375.
Cloning and Expression of Cr10HGO, CrGDS, CrGS, CrG10H and CrIDS. The full length transcripts which showed high ranking with known GDS, GS, G10H, 10HGO and the reductive cyclization enzymes from various sources were used for the    Tables S5  and S6).
Yeast Expression and Microsome Preparation (CrG10H). Expression of active protein was carried out in INVSc1 yeast competent cells. Cells were grown overnight in synthetic complete medium without Uracil (SC-U), containing 2% glucose at 30uC, then transferred to induction medium (SC-U, containing 20% galactose) and further incubated at 30uC for 12 hours. The cells were centrifuged at 3000 3 g for 10 minutes at 4uC. The cell pellet obtained was washed with TEK buffer (50 mM Tris-HCl, 1 mM EDTA, pH 7.4, 100 mM KCl) (1 mL/g of cell pellet 3 3) and centrifuged. The cell pellet (1 g/5 mL) was re-suspended in 50 mM Tris-HCl buffer (containing 1 mM EDTA, 600 mM Sorbitol, 5 mM DTT, 0.25 mM PMSF and pH 7.4) and cells were lysed using a bead-beater (with acid washed glass beads, 425-600 mm) for 6 cycles (pulse on 30 sec, pulse off 30 sec, manual rocking for 3 3 30 sec). The lysed cells were centrifuged at 1000 3 g for 5 min at 4uC to remove the glass beads. Further, the supernatant was subjected to centrifugation at 10,000 3 g for 30 min at 4uC. The 10,000 3 g supernatant was centrifuged at 1,00,000 3 g for 1 hr 30 min at 4uC. The microsomal pellet, thus obtained, was suspended in TEG buffer (50 mM Tris-HCl, 1 mM EDTA, 30% glycerol, pH 7.5) and homogenized. The homogenized microsome fraction was aliquoted (0.2 mL), flash-frozen in liquid nitrogen and stored at 280uC.
Chemical Synthesis. Geraniol acetate, Citral, 10-oxonerayl acetate and (S)-citronellol were purchased from Sigma-Aldrich. The compounds 2, 3, 4, 5, 12 and 16 (Supplementary Schemes S1 and S2) were synthesized according to the procedures described earlier 21,22 . Nepetalactol (11) was prepared according to the procedure described by Beckett et. al. 20 4 was found to contain a mixture of ,24% of 3. All the spectral data recorded for these compounds were in accordance with those obtained in the literature (Supplementary Figures S35-S52).
Product Ratio Studies of Cr10HGO. To 0.1 mg of protein in 0.5 mL of sodium bicarbonate buffer (20 mM sodium bicarbonate, 10% v/v glycerol, pH 10.0), containing NADP 1 , 0.2 mM of 10-hydroxygeraniol was added and the mixture incubated at 30uC for 30 minutes. The mixture was then extracted thrice with dichloromethane (CH 2 Cl 2 ). The combined organic phase was dried over sodium sulphate, reduced to ,50 mL with a stream of dry nitrogen and subjected to GC and GC-MS analyses. Assays with various other substrates (Supplementary Table S2) were carried out with Cr10HGO under identical conditions.
Determination of Kinetic Parameters. Steady-state kinetics was performed in 20 mM Sodium bi-carbonate, 10% v/v Glycerol, pH 10.0 at 30uC with varying substrate concentrations, ranging from 0.25 to 500.0 mM with saturation concentration of cofactor, NADP 1 (500.0 mM) and vice versa. The reactions were followed by measuring changes in NADPH concentration at 340 nm. The kinetic data were fitted with the Graph Pad Prism software and the parameters calculated using Michaelis-Menten plots. Similarly, kinetic parameters for 10-hydroxygeraniol (with NAD 1 ), 10-oxogeraniol (with NADP 1 or NADPH), 10-oxogeranial (with NADPH), 10-hydroxygeranial (with NADP 1 or NADPH) and 10-hydroxynerol (with NADP 1 ) were determined.
Combined Assays. The combined assay of CrGDS, CrGS and CrG10H was carried out by adding 0.1 mg of each purified protein (microsomal pellet in case of CrG10H) to an assay mixture containing IPP (0.1 mM), DMAPP (0.1 mM), NADPH (1 mM), Glucose 6-phosphate (2.5 mM), Glucose 6-phosphate dehydrogenase (1U), FAD (10 mM), FMN (10 mM) in buffer (100 mM K 2 HPO 4 , 50 mM MOPSO, pH 7.6, 1 mM EDTA, 1 mM DTT, 10 mM MgCl 2 , 0.1 mM MnCl 2 ) and incubated on a metabolic shaker for 3 hours at 50 rpm. After this incubation period, the aqueous phase was extracted three times with 0.5 mL of dichloromethane. The combined organic phase was dried over sodium sulphate, concentrated and subjected to GC and GC-MS analyses. The Cr10HGO and CrIDS combined assay was carried out by adding 0.1 mg of each purified protein to an assay mixture containing 0.2 mM 10-hydroxygeraniol and 0.2 mM of NADP 1 in assay buffer (20 mM MOPS, pH 7.0, 10% v/v Glycerol) and incubated at 30uC on a rotary shaker. Incubations were carried out for different time intervals (30 min to 6 hours). After this incubation period, the assay samples were exctrated thrice with 0.5 mL dichloromethane. Similarly, the combined assay for CrG10H, Cr10HGO and CrIDS was carried out in 2 mL assay buffer (100 mM K 2 HPO 4 , 50 mM MOPS, pH 7.6, 1 mM EDTA, 1 mM DTT, 10 mM MgCl 2 , 0.1 mM MnCl 2 ) containing the required cofactors and geraniol as substrate.
Product Analysis. 1 ml of the extract was injected onto a i) 30 m 3 0.25 mm 3 0.25 mm HP-5 capillary GC column with a temperature gradient from 60 to 120uC at 20uC per min, followed by a temperature gradient from 120 to 170uC at 2.5uC per min and a final temperature gradient from 170 to 190uC at 20uC per min (program 1) or ii) 30 m 3 0.25 mm 3 0.12 mm Astec CHIRAL DEX TM B-DA capillary column with a temperature gradient from 60 to 100uC at 4uC per min, followed by a temperature gradient from 60 to 160uC at 1uC per min, followed by a temperature gradient from 160 to 215uC at 10uC per min (program 2) or iii) 30 m 3 0.25 mm 3 0.12 mm Astec CHIRAL DEX TM B-DA Capillary Column with a temperature gradient from 60 to 160uC at 1uC per min, followed by a temperature gradient from 160 to 215uC at 10uC per min (program 3). Nitrogen was used as a carrier gas at a flow rate of 1 mL/min. Analyses by GC-MS were carried out under similar conditions at a helium flow rate of 1 mL/min. 1 H and 13 C NMR in CDCl 3 spectra were recorded at 400.13 and 100.63 MHz. Chemical shifts are given in d-values relative to TMS (tetramethylsilane) as internal standard. Exact molecular mass and molecular formula determinations were recorded using Q Exactive Orbitrap spectrometer.