Functional Characterization of Novel Sesquiterpene Synthases from Indian Sandalwood, Santalum album

Indian Sandalwood, Santalum album L. is highly valued for its fragrant heartwood oil and is dominated by a blend of sesquiterpenes. Sesquiterpenes are formed through cyclization of farnesyl diphosphate (FPP), catalyzed by metal dependent terpene cyclases. This report describes the cloning and functional characterization of five genes, which encode two sesquisabinene synthases (SaSQS1, SaSQS2), bisabolene synthase (SaBS), santalene synthase (SaSS) and farnesyl diphosphate synthase (SaFDS) using the transcriptome sequencing of S. album. Using Illumina next generation sequencing, 33.32 million high quality raw reads were generated, which were assembled into 84,094 unigenes with an average length of 494.17 bp. Based on the transcriptome sequencing, five sesquiterpene synthases SaFDS, SaSQS1, SaSQS2, SaBS and SaSS involved in the biosynthesis of FPP, sesquisabinene, β-bisabolene and santalenes, respectively, were cloned and functionally characterized. Novel sesquiterpene synthases (SaSQS1 and SaSQS2) were characterized as isoforms of sesquisabinene synthase with varying kinetic parameters and expression levels. Furthermore, the feasibility of microbial production of sesquisabinene from both the unigenes, SaSQS1 and SaSQS2 in non-optimized bacterial cell for the preparative scale production of sesquisabinene has been demonstrated. These results may pave the way for in vivo production of sandalwood sesquiterpenes in genetically tractable heterologous systems.


Functional Characterization of Novel Sesquiterpene Synthases from Indian Sandalwood, Santalum album
Prabhakar Lal Srivastava 1 , Pankaj P. Daramwar 1 , Ramakrishnan Krithika 1 , Avinash Pandreka 1, 2 , S. Shiva Shankar 1 & Hirekodathakallu V. Thulasiram 1,2 Indian Sandalwood, Santalum album L. is highly valued for its fragrant heartwood oil and is dominated by a blend of sesquiterpenes.Sesquiterpenes are formed through cyclization of farnesyl diphosphate (FPP), catalyzed by metal dependent terpene cyclases.This report describes the cloning and functional characterization of five genes, which encode two sesquisabinene synthases (SaSQS1, SaSQS2), bisabolene synthase (SaBS), santalene synthase (SaSS) and farnesyl diphosphate synthase (SaFDS) using the transcriptome sequencing of S. album.Using Illumina next generation sequencing, 33.32 million high quality raw reads were generated, which were assembled into 84,094 unigenes with an average length of 494.17 bp.Based on the transcriptome sequencing, five sesquiterpene synthases SaFDS, SaSQS1, SaSQS2, SaBS and SaSS involved in the biosynthesis of FPP, sesquisabinene, β-bisabolene and santalenes, respectively, were cloned and functionally characterized.Novel sesquiterpene synthases (SaSQS1 and SaSQS2) were characterized as isoforms of sesquisabinene synthase with varying kinetic parameters and expression levels.Furthermore, the feasibility of microbial production of sesquisabinene from both the unigenes, SaSQS1 and SaSQS2 in non-optimized bacterial cell for the preparative scale production of sesquisabinene has been demonstrated.These results may pave the way for in vivo production of sandalwood sesquiterpenes in genetically tractable heterologous systems.
Terpenoid or isoprenoid compounds are the most ancient and diverse collection of natural products and are found in all forms of life.Over 70,000 individual structures, containing a truly incredible array of carbon skeletons and functional groups have been reported 1 .These structurally and stereochemically distinct molecules play crucial roles in plants, including hormones 2 , photosynthetic pigments 3 , electron carriers 4 , structural components of membrane 5,6 , as well as in communication and defense 7 .Indeed, the structurally diverse collection of isoprenoid compounds is constructed from two simple five-carbon building blocks, isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP), which in-turn are synthesized through mevalonate (MVA) or methylerythritol phosphate (MEP) pathway.The allylic isoprenoid diphosphate (DMAPP) undergoes coupling with one or more IPP molecules to form 1′ -4 linkage characteristic of head-to-tail condensation to form prenyl diphosphates, geranyl diphosphate (GPP), farnesyl diphosphate (FPP) and geranylgeranyl diphosphate (GGPP).These short chain prenyl diphosphates subsequently undergo cyclization reaction catalyzed by terpene cyclases and ultimately, the functional group modifications catalyzed by CYP450 mono-oxygenase systems to generate functionally active terpenoids 8 .Farnesyl diphosphate synthase (FDS), a key chain elongation enzyme in isoprenoid biosynthesis, catalyzes the electrophilic condensation of one or two molecules of IPP (C 5 ) with the allylic carbocations generated from allylic diphosphates, GPP (C 10 ) or DMAPP (C 5 ), respectively, to produce FPP (C 15 ) that lies at the juncture of various isoprenoid biosynthetic branches, including sesquiterpene biosynthesis 9 .Highly evolvable sesquiterpene synthases catalyze the multistep conversions of (E,E)-FPP into numerous acyclic and cyclic structures through a series of carbocationic rearrangements, alkylations, hydride shifts and cyclizations which are initiated by ionization of the diphosphate anion resulting in the generation of reactive farnesyl carbocation 10,11 .Till date, over 7000 sesquiterpene molecules with more than 300 stereo-chemically distinct hydrocarbon skeletons are reported 12 .Although, sesquiterpenes are traditionally used as flavors and fragrances, they are also known to possess various biological properties including anticancer 13 and antimalarial 14 activities.In recent years, sesquiterpenes of farnesene, bisabolene and sabinene skeletons have been recognized as replacements for petroleum-derived jet-engine fuels 15,16 .
In this manuscript, we present the transcriptome sequencing, cloning and functional characterization of genes, which encode the enzymes involved in sesquiterpene biosynthesis in Indian Sandalwood S. album.Based on the transcriptome screening, cloning and functional characterization of one prenyltransferase (SaFDS) and four sesquiterpene synthases, including sesquisabinene synthases (SaSQS1 and SaSQS2), bisabolene synthase (SaBS) and santalene synthase (SaSS) has been carried out.Although, sesquisabinene and its hydroxy derivative, sesquisabinene hydrate have been reported to be present in various natural sources 26,27 , to the best of our knowledge, there is no report on the isolation and functional characterization of the gene which encodes sesquisabinene synthase.Furthermore, we have also evaluated the feasibility of microbial production of sesquisabinene from both the unigenes in non-optimized bacterial cells and demonstrated the feasibility of metabolic engineering of SaSQS1 and SaSQS2 for preparative scale production of sesquisabinene in metabolically tractable heterologous systems.

Results and Discussion
RNA sequencing and de novo transcriptome assembly.The cDNA library of S. album was constructed using mRNA purified from total RNA isolated from the interface of heartwood and sapwood.The resultant cDNA library was amplified by PCR to enrich the adaptor ligated fragments and sequenced on one lane of the flow cell using paired end sequencing on Illumina GAII Analyzer.A total of 33,323,756 raw reads were generated with a length of 150 bp corresponding to 10.08 GB of sequence data file.Adapter trimming and low quality trimming was performed throughout the sequence to get better quality reads.High quality reads (>20 phred score) were then used for de novo assembly with varying hash lengths from 51 to 113.The 26,730,799 raw reads (Gen-Bank Accession: SRR1725543) (80.22%) obtained were assembled into 58,221 contigs with optimized hash length of 59, having an average contig length of 571.67 bp and an N50 value of 863.These contigs were submitted as inputs for Oasis_0.2.01 to generate 90,478 transcripts having N50 value of 695 and an average transcript length of 474.18 bp.These transcripts were further subjected to cluster and assembly analysis using CD-HIT to remove the redundancy, which resulted in a total of 84,094 unique transcripts with an average size of 494.17 bp and an N50 value of 717 containing 24,912 transcripts (29.62%) with lengths greater than 500 bp and 9,136 transcripts (10.86%) with lengths greater than 1 kb (Table 1 and Supplementary Fig. S2).

Functional annotation.
Functional annotation of unigenes provides valuable insights into genes, which are involved in specific molecular functions and biological processes.Various approaches for functional annotation of the assembled transcripts have been used to identify the genes, which are involved in terpenoid biosynthesis in S. album.All the 84,094 putative unigenes were compared with manually curated KEGG (Kyoto Encyclopedia of Genes and Genomes) database of Arabidopsis thaliana (Thale cress) and Oryza sativa japonica (Japanese rice) for functional annotation of genes by bidirectional BLAST 28 .KEGG Orthology (KO) numbers were assigned to 4,244 unigenes, representing 298 KEGG pathways involved in majority of plant biochemical pathways including metabolism, cellular processes and genetic information processing.There were 30 unigenes mapped specifically on enzymes involved in terpenoid backbone biosynthesis (Supplementary Fig. S4).
All the unique transcripts (84,094) were submitted to Virtual Ribosome-V1.1 to predict ORF of maximum length for each unigene in all six frames.A total of 83,823 unigenes (99.6%) were identified as having an ORF starting at the ATG codon, from which 17,119 unigenes (20.35%) contained the ORF of ≥ 100 amino acids length.To identify protein domain architecture, these 17,119 unigenes were submitted for Pfam analysis against PfamA database.Of 17,119 unigenes, only 10,668 could be assigned with Pfam IDs.Eighteen unigenes containing Pfam ID: PF01397 (terpene synthase N-terminal domain), PF03936 (terpene synthase family metal binding domain), PF00432 (prenyltransferase and squalene oxidase repeat), PF13243 (prenyltransferase like) were selected for screening of terpene synthases.The 10,668 transcripts were also submitted to megablastx search against NCBI Nr-database, Swissprot/ Uniprot database with an E-value ≤ 10 −5 , which resulted in 18 unigenes related to terpene synthases, 72 unigenes for CYP450 monooxygenase and 3 unigenes representing CYP450 reductases.

Screening and isolation of terpene synthases.
From transcriptome screening, five unigenes such as SaFDS (Locus_19031_Transcript_1/1_Confidence_1.000_length_453 bp), SaSQS1 (Locus_33105_Transcript_1/1_Confidence_1.000_length_761 bp), SaSQS2 (Locus_8408_Transcript_1/1_ Confidence_1.000_length_1016bp), SaBS (Locus_5558_Transcript_1/1_Confidence_1.000_length_832 bp) and SaSS (Locus_1838_Transcript_1/1_Confidence_1.000_length_438 bp) were identified using BLAST analysis, based on their homology with known terpene synthases reported in the NCBI database.EST fragment of SaFDS was found to match with FPP synthase reported from Panax quinquefolius with 88% identity at the amino acid level, but lacked its 5' and 3' sequences.The full-length cDNA sequence of SaFDS was obtained by performing 5' and 3' RACE reactions.The full length ORF of SaFDS (Gen-Bank Accession: KF011939) composed of 1029 bp encoding a polypeptide of 342 amino acids with a calculated molecular weight of 39.4 kDa and pI of 5.26.The deduced amino acid sequence of SaFDS showed resemblance with that of earlier reports 25 .The analysis of SaFDS amino acid sequence revealed the presence of several highly conserved regions including two aspartate rich motifs, FARM (L, X 4 LDDxxDxxxxRRG) and SARM (GxxFQxxDDxxD….GK) involved in binding of both homoallylic (IPP) and allylic diphosphate substrates (GPP and DMAPP) [29][30][31] .
The EST fragments of SaSQS1 and SaSQS2 lacked their 3' sequences.The full-length cDNA sequence of both the unigenes were obtained by performing 3' RACE.The ORFs of SaSQS1 (Gen-Bank Accession: KJ665776) and SaSQS2 (Gen-Bank Accession: KJ665777) composed of 1701 bp encoding a polypeptide of 566 amino acids with a calculated molecular weight of 65.22 kDa (SaSQS1), 65.44 kDa (SaSQS2) and pI of 5.01 (SaSQS1) and 5.10 (SaSQS2), respectively.Both the sequences shared a high level of similarity to each other with 81.7% identity at the nucleotide level and 82.8% identity and 89.6% similarity at protein sequence level (Supplementary Fig. S26).The deduced amino acid sequences of SaSQS1 and SaSQS2 were found to match with β-bisabolene synthase from S. austrocaledonicum with 63% identity at the amino acid level.
The missing 3' end sequences of the EST fragment of SaBS was obtained by performing 3' RACE.The ORF of SaBS (Gen-Bank Accession: KJ665778) was found to be composed of 1731 bp encoding a polypeptide of 576 amino acids with a calculated molecular weight of 65.90 kDa and pI of 5.48.Amino acid sequence analysis of SaBS with reported terpene synthases showed resemblance to monoterpene synthase (SamonoTPS) from S. album (99% identity), which was reported to produce traces of β-bisabolene with FPP 32 .Furthermore, there was no detailed study on biochemical characterization of β-bisabolene synthase from S. album.
The EST fragment of SaSS lacked its 5' and 3' sequences.To obtain the full-length cDNA sequence of SaSS, 5' and 3' RACE were performed.The ORF of SaSS (Gen-Bank Accession: KF011938) composed of 1710 bp encoding a polypeptide of 569 amino acids with a calculated molecular weight of 65.16 kDa and pI 5.63.The deduced amino acid sequence of SaSS was similar to that previously reported from S. album 25 .
Predicted polypeptide sequences of SaSQS1, SaSQS2, SaBS and SaSS lacked N-terminal organelle targeting sequence, suggesting that these enzymes are directed to the cytoplasm.The deduced protein sequences of these sesquiterpene synthases shared highly conserved residues with known sesquiterpene synthases including DDXXD motif [33][34][35] involved in substrate binding, (D/N)DXX(S/T)XXXE motif 33,36 essential for metal binding and also RRX 8 W motif 12,37 , the characteristic feature of TPS-b subfamily.

Heterologous expression and functional characterization of SaFDS, SaSQS1, SaSQS2, SaBS and SaSS.
The open reading frames of SaFDS, SaSQS1, SaSQS2, SaBS and SaSS were cloned in suitable vector frames with N terminal His 6 tag for affinity purification under the control of T7-RNA polymerase promoter for expression of soluble active protein in E. coli BL21 DE3 or Rosetta 2 DE3 cells (Supplementary Table S2).Recombinant His 6 -tagged proteins were purified to homogeneity by Ni 2+ -affinity chromatography with a yield of 10-30 mg/L of bacterial culture.
Incubation of recombinant SaFDS with equimolar concentrations of IPP and GPP or 1:2 molar ratio of DMAPP and IPP resulted in the formation of (E,E)-FPP, which on subsequent treatment with alkaline phosphatase yielded the hydrolyzed product (E,E)-farnesol.The product from both the reactions was characterized as (E,E)-farnesol by GC and GC-MS analyses and co-injection studies using standard (E,E)-farnesol (Supplementary Fig. S27).To assess the function of recombinant protein of SaSQS1 and SaSQS2, enzyme assays were performed using purified protein with (E,E)-FPP as substrate in the presence of divalent cation Mg 2+ .GC-MS analyses of the assay extracts indicated that both the enzymes were able to produce a sesquiterpene hydrocarbon (7) with m/z 204 as an exclusive (>93%) enzymatic product, with traces of β-sesquiphellandrene (8) (~5%) and an unidentified metabolite (~2%) (Fig. 2a, and Supplementary Figs.S35-S38).To characterize the enzymatic product (7), large scale assays were performed using 80 mg of purified SaSQS1 protein with 60 mg of (E,E)-FPP (Method S1.4).The hexane extract of the assay mixture was subjected to silica gel column chromatography to obtain the pure product.Based on the spectral data, the enzymatic product was identified as sesquisabinene (7) and the data was found to match well with that of its earlier report 38 .GC-MS analyses and GC co-injection studies using Astec CHIRAL DEX TM B-DA chiral capillary column (Supplementary Table S4 program2) clearly indicated that both SaSQS1 and SaSQS2 enzymes catalyzed the cyclization of (E,E)-FPP to sesquisabinene in presence of Mg 2+ (Fig. 3 and Supplementary Fig. S28).Incubation of recombinant SaBS with FPP produced β-bisabolene (9) as a major product (92%) with the solvated product α-bisabolol (10)  contributing to the rest of the assay product (8%).The enzymatic products were confirmed by GC-MS fragmentation analysis and GC co-injection studies with authentic standards (Fig. 2b and Supplementary Figs.S46-S49 and S66).GC analysis using HP-chiral (20% β-cyclodextrin) capillary column 39 clearly indicated the enzymatic product to be a single enantiomer, which was characterized as (S)-β-bisabolene (Fig. 3 and Supplementary Fig. S29).
For the functional characterization of SaSS, enzyme assay was performed using purified recombinant protein with (E,E)-FPP in the presence of Mg 2+ .GC and GC-MS analyses of the assay extracts indicated the presence of six compounds (Fig. 2c, R t : 16.1, 16.4, 16.7, 16.8, 17.0 and 17.6 min, Supplementary Table S4 program 1).GC and GC-MS profiles of product ratios were in the similar range as those of the earlier report on SaSS 25 .Four metabolites eluting at 16.1, 16.4, 16.7 and 17.0 min were identified as α-santalene (1, 41.2 ± 1.0%), β-santalene (2, 29.5 ± 0.4%), epi-β-santalene (3, 4.4 ± 0.0%) and exo-α-bergamotene (4, 21.6 ± 0.6%), respectively, by comparing the retention time, mass fragmentation pattern and co-injection studies with those of purified compounds (Fig. 3 and Supplementary Figs.S39-S45, S65) 17 .On comparing the mass fragmentation pattern of the compounds eluting at 16.8 min and 17.6 min with NIST/Wiley mass spectral library, they were found to match with farnesenes 25 .However, when the SaSS assay extract was co-injected with the synthesized (E)-β-farnesene ( 6) and farnesene mixture 40 , only one sesquiterpene at R t 16.8 min co-eluted with synthesized (E)-β-farnesene, whereas the peak corresponding to R t 17.6 min did not match with any of the synthesized farnesenes (Supplementary Fig. S65 H-I).Surprisingly, a mutant of SaSS (Y539W, NCBI: JQ690659) was able to cyclize (E,E)-FPP into the compound with R t 17.6 min (Supplementary Table S4 program 1) as one of the major compounds (data not shown).This compound was purified and characterized as exo-β-bergamotene (5) by comparing the spectral data with that of the earlier report 41 .GC and GC-MS co-injection studies using authentic compound 5, the SaSS catalysed reaction product eluting at 17.6 min was identified as exo-β-bergamotene (5) (Supplementary Fig. S65 J-K).
All these sesquiterpenes (1-9) (Fig. 3) are formed through carbocationic cascade reactions and intra-molecular cyclizations involving Wagner-Meerwein rearrangements of the farnesyl carbocation.Interestingly, when IPP and DMAPP/GPP were sequentially incubated with SaFDS and SaSS (methods section), the ratio of the sesquiterpenes formed (1-6) were in the same range as observed by incubating (E,E)-FPP with SaSS (Fig. 2c).Similarly, incubation of IPP and DMAPP/GPP in presence of SaFDS and SaSQS1/SaSQS2/SaBS, led to the formation of respective sesquiterpenes, sesquisabinene/ bisabolene (Supplementary Figs.S30 I-II).These results clearly indicate that the combined assay strategy utilizing sequential catalysis by SaFDS and SaSS/SaSQS1/SaBS along with appropriately deuterium labelled IPP and/or DMAPP/GPP can be utilized to gain insights into the mechanisms involved in the biosynthesis of corresponding sesquiterpenes.
In vivo production of sesquisabinene in microbial host.Microbial production of sesquisabinene with both sesquiterpene synthases (SaSQS1 and SaSQS2) using in vivo expression system: pETDuet-1:SaFDS:SaSQS1/SaSQS2 was performed in C41DE3 cells containing pRARE plasmid.GC and GC-MS analyses of both the cell pellet and broth extracts indicated the presence of sesquisabinene (7) in bacterial cells harbouring SaSQS1 and SaSQS2 where as 7 was not detected in bacterial culture containing empty vector.Using the standard curve drawn for sesquisabinene, GC-FID quantification of 7 was carried out under similar conditions for both pellet and broth extracts.Sesquisabinene production by SaSQS1 was found to be ~3 mg/L of bacterial culture, whereas yield of sesquisabinene was 1 mg/L for the bacterial culture of SaSQS2 (Fig. 4).Trace of 7 was detected in pellet extracts.The difference in the yields of sesquisabinene from both the synthases correlated well with the kinetic parameters of SaSQS1 and SaSQS2 (Table 2).

Phylogenetic analysis.
A neighbour joining phylogenic tree placed all the full-length sequences of sesquiterpene synthases isolated from S. album in a separate clade.Santalene synthase (SaSS), a moderately promiscuous enzyme, branches out separately with the other sesquiterpene synthases (SaSQS1, SaSQS2 and SaBS), which demonstrate high fidelity, while forming dominant single products.The other branch containing SaSQS1, SaSQS2 and SaBS diverges into two nodes, one representing β-bisabolene synthase (SaBS) and the other separated into two clades representing sesquisabinene synthases (SaSQS1 and SaSQS2), respectively (Fig. 5).Phylogenetic analysis suggested that all the sesquiterpene synthases in S. album are evolved from a common ancestor more closely related to monoterpene synthases.

Molecular regulation of sesquiterpene biosynthesis in Santalum album.
What exactly triggers the formation of santalene derivatives with extremely high levels as compared to other sesquiterpenes in sandalwood oil is unknown till date.Previous reports state higher expression of SaFDS and SaSS in matured wood, but no comparative data is available for expression levels of other sesquiterpene   synthases present at the interface of heartwood and sapwood of S. album 42 .In order to establish the molecular regulation of sesquiterpene biosynthesis, kinetic parameters and expression levels of all the sesquiterpene synthases have been determined.Steady-state kinetic constants for sesquiterpene synthases were evaluated for FPP and observed that SaSS had very low K m (0.6 μM) and an exceptionally high kinetic efficiency (>50-100 fold) in comparison to other sesquiterpene synthases (Table 2).Comparative expression level analysis of all the characterized sesquiterpene synthases by semi-quantitative real time PCR revealed substantially higher expression level of SaSS and their relative abundance was compared with SaFDS (Fig. 6).However, expression level of SaSQS1 was found to be approximately 2-3 times higher than SaSQS2 and that of SaBS was very low in comparison to other sesquiterpene synthases.
The pattern of kinetic constants and expression levels of sesquiterpene synthases are in correlation to the sesquiterpene composition in sandalwood essential oil (Supplementary Fig. S70).These results suggest robust kinetic parameters and very high expression level of SaSS as compared to other sesquiterpene synthases (SaSQS1, SaSQS2 and SaBS) leading to the biosynthesis of santalene mixtures in much higher amount even at lower cellular concentrations of (E,E)-FPP, which are subsequently hydroxylated by mono-oxygenase system/s to generate respective sesquiterpene alcohols.

Concluding Remarks
Based on the transcriptome sequencing, we have isolated and functionally characterized one prenyltransferase (SaFDS) and two classes of sesquiterpene synthases.One class represents santalene synthase (SaSS), a multi product-forming enzyme, whereas the second class of sesquiterpene synthases represents, sesquisabinene synthases (SaSQS1 and SaSQS2) and β-bisabolene synthase (SaBS), which form the dominant single products.The products formed were characterized based on thorough spectral analysis, comparison of the mass fragmentation patterns and co-injection studies using authentic standards.
The pattern of kinetic constants and expression levels of sesquiterpene synthases were found to be in  correlation to the sesquiterpene composition in sandalwood essential oil 17 .Molecular and biochemical characterization of four sesquiterpene synthases revealed that robust kinetic parameters and very high expression level of SaSS as compared to SaSQS1, SaSQS2 and SaBS could lead to the formation of santalene mixtures in a much higher level, which are further hydroxylated by CYP450 system to generate respective sesquiterpene alcohols.Functional characterization of SaSQS1 and SaSQS2 revealed that both the enzymes, exhibiting different kinetic parameters, catalyzed an exclusive formation of sesquisabinene from (E,E)-FPP.Furthermore, production of sesquisabinene in heterologous bacterial system was validated by co-expressing SaFDS and SaSQS1 or SaSQS2.These results may pave the way for the large-scale production of these sesquiterpenes in metabolically tractable heterologous systems.

Methods
Plant material.Wood shavings from the interface of heartwood and sapwood were collected at a height of 30-40 cm from ground level from mature sandalwood trees at CSIR-NCL campus, Pune, using Hagloff wood borer, flash-frozen in liquid nitrogen and stored at −80 °C until further use.
RNA isolation and transcriptome sequencing.Total RNA was isolated from the interface of heartwood and sapwood of Indian Sandalwood using a protocol initially reported for isolation of RNA from xylem tissue 43 with extensive modifications (Supplementary Method S1.1).Transcriptome library for sequencing was constructed according to the Illumina TruSeq RNA library protocol outlined in "TruSeq RNA Sample Preparation Guide" at Genotypic Technology Bangalore, India.The library was amplified using 8 cycles of PCR for enrichment of adapter-ligated fragments.Primary QC check of the raw data was performed using the inbuilt tool SeqQC-V2.1.

Transcriptome assembly and functional annotation.
To obtain high quality clean read data for De novo assembly, the raw reads were filtered by discarding the reads containing adaptor sequence and poor quality reads (Phred score < 20).The clean reads were first assembled into contigs using the Velvet_1.1.05.Assembled contigs were given as input for Oasis_0.2.01 to generate transcripts.Redundancy in the transcripts was removed using CD-HIT.To assign molecular function, biological processes and cellular components of transcript, functional annotation of unigenes were performed using KEGG-KAAS analysis, Pfam domain analysis and megablastx search against NCBI Nr database, SwissProt/Uniprot database, Protein Data Bank (PDB) with an E-value ≤ 10 −5 .

Isolation and cloning of terpene synthases in expression vector.
Coding sequences of prenyltransferase (SaFDS) and sesquiterpene synthases (SaSQS1, SaSQS2, SaSS and SaBS) were amplified from cDNA using full-length ORF primers having RE site at both the ends (Supplementary Table S1 and S2).PCR was performed using Proof reading taq DNA polymerase (Sigma-Aldrich).PCR products purified from agarose gel were digested with respective restriction enzymes (NEB) incorporated at their ends and ligated with their expression vectors (pRSETB, pET32b and pET28a, respectively).
Bacterial expression and purification of active protein of SaFDS,SaSS, SaSQS1, SaSQS2 and SaBS.For the expression of active protein, recombinant plasmids such as pRSETB harbouring SaFDS was introduced into BL21 DE3 competent cells, whereas, pET32b harbouring coding sequence of SaSS and pET28a harbouring coding sequence of SaSQS1, SaSQS2 and SaBS were introduced in Rosetta 2 DE3 competent cells.Recombinant protein was expressed under the control of IPTG induction in respective cells and protein was purified to the homogeneity using Ni-affinity column chromatography (Supplementary Method S1.3).Protein concentrations were determined using Bradford method 44 and all the fractions were analyzed on 10% SDS-PAGE.
Product ratio studies of sesquiterpene synthases.The assay mixture contained purified recombinant protein (100 μg) in buffer (25 mM HEPES, 10% v/v glycerol, 5 mM dithiothreitol, 10 mM MgCl 2 , pH 7.4) with isoprenyl diphosphate as substrates (100 μM) in a final reaction volume of 400 μL.SaFDS was assayed using IPP and allylic diphosphate (GPP/DMAPP) as the substrates.The assay mixture was incubated at 30 °C on a rotary shaker for 1 h.After this incubation period, alkaline phosphatase (80 units dissolved in glycine buffer, pH 10.5) was added and incubation was continued at 37 °C on a rotary shaker (150 rpm).After 1 h of incubation, the assay mixture was cooled to 4 °C and extracted with n-hexane (3 × 500 μL).The pooled organic layer was dried over anhydrous Na 2 SO 4 , reduced to ~50 μL with a stream of dry nitrogen and analyzed by GC and GC-MS.Sesquiterpene synthases (SaSS, SaSQS1, SaSQS2, SaBS) were assayed using (E,E)-FPP as the substrate by incubating at 30 °C on a rotary shaker for 2 h 25,45 .After this incubation period, all the reaction mixtures were extracted with n-hexane (3 × 500 μL).The organic layers containing sesquiterpene products were dried over anhydrous Na 2 SO 4 and reduced to ~50 μL with a stream of dry nitrogen.The extracts were analyzed by GC and GC-MS and the sesquiterpene products formed were identified by co-injection with purified or authentic compounds, by comparing the retention time and mass fragmentation.In the combined assays of SaFDS and the sesquiterpene synthases, the assay mixture containing SaFDS along with IPP and allylic diphosphates (DMAPP/GPP) (100 μM) were incubated for 1 h at 30 °C on a rotary shaker.After this incubation period, 100 μg of recombinant sesquiterpene synthases (SaSS/SaSQS1/SaSQS2/SaBS) were added and the incubation was continued for

Figure 4 .
Figure 4. Total ion chromatograms of in vivo productions of SaSQS1 and SaSQS2, A) n-hexane extract of supernatant of empty vector control, B) n-hexane extract of pellet of empty vector, C) n-hexane extract of supernatant of SaSQS1, D) n-hexane extract of pellet of SaSQS1, E) n-hexane extract of supernatant of SaSQS2, F) n-hexane extract of pellet of SaSQS2.

Table 2 .
Kinetic parameters of sesquiterpene synthases isolated from Sandalwood.