Comparative temporal metabolomics studies to investigate interspecies variation in three Ocimum species

Ocimum is one of the most revered medicinally useful plants which have various species. Each of the species is distinct in terms of metabolite composition as well as the medicinal property. Some basil types are used more often as an aromatic and flavoring ingredient. It would be informative to know relatedness among the species which though belong to the same genera while exclusively different in terms of metabolic composition and the operating pathways. In the present investigation the similar effort has been made in order to differentiate three commonly occurring Ocimum species having the high medicinal value, these are Ocimum sanctum, O. gratissimum and O. kilimandscharicum. The parameters for the comparative analysis of these three Ocimum species comprised of temporal changes in number leaf trichomes, essential oil composition, phenylpropanoid pathway genes expression and the activity of important enzymes. O. gratissimum was found to be richest in phenylpropanoid accumulation as well as their gene expression when compared to O. sanctum while O. kilimandscharicum was found to be accumulating terpenoid. In order to get an overview of this qualitative and quantitative regulation of terpenes and phenylpropenes, the expression pattern of some important transcription factors involved in secondary metabolism were also studied.


Results
Chemoprofiling of essential oil. Aiming (Table 1). Total compounds which are represented in metabolic profile were those having peak area percent greater than and equal to 0.01 were considered for metabolic profiling analysis (39 compounds from O. sanctum, 47 from O. gratissimum and 22 from O. kilimandscharicum). As evident from Fig. 1, abundant essential oil constituents (d-camphor, D-limonene, camphene, and caryophyllene) in O. kilimandscharicum were found to be present in higher amounts from July to October and thereafter the contents   Table 1. Essential oil yield of the three Ocimum species for six months. Data are means ± SD (at least three replicates).
of these constituents started decreasing. Similar pattern of essential oil constituents (eugenol, β-elemene, caryophyllene and methyleugenol) was also observed in the oil profiling from the leaves of O. sanctum. However, no major effect of time and environmental factors was observed in O. gratissimum essential oil constituent profiling (eugenol, β-ocimene, germacrene-D and caryophyllene) (Fig. 1). Eugenol was identified to be present in the highest concentration in the leaf essential oils of both O. sanctum and O. gratissimum while camphor constituted the major proportion of the O. kilimandscharicum oil. As per the oil profiling (Supplementary Table 3) of the Ocimum varieties taken in the present study O. sanctum and O. gratissimum were found to be rich in phenylpropenes while O. kilimandscharicum was rich in terpenes. The essential oil constituents were categorized under five major class of compounds namely, phenylpropenes, monoterpenes, monoterpene alcohols, sesquiterpenes and sesquiterpenoid alcohols. The array of essential oil constituents was highest in O. gratissimum followed by O. sanctum and was lowest in O. kilimandscharicum (Supplementary Table 3).

Glandular trichome density.
Trichome density in terms of average number of trichomes per mm 2 leaf area were recorded from the abaxial surface of leaves in all the six months from July to December in a population size of 10 plants each month for O. sanctum, O. gratissimum as well as O. kilimandscharicum (Fig. 2). Highest effect of variation was found to be observed in O. sanctum species where the trichome density gradually increased from July to September and decreased thereafter onwards October. In O. gratissimum also the trichome density increased from July to September but fluctuated in October and November and finally got decreased in the month of December. However O. kilimandscharicum was found to be having least temporal variation on its trichome densities.  Ocimum gratissimum (OG) and, (C) Ocimum kilimandscharicum (OK). Data are given as means ± SD (at least three independent replicates).
transcriptome and the whole genome sequencing data of O. sanctum reported recently [38][39][40] (Supplementary  Table 1). Figure 3 shows the detailed pathway leading to biosynthesis of terpenes and phenylpropenes in the glandular trichomes of the three Ocimum species as earlier discussed by Iijima et al. 41 The results reveal that overall there is a higher expression of phenylpropanoid pathway genes in O. gratissimum as compared to O. sanctum and O. kilimandscharicum. In the succession of gene expression, the highest expression in terms of relative quotient (RQ) was observed in PAL followed by CAD, C4H, 4CL, C3H, CCR, COMT, EOMT and EGS (Fig. 4)  Ocimum sanctum (OS) (B) Ocimum gratissimum (OG) and, (C) Ocimum kilimandscharicum (OK). Statistical analysis was performed using HSD tukey test, standard weighted-means analysis at *P < 0.01, **P < 0.05 and ***P < 0.1 levels of significance. Arrows in the figure are pointing the glandular trichomes on the abaxial leaf surface; Scale bar = 100 µm. Error bars represent ± SD (at least ten independent plants).  Table 2). Only primer sequence of TF WRKY was designed from the O. basilicum 'TrichOME Database' (http:// www.planttrichome.org/trichomedb/estbyspecies_detail.jsp?species=Ocimum%20basilicum) (Supplementary Table 2) because the annotations were lacking WRKY which is considered to be an important TF involved in regulation of secondary metabolism pathways 42 . In case of bHLH1 and MADS BOX TFs the two-two transcripts were selected as these transcript sequences were non-overlapping and were showing equivalent transcript abundance Expression of MYB3 and MYB5 transcripts in O. gratissimum increased from young stage (July) and then decreased upon maturity (December) while MYB2 expression increased from July to September and was nearly constant in the successive months upto December. However, the expression pattern of the three MYB family TFs in O. kilimandscharicum and O. sanctum was nearly temporally unaffected. Next, the MYC transcript expression was also found to be highest in O. gratissimum which increased from July upto November and got reduced in December, while its expression increased in case of O. sanctum from July to December and was least as well as unaffected in O. kilimandscharicum. Expression study data of PAP1 TF transcript reveals O. sanctum to be highest PAP1 expressing Ocimum species among the three whose expression increases as the plant matures and found to be highest in during the month of December. PAP1 expression in other two Ocimum species, O. gratissimum and O. kilimandscharicum was very low as compared to O. sanctum and also no noticeable temporal effect was observed in PAP1 expression pattern as there laid almost evenly distributed transcript abundance throughout the Data are means ± SD (for at least three replicates), and the y-axis represents the relative quotient (RQ). Statistical analysis was performed using HSD tukey test, standard weighted-means analysis at P < 0.01* and P < 0.05** levels of significance while 'ns' means nonsignificant. enzyme activity of some important phenylpropanoid pathway enzymes. Crude protein extracts obtained from young leaves of the mature plant were assayed for activity for the key phenylpropanoid pathway enzymes in leading to the phenylpropenes, as well as for an intermediate enzyme that might also be involved in phenylpropenes biosynthesis. These enzymes included phenylalanine ammonia lyase (PAL), cinnamate 4-hydroxylase (C4H), 4-coumarate: CoA ligase (4CL), and cinnamyl alcohol dehydrogenase (CAD). Figure 6 shows the activity of the enzymes per mg protein. The order of activity for the four enzymes in the three Ocimum species was highest in O. gratissimum followed by O. sanctum and least in O. kilimandscharicum. total phenolic content, anthocyanin content and chlorophyll content. Total phenolics, anthocyanin and chlorophyll contents were estimated from the leaves of the mature plant of all the three Ocimum species in order to know their variability in terms of these three constituents in order to get some correlation of the these constituents composition with that of the gene expression profiling of phenylpropanoid pathway as well as transcription factors.
The phenolics and the pigment analysis results as shown in the Fig. 7 proved O. gratissimum to be richest of all, the total phenolics, anthocyanin as well as chlorophyll pigments. Data are means ± SD (for at least three replicates), and the y-axis represents the relative quotient (RQ). Statistical analysis was performed using HSD tukey test, standard weighted-means analysis at *P < 0.01 and **P < 0.05 levels of significance while 'ns' means non-significant.

Discussion
On comparing the oil yields and the trichome densities of the three Ocimum species, it was observed that as the trichome density decreases upon the plant maturity, the oil yield also decreases ( Table 1). The essential oil profiling showed that both O. sanctum and O. gratissimum were rich in eugenol which is a phenylpropene compound whereas O. kilimandscharicum was rich in camphor, a well-known terpenoid ( Fig. 1 and Supplementary Table 3). As also reported by Joshi 43   O. kilimandscharicum (OK) at maturity. L-Phenylalanine, trans-cinnamic acid, p-coumaric acid and coniferyl alcohol were used as the substrates for PAL, C4H, 4CL and CAD, respectively. Activities are given as means of two independent assays ± SD (standard deviation). Statistical analysis was performed using HSD tukey test, standard weighted-means analysis at *P < 0.01 and **P < 0.05 levels of significance while 'ns' means nonsignificant. www.nature.com/scientificreports www.nature.com/scientificreports/ trichome distribution throughout the six months and also had the least number of trichomes per mm 2 leaf area and also least oil yield as compared to O. sanctum and O. gratissimum. Since the apical leaves were used for the trichome density studies, the number of trichomes increased upto the optimum metabolite expression stage (July to October) and thereafter the number decreased to leaf expansion. While studying the changes in leaf trichomes and epicuticular flavonoids during leaf development in birch taxa, Valkama et al. 46 also concluded that rapid decline in the density of leaf trichomes due to growth dilution in expanding leaves, as the total number of trichomes per leaf remained constant and neither the development nor shedding of trichomes at later growth stage occurs. In another study, Adebooye et al. 47 also described that the morphology and density of trichomes and stomata of Trichosanthes cucumerina also got affected by leaf age, densities decrease as leaf age increases. Tozin et al. 48 while analyzing the density of glandular trichome density and essential oil profile of inflorsences and leaves of Lippia origanoides Kunth belonging to family verbenaceae also reported that higher essential oil yield in the inflorescences as compared to the leaves. In a separate investigation by Werker 49 , it was demonstrated that trichomes remain functional in mature leaves, contrastingly Gairola et al. 50 also reported that at leaf maturity, the functional role of trichomes becomes less important and they therefore senesce or wither. All of the differences in the trichome densities of the three Ocimum species could be related to genetic or physiological or evolutionary mechanisms operating within the genus, which need further investigation. Essential oil of the genus Ocimum is a secondary metabolites reservoir and has been suggested to be correlated with the chromosome numbers of species as well as the oil yield 39,51 . The chromosome numbers of O. sanctum var. CIM Ayu (2n = 16), O. gratissimum is higher (2n = 40) and O. kilimandscharicum (2n = 38) 39,52,53 also support the earlier report of correlation between essential oil yield and chromosome number.
In secondary metabolism, phenylpropanoid biosynthesis is one of the most important pathways as it leads to synthesis of a large group of natural products 22,54 . The core phenylpropanoid pathway involves three enzymes, PAL, C4H, and 4CL. PAL is the first enzyme in the pathway that catalyzes the conversion of L -phenylalanine to trans-cinnamic acid. Subsequently, C4H, which is a member of cytochrome P450 super-family, hydroxylates trans-cinnamic acid into para-coumaric acid. Lastly, the formation of p-coumaroyl CoA from p-coumaric acid takes place by the reaction catalyzed by 4CL leading to the production of hydroxycinnamic acids, monolignols/ lignin, coumarins, benzoic acids, stilbenes, anthocyanins and flavonoids 22,55 . In Ocimum species, the phenylpropanoid pathway is an important one as it leads to the synthesis of many commercially important phenylpropenes like-eugenol, methyleugenol, chavicol, methylchavicol in the leaf essential oil 38,39,44,56 . The phenylpropenes synthesized in plant aerial parts helps in plant defense against herbivores and pathogens 56 and are also imperative in human diet 57,58 . Biosynthesis of these phenylpropenes is localized in the specialized glands known as glandular trichomes on the surface of leaves 3,56 .
PAL is an important enzyme as it is a link between primary and secondary metabolism. It is also a key regulatory enzyme in the phenolics biosynthesis 59 as high activity of PAL is usually associated with the accumulation of phenolic compounds in fruit tissues of several species 60 . Highest expression of PAL (Fig. 4) also correlated with the highest enzyme activity (Fig. 6), highest total phenol and anthocyanin contents (Fig. 7A,B) in O. gratissimum followed by O. sanctum and O. kilimandscharicum. Expression of the PAL, C4H and 4CL transcripts in all the three Ocimum species first increased and then decreased as the plant attained maturity. The expression was optimal at the duration of high oil yield respective to each of the three Ocimum species. Xu et al. 61 have also proven that the expression level of GbPAL from Ginkgo biloba was lowest at the beginning of leaf growth, increased gradually, decreased thereafter and subsequently increased further and then remained relatively constant. The results obtained at the metabolic, transcript and protein level support the hypothesis 62 that the PAL, C4H, and 4CL genes or gene families irrespective of their variable sizes, represent a case of tight regulation, possibly mediated through large structural and functional similarity in TF binding with their promoters [63][64][65][66] . The real-time expression results of the seven TFs (bHLH1_25905, EREB, MADS box_50254, MYB3, MYB5, MYC and TTG1) out of twelve TFs studied (Fig. 5) in the present investigation also show a similar expression pattern as of the PAL, C4H and 4CL transcripts in the three Ocimum species are also in coherence with the above hypothesis. As also suggested by Koopmann et al. 67 , the results of the present study indicate to the possibility of the formation of a true multienzyme complex by PAL, C4H and 4CL enzymes.
The activity of C3H in biosynthesis of lignin and many other phenylpropanoid pathway products in plants has been well documented, however, conditions suitable for assay of the enzyme explicitly, still remain unclear. Although p-coumarate acts as the substrate of C3H but its significant activity towards other para hydroxylated substrates cannot be ignored 67 . Franke et al. 68 in their work revealed CYP98A3 is encoded by REF8 gene which is required for the synthesis of wild-type lignin precursors and sinapate esters in Arabidopsis. Gang et al. 69 have reported that the differential production of meta hydroxylated phenylpropanoids in sweet basil is controlled by the activities of specific acyltransferases and hydroxylases found in the peltate glandular trichomes and leaves. In the present investigation, the expression pattern of C3H transcript in the three Ocimum species did not show a specific trend. In case of O. sanctum and O. gratissimum, the expression first increased, then decreased, again increased and finally decreased while it was nearly constant in case of O. kilimandscharicum (Fig. 4). Till date the activity of C3H was considered essential for the lignin biosynthesis in plants 70,71 but the possibilities of synthesizing other compounds may not be overlooked. As also reported in our previous work 38 a new function of 4CL gene was explored towards eugenol biosynthesis rather than considering its conventional involvement in lignin biosynthesis. Hence, a further investigation is required to finally prove its role in the synthesis and/or regulation of phenylpropenes biosynthesis and to evidence the highest transcript expression in O. sanctum. Expression of next gene transcript, COMT was found to be decreased as the plant attained maturity in case of O. gratissimum and O. sanctum whereas in O. kilimandscharicum there was no major change in expression pattern was observed (Fig. 4). It has already been discussed that there is decrease in essential oil metabolites as the plant attains maturity. This nature of expression of COMT gene transcript expression may be elucidated by the experiments carried out by Gang et al. 56 where the role of COMT in phenylpropene biosynthesis was evident when the northern blot Scientific RepoRtS | (2020) 10:5234 | https://doi.org/10.1038/s41598-020-61957-5 www.nature.com/scientificreports www.nature.com/scientificreports/ showed high expression of COMT in glandular trichomes as compared to the whole leaf while studying the relative abundance of mRNA in the peltate glandular trichomes.
The activity of CCR and CAD enzymes till date has been attributed to lignin biosynthesis [72][73][74] . The CCR and CAD gene encode the enzymes which catalyze the first and last steps of lignin monomer biosynthesis, respectively and are closely related members of the short-chain dehydrogenase/reductase (SDR) superfamily 75 . Thus, the constant and increasing patterns of CCR and CAD transcripts expression in the all the three Ocimum species could be correlated with the plant aging (Fig. 4).
EGS and EOMT are the terminal genes which encode for the enzymes responsible for the synthesis of eugenol and methyleugenol from the coniferyl acetate and eugenol as the substrates, respectively 76 as also shown in Fig. 3. Expression of EGS transcript was found to be expressed highest in O. gratissimum followed by O. sanctum and vice versa in case of EOMT expression pattern while, O. kilimandscharicum showed very low expression of EGS and even lesser expression of EOMT transcript (Fig. 4). As evident from the Fig. 1, O. gratissimum and O. sanctum had high percentages of eugenol content in their leaf essential oil; hence the high expression of EGS in both the two species gets justified. Gang et al. 77 while characterizing the phenylpropene O-methyltransferases from sweet basil with 13 phenolic acid substrates demonstrated that EOMT1 enzyme gave 100% activity with eugenol as a substrate, simultaneously it also gave 29%, 26% and 24% activities with guaicol, isoeugenol and chavicol as substrates. Essential oil profile of O. sanctum and O. gratissimum (Supplementary Table 3) also confirms the eugenol and iso-eugenol presence in high percentages and hence, the high expression of these two gene transcripts in the two Ocimum species may be correlated. Since these two enzymes, EGS and EOMT are localized in the glandular trichomes of leaf, the expression of the transcript decreases as the leaf expands upon maturity.
Transcription factors (TFs) play a dominant role in gene regulation of all plant growth and development aspects, inclusive of secondary metabolism. Recent years, have added to the number of transcription factors involved in plant secondary metabolism regulation. However, the possibility of existence of other mechanisms regulating specific pathways cannot be overruled. Several families of TFs have been ascribed to be regulators of plant secondary metabolism but a few important ones with equivalent digital gene expression from the comparative O. sanctum and O. basilicum transcriptome sequencing data 39 were sorted for the temporal expression studies of three Ocimum species. These include two non-overlapping transcripts of bHLH1 (bHLH_21387 and bHLH_25905), MADS box (MADS box_50254 and MADS box_43518) each and single transcripts of EREB, MYB2, MYB3, MYB5, MYC, PAP1, and TTG1 (Fig. 5). MADS box proteins, the MYB and bHLH (basic-helix-loop-helix) families have significantly expanded in the past 100-600 million years and are extensively reviewed 78 . TFs generally form complexes in order to regulate the metabolic pathways as evident by several examples. It has been reported that the MYB and bHLH TFs function cooperatively and flavonoid biosynthesis is one of the best best-studied pathway of the combinatorial gene regulation by interactions between the two 79 . Gonzalez et al. 80 demonstrated the regulation of the anthocyanin biosynthetic pathway by transcriptional complex formation of TTG1 (transparent testa glabra1)/ bHLH/ MYB TFs in Arabidopsis seedlings. Not only was this TF complex, MYC also suggested to be involved in regulation of anthocyanin biosynthesis in Perilla frutescens 81 , a member of the same lamiaceae family to which Ocimum belongs. Considering the diverse functions of these TFs it becomes extremely difficult to give explanations to the intricate role of bHLH, MYB, MYC and TTG1 TFs. PAP1 (production of anthocyanin pigment1) being well described for its involvement in the anthocyanin biosynthetic pathway 82 but Sekhon et al. 83 and Pourtau et al. 84 have also reported PAP1 and PAP2 to be involved in senescence induced by pollination prevention in maize and sugar application in Arabidopsis, respectively. Hence, the expression pattern of PAP1 in O. sanctum may be due to some stress induced at the onset of winter season in November and December as Ocimum is a plant of tropics. But no specific trends in the expression pattern of O. gratissimum and O. kilimandscharicum were observed which might be due to the fact that O. sanctum might be comparatively more susceptible to cold climate which needs experimental confirmation. Contrastingly, MADS which is an acronym for the four founder proteins MCM1 (from Saccharomyces cerevisiae), AGAMOUS (from Arabidopsis), DEFICIENS (from Antirrhinum), and SRF (a human protein), on which the definition of this gene family is based. The network of these MADS box genes is not only imperative in contributing to floral organ identity but also in floral meristem identity 85 . The increasing temporal pattern of expression of MADS box gene transcript in the three Ocimum species of the present study supports the fact of involvement of MADS box gene in flower organ development as meristem. The ERF (ethylene response factor) family formerly known as EREBP (ethylene-responsive element binding proteins) is attributed to regulation of biological processes related to plant growth, metabolism, development, and response to abiotic and biotic stresses 86 . Since the EREB transcript reveals an increasing and further decreasing temporal expression pattern in the three Ocimum species under the investigation, it might be involved in some of the biological process active during the plant development and slows down as the plant attains maturity. WRKY proteins comprise a large family of TFs which imperative to developmental and defense response in plants, hence the higher expression of WRKY transcript in the three Ocimum species towards the plant maturity might be due to the plant response against the abiotic stresses. The phenylpropenes found in glandular trichomes of Ocimum basilicum play an important in plant resistance against herbivores 56 . Simultaneously, Valkama et al. 46 suggested that during the birch leaf development, the amount of osmiophilic material (phenolics containing o-dihydroxy groups) declines, however 20-40% of cells in aged trichomes possess it.
This study provides a comparative description in trichome number and expression pattern of important genes of phenylpropanoid biosynthesis pathway as well as the transcription factors involved in the secondary metabolism with respect to the differential accumulation and regulation of essential oil metabolites and their composition among three Ocimum species. The final number of trichomes is Ocimum leaf is ascertained at the young stage and does not change during leaf development. On the contrary, the trichome density as well as phenylpropens tends to decline with leaf age. Since the basils are susceptible to winter season, the expression pattern of the genes and transcription factors discussed herewith may be due to some abiotic (cold) or biotic (insects, fungal pathogens etc.) which often attack the plant during this season. A very scarce literature is available to infer the Scientific RepoRtS | (2020) 10:5234 | https://doi.org/10.1038/s41598-020-61957-5 www.nature.com/scientificreports www.nature.com/scientificreports/ interaction of such abiotic and biotic plant stress exerted over the plant. Present investigation in light of trichomes as well as gene expression studies could be exploited for genetically improving the essential oil biosynthesis in Ocimum species which are becoming highly desirable for fragrance, flavor and pharmaceutical industries. extraction and analysis of essential oil. Hydro-distillation of collected plant leaves was conducted in a Clevenger-type apparatus for two hours. 1 µl of 1:10 pentane diluted essential oil was injected in GC-MS (Agilent Technologies 7980 A gas chromatograph system with the 5977 A mass selective detector) for analysis. The HP5-MS column with dimension 30 m × 250 µm having film thickness 0.25 µm was used for obtaining the peak separation in the chromatogram. Helium in a split ratio of 10:1 and flow rate of 1 ml/min was used as the carrier gas. The running condition for the samples was 40° for 5 min as initial hold, subsequently 150 °C at the flow rate of 3 °C/min, followed by a ramping of 5 °C/min until the temperature reaches 200 °C and finally a hold for 10 min after the temperature reaches 300 °C with a ramp rate of 10 °C/min. Mass spectrometry was conducted at 230 °C as a transfer line and ion source temperature while, 150 °C as quadrupole temperature, 70 eV ionization potential and 50 to 550 atomic mass units scan range. Version 2.0 g of NIST/EPA/NIH mass spectral library was used for compound identification (Agilent Technologies, Palo Alto, CA, USA). The relative abundance of particular constituent was considered as the area percent. and O. kilimandscharicum) were used to isolate the glandular trichomes following the method used by Rastogi et al. 38 . The total RNA was isolated from the isolated glandular trichomes using Spectrum Plant Total RNA Kit (Sigma). 2 µg of total RNA was used for the cDNA synthesis via Revert Aid Premium First Strand cDNA Synthesis Kit (Thermo).

Methods
Quantitative Rt-pcR analysis. Quantitative realtime PCR was performed by the protocol given by Rastogi et al. 38 which utilized SYBR Green chemistry (Thermo). The Primer Express Software version 2.0 (Applied Biosystems) was used for the designing of gene-specific primers and were ordered from Integrated DNA Technologies, India (Supplementary Table 1 and 2). The experiment was conducted in '7900HT Fast Real Time PCR System' (Applied Biosystems) with five biological replicates, and the reaction specificity was evaluated by analyzing the melting curve. The parameters of the thermal cycling were: 50 °C for 2 min (initial hold); 95 °C for 10 min (initial denaturation); and 40 amplification cycles (95 °C for 15 s; and 60 °C for 1 min). Subsequently additional steps (60 °C for 15 s, 95 °C for 15 s and 37 °C for 2 min) were followed to get the dissociation curve. Actin of O. sanctum (details provided in the Supplementary Table 1) was used as an endogeneous control to quantify the relative mRNA levels 38,87 . ∆∆C t method was used for relative quantification of gene transcripts through Sequence Detection System (SDS) software version 2.2.1. As a result of real-time PCR, the C t (threshold cycle) values thus obtained were used to calculate ∆C t value (target-endogenous control). Thereafter, ∆∆C t was calculated for the quantification by determining the fold difference in gene expression [∆Ct target -∆Ct calibrator]. Finally, 2 −∆∆CT was determined as relative quotient (RQ). total phenols, anthocyanin chlorophyll estimation. Anthocyanin content was estimated following the protocol of Neff and Chory 88 . Anthocyanin quantification was carried out by incubating 1 g leaf samples (grounded in liquid nitrogen) overnight in 150 ml of with 1% HCl acidified methanol in triplicates. Further 100 ml of distilled water and 250 ml chloroform was added to separate anthocyanins from chlorophylls. Absorbance at 530 nm and 657 nm were recorded to determine total anthocyanins using a spectrophotometer (Elico). Relative amount of anthocyanin per gram leaf sample was calculated by subtraction of absorbance at 657 nm from the absorbance at 530 nm.
Total phenolic content was determined by Folin-Ciocalteu method using gallic acid as phenolic standard 89 . About 100 mg powdered leaf samples of O. sanctum, O. kilimandscharium and O. gratissimum were extracted with 0.5 µl of 80% ethanol in triplicates. The extract was centrifuged for 20 min and the supernatant was collected. The supernatants were evaporated to dryness and dissolved in 0.5 ul of water. Different aliquots of the dissolved extracts were pipetted (2-20 µl) into the micro-centrifuge tubes. The volume of each micro-centrifuge tube was made up to 300 µl final volume with the double distilled water. About 50 µl of Folin-Ciocalteau reagent was added into each tube. After 3 min, 200 µl of 20% Na 2 CO 3 solution was added into the each tube and mixed thoroughly. Each tube was now placed in boiling water for exactly one min. The samples were cooled and measured at 650 nm absorbance using micro-titer plate. The concentration of total phenols was estimated using standard curve and expressed as mg phenols/100 g materials. (2020) 10:5234 | https://doi.org/10.1038/s41598-020-61957-5 www.nature.com/scientificreports www.nature.com/scientificreports/ Chlorophyll extraction was performed by the protocol given by Sadasivam and Manickam 89 . About 1 g powdered leaf samples of O. sanctum, O. kilimandscharium and O. gratissimum were extracted with 80% chilled acetone in triplicates till the residues turned colorless. The supernatant was collected into the volumetric flask and the final volume was made upto 100 ml with 80% of chilled acetone. The extracted solutions were measured as 645 nm, 663 nm and 652 nm absorption against 80% acetone as blank. The amount of chlorophyll present in the extract mg chlorophyll per g tissue was calculated using the following equations: where, A = absorbance at specific wavelengths, V = final volume of chlorophyll extract in 80% acetone W = fresh weight of tissue extracted enzyme assays. Young leaves were used to prepare the soluble protein extracts. Whole leaves of individual species were weighed and grinded in liquid nitrogen in triplicates. The extraction was carried out in ice-chilled protein extraction buffer (10:1, w/v), containing 50 mM BisTris [2-[bis(hydroxyethyl)amino]-2-(hydroxymethyl)-1-propane-1,3-diol] HCl, pH 8.0, 14 mM β-mercaptoethanol, and 10% (w/v) glycerol followed by an incubation of 30 min on ice. The ground mixture was then centrifuged at 4 °C for 20 min at 14,000 g to get the protein extract as a clarified supernatant which was transferred to a new tube. The Bradford method 90 was used to quantify the concentration of protein in the extract. The protein thus isolated was used for the assay of PAL (phenylalanine ammonia lyase), C4H (cinnamate-4-hydroxylase), 4CL (4-coumarate: CoA ligase) and CAD (cinnamyl alcohol dehydrogenase) enzymes. PAL, C4H, 4CL, and CAD activities were measured following the procedures described by Gang et al. 56 , Misra et al. 91 , Rastogi et al. 38 , Fu et al. 74 , respectively. Enzyme assay for each enzyme was set in reaction volume of 1 ml containing 1 mg of the plant protein extracted. There were two controls taken for this enzyme assay, one included all reaction components except the protein and another had all reaction components except substrate. Reactions were incubated for 2 hours at 30 °C and after that it was ended by the addition of 50 µl 6 N HCl. The product was extracted twice by adding equal volume of ethylacetate, vortexing, and centrifuging at 14,000 g for 5 min, followed by evaporation of organic phase in vacuum. Product identification was verified by gradient high-performance liquid chromatography (HPLC) (LCMS-2010 EV, Shimadzu) as described by Proestos  ) at the flow rate of 1 ml/min. Column effluent was monitored at wavelengths of 254 nm, 280 nm, 320 nm and the product was recognized by spectral scans using the photodiode-array detector followed by comparing retention time and UV spectrum with that of genuine standards. In case of 4CL enzyme assay, the activity was measured in terms of substrate utilization rather than product formation due to the commercial unavailability of p-coumaroyl CoA standard.
Statistical analysis. One-Way Analysis of Variance (ANalysis Of VAriance) with post-hoc Tukey HSD (Honestly Significant Difference) test 93 was used for performing all the statistical analysis used in the study at *P < 0.01 and **P < 0.05 levels of significance with 'ns' meaning non-significant.