In silico mining and functional analysis of AP2/ERF gene in Withania somnifera

Withania somnifera owing to its strong and remarkable stress tolerance property is a reliable candidate for the determination of genes involved in mechanism of adaption/tolerance of various stress conditions. 187 AP2/ERF gene related transcripts (GRTs) were identified during comprehensive search in W. somnifera transcriptome repertoire. Major hits in homology search were observed from the model plant Arabidopsis and members of Solanaceae family. Cloning, expression analysis of the gene and genetic transient transformation with the gene (WsAP2) were performed to predict its functional role in planta. Enhanced expression of some of the pathway genes for terpenoid biosynthesis was observed in transformed tissues in comparison to the control tissues. It is speculated that WsAP2 gene crucially regulates the expression of GGPPS gene in addition to the regulation of other important genes of terpenoid pathway via induction of expression of other genes such as HMGR, CAS, DXS and DXR. To the best of our knowledge, this is the first report representing detailed study of AP2/ERF gene family in W. somnifera. It is also suggested from the study that gene might have role in eliciting responses to combat stress and attribute the strong stress tolerant property associated with the plant.

Sequence alignment and phylogenetic analysis. The query sequences from W. somnifera transcriptome were subjected to multiple sequence alignment along with the the sequences from S. lycopersicum, C. annum and N. tabacum and other plants using tools available (https://www.ebi.ac.uk/Tools). The dataset was then used for tree construction using MEGA 6.06 (https://www.megasoftware.net/). The extent of closeness among the sequences was determined through maximum likelihood using earlier method 32 . Highest log likelihood was applied to construct the tree with branch length obtained through calculation of occurrence of substitution at a particular position.
Motif analysis. MEME suite, a motif based sequence analysis tool was used for motif identification with average amino acid sequences of upto 25 (http://meme-suite.org/) with a value of 35 as maximum number of motifs allowed for analysis. MEME was used to predict the probability of amino acid at a particular site in the pattern 33 . Plant materials. Withania somnifera (NMITLI-118) plants were grown the experimental farm and glasshouse of the CSIR-Central Institute of Medicinal and Aromatic Plants, Lucknow, India. Young plantlets of five leaf stage (one month old) were taken for stress and elicitation treatments while mature (six months old) plants were harvested for tissue wide expression analysis and gene isolation studies.
Isolation of RNA and cDNA synthesis. Flowers of W. somnifera (NMITLI-118) were collected and frozen in liquid nitrogen immediately after harvesting. For the isolation of total RNA, TRI reagent (Sigma-Aldrich, US) was utilized following the standard manufacturer's instructions. The quality of RNA was assesed on 0.8% agarose gel and quantity was checked by spectrophotometric analysis (ND-1000 Nanodrop, NanoDrop Technologies, US). cDNA was synthesized by using the Revert Aid kit for cDNA synthseis (Fermentas, US) following the manufacturer's methods.
Tissue wide transcript quantification studies of WsAP2. For relative quantification of levels of transcripts of WsAP2, different plant parts (leaf, root, berry and flower) were collected from six months old mature plants of W. somnifera and immediately frozen in liquid nitrogen. Further, RNA was isolated and cDNA was synthesized following standard procedure provided by manufacturers (Fermentas) followed by real time PCR analysis using SYBR green master mix (Applied Biosystems). Gene specific real time primers WST11RTF and WST11RTR were used with keeping β-actin gene as an internal reference control. Reactions were carried out Stress and elicitor treatments. One month old plantlets of five leaf stage were employed for various elicitor and stress treatments such as gibberellic acid (GA), methyl jasmonate (MeJA), salicylic acid (SA), wounding, heat-shock and cold. For GA, SA and MeJA treatments, plantlets were dipped in distilled water supplemented with 0.1 mM and 1 mM SA, GA and MeJA separately for 3 hrs, 6hrs, 9hrs, 12hrs, 24 hrs, and 48 hrs, respectively. For heat shock treatment, plantlets were kept at 65 °C for 30 min and 1 hr, for cold treatment plantlets were kept at 4 °C for 30 min and 1 hr. The wound treatment was given by rubbing and puncturing the leaves by sterile syringe, followed by dipping the plantlets in distilled water for 30 min and 1 hr. RNA isolation was done followed by cDNA synthesis and real time transcript quantification of WsAP2 was carried out by methods described above using same set of primers.
Full length clone of WsAP2 (Node_3814) was obtained using cDNA synthesized from RNA of W somnifera flowers with gene specific primers. For amplification, Pfu DNA polymerase (Fermentas, US) was used and the PCR thermocyling program started from denaturation at 94 °C for 3 min, followed by 32-35 cycles at 94 °C for 40 sec, 55 °C for 1 min and 72 °C for 2.0 min, and terminated by a final extension step at 72 °C for 10 min. The purified product was then cloned in the pJET1.2 vector and transformed in E. coli DH5α cells. Positive clones were screened by colony PCR and used for plasmid isolation. Plasmid was digested by appropriate restriction enzymes and cloned into pBI121 vector containing CaMV 35 S promoter. Positive clones were confirmed by PCR and digestion and was further utilized for transformation in A. tumefaciens. Transient transformation of W. somnifera with WsAP2-pBI121 construct. Transient transformation was performed to transiently overexpress WsAP2 transcription factors in W. somnifera by the method described in our earlier studies 35 . The cycling parameters were same as described above and β-actin was taken as endogenous control for semi-quantitative PCR. The semi quantitative expression pattern of WsAP2 was visualized on 0.8% agarose gel. Quantitative real time PCR was performed to validate the results obtained from semi-quantitative PCR and to quantify the WsAP2 transcript abundance in transformed tissues. qRT-PCR was performed with an ABI PRISM 7500 Real-Time PCR System (ABI, US) with a thermocycling program of 95 °C for 30 sec, followed by 40 cycles of amplification (95 °C for 5 sec, 60 °C for 20 sec, 72 °C for 20 sec). For normalization of expression values, actin of W. somnifera was used as a reference gene. All the reactions were conducted with three biological replicates in 10 µl reaction volume.

GUS assay of putative transformants.
A. tumefaciens infected putative tissue transformants after transient expression were checked for GUS expression 36 . Briefly, transiently transformed leaves tissues were checked for expression of GUS by utilizing in-situ GUS reaction mixture. Tissues were incubated for overnight at 37 °C followed by washing with sterile double distilled water. Further, the tissues were dipped in 70% ethanol for removal of chlorophylls. The tissues were observed under a microscope (Leica EZ4D, Switzerland), and scored for the presence of GUS foci with photo-documentation.
Transcript abundance studies on secondary metabolite pathway genes in WsAP2 transformed tissues. The selected genes for this study were CAS (cycloartenol synthase), HMGR (3-hydroxy-3-methylglutaryl coenzyme A reductase), DXS (1-deoxy-D-xylulose-5-phosphate synthase), DXR (1-deoxy-D-xylulose 5-phosphate reductoisomerase) and GGPPS (geranylgeranyl diphosphate synthase). RNA was isolated from wild type and transiently transformed tissues of WsAP and cDNA was synthesized. This cDNA was utilized for qRT-PCR analysis as described above. The primer sequences of selected genes are mentioned in Table S1.

Mining and annotation of AP2/ERF GRTs. Transcripts with significant range of length for AP2/ERF
GRTs were observed in the transcriptome (Pooled berry tissue transcripts represented by red bars as various NODE ids and green and yellow bars denote transcripts named as various contig ids for leaf and root tissues respectively, as in our previous reports). Overall 187 AP2/ERF GRTs were found to be present in W. somnifera transcriptomes. Significant number of AP2/ERF GRTs of considerable lengths were observed in berry (89), leaf (36) and root (62) tissues during the study which included Blastx analysis for homology search, GO analysis, Interproscan assignment analysis etc. The transcripts having length between 500 to 3000 bp were plotted together for the three tissues ( Fig. 1). Top hits were observed for genes reported from A. thaliana, various members from Solanaceae family (S.lycopersicum, C. annum, N. tabacum, S. tuberosum etc.), G. max and various other plants (Fig. 2). The major descriptions for the hits observed in AP2/GRTs were AINTEGUMENTA-like 6, AP2 family protein, ERF family protein, cytokinin response factor, 4, AP2 6 l, ethylene responsive element binding factor2 and so on. 27 transcripts out of total input transcripts were observed to be annotated with Arabidopsis AP2/ERF gene sequences at PLANTTFDB containing important gene definitions (Table 1).
Gene ontology (GO) assignment. Gene ontology search assigned various terms to the gene transcripts such as cell growth, transport, response to stress, response to stimulus (abiotic, biotic and endogenous), signal transduction, flower and embryo development, secondary metabolic process and so on. Similarly, for metabolic function DNA-binding transcription activity, DNA binding, protein binding and so on were the major terms in addition to the transporter activity. Additionally, for cellular component the major expressed terms were nucleus, cytoplasm, plasma membrane, cytosol and some other terms (Fig. 3). (2020) 10:4877 | https://doi.org/10.1038/s41598-020-60090-7 www.nature.com/scientificreports www.nature.com/scientificreports/ Phylogenetic reconstruction of AP2/ERF GRTs in W. somnifera. Multiple sequence alignment was done using the amino acid sequences of putative AP2/ERF GRTs against proteins. MEGA 6.06 software was used for analysis of the phylogeny and molecular evolutionary pattern (Fig. 4). On analyzing the closeness of AP2/ ERF GRTs from W. somnifera with C. annum, S. lycopersicum and N. tabacum, 26 transcripts branched as separate groups in which starting from the first group anticlockwise 7 transcripts were classified as separate clades and 18 other transcripts together clubbed in other clade. Root contig 13458 was classified as a separate branch. In another group 21 W. somnifera AP2/ERF GRTs were grouped separately but lying in the same clade having 3 AP2/ERF genes from S. tuberosum, and 1 related gene from C. annum which were closely placed with root contigs 57460, 18800, 18921 and 6893. Leaf contig 28320, Node 26035 and root contig 45742 were branched separately. Similarly, some other transcripts from W. somnifera in the same group were observed to form different clades. 12 AP2/ERF GRTs were aligned together with 11 AP2 genes from Solanaceae species containing 6 sequences from C. annum, 1 from N. tabacum and 4 from S. tuberosum (Fig. 4). Node 7315, 95324 and leaf contig 7075, 14770  www.nature.com/scientificreports www.nature.com/scientificreports/ branched as additional separate groups,. Further, the next two groups mostly were populated with the reference gene sequences from Solanaceae species. Importantly, the AP2/ERF GRTs importantly that aligned with the above mentioned groups were Node 69997, 57223, 8963, 22774, 38413, and leaf contig 37257. Node 38413 was observed to be closely related to S. tuberosum, N. tabacum and C. annum assigned to the same group but branched differently (Fig. 4).  www.nature.com/scientificreports www.nature.com/scientificreports/ Conserved motif analysis. The annotated sequences contain enriched motifs for AP2/ERF related transcription factors. The analysis suggested that various AP2/ERF genes contain conserved motifs. Three motifs which were of considerable importance, have been shown for the selected sequences (Fig. 5). Major proportions of W. somnifera GRTs showed full AP2/ERF domains while some contained partial domains. The 3 majorly observed motifs were SKKLYRGVRQRPWGKWVAEIRLP as motif 1, AARAYDAAALKLRGKKA as motif 2, and KLNFPENRP as motif 3. The domain importantly present in all the sequences was IPR001471 which has been defined as AP2/ERF domain. The motifs are closely located in some of the sequences while distantly apart in various other sequences (Fig. 5).
cloning of WsAP2. The complete ORF of WsAP2 was cloned in pJET 1.2 blunt end vector and sequenced (Fig. 6a). The full length (1098 bp) amplicon of WsAP2 from W. somnifera codes for the predicted protein of 366 amino acids (Fig. 6a-e). The calculated mass of WsAP2 protein was 40.2 kDa. Homology matching of WsAP2 sequence with transcriptomic dataset (Node_3814) confirmed existence of 99 percent sequence identity. The phylogenetic tree constructed for WsAP2 sequence with the top hit matching sequences during homology search also revealed the closeness of the WsAP2 gene sequence with other members of family Solanaceae such as S. tuberosum (XP_006341423.1), N. tabacum (XP_016457719.1), N. attenuata (XP_019244135.1), C. annum (XP_016562990.1) and C. chinense (PHU24180.1) (Fig. 7).
Tissue specific abundance of WsAP2 transcripts. Since expression level of a gene varies across parts of a plant under various biotic and abiotic stresses 37,38 , we analyzed the expression levels of WsAP2 in different tissues of the W somnifera . Our investigations suggested that WsAP2 was expressed ubiquitously in all parts of W. somnifera. Comparative analysis of expression revealed that it was highest in flower followed by leaf. Lowest expression level were recorded in root tissue (Fig. 8a).  www.nature.com/scientificreports www.nature.com/scientificreports/ jasmonate and gibberellic acid), wounding, heat and cold treatments (Fig. 8a-e). In the case of 1 mM gibberellic acid treatment, increment in expression level was recorded after 9 hrs (95 folds) (Fig. 8b).The expression of the gene was upregulated upto 245 folds during the first 24 hrs after treatment with salicylic acid after which down regulation of gene occurred (Fig. 8c). The most profound effect was observed with MeJA treatment with 1 mM concentration where the elevated expression was recorded after 9 hrs of treatment (upto 155 folds) followed by a decreased expression at 12 hrs intervals (Fig. 8d). Wounding increased the transcripts level of WsAP2 after one hour of treatment (282 folds) while cold treatment reduced the expression of WsAP2 from 30 min (32 folds) to one hour (26 folds) post treatment. Heat treatment reduced the expression level of transcripts in one hour (16 folds) as compared to 30 min (38 folds) (Fig. 8e). www.nature.com/scientificreports www.nature.com/scientificreports/ Transcript abundance of WsAP2 and expression level of pathway genes in transiently transformed leaf tissues. The transient transformation assay was chosen for assessment of the functional role of WsAP2 gene. WsAP2 was successfully cloned in pBI121 (Fig. 6a,b) and overexpressed in W. somnifera leaf explant as indicated by the GUS histochemical analysis (Fig. 9a). Semi-quantitative as well as quantitative real time PCR analysis was done to assess the expression level of WsAP2 in transformed tissues of W. somnifera in comparison with wild type and empty vector control (Fig. 9b,c). Expression levels of transcripts of WsAP2 in transiently transformed tissues was 6 to 8 times higher in comparison to wild type tissues and empty vector transformed tissues. The data revealed that the elevated expression in transcripts level of WsAP2 gene in the transgenic tissues, with variation in transcript abundance in different transformed lines.

Modulation of
By using qRT-PCR analysis the expression level of secondary metabolic pathway (terpenoids) related genes like CAS, HMGR, DXS, DXR and GGPPS in transiently transformed tissues of WsAP2 was also monitored. It was found that all the transcripts showed increased expression in transformed tissues as compared to wild type and empty vector control but the expression of GGPPS was more pronounced (more than 80 folds) than the remaining gene. This result suggested that WsAP2 may be a master regulator of GGPPS gene and also regulated the MVA and MEP pathway of terpenoid biosynthesis via inducing the expression of HMGR, CAS, DXS and DXR gene (Fig. 9d).   www.nature.com/scientificreports www.nature.com/scientificreports/ be pest and disease free with improved yield and fewer requirements of fertilizers and water 39,40 . Environmental factors and pathogens have strong impact on plants during its development. Since AP2/ERF genes play significant role in plant development and various stress responses, whether biotic or abiotic, these present ideal candidature for investigating the regulatory mechanism of related process. The improved tolerance for biotic and abiotic stress requires special focus on plants under the physiological and molecular mechanism towards these stresses as for example in W. somnifera. The significance of the objective to identify and isolate AP2/ERF GRTs is towards understanding the molecular genetic basis which would facilitate the improvement of W. somnifera and provide the functional genetic resource meant for transgenic research. Transcription factors are very important in modulation of acclimatization of plant responses towards different external or internal cues. These significantly govern downstream gene expression in response to stress exposure via gene activation/repression in case of stress and signal transduction pathways. These are present in the plant genome in large numbers. In our previous study, we have demonstrated the presence of significant proportion of transcription factors in W. somnifera 29 . On the basis of conserved AP2-related domains AP2/GRTs were identified to be grouped as AP2/ERF superfamily members in W. somnifera transcriptome in this analysis. The availability of datasets for various plants at Plant (TFDB) facilitated the identification and comparison of families and groups at default parameters. The annotation as predicted, based on comparison with various databases (db) such as Arabidopsis genome db, Interpro db and TFCAT db confirmed the presence AP2/ERF domains in the sequences and thus suggests the correctness of annotation via these databases. The results of selected AP2/ERF genes through phylogenetic analysis were observed to be in confirmation with the results of whole transcripts phylogenetic comparison with other Solanaceae species members, as the selected Node_3814 was aligned close to the Nicotiana sp., S. tuberosum, Capsicum sp. Not every W. somnifera AP2/ERF gene has counterpart in N. tabacum, S. tuberosum, C. annnum depicting the probability that W. somnifera had undergone differential expansion separately. The candidate transcripts for AP2/ERF proteins were used in MEME motif analysis for identification of conserved motifs in families and subfamilies. The results from conserved motif analysis may provide leads to further classify the putative candidates, as identical motifs are likely to have similar function 41 .

Disscussion
Various transcription factors were cloned and characterized from W. somnifera but the role of AP2-ERF TF was still not understood. For the assesment of the role of AP2-ERF transcription factor in W. somnifera via in-silico methods followed by in-planta validation, this study was carried out.The transient overexpression study of WsAP2 showed that this gene was successfully transformed in W. somnifera. It was already known that AP2 transcription factor was involved in growth and development of plant as well as in the regulation of biotic and abiotic stress response 3,42 . It was reported that salicylic acid and jasmonic acid are important activators of defence related genes in plant system. In addition to this plant hormone gibberellic acid regulates the crosstalk of different signaling cascades occurring during different abiotic stress responses 43 , such as drought, salt, and cold. In our current study, the transient overexpression of WsAP2 was found to be induced not only by salicylic acid, methyl jasmonate and gibberellic acid but also by abiotic stresses like wound, heat and cold treatments. This has led to the conclusion that WsAP2 might act as a connecting link between different signalling pathways abiotic stress responses in plant. Jasmonic acid is involved in the rearrangement of gene expression of secondary metabolism in response to various types of environmental and developmental stimuli. Thus, jasmonic acid is a strong inducer of secondary metabolism 44 . Several jasmonate (methyl jasmonate) inducible AP2 transcription factors were studied to be involved in the regulation of secondary metabolic pathway related enzymes. For example overexpression of AaERF1 or AaERF2 were associated with increased accumulation of artemisinin and artemisinic acids 45,46 . In our study, it was found that after transient overexpression of WsAP2 in plant, the expression levels of secondary metabolism related genes like CAS, HMGR, DXS and DXR was increased and GGPPS were maximally induced after WsAP2 overexpression confirming the involvement of WsAP2 in terpenoid metabolism.