Genome-wide identification and analysis of GRAS transcription factors in the bottle gourd genome

GRAS genes belong to the plant-specific transcription factors (TF’s) family that are known to be involved in plant growth and development. In this study, we have identified 37 genes from the bottle gourd genome that encodes for GRAS TF’s. Except for the SCLA, we were able to identify at least one gene from each of the 17 subfamilies. Gene structure and chromosomal analysis showed that maximum seven genes are present on Chr7 followed by six genes on Chr1. The subcellular location analysis revealed that most of the genes were localized in the nucleus, except for a few in chloroplast and mitochondria. Additionally, we have identified one tandem gene duplication event on Chr7 and three major motifs that were present in all the GRAS genes. Furthermore, the protein–protein interaction prediction and gene expression analysis showed five candidate hub-genes interact with various other genes and thus probably control the expression of interacting partners in different plant tissues. Overall, this study provides a comprehensive analysis of GRAS transcription factors in bottle gourd genome which could be further extended to other vegetable crops.

www.nature.com/scientificreports/ Sequence analysis and subcellular location prediction. ProtParam and BUSCA webserver was used for computing the protein properties and subcellular location of the GRAS genes 26,27 . To identify the motifs signature present in the GRAS genes, MEME suite (Multiple Em for Motif Elicitation) webserver was used with the maximum number of motifs (10), minimum motif length (6), maximum motif length (50), minimum (2) and maximum (37) sites per motif respectively 28 .
Phylogenetic tree. We have considered 397 GRAS TF's from eight different species that were classified into 17 different subfamilies 11 . A local Blast database was constructed and each of the 37 identified GRAS genes were subjected to blast search against this database. Based on the best-hit, each gene was further assigned to the respective group 11 . Subsequently, the final dataset of 434 genes was used to perform multiple sequence alignment (MSA) using Muscle tool 29 and MEGA7 was used to construct the phylogenetic tree using the JTT model with gamma distribution and complete deletion of removal or gaps 30 . Finally, the tree was visualized using the iTOL (interactive tree of life) software 31 .

Protein-protein interaction (PPI) prediction and differential gene expression (DGE).
Bottle gourd and Cucumis melo (C. melo) are phylogenetically closely related species and belong to the same Cucurbitaceae family therefore, we have used C. melo as a reference to search GRAS genes and their interacting partners in the String database 32 . All the identified interacting partners were collected and queried against the bottle gourd genome at e-value 1e −10 using blast software. The single best-hit for each gene was considered for the construction of a PPI network using Cytoscape 33 . Finally, the top five hub genes from the interaction network were predicted using cytoHubba plugin of Cytoscape 34 . Further, to understand the contribution of these genes, transcriptome of the different tissues from the bottle gourd were searched. We found one dataset in the Cucurbitaceae genome database with the gene expression (FPKM) values for five different tissues. FPKM value of all the 37 GRAS genes were extracted and plotted in the form of a heatmap to understand the relationship.

Identification of GRAS transcription factors.
We have identified 38, 37 and 37 genes encoding for GRAS TF's using HMMScan, PfamScan, and InterProScan respectively. Further, domain-based analysis using the SMART tool confirmed the presence of 37 genes from the GRAS family. Therefore, we have finally selected www.nature.com/scientificreports/ these 37 genes for further in-depth analysis. In addition to the GRAS domain, we identified a DELLA domain in four genes, WD40 in one gene, and a maximum of nine low complexity regions ( Table 1). The search for the chromosomal location has identified a minimum of one gene on chromosome-8 and 11, while a maximum of seven genes have been found on chromosome-7 ( Figure 1).
Analysis of gene structure, duplicated genes and sequence properties. As observed from the  www.nature.com/scientificreports/ Phylogenetic analysis. A phylogenetic tree was constructed from the 434 sequences including 397 from previously published work and 37 from the bottle gourd genome. As observed from the Figure-2, all the 37 identified GRAS genes could be divided into 16 different subfamilies. Further analysis revealed that except for the SCLA, at least one gene from each subfamily was present in the genome (Fig. 2). A maximum of six genes were observed from the Phytochrome A signal transduction (PAT) subfamily followed by four genes from the HAM and DELLA subfamilies. In contrast to that, LS and Required for arbuscule development (RAD) belong to the smallest subfamily with only one gene.

Protein-protein interaction (PPI) prediction and DGE analysis. PPI prediction analysis revealed
that a total of 178 unique genes from C. melo were involved in 467 possible interactions. Next, we searched their homologs in the bottle gourd genome. We observed that two genes (XP_008452266.1 and XP_008459103.1) do not have any homologs, and therefore we excluded them further in our study. Finally, we had a set of 169 exclusive genes from the bottle gourd genome that possibly interacts with each other to control various biological functions. Further, network analysis revealed the presence of 169 nodes and 467 edges with clustering www.nature.com/scientificreports/ coefficient value 0.452 and 5.4 average no. of neighbour's, respectively. Based on the maximum degrees, we have identified Lsi02G029020, Lsi02G024800, Lsi01G004880, Lsi04G000500, and Lsi06G005500 as the candidate hub genes with maximum interactions 22,21,20,19, and 19 respectively (Fig. 3). We have also studied expression level of the GRAS transcription factors in different plant tissues and observed that Lsi01G004880 gene was upregulated in all five tissues with higher expression (> 6 folds) in the root tissue (Fig. 4). Similarly, Lsi04G000500 and Lsi04G005500 expressed in the root tissue, whereas with no or poor expression in stem and leaves. On the other hand, the Lsi02G024800 gene showed down-regulation in the stem and leaves while no significant change in the expression pattern of the Lsi02G029020 gene was observed (Fig. 4). The GRAS genes Lsi05G003650, and Lsi07G014150 expressed in roots but with no significant expression in the stem tissues.

Functional annotation of Hub and interacting genes.
We investigated the function of five candidate hub genes based on their involvement in a biological and cellular process. Lsi02G024800 gene is a DELLA protein that acts as a repressor of GA induced growth and interacts with gene encoding for auxin efflux carrier, gibberellin receptor, and lignin degradation and detoxification. Thus, it eventually controls the root growth, seed germination and elongation of the stem. Similarly, Go-term analysis showed that Lsi02G029020 gene has DNA binding transcription factor activity involved in the regulation of transcription, gene expression and is localized in the nucleus. This gene interacts with transcriptional activator genes that control the genes encoding for stamen development, cell expansion, and flowering time and also modulate the growth of roots. Next, hub-gene Lsi01G004880 is a scarecrow-like protein that interacts with 10 other proteins including a Zn-finger domain and phytohormone protein. The phytohormone gene controls the phototropic response by modulating the light signal. Similarly, the Lsi04G000500 gene also belongs to the scarecrow-like class and interacts with genes like serine/arginine-rich SC35-like splicing factor SCL28, jasmonate O-methyltransferase (JMT), and Nutcracker (NUC). JMC gene converts the jasmonate into methyl-jasmonate and plays an important role in plant defense. NUC gene acts as a transcriptional activator and is involved in the regulation of flowering, and asymmetric cell division. Also, Lsi06G005500 gene belongs to the scarecrow-like class and acts on auxin response factor (ARF), gibberellin 2-beta-dioxygenase-1 (GA2OX1), and phytochrome that is involved in plant growth and development. In the present study, we explored the GRAS family in the bottle gourd genome including their gene structure, chromosomal distribution, phylogenetic analysis, and gene expression in different tissues.
In this study, a total of 37 GRAS genes were computationally identified in the bottle gourd genome, which is lower than tomato, Cucurbita but higher than Arabidopsis. Previous studies reported that GRAS genes are mostly encoded by the single exon with a length of around 400-770 amino acids 35 . We have also observed a similar pattern in case of bottle gourd with a small variation in the gene length. Intronless gene is the important feature of prokaryotic genome thus, it suggested the phenomenon of horizontal gene transfer from the prokaryotes as well as the close evolutionary relationship among different members 4 . Further, our subcellular localization analysis is also in agreement with previous studies that most of these genes are localized in the nucleus. We also found one tandem gene duplication event that could play an important role in the expansion of GRAS gene family.
It is well-known that some genes of the DELLA subfamily such as GAI, RGA, and RGL act as repressor of gibberellin signaling, while SCR and SHR are involved in radial root development 5,38,39 . Expression of the DELLA gene (a negative regulator of GA signaling) in multiple tissues highlighted their role in plant growth and developments 40,41 . Similarly, SCL3 is involved in the root elongation, and SCR interacts with the RGA gene that ultimately controls the root meristem size in Arabidopsis 42 . Previous studies reported that mutation in the SCR gene resulted in the disruption of the asymmetrical cell division and thus affect the root growth and development. Thus, a higher expression of the SCR gene in the plant roots is helpful for the radial organization of roots 43 . www.nature.com/scientificreports/ Bottle gourd has been widely used as a rootstock in controlling different types of biotic and abiotic plant stress [44][45][46][47] . Previous studies suggested that the SCL14 gene in Arabidopsis is essential for the activation of stressinducible promoters 48 . Likewise, overexpression of VaPAT1 and OsGRAS23 confers the abiotic stress tolerance in Arabidopsis and rice 37,49 . In 2019, Garcia-Lozano et al. compared the transcriptome of the bottle gourd grafted on the watermelon and vice-versa 44 . They reported more than 400 mobile RNA between the different heterografts and observed that the use of bottle gourd as rootstock increased the size and rind thickness of the watermelon fruits. Similarly, Liu et al. reported the differential expression of 787 genes between watermelon homo and heterograft (bottle gourd rootstock) 16 . To highlight the importance of bottle gourd as rootstock, Wang et al. (2020) analyzed the transcriptome of bottle gourd (rootstock) and watermelon (scion) under chilling reatment. They reported that bottle gourd homograft, as well as hetero-graft, are tolerant to chilling stress compared to the water-melon homograft 50 . Thus, in-depth analysis of GRAS transcription factors in the bottle gourd genome will be helpful for enhancing the use of bottle gourd as valuable rootstock material and could also be extended in other vegetable crops.