GRAS genes belong to the plant-specific transcription factors (TF’s) family that are known to be involved in plant growth and development. In this study, we have identified 37 genes from the bottle gourd genome that encodes for GRAS TF’s. Except for the SCLA, we were able to identify at least one gene from each of the 17 subfamilies. Gene structure and chromosomal analysis showed that maximum seven genes are present on Chr7 followed by six genes on Chr1. The subcellular location analysis revealed that most of the genes were localized in the nucleus, except for a few in chloroplast and mitochondria. Additionally, we have identified one tandem gene duplication event on Chr7 and three major motifs that were present in all the GRAS genes. Furthermore, the protein–protein interaction prediction and gene expression analysis showed five candidate hub-genes interact with various other genes and thus probably control the expression of interacting partners in different plant tissues. Overall, this study provides a comprehensive analysis of GRAS transcription factors in bottle gourd genome which could be further extended to other vegetable crops.
Transcription factors (TFs) are the class of proteins that control the functioning of various genes by binding with their promoters and thus involved in the gene regulation process1. Previously, more than 320K TFs belonging to 58 transcription factor families have been reported from 165 different plant species. Out of these, GRAS represents one of the major families that are involved in plant growth, development, cell signaling, and stress tolerance2. GRAS family was first reported in bacteria and characterized by the three TFs i.e. (i) GAI (gibberellic-acid insensitive); (ii) RGA (repressor of GAI), and (iii) SCR (scarecrow) with the size range from 400 to 770 amino acids3,4. It was observed that GAI and RGA are the part of DELLA proteins that take part in gibberellin (GA) and Jasmonate (JA) response as well as light signaling. Likewise, SCR and Short Root (SHR) played a key role in the radial organization of root by forming the SCR/SHR complex5.
Previously, it has also been reported that the C-terminal region of GRAS members is highly conserved, while the N-terminal is highly divergent that might provide the specificity to each protein6. Furthermore, the evolutionary analysis suggested the phenomenon of horizontal gene transfer (HGT) from bacteria to plants. GRAS gene family is further categorized into different subfamilies such as eight in Arabidopsis and rice7, while 8 to 13 in tomato, popular, castor beans, etc8,9,10. Recently, Cenci et al. classified the GRAS family into 17 subfamilies such as DELLA, Lateral suppressor (LS), Hairy meristem (HAM), and SCR11. Previous studies showed that it is one of the widely explored transcription factor family in the various plant species including tomato, potato, buckwheat, and sweet orange12,13,14.
Bottle gourd (Lagenaria siceraria), a member of the Cucurbitaceae family is commonly cultivated in the tropical and subtropical regions and is believed to be originated in southern Africa15. It is a diploid species (2n = 2× = 22) having 22 chromosomes that belong to the genus Lagenaria. In 2017, the first draft genome of Lagenaria siceraria cultivar USVL1VR-Ls was reported with a total of 22,472 genes covering 313.4 Mb region of the genome. In the past, genome-wide analysis studies such as identification of graft responsive mRNA, miRNA has been done in bottle gourd16,17. But, in-depth genome-wide analysis of the GRAS transcription factor family is still lacking in case of the bottle gourd. Therefore, considering the use of bottle gourd as an important rootstock material and GRAS transcription factors in plant growth and development, a comprehensive genome-wide search was done in bottle gourd genome to identify the GRAS TFs, their phylogenetic relationship and expression pattern in the different plant tissues.
Material and methods
The genome and proteome of bottle gourd (Lagenaria siceraria cultivar USVL1VR-Ls) was downloaded from the cucurbits genomics database (CuGenDB, https://cucurbitgenomics.org/)15,18. The hidden markov model (HMM) profile of GRAS TF (PF03514.14) was downloaded from the Pfam database using HMMer software19,20. We have used PfamScan, InterproScan, and HMMScan to identify the GRAS transcription factors [Suppl. Figure-S1]20,21. PfamScan was used to search the complete proteome against the HMM profile at e-value and domE value cut-off 1e−03. Similarly, hmmscan was used with incdomE value cut-off 1e-03. InterproScan was used against the complete Pfam database and thereafter hits containing GRAS domain (PF03514) were filtered out22. SMART tool (https://smart.embl-heidelberg.de/) was used to further confirm the presence of GRAS domain in the respective hits23.
Gene structure, chromosomal location, and gene duplication
To determine the chromosomal location of the GRAS TFs containing genes, nucleotide sequence of respective hits were extracted and subjected against the bottle gourd genome using Blast software24. To identify the tandem gene duplication, DupGen_finder software was used which requires the blast results and GFF file of target and out-group species25. In this study, BlastP tool was used with the parameters i.e. e-value (1e−10), maximum target hits (5) and A. thaliana as an outgroup species.
Sequence analysis and subcellular location prediction
ProtParam and BUSCA webserver was used for computing the protein properties and subcellular location of the GRAS genes26,27. To identify the motifs signature present in the GRAS genes, MEME suite (Multiple Em for Motif Elicitation) webserver was used with the maximum number of motifs (10), minimum motif length (6), maximum motif length (50), minimum (2) and maximum (37) sites per motif respectively28.
We have considered 397 GRAS TF’s from eight different species that were classified into 17 different subfamilies11. A local Blast database was constructed and each of the 37 identified GRAS genes were subjected to blast search against this database. Based on the best-hit, each gene was further assigned to the respective group11. Subsequently, the final dataset of 434 genes was used to perform multiple sequence alignment (MSA) using Muscle tool29 and MEGA7 was used to construct the phylogenetic tree using the JTT model with gamma distribution and complete deletion of removal or gaps30. Finally, the tree was visualized using the iTOL (interactive tree of life) software31.
Protein–protein interaction (PPI) prediction and differential gene expression (DGE)
Bottle gourd and Cucumis melo (C. melo) are phylogenetically closely related species and belong to the same Cucurbitaceae family therefore, we have used C. melo as a reference to search GRAS genes and their interacting partners in the String database32. All the identified interacting partners were collected and queried against the bottle gourd genome at e-value 1e−10 using blast software. The single best-hit for each gene was considered for the construction of a PPI network using Cytoscape33. Finally, the top five hub genes from the interaction network were predicted using cytoHubba plugin of Cytoscape34. Further, to understand the contribution of these genes, transcriptome of the different tissues from the bottle gourd were searched. We found one dataset in the Cucurbitaceae genome database with the gene expression (FPKM) values for five different tissues. FPKM value of all the 37 GRAS genes were extracted and plotted in the form of a heatmap to understand the relationship.
Identification of GRAS transcription factors
We have identified 38, 37 and 37 genes encoding for GRAS TF’s using HMMScan, PfamScan, and InterProScan respectively. Further, domain-based analysis using the SMART tool confirmed the presence of 37 genes from the GRAS family. Therefore, we have finally selected these 37 genes for further in-depth analysis. In addition to the GRAS domain, we identified a DELLA domain in four genes, WD40 in one gene, and a maximum of nine low complexity regions (Table 1). The search for the chromosomal location has identified a minimum of one gene on chromosome-8 and 11, while a maximum of seven genes have been found on chromosome-7 (Figure 1).
Analysis of gene structure, duplicated genes and sequence properties
As observed from the Table 2, the GARS family protein sequence length varies from 378 to 1466 AA’s with a PI range from 4.7 to 8.22. Analysis of the gene structure revealed that out of the 37 genes, 25 genes were encoded by a single exon. However, one gene (Lsi06G016090.1) of length 1466 amino acid was encoded by the 13 exons (Table 2). We have searched for gene duplication events and observed one tandem duplication at Chr7 in the gene Lsi07G002180.1 and Lsi07G002190.1. Further, the prediction of their subcellular location showed that most of the genes were confined into the nucleus, whereas 10 genes in the chloroplast, one each in mitochondria, endomembrane system, and the extracellular space. Further, we have identified 10 most prominent motifs in the GRAS genes with motif length varying from 16 to 41 amino acids [Suppl-1.docx: Table-S1]. Except for the Lsi07G013900 gene, the predicted motifs were localized in the C-terminal region of the genes. We observed that three motifs of length 21, 21 and 25 respectively were highly conserved and present in all the GRAS genes [Suppl.-1.docx].
A phylogenetic tree was constructed from the 434 sequences including 397 from previously published work and 37 from the bottle gourd genome. As observed from the Figure-2, all the 37 identified GRAS genes could be divided into 16 different subfamilies. Further analysis revealed that except for the SCLA, at least one gene from each subfamily was present in the genome (Fig. 2). A maximum of six genes were observed from the Phytochrome A signal transduction (PAT) subfamily followed by four genes from the HAM and DELLA subfamilies. In contrast to that, LS and Required for arbuscule development (RAD) belong to the smallest subfamily with only one gene.
Protein–protein interaction (PPI) prediction and DGE analysis
PPI prediction analysis revealed that a total of 178 unique genes from C. melo were involved in 467 possible interactions. Next, we searched their homologs in the bottle gourd genome. We observed that two genes (XP_008452266.1 and XP_008459103.1) do not have any homologs, and therefore we excluded them further in our study. Finally, we had a set of 169 exclusive genes from the bottle gourd genome that possibly interacts with each other to control various biological functions. Further, network analysis revealed the presence of 169 nodes and 467 edges with clustering coefficient value 0.452 and 5.4 average no. of neighbour’s, respectively. Based on the maximum degrees, we have identified Lsi02G029020, Lsi02G024800, Lsi01G004880, Lsi04G000500, and Lsi06G005500 as the candidate hub genes with maximum interactions 22, 21, 20, 19, and 19 respectively (Fig. 3). We have also studied expression level of the GRAS transcription factors in different plant tissues and observed that Lsi01G004880 gene was up-regulated in all five tissues with higher expression (> 6 folds) in the root tissue (Fig. 4). Similarly, Lsi04G000500 and Lsi04G005500 expressed in the root tissue, whereas with no or poor expression in stem and leaves. On the other hand, the Lsi02G024800 gene showed down-regulation in the stem and leaves while no significant change in the expression pattern of the Lsi02G029020 gene was observed (Fig. 4). The GRAS genes Lsi05G003650, and Lsi07G014150 expressed in roots but with no significant expression in the stem tissues.
Functional annotation of Hub and interacting genes
We investigated the function of five candidate hub genes based on their involvement in a biological and cellular process. Lsi02G024800 gene is a DELLA protein that acts as a repressor of GA induced growth and interacts with gene encoding for auxin efflux carrier, gibberellin receptor, and lignin degradation and detoxification. Thus, it eventually controls the root growth, seed germination and elongation of the stem. Similarly, Go-term analysis showed that Lsi02G029020 gene has DNA binding transcription factor activity involved in the regulation of transcription, gene expression and is localized in the nucleus. This gene interacts with transcriptional activator genes that control the genes encoding for stamen development, cell expansion, and flowering time and also modulate the growth of roots. Next, hub-gene Lsi01G004880 is a scarecrow-like protein that interacts with 10 other proteins including a Zn-finger domain and phytohormone protein. The phytohormone gene controls the phototropic response by modulating the light signal. Similarly, the Lsi04G000500 gene also belongs to the scarecrow-like class and interacts with genes like serine/arginine-rich SC35-like splicing factor SCL28, jasmonate O-methyltransferase (JMT), and Nutcracker (NUC). JMC gene converts the jasmonate into methyl-jasmonate and plays an important role in plant defense. NUC gene acts as a transcriptional activator and is involved in the regulation of flowering, and asymmetric cell division. Also, Lsi06G005500 gene belongs to the scarecrow-like class and acts on auxin response factor (ARF), gibberellin 2-beta-dioxygenase-1 (GA2OX1), and phytochrome that is involved in plant growth and development.
GRAS proteins have been recognized as an important plant cellular component that play role in signal transduction process, root, and shoot development35,36 as well as in managing the various kind of biotic and abiotic stress37. In the present study, we explored the GRAS family in the bottle gourd genome including their gene structure, chromosomal distribution, phylogenetic analysis, and gene expression in different tissues.
In this study, a total of 37 GRAS genes were computationally identified in the bottle gourd genome, which is lower than tomato, Cucurbita but higher than Arabidopsis. Previous studies reported that GRAS genes are mostly encoded by the single exon with a length of around 400–770 amino acids35. We have also observed a similar pattern in case of bottle gourd with a small variation in the gene length. Intronless gene is the important feature of prokaryotic genome thus, it suggested the phenomenon of horizontal gene transfer from the prokaryotes as well as the close evolutionary relationship among different members4. Further, our subcellular localization analysis is also in agreement with previous studies that most of these genes are localized in the nucleus. We also found one tandem gene duplication event that could play an important role in the expansion of GRAS gene family.
It is well-known that some genes of the DELLA subfamily such as GAI, RGA, and RGL act as repressor of gibberellin signaling, while SCR and SHR are involved in radial root development5,38,39. Expression of the DELLA gene (a negative regulator of GA signaling) in multiple tissues highlighted their role in plant growth and developments40,41. Similarly, SCL3 is involved in the root elongation, and SCR interacts with the RGA gene that ultimately controls the root meristem size in Arabidopsis42. Previous studies reported that mutation in the SCR gene resulted in the disruption of the asymmetrical cell division and thus affect the root growth and development. Thus, a higher expression of the SCR gene in the plant roots is helpful for the radial organization of roots43.
Bottle gourd has been widely used as a rootstock in controlling different types of biotic and abiotic plant stress44,45,46,47. Previous studies suggested that the SCL14 gene in Arabidopsis is essential for the activation of stress-inducible promoters48. Likewise, overexpression of VaPAT1 and OsGRAS23 confers the abiotic stress tolerance in Arabidopsis and rice37,49. In 2019, Garcia-Lozano et al. compared the transcriptome of the bottle gourd grafted on the watermelon and vice-versa44. They reported more than 400 mobile RNA between the different hetero-grafts and observed that the use of bottle gourd as rootstock increased the size and rind thickness of the watermelon fruits. Similarly, Liu et al. reported the differential expression of 787 genes between watermelon homo and heterograft (bottle gourd rootstock)16. To highlight the importance of bottle gourd as rootstock, Wang et al. (2020) analyzed the transcriptome of bottle gourd (rootstock) and watermelon (scion) under chilling reatment. They reported that bottle gourd homograft, as well as hetero-graft, are tolerant to chilling stress compared to the water-melon homograft50. Thus, in-depth analysis of GRAS transcription factors in the bottle gourd genome will be helpful for enhancing the use of bottle gourd as valuable rootstock material and could also be extended in other vegetable crops.
Franco-Zorrilla, J. M. et al. DNA-binding specificities of plant transcription factors and their potential to define target genes. Proc. Natl. Acad. Sci. U. S. A. 111, 2367–2372 (2014).
Tian, F., Yang, D.-C., Meng, Y.-Q., Jin, J. & Gao, G. PlantRegMap: charting functional regulatory maps in plants. Nucleic Acids Res. 48, D1104–D1113 (2020).
Zhang, D., Iyer, L. M. & Aravind, L. Bacterial GRAS domain proteins throw new light on gibberellic acid response mechanisms. Bioinformatics 28, 2407–2411 (2012).
Zhang, H. et al. Genome-wide characterization of GRAS family genes in Medicago truncatula reveals their evolutionary dynamics and functional diversification. PLoS ONE 12, e0185439 (2017).
Helariutta, Y. et al. The SHORT-ROOT gene controls radial patterning of the Arabidopsis root through radial signaling. Cell 101, 555–567 (2000).
Song, X. M. et al. Genome-wide analysis of the GRAS gene family in chinese cabbage (brassica rapa ssp pekinensis). Genomics 103, 135–146 (2014).
Tian, C., Wan, P., Sun, S., Li, J. & Chen, M. Genome-wide analysis of the GRAS gene family in rice and Arabidopsis. Plant Mol. Biol. 54, 519–532 (2004).
Chen, L. & Liu, Y.-G. Male sterility and fertility restoration in crops. Annu. Rev. Plant Biol. 65, 579–606 (2014).
Huang, W., Xian, Z., Kang, X., Tang, N. & Li, Z. Genome-wide identification, phylogeny and expression analysis of GRAS gene family in tomato. BMC Plant Biol. 15, 209 (2015).
Xu, W. et al. Genome-wide identification, evolutionary analysis, and stress responses of the GRAS gene family in castor beans. Int. J. Mol Sci. 17, 1004 (2016).
Cenci, A. & Rouard, M. Evolutionary analyses of GRAS transcription factors in angiosperms. Front. Plant Sci. https://doi.org/10.3389/fpis.2017.00273 (2017).
Niu, Y., Zhao, T., Xu, X. & Li, J. Genome-wide identification and characterization of GRAS transcription factors in tomato (Solanum lycopersicum). PeerJ 2017, e3955 (2017).
Zhang, H. et al. Genome-wide identification, characterization, interaction network and expression profile of GRAS gene family in sweet orange (Citrus sinensis). Sci. Rep. 9, 1–16 (2019).
Liu, M. et al. Genome-wide identification, expression analysis and functional study of the GRAS gene family in Tartary buckwheat (Fagopyrum tataricum). BMC Plant Biol. 19, 342 (2019).
Wu, S. et al. The bottle gourd genome provides insights into Cucurbitaceae evolution and facilitates mapping of a Papaya ring-spot virus resistance locus. Plant J. 92, 963–975 (2017).
Liu, N. et al. Genome-wide identification and comparative analysis of grafting-responsive mRNA in watermelon grafted onto bottle gourd and squash rootstocks by high-throughput sequencing. Mol. Genet. Genomics 291, 621–633 (2016).
Liu, N., Yang, J., Guo, S., Xu, Y. & Zhang, M. Genome-wide identification and comparative analysis of conserved and novel MicroRNAs in grafted watermelon by high-throughput sequencing. PLoS ONE 8, e57359 (2013).
Zheng, Y. et al. Cucurbit genomics database (CuGenDB): a central portal for comparative and functional genomics of cucurbit crops. Nucleic Acids Res. 47, D1128–D1136 (2019).
Finn, R. D. et al. Pfam: the protein families database. Nucleic Acids Res. 42, D222–D230 (2014).
Eddy, S. R. Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195 (2011).
Jones, P. et al. InterProScan : genome-scale protein function classification. Bioinformtics 30, 1236–1240 (2014).
Finn, R. D. et al. InterPro in 2017—beyond protein family and domain annotations. Nucleic Acids Res. 45, D190–D199 (2017).
Schultz, J., Milpetz, F., Bork, P. & Ponting, C. P. SMART, a simple modular architecture research tool: identification of signaling domains. Proc. Natl. Acad. Sci. U. S. A. 95, 5857–5864 (1998).
Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990).
Qiao, X. et al. Gene duplication and evolution in recurring polyploidization-diploidization cycles in plants. Genome Biol. 20, 38 (2019).
Savojardo, C., Martelli, P. L., Fariselli, P., Profiti, G. & Casadio, R. BUSCA: an integrative web server to predict subcellular localization of proteins. Nucleic Acids Res. 46, W459–W466 (2018).
Gasteiger, E. et al. Protein identification and analysis tools on the ExPASy server. in The Proteomics Protocols Handbook. Springer Protocols Handbooks (ed. Walker, J. M.) (Humana Press, 2005). https://doi.org/10.1385/1-59259-890-0:571.
Bailey, T. L. et al. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 37, w202–w208 (2009).
Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797 (2004).
Kumar, S., Stecher, G. & Tamura, K. MEGA7 molecular evolutionary genetics analysis Version 7.0 for bigger datasets. Mol. Biol. Evol. 33, 1870–1874 (2016).
Letunic, I. & Bork, P. Interactive tree of life (iTOL) v4: recent updates and new developments. Nucleic Acids Res. 47, W256–W259 (2019).
Szklarczyk, D. et al. STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47, D607–D613 (2019).
Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003).
Chin, C. H. et al. cytoHubba: identifying hub objects and sub-networks from complex interactome. BMC Syst. Biol. 8, S11 (2014).
Bolle, C. The role of GRAS proteins in plant signal transduction and development. Planta 218, 683–692 (2004).
Hirsch, S. & Oldroyd, G. E. D. GRAS-domain transcription factors that regulate plant development. Plant Signal. Behav. 4, 698–700 (2009).
Xu, K. et al. OsGRAS23, a rice GRAS transcription factor gene, is involved in drought stress response through regulating expression of stress-responsive genes. BMC Plant Biol. 15, 141 (2015).
Willige, B. C. et al. The DELLA domain of GA INSENSITIVE mediates the interaction with the GA INSENSITIVE DWARF1A gibberellin receptor of Arabidopsis. Plant Cell 19, 1209–1220 (2007).
Gallagher, K. L. & Koizumi, K. Identification of SHRUBBY, a SHORT-ROOT and SCARECROW interacting protein that controls root growth and radial patterning. Developement 140, 1292–1300 (2013).
Tyler, L. et al. Della proteins and gibberellin-regulated seed germination and floral development in Arabidopsis. Plant Physiol. 135, 1008–1019 (2004).
Chen, Y. et al. Identification and expression analysis of GRAS transcription factors in the wild relative of sweet potato Ipomoea trifida. BMC Genomics 20, 911 (2019).
Heo, J. O. et al. Funneling of gibberellin signaling by the GRAS transcription regulator SCARECROW-LIKE 3 in the Arabidopsis root. Proc. Natl. Acad. Sci. U. S. A. 108, 2166–2171 (2011).
Di Laurenzio, L. et al. The SCARECROW gene regulates an asymmetric cell division that is essential for generating the radial organization of the Arabidopsis root. Cell 86, 423–433 (1996).
Garcia-Lozano, M. et al. Transcriptome changes in reciprocal grafts involving watermelon and bottle gourd reveal molecular mechanisms involved in increase of the fruit size, rind toughness and soluble solids. Plant Mol. Biol. 102, 213–223 (2020).
Yang, Y., Yu, L., Wang, L. & Guo, S. Bottle gourd rootstock-grafting promotes photosynthesis by regulating the stomata and non-stomata performances in leaves of watermelon seedlings under NaCl stress. J. Plant Physiol. 186–187, 50–58 (2015).
Nawaz, M. A. et al. Improving vanadium stress tolerance of watermelon by grafting onto bottle gourd and pumpkin rootstock. Plant Growth Regul. 85, 41–56 (2018).
Mayrose, M., Ekengren, S. K., Melech-Bonfil, S., Martin, G. B. & Sessa, G. A novel link between tomato GRAS genes, plant disease resistance and mechanical stress response. Mol. Plant Pathol. 7, 593–604 (2006).
Fode, B., Siemsen, T., Thurow, C., Weigel, R. & Gatz, C. The arabidopsis GRAS protein SCL14 interacts with class II TGA transcription factors and is essential for the activation of stress-inducible promoters. Plant Cell 20, 3122–3135 (2008).
Yuan, Y. et al. Overexpression of VaPAT1, a GRAS transcription factor from Vitis amurensis, confers abiotic stress tolerance in Arabidopsis. Plant Cell Rep. 35, 655–666 (2016).
Wang, Y. et al. A universal pipeline for mobile mRNA detection and insights into heterografting advantages under chilling stress. Hortic. Res. 7, 13 (2020).
Authors are thankful to the Department of Biotechnology (DBT) (project BTISNET) for providing the bioinformatics facilities at the School of Agricultural Biotechnology, PAU, Ludhiana.
The authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Sidhu, N.S., Pruthi, G., Singh, S. et al. Genome-wide identification and analysis of GRAS transcription factors in the bottle gourd genome. Sci Rep 10, 14338 (2020). https://doi.org/10.1038/s41598-020-71240-2