Genome-wide identification and expression analysis of the plant specific LIM genes in Gossypium arboreum under phytohormone, salt and pathogen stress

Asiatic cotton (Gossypium arboreum) cultivated as ‘desi cotton’ in India, is renowned for its climate resilience and robustness against biotic and abiotic stresses. The genome of G. arboreum is therefore, considered as a valued reserve of information for discovering novel genes or gene functions for trait improvements in the present context of cotton cultivation world-wide. In the present study, we carried out genome-wide analysis of LIM gene family in desi cotton and identified twenty LIM domain proteins (GaLIMs) which include sixteen animals CRP-like GaLIMs and four plant specific GaLIMs with presence (GaDA1) or absence (GaDAR) of UIM (Ubiquitin Interacting Motifs). Among the sixteen CRP-like GaLIMs, eleven had two conventional LIM domains while, five had single LIM domain which was not reported in LIM gene family of the plant species studied, except in Brassica rapa. Phylogenetic analysis of these twenty GaLIM proteins in comparison with LIMs of Arabidopsis, chickpea and poplar categorized them into distinct αLIM1, βLIM1, γLIM2, δLIM2 groups in CRP-like LIMs, and GaDA1 and GaDAR in plant specific LIMs group. Domain analysis had revealed consensus [(C-X2-C-X17-H-X2-C)-X2-(C-X2-C-X17-C-X2-H)] and [(C-X2-C-X17-H-X2-C)-X2-(C-X4-C-X15-C-X2-H)] being conserved as first and/or second LIM domains of animal CRP-like GaLIMs, respectively. Interestingly, single LIM domain containing GaLIM15 was found to contain unique consensus with longer inter-zinc-motif spacer but shorter second zinc finger motif. All twenty GaLIMs showed variable spatio-temporal expression patterns and accordingly further categorized into distinct groups of αLIM1, βLIM1, γLIM2 δLIM2 and plant specific LIM (DA1/DAR). For the first time, response of GaDA1/DAR under the influence of biotic and abiotic stresses were studied in cotton, involving treatments with phytohormones (Jasmonic acid and Abscisic acid), salt (NaCl) and wilt causing pathogen (Fusarium oxysporum). Expressions patterns of GaDA1/DAR showed variable response and identified GaDA2 as a probable candidate gene for stress tolerance in G. arboreum.


Chromosomal location and gene duplication.
To trace chromosomal locations of GaLIM gene family members, relevant data were recovered from G. arboreum genome and subsequently mapped using MapChart software version 2.30 38 . Gene duplication events were analyzed as described in earlier studies 39,40 . Synonymous (Ks) and non-synonymous (Ka) substitution rates were determined using DnaSp software, followed by estimation of time of duplication events using the formula T = Ks/2λ, where clock-like rate (λ) denotes divergent time of 1.5 synonymous substitutions for every 108 years for cotton 41 and that of other plant species (P. trichocarpa, A. thaliana, C. arietinum) were aligned. The alignment of sixtyone LIMs were used to generate phylogenetic tree using MEGA v.7.0 software 43 . Maximum Likelihood (ML) tree was constructed using JTT+G model. The ML tree was also generated for 37 DA1/DAR proteins comprising G. arboreum, C. arietinum, B. rapa and G. max using JTT+G model [43][44][45] . The bootstraps were performed for 1000 iterations and partial deletion was used for gap treatment. Number of amino acid residues, molecular weight, and pI value for each GaLIM protein was calculated using ProtParam tool (https:// web. expasy. org/ protp aram). Subcellular localizations of GaLIM proteins were predicted using the online tool BaCelLo [Balanced Subcellular Localization Predictor] 46 .
Plant material. The diploid cotton variety G arboreum cv. Roja, developed by ICAR-Central Institute for Cotton Research (ICAR-CICR), Nagpur was used in the present study. The seeds of G. arboreum cv. Roja were grown both in vitro on half strength of Murashige and Skoog's (MS) plant tissue culture medium and in-vivo under field conditions at ICAR-CICR, Nagpur. For organ specific expression analysis, hypocotyl, cotyledonary leaves and root tissues were collected from 15 days old seedling raised on tissue culture MS medium 47 , whereas anthers and ovaries were collected from flowers at 0 days post-anthesis (DPA), ovules with fibres at 10 and 25 DPA respectively, representing fibre developmental stages collected from field grown plants. All  www.nature.com/scientificreports/ GaLIM14 and GaLIM20 were observed to be more closely related with 92.02% similarity in protein sequence. BaCelLo prediction revealed nuclear localization of eighteen GaLIMs except GaLIM4/PLIM2c and GaLIM18/ GaDAR1 which were predicted to be secretory and chloroplast localized proteins, respectively (Table 1). SMART analysis of the identified protein sequences grouped the LIM proteins of G. arboreum into two subfamilies (Supplementary Fig. S1 & S2). The first subfamily was comprised of sixteen GaLIM proteins similar to animal CRPs and pair-wise amino acid analysis of those proteins showed a similarity index ranging between 31.94 to 92.02% (Supplementary Table S3 Notably, in addition to this, five proteins were predicted with single LIM domain (1LIM) with size ranging between 118 (12.8 kDa) and 202 (22.45 kDa) amino acids (Table 1). Among those 5 1LIM proteins, GaLIM2 and GaLIM11 shared a common LIM domain sequence consensus identical to that of first LIM domain of 2LIM, whereas GaLIM8 and GaLIM16 contained consensus related to second LIM domain of 2LIM ( Fig. 1). GaLIM15 had a unique LIM domain consensus of [(C-X 2 -C-X 16 -H-X 2 -C)-X 8 -(C-X 9 -C-X 2 -H)]. Distinctively, GaLIM15 had relatively shorter second zinc finger motif due to a missing cysteine residue at first position ( Fig. 1). In addition, it was also featured with lengthier spacer of eight amino acids between two Zn-finger motifs, unlike commonly observed 2-3 amino acid spacers in other LIM domains. Genomic localization and gene structural analysis. MapChart analysis revealed that the identified 20 GaLIMs were unevenly distributed among the chromosomes of G. arboreum covering 9 chromosomes out of total 13 (Chr.1 to Chr.13). A maximum number of five GaLIMs were found to be localized on Chr.8, while three each were observed on Chr.3, Chr.7 and Chr.13 and two on Chr.10 respectively. Chromosomes 4, 6, 11 and 12 contained one GaLIM each, whereas chromosomes 1, 2, 5 and 9 were devoid of any GaLIM. Plant specific GaLIM (DA1/DAR) family members showed their localization on Chr.7, Chr.8 and Chr.13 (Table 1 & Fig. 2). Structural analysis of genomic regions with corresponding CDS revealed the organization of exon and intron in      Table S4). The 2LIM proteins from G. arboreum (16 LIMs), Arabidopsis (6 LIMs), poplar (12 LIMs) and chickpea (9 LIMs) resulted in the segregation of 16 GaLIMs into four groups namely αLIM1, βLIM1, γLIM2 and δLIM2 as supported by high bootstrap values (Fig. 3). Subsequently, the members of GaLIMs clustered within those four groups were renamed according to their tissue-specific expression patterns by adopting the nomenclature described by Arnaud et al. 7 .  Table S7).

Gene duplication. Paralogous gene pairs
Organ-specific gene expression and confirmation of phylogeny. Tissue and/or developmental stage specific gene expressions of 20 GaLIMs were confirmed through semi-quantitative PCR utilizing the cDNA as template from the vegetative tissues (root, leaf and hypocotyl), floral tissues (anther and ovary) and fibres in the developmental stage (10 DPA and 25DPA) of G. arboreum cv. Roja (Fig. 4). As mentioned earlier, these 20 GaLIMs could be grouped into five major groups such as αLIM1, βLIM1, γLIM2, δLIM2 and plant specific LIM (DA1/DAR) groups based upon their gene expression patterns which also confirms the deduced phylogenetic relationships as well.
αLIM1 group. αLIM1 group comprises of three members namely GaWLIM1a, GaFLIM1a and GaFLIM1b. The pair-wise amino acid sequence similarities among them ranged from 76.06 to 79.59 percent and had shown their transcript abundance in all tissues of cotton under study. GaFLIM1a and GaWLIM1a manifested similar pattern of expression in all tissues/stages, preferentially during primary cell wall synthesis of cotton fibre (10 DPA) of cotton (Fig. 4). Notably, GaFLIM1b exhibited elevated expressions in all stages and tissues under study including both early (10 DPA) and advanced (25 DPA) stages of cell wall biosynthesis of cotton fibre.
βLIM1 group. GaβLIM1 is the lone member observed under this group which portrayed similar pattern of gene expression as that of GaαLIM1group members, with preferential expressions being recorded at root, hypocotyl and cotton fibre at 10 DPA (Fig. 4).
γLIM2 group. It featured the members from WLIM2 subgroups with four duplicated members of GaWLIM2. Among those, GaWLIM2a and GaWLIM2c showed constitutively high expression in all the tissues and stages of cotton under study and shared 92.02 percent pair-wise identity at amino acids level. In contrast, gene expressions for GaWLIM2b and GaWLIM2d were observed in all the tissues, except reproductive parts (anther and ovary).
δLIM2 group. The δLIM2 group comprises of three monophyletic subgroups with members from monocot PLIM2, eudicot PLIM2 and asterids δLIM2. Upon phylogenetic comparison of GaLIMs with members of δLIM2 groups from Arabidopsis, chickpea and poplar, a subtotal of eight GaLIMs have been assigned to δLIM2 group. Those GaLIMs were further separated into two monophyletic subgroups namely PLIM2 and δLIM2 (PLIM2like) based on expression patterns in the pollen development. Both PLIM2 and δLIM2 were comprised of four members each, where three out of four GaPLIM2 (GaPLIM2a, 2b and 2c) showed preferential expressions in anther tissues. However, GaPLIM2d was found to be expressed in other tissues along with anther. Similarly, in δLIM2 (PLIM2-like) subgroup, all the members showed high expressions during anther development and in immature cotton fibre at 10 DPA stages, except GaδLIM2d which recorded its expression exclusively in anther development stage (Fig. 4). In addition, pair-wise amino acid sequence analysis of those four GaPLIM2 proteins showed similarity range of 76.22 to 82.61 percent indicating their close relatedness and probable duplication during evolution (Supplementary Table S2).

Expression patterns of plant-specific LIM (DA1/DAR) group. All the members of plant-specific
GaLIM (DA1/DAR) showed constitutive expressions in all the tissues and stages, with an exception of GaDA1 having no detectable expression in root tissues. Notably, GaDA2 showed superior expression level as compared to that of GaDA1 or GaDA3 or GaDAR1 (Fig. 4).

Expression analysis of plant-specific LIM genes in response to stresses. Hormonal treat-
ment. Plant defense hormones such as ABA and JA were sprayed on the 3 weeks old seedlings of cotton. The response of plant-specific LIM gene members against these hormones were quantified at transcript level using quantitative real-time PCR (qRT-PCR). Analysis of qRT-PCR data revealed that upon ABA treatment, only GaDA3/GaLIM10 gene showed significant and steady up-regulation in its expression, whereas, the others viz. GaDAR1, GaDA1 and GaDA2 genes exhibited variable expression patterns with respect to time. GaDAR1 and GaDA2 showed significant down-regulation at 3h and 6h but significant up-regulation of expression at 12h time, followed by significant and non-significant down-regulation of expression at 24h, respectively with respect to control. GaDA1 showed significant down-regulation of expression as compared to control (Fig. 5a). In case of JA www.nature.com/scientificreports/ treated seedlings, GaDA1 and GaDA2 showed significant up-regulation in expressions after 6h of the treatment as compared to control. Except at 3h, both the genes followed similar expression pattern at different point of time in response to JA treatment. On the other hand, GaDA3 and GaDAR1 showed variable expression at different time intervals upon JA treatment (Fig. 5b).
Salt treatment. Relative expressions of genes in response to treatment of 200 mM NaCl and mock treated samples as control was quantified. The response of GaDAR1 to the salt treatment was significant with elevated transcript level in comparison to mock treated samples. GaDA1 showed significant up-regulation at early induction periods (1hr and 3hrs) but steep decline in the expression was noticed at 6h, before significant up-regulation at 12h and 24h of time respectively. In contrast, significant decline in transcript levels of GaDA2 and GaDA3 in response to salt treatment was noticed (Fig. 6).
Biotic stress. Expression of LIM genes in response to challenge inoculation of F. oxysporum resulted in identifying GaDA2 as putative candidate gene as it provided significant up-regulation across time intervals. GaDAR1 and GaDA3 showed induced responses during early hours of treatment, but significant transcript level reductions were observed for both at 24h of treatment (Fig. 7). Upon treatment, expression of GaDA1 was significantly down-regulated.

Discussion
Biologically, cotton fibre is a single elongated epidermal cell whose development involves initiation, elongation, secondary cell wall synthesis and maturation phases. Its quality features such as fibre length, strength and fineness are resultants of interactions between cellulosic, non-cellulosic components along with network of cytoskeleton protein components. Fibre strength is strongly influenced by the thickness of the secondary cell wall and the ultra-structure of the cellulose 50,51 . A number of actin-binding proteins are known to play pivotal role in fibre development from the elongation stage to the secondary wall synthesis stage 29,52 . Among various such actin-binding proteins, LIM domain containing protein families have recently been identified as potential regulators of actin dynamics and cytoskeleton reorganization in several plants species. Several LIM proteins viz. GhWLIM1a, GhWLIM5, GhPLIM1 and GhXLIM6 were characterized for their roles in fibre and pollen development of upland cotton 16,18,31,32 . Except those few reports, a comprehensive understanding of structure and function of LIM genes in cotton is still in its infancy. Hence, we attempted to unravel LIM domain proteins of diploid G. arboreum using genomic information available in public domain. Unless mentioned otherwise, this is the first report on comprehensive genome-wide characterization of LIM genes from G. arboreum. In the present study, 20, 18 and 35 LIM genes were identified in G. arboreum, G. raimondi and G. hirsutum, respectively. www.nature.com/scientificreports/ This disparity between the number of LIM genes among those three species might have resulted from differential selection pressure occurred during the course of evolution and/or errors associated with assembling of whole genome sequencing data 39 . The identified twenty candidate LIM proteins from G. arboreum, contained either one or two LIM domain/s, distributed across the genome. The total number of LIM proteins identified from G. arboreum (twenty) was more than those reported in A. thaliana (fourteen), P. trichocarpa (twelve) and C. arietinum (fifteen). Differences in the  www.nature.com/scientificreports/ number of LIM genes between different plant genera might be due to the differential expansions of gene family occurred during the course of their evolution. For instance, at around 93 million years ago (mya), G. arboreum shared a common ancestry with Arabidopsis. During the course of evolution, both had undergone subsequent cycles of whole genome duplications resulting in variable expansion of several genes families 34,53,54 . Evidently, our gene duplication analysis revealed that around 85% of the entire LIM gene family in G. arboreum showed predominantly segmental paralogous gene duplication which might be instrumental in the expansion of GaLIM gene family. This is supported by our results indicating an approximate evolutionary timeline of 1.81 to 28.57 mya for gene duplication event of GaLIMs (Supplementary Table S7), coinciding with recent whole genome duplication of G. arboreum which occurred around 13 to 20 mya 34 . In phylogeny, these proteins were distinctly categorized into two sub-families (2LIM and plant-specific DA1/DAR) based on their number of LIM domains and their sub-division was also supported by high bootstrap values (Fig. 3). Similar categorization of LIM gene family members was also deduced in different monocot and dicot plant families such as Brassicaceae, Solanaceae, Poaceae, Malvaceae and Fabaceae suggesting their conserved evolutionary relatedness 7,11,12,14,15,17,18,24 . The first subfamily, called 2LIM (containing two LIM domains) sharing features with animal CRPs, harbours sixteen GaLIMs, while the other four GaLIMs were clustered with plant-specific DA/DAR subfamily. Interestingly, in our study, out of 16 CRP-like GaLIMs, only 11 LIMs had features of 2LIM domain, while remaining five LIM proteins were predicted to possess single LIM domain (Supplementary Fig. S1). This finding is in contrast to all the earlier reports which suggest that plant LIM proteins sharing structural analogy with animal CRPs typically contain two LIM domains separated by a spacer of around 50 amino acids 55 12 . The role of LIM domain in plants has not been clearly understood. However, among the fewer characterized LIM genes, they are known to be associated with the transcription regulation of phenyl-propanoid pathway genes involved in secondary cell wall biosynthesis and pollen tube development, apart from their implicit role in actin dynamics 11,16,18,56 . Hence, it will be interesting to explore and functionally validate CRP-like 1LIM domain containing proteins in other plants including cotton. The available whole genome information of various plants may be explored for deriving deeper insights on plant LIMs.
There are subgroups of αLIM1, βLIM1, γLIM2 and δLIM2 which exhibit tissue and/or stage specific functions 7 . In the present study, expression pattern and phylogenetic analysis confirmed the presence of four distinct LIM subgroups viz. FLIM1/XLIM1, WLIM1, WLIM2 and PILM2 distributed within αLIM1, βLIM1, γLIM2 and δLIM2 (Figs. 3 and 4). Absence of subgroup PLIM1 was observed which is in concurrence with the other plant species such as A. thaliana, O. sativa, P. trichocarpa, C. arietinum, B. rapa 7,12,14 . δLIM2 group formed a large group with 8 out of 16 GaLIM members. In Arabidopsis, members of PLIM2 group (AtPLIM2a-2c) were reported to be highly expressive in reproductive tissues particularly in pollen and are functionally linked to actin dynamics 57 . Similarly, members of PLIM2 (PtPLIM2a) from Poplar had shown remarkably elevated expressions in matured anthers 7 . In our study, four GaLIMs (GaPLIM2a-2d) formed clusters with PLIM2 group members belonging to A. thaliana, P. trichocarpa and C. arietinum (Fig. 3). Moreover, GaPLIM2a and GaPLIM2c had shown preferential expressions in anthers which suggested a functional analogy of GaPLIMs akin to that of A. thaliana, P. trichocarpa and C. arietinum (Fig. 4). These observations are also supported by the fact that PLIM2 exhibited profound transcript abundance in pollen development phase in majority of eudicots 7 . The other two GaPLIMs (GaPLIM2b and 2d) sharing close similarity with GaPLIM2a and 2c, were also expressed in anther tissue. In addition, they also showed expressions in cotton fibre development stages indicating their dual functions in relation to pollen and cotton fibre development. This is in congruence with the earlier studies on expression patterns in Poplar (PtPLIM2a and 2b) 24 . Gene duplication (tandem and/or segmental) is considered to be a key driving force for expansion of gene families. Moreover, it is possible that the duplicated gene members may retain their original function or undergo pseudogenization, subfunctionalization and neofunctionalization process of evolution 40,64 .
Another sub-clade of δLIM (GaδLIM2/GaPLIM2-like) distantly related to PLIM2 subgroup was also observed in the phylogeny of GaLIMs. During evolutionary expansion of Asterids δLIM, the members had diverged following PLIM2 gene duplication and formed a separate monophyletic clade namely PLIM2-like (earlier δLIM2), distinct from previously identified PLIM2 members from Tobacco (NtPLIM2) and Sunflower (HaPLIM2) 7,24 . However, beside differences in sequence identity, the members of PLIM2-like were also found to be strongly expressed in pollen. The phylogenic and expression analysis revealed that four GaLIMs (GaδLIM2a-2d) were clustered with P. trichocarpa δLIMs (PtδLIM2a and 2b) (Figs. 3 and 4). The expression specificities of GaδLIM2 genes suggested a functional conservation during their evolutionary divergence from parental GaPLIM2. The expression patterns of members of GaδLIM2 have also indicated their potential roles during pollen development and cotton fibre development. Interestingly, δLIM2 members are also known to perform dual functions as actin modulators during pollen development and cottony hair formation along with other functions in leaf and vascular tissues 24 .
GaWLIM1 members showed ubiquitous nature of expression pattern with a tendency of transcript abundance particularly during cotton fibre developmental stage (Fig. 4). Several reports have elucidated the roles of WLIM1 including F/G/XLIMs genes during lignin biosynthesis and secondary cell wall biosynthesis via modulating the actin dynamics and expression patterns of PAL box genes involved in phenyl propanoid pathway 58 . In P. trichocarpa, PtWLIM1a and PtWLIM1b are found to be associated with lignin biosynthesis during secondary xylem formation 24 . In the present study, three GaLIMs (GaWLIM1, GaFLIM1a and 1b) showed close similarity with poplar WLIM1 members (Fig. 3), thus suggesting a possible similarity of GaWLIM1 in actin modulating feature as well. This is further supported by the studies in upland cotton where the activity of GhLIMs (GhX-LIM6, GhWLIM1a and GhWLIM5) was reported to play regulatory roles in cotton fibre development stage via www.nature.com/scientificreports/ interacting with F-actin dynamics and acting as negative regulators of fibre development 16,58 . Another distinct GaLIM (GaβLIM1a) had also expressed in all the tissues of G. arboreum including early fibre development stage corresponding to 10 DPA (Fig. 4). Despite exhibiting similar pattern of gene expression like GaWLIM1 subgroup, it formed a separate cluster distinct from GaWLIM1 owing to its sequence divergence and grouped with βLIM members of poplar and chickpea in phylogeny, hence designated as GaβLIM1a. βLIM expression in different tissues of poplar 7 and chickpea 14 supported this classification and nomenclature. Apart from sixteen CRP-like LIM proteins, four other plant specific GaLIM (DA1/DAR) were found in G. arboreum with presence or absence of UIM and highly conserved C-terminal domain, when compared with that of Arabidopsis and chickpea genomes. Further, phylogenetic analysis classified GaDA1 from G. arboreum into three categories i.e., GaDA1, GaDA2 and GaDA3 (Fig. 3). Plant-specific LIMs of DA1 and DAR are known to be involved in biotic and abiotic stress responses and organ size regulation 13,14,22,59 . In our study, plant-specific GaDA1/GaDAR revealed ubiquitous expression in all tissues. Similarly, in chickpea, majority of CaDA1 and CaDAR members recorded their expressions in most of the tissues and developmental phases. Functional significance of LIM sub-family was investigated in Arabidopsis, where those genes were found to regulate organ size, apoptosis and freezing tolerance 60 , which signifies their roles in biotic and abiotic stresses. Studies in chickpea under biotic and abiotic stresses also drew similar correlation between DA1/DAR and stress response 22 . Phylogenetic closeness of GaDA1/DAR with that of A. thaliana, B. rapa, G. max and C. arietinum counterparts seems to vindicate our results ( Supplementary Fig. S8). Hence, to investigate the role of GaDA1/DAR under stress, certain abiotic and biotic stresses were imposed. For this, defense hormones application (JA and ABA), salt treatment (NaCl) and wilt causing pathogen (F. oxysporum) inoculation were used to understand the response of GaDA1/ DAR genes. Expressions of all the members of GaDA1/DAR family were altered as compared to control under the influence of stress treatments. Among the notable changes, expression of GaDA1 of ABA treated and pathogen inoculated treatments followed almost similar pattern of down-regulation with respect to time and control. On contrary, there was a steady up-regulation of GaDA2 in the treatments of JA and F. oxysporum, respectively. A strong correlation of ABA and JA with pathogenesis of F. oxysporum demonstrates the interference of this pathogen with the signaling pathways of ABA and JA 61 . Incidentally, GaDA2 LIM harbors an additional SCOP domain (SCOP d1i9za), at amino acid sequence position 19 to 158 which belongs to exonuclease-endonucleasephosphatase (EEP) domain superfamily. EEP superfamily of proteins contains catalytic domain with characteristic feature of phosphodiester bonds cleavage activity, with their substrates being nucleic acids, phospholipids and some proteins as well 62 . All these features are therefore, relatable to a typical component of cell repair mechanism which often gets activated under stress. This probably justifies the designation of GaDA2 as a candidate gene for F. oxysporum related to biotic stress in G. arboreum cotton. Apart from this, DA1 also contains Ubiquitin Interacting Motif (UIM) that may participate in ubiquitination process leading to several crucial cellular phenomena like nucleotide repair and stress response 63 . The role of DAR LIM members in biotic and abiotic stress response have been vividly demonstrated in B. rapa 12 and Arabidopsis 60 , respectively. In view of the variable responses of GaDAR under different abiotic and biotic stresses in cotton, further functional validation studies are required to ascertain their exact role in stress response.

Conclusion
This comprehensive study exploring whole genome information of G. arboreum has revealed twenty LIM domain containing proteins and categorized them on the basis of gene expression patterns and phylogenetic relationships into distinct groups. The study has identified a novel group of animal CRP-like GaLIM proteins containing only single LIM domain with a unique Zn-finger motif architecture. This study is incidentally the first of its kind in cotton and second in plants after Brassica. Present study also expands a scope for detailed understanding of roles of each GaLIM proteins corresponding to their LIM domain organizations in plants. Moreover, quantitative gene expression studies have unveiled plant-specific GaDA2 as a candidate gene under stress response in G. arboreum. These findings can be further validated and applied to develop a potential genetic marker for cotton breeding program. This study sets benchmark parameters to search for new LIM-domain proteins from unexplored genomes of Gossypium species.