Introduction

Chickpea or Bengal gram (Cicer arietinum L.) is one of the most important food legume crops cultivated globally1. According to FAOSTAT (2020)2, among the major legume crops, soybeans lead with a production of approximately 355.37 million tons (mt), followed by peas at 34.99 mt, lima and common beans at 27.41 mt, chickpeas at 15.06 mt, broad beans at 7.47 mt s, cowpeas at 9.04 mt, and pigeonpeas at 5.05 mt. This positions chickpea as the fourth-largest legume crop globally2. India is the largest producer (11 million tons) where it produces approximately 73% of the total global production of chickpea2. The chickpea production globally is adversely affected by drought stress, which can cause up to 50% yield losses1. Prevalence of drought at pod filling stage, commonly referred to as terminal drought, results in reduced flower and pod production and reduced seed size in chickpea plants. With the ongoing global warming, the world is predicted to face even hotter climate and erratic rainfalls in the near future3,4. This will lead to an increase in both the incidence and intensity of drought stress during pod filling stage. Hence, it becomes imperative to understand the genetic basis and mechanisms for drought tolerance in chickpea for varietal improvement.

A small genome size of 740 Mbp and the availability of whole genome sequence of chickpea, renders the crop amenable to genomics-assisted breeding for accelerated development of varieties with enhanced drought tolerance. Drought tolerance is a complex phenomenon, which is governed by several genes and involves an intricate network of stress response pathways.

Various physiological and morphological traits have been used to screen drought tolerance in plants, for example water use efficiency (WUE), stomatal conductance (SC), relative water content (RWC), membrane stability index (MSI), canopy temperature depression (CTD), root biomass and architecture, leaf chlorophyll content, leaf rolling, yield and yield components under stress5,6. However, the genetics of many of these traits is not sufficiently studied in chickpea. The significance of these traits and their utility in chickpea improvement for drought tolerance has been substantiated with the observation that as many as 13 Meta-QTLs for plant vigor are co-mapped to a previously identified “QTL hotspot” region for drought tolerance7. It is believed that a component trait-based selection of plants is more useful inbreeding for improved drought tolerance. Different types of molecular markers have been used to identify the QTL/genes for these traits. A total of 93 QTLs for drought related traits were mapped using SSR markers in a bi-parental mapping population of the cross ILC 588/ ILC 32798,9. The “QTL-hotspot” harboring 12 QTLs, governing different drought tolerance traits, with up to 58.2% of the phenotypic variation explained, was mapped on chickpea chromosome 4, by genotyping using 241 SSR markers of an intra-specific RIL population derived from cross ICC 4958 × ICC 188210. This QTL-hotspot became a focal point of investigation for many chickpea geneticists and breeders. The “QTL hotspot” on CaLG04 was narrowed down to about 14 cm, by using additional SNP markers11. With the advent of high throughput next generation sequencing (NGS) technologies, single nucleotide polymorphism (SNP) markers have become a favorite choice for genotyping. The advantages of SNP markers include high abundance in the genome, biallelic nature, high reproducibility, low mutation rates and amenability to automation12. Some SNPs in the protein coding regions may directly affect the agronomic traits by changing the amino acid sequence of the protein, hence these are known as functional SNPs or perfect markers. Due to these reasons, SNPs are being increasingly used for genotyping in many crops including rice, wheat, barley, sorghum and soybean13,14,15,16.

Genotyping by Sequencing (GBS) allows bar-coding for large-scale multiplexing of samples to achieve high throughput SNP genotyping with lower cost per sample as compared to high-density SNP chip arrays13,17,18. The problem of poor uniformity in genome coverage in GBS analysis has been addressed by restriction site associated DNA (RAD) sequencing, which involves sequential steps of restriction enzyme digestion of genomic DNA for reduced representation, adapter ligation, PCR amplification and sequencing. The approach has been widely used for diversity assessment, trait mapping and genome wide association studies (GWAS) in several crops19,20. Further, QTLs for seed traits in chickpea have been mapped using GBS to generate a high-density linkage map of 3,363 SNP markers12.

Realizing the importance of drought responsive morpho-physiological traits, which allow chickpea plants to withstand drought stress, the present study was undertaken to map the QTLs for drought stress responsive traits viz. membrane stability index (MSI), relative water content (RWC), 100 seed weight (SW) and seed yield per plant (YLD) under drought and to identify candidate genes underlying these QTL regions.

Results

SNPs between the parents and annotation of genes

A total of 42 Gb data with 209 million reads were generated for the two parents and 186 RILs of the Pusa 362/SBD 377 mapping population. The sequence data has been deposited in the NCBI short reads archive (SRA) database (SRA, 2021) under accession numbers SRR13002229 and SRR13002230. The sequence reads were subjected to quality check (QC). Filtering criterion included removal of barcodes, ApeKI restriction enzyme overhangs, and base quality Phred score of ≥ 15 for at least 80% of the bases in a read. After QC, a total of 177 million high quality reads (85% of the original reads) were processed for further analysis. The average number of reads per genotype was 1,125,483 with a variation of 29.88% between individuals. After alignment to the reference genome a total of 35,502 raw SNPs were identified. Of these, 3237 were high quality SNPs identified after stringent filtering, and the sequence information of these 3237 SNPs is provided in Supplementary Table S1. The distribution of SNPs across eight chickpea chromosomes is shown in Table 1, Fig. 1a and Supplementary Table S2. The highest number of SNPs were located on chromosome 6 (636) and the lowest number on chromosome 8 (128). Base transitions were 1.5 times more frequent than transversions.

Table 1 Summary statistics of chickpea SNP linkage map constructed using GBS data from ‘Pusa 362/SBD 377’ RIL population.
Figure 1
figure 1

Distribution of SNPs between Pusa 362 and SBD 377 across the chickpea genome. (a) Circos diagram depicting chromosome wise distribution of SNPs in 10 kb window size. Track 1 (the outermost circle) represents the 8 chickpea chromosomes (CaLG1-8) in red colour. Tracks 2 , 3, 4, and 5 represent genes, exons, raw SNPs and filtered SNPs, respectively. (b) Percent distribution of SNPs in different genomic regions.

Genomic distribution of the identified SNPs in intergenic, exonic, intronic and genic UTR regions was determined using the chickpea genome annotation project database (Chickpea Genome Analysis Project, 2013). The highest proportion of the identified SNPs were located in the intergenic regions (48.52%), followed by introns (23.98%), exons (22.48%) and UTRs (5%) (Fig. 1b). There were 784 gene models identified which contained the exonic SNPs (Supplementary Table S3). The pentatricopeptide repeat-containing proteins were found to have the highest number of exonic SNPs followed by genes encoding serine/threonine-protein kinases, E3 ubiquitin protein ligases, ethylene-overproduction proteins and ABC transporter families. Out of the 784 genes, 462 showed a significant matches with entries in the nucleotide data base of NCBI. These genes were classified in to three major GO groups i.e., biological process, molecular function and cellular components (Fig. S1). The genes categorized under the “biological processes” are mostly associated with different metabolic processes. Most of the genes belonging to the “molecular function” group possess ‘catalytic activity’ followed by ‘binding activity’. To identify SNPs in the genes for transcription factors (TF), HMM (Hidden Markov Model) was applied using HMMBUILD (HMMER 3.0, 2015). The most represented families of TFs identified were bHLH (97 genes), MYB (75 genes), NAC (69 genes), ERF (54 genes), WRKY (51 genes), and bZIP (38 genes) (Fig. S2).

Trait segregation and location of QTLs for drought tolerance

A F11 RIL population developed by crossing chickpea varieties Pusa 362 (100 seed weight- 16.3 g, yield/plant- 125.67 g, RWC-70.99%, and MSI- 80.99%) and SBD 377 (100 seed weight- 32.39 g, yield/plant- 32 g, RWC- 40.88% and MSI- 53.88%), showed a considerable amount genetic variation for traits under drought stress. The RIL population showed segregation for all the four drought tolerance traits analyzed in this study. The 3 years descriptive statistics for these traits are summarized in Table 2. For all the traits analyzed, a normal frequency distribution curve was generated for three years (Fig. 2) and a relatively higher degree of genetic variation was observed. The CV of RWC was 9.4%, 9.21%, and 9.12% during 2014–15, 2015–16 and 2016–17 respectively. The CV for MSI was 27.99% (2014–15), 27.17% (2015–16), and 26.27% (2016–17). The CV of YLD was 46.57% (2014–15), 46.89% (2015–16), and 46.5% (2016–17). The CV for 100SW was 18.32% (2014–15), 17.45% (2015–16), and 17.1% (2016–17). The Pearson correlations showed the significant correlation among four traits under drought stress to establish relationship between traits under study (Fig. 3).

Table 2 Descriptive statistics for the traits analysed in the ‘Pusa 362/ SBD 377’ RIL population of chickpea.
Figure 2
figure 2

Histogram and phenotypic distribution of (a) Relative Water Content (RWC), (b) Membrane Stability Index (MSI), (c) Yield per plant (YLD), and (d) 100 Seed Weight (100SW) in the F11 population derived from the Pusa 362 X SBD377.

Figure 3
figure 3

Pearson’s correlation coefficient of relative water content (RWC), membrane stability index (MSI), yield per plant (YLD), and 100 seed weight (100SW) under drought stress. P1 = Year 2014–15, P2 = Year 2015–16, P3 = Year 2016–17.

Genome wide RAD SNPs were utilized to generate a high-density linkage map of the RIL population. A total 3,237 SNPs identified by stringent filtering of data for SNP quality and goodness of fit to the expected 1:1 segregation ratio for RILs were used to develop linkage maps of the eight chickpea chromosomes (Fig. 4, Supplementary Table S4). The linkage groups were numbered CaLG1-CaLG8 according to the earlier published maps12. The lengths of individual linkage groups ranged from 81.0 cm for LG8) to 178.1 cm for LG6 and the total map length of the eight linkage groups was 1069.1 cm. The number of SNPs mapped per linkage group varied from 128 (LG8) to 636 (LG6) followed closely by LG4 which had 633 mapped SNPs. The average marker interval for genome was 0.33 cm with LG4 exhibiting highest density linkage group (0.23 cm) and LG8 having the lowest marker density (0.63 cm). A summary of the marker information for the eight linkage groups of chickpea is presented in Table 1.

Figure 4
figure 4

Intra-specific linkage map of chickpea constructed using the RIL population derived from the parental lines Pusa 362 X SBD 377. The scale shown on the left is in cm. The eight chromosomes are shown as vertical bars and each horizontal line on the bar represents a SNP marker. Aggregation on horizontal lines reflects higher marker density on that chromosome.

We utilized this high-density genetic map constructed using GBS based SNPs to identify QTLs for the four traits, which were distributed on all the linkage groups (LGs) except LG3 and LG8 (Fig. 5, Table 3). The analysis using genome-wide composite interval mapping (GCIM) method identified 11 QTLs associated to RWC, MSI, YLD, and 100SW on five different Linkage Groups (LG 1, 4, 5, 6, and 7) based on year 2014–15 phenotypic results. However, in next years (2015–16), five QTLs have been identified to be linked to the traits RWC, MSI, and YLD on the three different Linkage Groups (LG 1, 4, 5) and 10 QTLs with the traits RWC, MSI, and YLD on five different Linkage Groups (LG 1, 2, 4, 5, 7) in the year 2016–17. All the 22 identified QTLs (LOD ≥ 2.5) represented four different traits located on different LGs of chickpea except LG 3, and LG 8.The QTLs for RWC (qRWC1.1 and qRWC5.1) identified on LG 1 and LG 5 were observed consistently for three years in the same physical position under drought stress (Table 3; Fig. 5). For qRWC1.1 the LOD values of 2.77 and phenotypic variance of 80.76 were obtained for the trait from the pooled data analysis. Similarly for qRWC5.1 the LOD value of 5.01 and phenotypic variance of 3.1 were observed from pooled data analysis (Table 3; Fig. 5).The additive effect 15.36 for qRWC1.1and 1.47 for qRWC5.1 was observed form pooled data (Table 3). The position of qRWC1.1 was at 69 cm with in the marker interval of S1_21926081 and S1_24070908 on LG 1 QTL qRWC1.1 and peak marker was identified for qRWC5.1 at 51.97 cm on LG5. The QTLs for MSI were detected on four different linkage groups namely LG2, LG4, LG4, LG5, and LG7. However the QTLs on LG4 (qMSI4.1) was found to be consistent across the year and at same physical position in pooled data analysis (Table 3; Fig. 5 ). The QTLs for MSI on LG2 (qMSI2.1) LG5 (qMSI5.1), and on LG7 (qMSI7.1 and qMSI7.2) were detected in experimental year 2015–16 and in pooled data analysis. The LOD values for these QTLs ranged from 5.15 (qMSI4.1) to 2.64 (qMSI7.2) in pooled data analysis. These QTLs explained phenotypic variance of 90.68% (qMSI7.1), 0.2% (qMSI7.2), and 9.12% (qMSI4.1), respectively in pooled data analysis. The position of qMSI2.1 was at 107 cm with single peak marker S2_35782996 on LG2, similarly for qMSI4.1 peak marker was identified at 41.42 cm position on LG4. Other QTLs, qMSI7.1 and qMSI7.2 were mapped at 35.61 cm and 62.8 cm position within marker intervals of S7_11812441-S7_11870532 and S7_20579963-S7_21595527 respectively on LG7 (Table 3, Fig. 5). In the case of q100SW, the two QTLs were mapped on two different linkage groups i.e. LGs 5, and 7 in a single experimental year 2014–15. The LOD values ranged from 3.08 (qSWD5.1) to 2.97 (qSWD7.1). The phenotypic variance explained (PVE) by these two QTLs ranged between 44.15% (qSWD5.1) and 49.58% (qSWD7.1) (Table 3, Fig. 5).

Figure 5
figure 5figure 5

Quantitative trait loci (QTLs) for relative water content (RWC), membrane stability index (MSI), yield per plant (YLD), and seed weight (100SW). The broken lines indicates the genome-wide significance LOD threshold, and the vertical red lines for main QTLs.

Table 3 Quantitative trait loci (QTLs) associated with the traits: membrane stability index (MSI), relative water content (RWC), seed weight under drought (SWD) and yield under drought (YLD) LG: Linkage Group. Positive additive effect indicates that the trait is contributed by drought tolerant parent Pusa 362, whereas negative additive value indicates contribution by sensitive parent SBD 377.

Three QTLs (qYLD1.1, qYLD5.2, and qYLD7.1) were identified on three different linkage groups including LGs 1, 5, and 7 for the trait yield per plant in pooled data analysis. The LOD values ranged from 6.38 (qYLD1.1) to 2.885 (qYLD7.1). The phenotypic variance explained by these three mapped QTLs ranged from 67.81% (qYLD1.1) to 6.29% (qYLD5.2). The pooled data analysis of this result showed that the alleles from parent Pusa 362 favored yield at all the loci except qYLD5.2 and qYLD7.1 (Table 3). The flanking regions of the identified SNPs were used to anchor the consistent QTLs to the chickpea physical map. The QTL interval for yield, qYLD1.1 on chromosome 1 spanning 1.63 Mb region were flanked by markers S1_36014093-S1_37647656at 110 cm position while QTL qRWC flanked by S1_21926081 and S1_24070908 spanning 2.14 Mb region. The QTLs qMSI7.1, qMSI7.2 and qYLD7.1 occupied a physical distance of around 0.06, 1.02, and 1.75 Mb respectively on the chromosome 7 (Fig. 5 Table 3).

Candidate genes for drought tolerance in the QTL intervals

The 9 consistent QTLs mapped in this study spanning total of 6.6 Mb region on five chickpea chromosomes were analysed for the identification of candidate genes for membrane stability complex (MSI), relative water content (RWC), and yield per plant (YLD). A total number of 369 genes were identified in the 6.6 Mb regions (Supplementary Table S5). In-silico expression profiling was done for all 369 genes using the available transcriptome data on root and shoot tissues from the chickpea varieties, Hashem (drought sensitive) and Bivanij (drought tolerant). Out of 369 genes, 326 genes were differentially expressed under drought conditions. Among these 369 genes, 37 genes were identified for qMSI, 80 genes for qRWC, and 252 genes for qYLD QTL (Fig. 6, Supplementary Table S5). These candidate genes included genes coding for proteins such as DEAD box ATP dependent RNA helicase, chaperonins, sugar transport proteins, E3 ubiquitin protein ligase, heat shock 70 kDa proteins, aquaporins, amino acid permeases, UDP glycosyl transferase, ABC transporter proteins, MADS-box protein, etc. Genes coding for transcription factors (TFs) like ethylene-responsive transcription factor, MYB, zinc finger protein, WRKY were also found to have drought specific expression. Gene ontology (GO) was done to identify characteristic biological attributes of in silico RNA-Seq. data. The GO studies revealed that cell redox homeostasis, DNA conformation change, protein modification process, regulation of transcription were enriched under drought stress conditions. The identification of these enriched terms holds tremendous potential in expanding our comprehension of the precise impact of differentially expressed genes within the context of drought stress, and thereby offering valuable insight for further investigation and understanding. Therefore, these genes are putative candidate genes in the mapped QTL regions, which require further validation by genetic fine mapping and transformation studies. To investigate the status of differentially expressed genes in different class of pathways, gene expression information was mapped to the KEGG pathways. The KEGG analysis was done to bring down the number of candidate genes. This reduced the candidate genes from 369 to 99(Supplementary Table S5). Pathways analysis and functional annotation showed that differentially expressed genes were clustered in several signaling pathways such as Arginin and proline metabolism, cellular senescence, biosynthesis of secondary metabolites, ubiquitin mediated proteolysis, pyruvate metabolism, MAPK signaling pathways, glutathione metabolism (Supplementary Table S5).

Figure 6
figure 6

(a) Differential expression of selected putative candidate genes in different chickpea root and shoot under control and drought stress conditions. (b). Heatmap depicting differentially expressed genes with haplotype identified.Color scale in block represents differences in gene expression; highest (red color) and lowest (green color) gene expression. Hashem root drought control-HRDC, Hashem shoot drought control-HSDC, Hashem root drought stress-HRDS, Hashem shoot drought stress-HSDS. Bivanij root drought control-BRDC, Bivanij shoot drought control-BSDC, Bivanij root drought stress -BRDS, Bivanij shoot drought stress-BSDS.

To validate the identity of the mapped QTLs as well as the candidate genes mined, we undertook the haplotype analysis. We first mined the number of genes harboring SNPs within the QTL intervals and identified 5 out of 9 consistent QTLs which harbored genic SNPs. Two QTLs with maximum number of SNPs, qRWC1.1 (10 SNPs) and qYLD7.1 (121 SNPs) were selected for haplotype analysis. In the qRWC1.1 region, six candidate genes were found of which only one gene, LOC101514724, encoding ATP-dependent DNA helicase Q-like 3, was associated with more than two SNPs. However, no significant difference was found in RWC between the two haplotypes of qRWC1.1 (64.7% vs 66.5%). Of the 74 candidate genes in qYLD7.1, four candidate genes, LOC101498779, LOC101501730, LOC101514620 and LOC101494835 harboured more than one SNP and hence were selected for haplotype analysis. Two haplotypes each, identified in LOC101494835 and LOC101498779, showed a significant difference in yield (65.31 g vs 89.51 g and 65.61 g vs 89.99 g grams). The haplotype GGTC/TACT were associated with loci S7_993031, S7_994976, S7_995674, and S7_995691 in LOC101498779, which encodes arogenate dehydratase/prephenate dehydratase 1 and KEGG pathways analysis showed that this gene came under metabolic pathways including biosynthesis of secondary metabolites and amino acids phenylalanine. Similarly, the haplotype CGAA/TTTG was associated with loci S7_2228317, S7_2228318, S7_2228372m, and S7_2230434 in LOC101494835 gene, whose orthologs in Glycine max encodes MHD domain-containing proteins that are associated with flavonoid biosynthesis. This gene was observed to be differentially expressed between the drought tolerant and drought sensitive varieties of chickpea (Fig. 6B) providing transcriptome level evidence of the significance of this gene under drought stress in chickpea.

Discussion

Efficiency of GBS based SNPs in QTLs mapping

Terminal drought that is, shortage of water during the reproductive phase is the most detrimental form of drought stress, leading to a huge yield losses in agricultural crops (45–69%)6. In order to devise breeding strategies for the development of drought tolerant chickpea varieties, there is an urgent need to identify novel QTL/genes for drought stress tolerance. A high-density, saturated genetic map can provide high-resolution molecular markers linked with the genomic regions controlling the complex drought tolerance traits for utilization in marker-assisted breeding. Deployment of molecular breeding for development of drought resilient superior chickpea cultivar, it is pre-requisite to unravel the genetic basis of drought tolerance and identification of tightly linked molecular marker. Most of the early molecular linkage maps of chickpea were constructed by using SSR markers with marker low-density, a problem further compounded by low level of inter-varietal polymorphism due to narrow genetic base of the crop21,22,23. The large QTL intervals mapped using SSR markers, may lead to introgression of undesirable linkage drag during MAS. SNP markers overcome this limitation due to their higher abundance in the genome. High-density chip arrays and GBS platforms allow simultaneous identification and genotyping of thousands of polymorphic SNPs for creating high-density linkage maps and QTL mapping. In chickpea, SNP markers have been used in recent past for mapping QTLs for drought tolerance, seed traits, Ascochyta blight, Fe and Zn content, pod borer resistance and yield traits12,17,24,25,26,27,28,29.

In the present study, GBS approach was used for high throughput SNP genotyping of a RIL population derived from cross between two desi chickpea varieties, Pusa 362 (tolerant) and SBD 377 (sensitive), with contrasting values for four drought tolerance parameters, namely MSI, RWC, SW and YLD under drought stress. The traits were selected keeping in view the impact of reproductive stage drought stress on drastic reduction of grain yield. MSI and RWC are physiological traits which measure the cellular response to drought stress and are considered as reliable parameters for quantifying the plant drought stress tolerance. SWD and YLD under drought stress are a direct measure of drought tolerance. Two parents (Pusa 362 and SBD 377) were selected due to their contrasting phenotypic traits associated with drought tolerance (Table 2). Pusa 362 exhibits superior drought tolerance traits when compared to SBD 377. This can be observed in its higher Relative Water Content (RWC) of 70.99% compared to 40.88% in SBD 377, suggesting that Pusa 362 retains more water under drought conditions. Moreover, Pusa 362 has a Membrane Stability Index (MSI) of 80.99% indicating greater cell membrane stability under drought stress than SBD 377, which has an MSI of 53.88%. Yield differences are also evident between the two parental lines. Pusa 362 has a yield per plant of 125.67 g, which is almost four times higher than that of SBD 377, which stands at 32 g. This demonstrates the superior productivity of Pusa 362 under the same conditions. Seed weight contrasts between the two varieties are evident as well, with Pusa 362 having a 100-seed weight of 16.3 g, while SBD 377 has almost double that at 32.39 g. These differences in drought tolerance and yield traits between the two parental lines led to the observed genetic variation in their F11 RIL population offspring. The contrasting traits of the parents serve as a foundation to better understand the phenotypic range and variation present in the RILs, providing valuable insights into the genetic determinants of drought tolerance in chickpea. While the QTLs controlling RWC and MSI have been mapped in several crops including rice, wheat and maize, and recently the QTLs for cell membrane stability traits for heat stress have reported in chickpea30. The F11 RIL population developed and used in this study exhibited large variation for these traits with normal frequency distribution indicating quantitative inheritance over three consecutive years. In present study, the presence of high coefficients of variation (CV) for YLD (> 46%) across the consecutive three years within the F11 RIL population have observed. By the F11 generation, generally most loci in recombinant inbred lines (RILs) should be homozygous, effectively fixing most genetic variability31. However, despite these expectations, we observed a notable high CV, suggesting the influence of various intertwined factors. Residual heterozygosity, although expected to be minimal, could still persist, leading to some ongoing genetic segregation and, consequently, variability. Epistatic interactions are also important contributors, given that yield is a multifaceted trait, and interactions between various QTLs can manifest in broader phenotypic variability than individual QTL effects alone32. Additionally, while controlled environments, such as those under rainout shelters, minimize variability, micro-environmental fluctuations like subtle changes in temperature or humidity can still impact plant yield33. It's imperative to also consider that even in such controlled settings, the genetic makeup of plants can interact in complex ways with these subtle environmental nuances, resulting in gene-by-environment (GxE) interactions and the observed variability34. This interplay of genetic and environmental factors in our F11 RIL population underscores the challenges of interpreting high CV in legume yield studies, even under ostensibly controlled conditions. Significant positive correlations of MSI, RWC and SWD with grain yield under drought stress, suggesting that these are genuine parameters for assessing drought tolerance. Positive correlation between 100 seed weight and grain yield under drought stress were observed and it has also been reported earlier in chickpea35. It indicates that yield related traits could be vital parameters and primary objective for developing drought tolerant crop plants and legumes including chickpeas. The traits targeted for improving drought tolerance must have strong correlation with grain yield under drought stress conditions36.

Elucidation and genotyping of genome wide SNPs and linkage map construction

A total of 35,502 SNPs were identified between Pusa 362 and SBD 377 by alignment of Illumina sequence reads to the chickpea reference genome15, but only 3237 (9%) of these were suitable for linkage mapping, after filtering for missing values and segregation distortion. This underlines the limitation of GBS in finding commonly genotyped SNPs across individuals due to random Poison distribution of the sequence reads in the genome of individual samples. This can be improved by increasing the fold coverage of sequence data which adds to the sequencing and informatics costs. Hence, recently there has been a shift towards use of high-density chip array-based SNP genotyping for QTL mapping in crop plants including chickpeas17,37,38. The highest number of SNPs were mapped to chromosome 6 (636), followed closely by chromosome 4 (633). These account for 19% of the SNP markers mapped. The mapping of higher number of SNPs on the 4th chromosome (CaLG4) has previously been reported by Jaganathan et al. 201524 and has been attributed to the possible presence of repeat rich regions on the chromosome. About 51% of the SNPs were mapped to the genic regions of the genome with 23% of intronic SNPs. 22% of the genic SNPs mapped to the exons in the genome. Among the genes which carried highest number of exonic SNPs were the genes coding for pentatricopeptide repeat-containing proteins. These proteins have been shown to be involved in regulating plant responses to drought, salinity and cold39. The presence of large number of exonic SNPs in these genes might be indicative of their role in drought stress tolerance in chickpea. The average marker density across the chromosomes was 0.33 cm, which is similar to chickpea SNP maps reported by Verma et al. 201512(3,228 SNPs spanning 1006.98 cm), Barmukh et al. 2020, 202128,29(3,873 SNP loci spanning 949.27 cm and 3,818 SNP loci spanning 1064.14 cm), denser than inter and intra-specific maps reported by Gaur et al. 201240 (1063 markers,1.7 cm), Hiremath et al. 201241 (1328 markers, 0.59 cm), Deokar et al. 201442 (1336 markers, 0.59 cm), but lower density than chip array-based linkage map reported by17 (13 679 SNPs spanning 1033.67 cm and 7769 SNPs spanning 1076.35 cm).

High-Resolution QTL mapping

Utilization of RIL population with high density genetic map and phenotyping under rainout shelter allowed to precise mapping of genomic region/s for drought stress in chickpea. A positive and significant correlation between the yield and yield related traits like seed weight could be an important parameter for developing drought resilient chickpea genotypes, which indicated that yield traits have considered to be the important objective for improving drought tolerance in agriculturally important crops including chickpea. We have identified 9 consistent QTLs for RWC (qRWC1.1, qRWC5.1), MSI (qMSI2.1, qMSI4.1, qMSI7.1, and qMSI7.2), and YLD (qYLD1.1, qYLD5.2, and qYLD7.1) which may be used for marker assisted breeding for drought tolerance in chickpea. In addition to mapping of QTLs for yield under drought stress environment, to the best of our knowledge, this is the first study that reports mapping of QTLs for physiological traits like MSI and RWC. Four QTLs for MSI, namely qMSI2.1, qMSI4.1, qMSI7.1, and qMSI7.2 with PVE range from 90.68% (qMSI7.1) to 0.2% (qMSI7.2) were found on linkage groups LG 2, LG 4 and LG 7, respectively, while the two consistent QTLs for RWC viz., qRWC1.1 (PVE- 80.76), and qRWC5.1 (PVE- 3.1), were identified on LG 1, and 5 respectively. Three major genomic regions spanning three robust QTLs (PVE: 67.81–6.29%) associated with yield per plant agronomic traits (YLD) were identified and mapped on three different Linkage Groups (LG 1, LG 5, and LG 7) of high density genetic map of chickpea. On the basis of physical position of SNPs marker covering the respective QTL intervals, the reliability of mapped QTLs for YLD were determined by comparing their underlying genomic region with earlier studies on genomic mapping10,25,43,44. In contrast to previous studies the QTLs identified in this study, especially the QTLs for yield were observed to overlap with previously mapped QTLs10,25. All these nine mapped QTLs reported by our study can be considered as robust QTLs showing phenotypic expression and have major effects except qRWC5.1 (PVE 3.1%), qMSI4.1 (PVE 9.12%), qMSI7.2 (PVE 0.2%), and qYLD5.2 (PVE 6.29%) on traits individually with PVE > 10% each. The consistent QTLs across different studies, can be of immense utility for pyramiding for reproductive stage drought tolerance in chickpea.

Candidate gene analysis associated with mapped QTLs

To identify the candidate genes in the QTL intervals, the genetic map developed in the present study was integrated with available genome and transcriptome resources of chickpea. The kabuli chickpea (CDF Frontier) genome15 was used for this analysis, because it is better assembled than the desi chickpea (ICC 4958) genome14 and therefore the possibility of locating the marker intervals on the chromosomes is higher. A total of 326 putative candidate genes showed differential gene expression. These genes encoded proteins associated with different metabolic processes and pathways such as biosynthetic pathways, signaling pathways, photosynthesis, etc. A large majority of these genes are involved in the biosynthesis of osmolytes like amino acids, polyamines and sugar alcohols. These genes have been shown to express under abiotic stress conditions as adaptive features to maintain plant water potential45,46. Similarly, the genes associated with photosynthetic, starch biosynthesis and UDP-glucose biosynthesis pathways, have been shown to be induced under drought stress in chickpea46. The synthesis of UDP-Glucose is a pre-requisite for imparting mechanical strength and cell-wall remodeling to protect plant from abiotic stress47. Genes coding for different TF families were identified and were differentially expressed under drought stress conditions. These are also involved in hormonal signaling, in response to hormones like abscisic acid, auxin, gibberellin and cytokinin, indicating a vital role of plant hormones and their cross-talk in drought stress48,49,50,51.Utilizing and predictive annotation approaches to identify candidate genes located at loci harboring SNPs has the potential to serve as a highly effective strategy in pinpointing causal genes52. To ascertain the veracity of the mapped QTLs and the potential candidate genes involved, haplotype analysis was performed, which revealed the presence of 5 QTLs among the 9 initially identified QTLs. From these, 2 QTLs, namely qRWC1.1 and qYLD7.1, were selected due to their high SNPs density within the QTL regions. While the candidate gene based haplotype analysis could not support the findings of qRWC1.1, two candidate genes in the qYLD7.1 displayed two distinct haplotypes that exhibited a substantial phenotypic differences. These two haplotypes could be associated with metabolic pathways for secondary metabolites biosynthesis including amino acid and flavonoid biosynthesis as found from the KEGG and GO analyses. Arogenate Dehydratase/Prephenate Dehydratase 1 (LOC101498779) gene encodes an enzyme involved in the terminal steps of phenylalanine biosynthesis. Phenylalanine is not just an essential amino acid; it is also a precursor to many secondary metabolites, including flavonoids, lignins, and alkaloids. Phenylalanine-derived compounds can play protective roles under stress. Lignins, for example, can contribute to cell wall rigidity and reduce water loss53. The exact function of MHD Domain-Containing Proteins (LOC101494835) in plant might be uncertain, the orthologs in Glycine max are associated with flavonoid biosynthesis. Flavonoids are versatile secondary metabolites playing roles in UV protection, signaling, and stress responses. Flavonoids antioxidant properties make them crucial in scavenging reactive oxygen species, which increase under stress conditions like drought. They can protect cellular structures from oxidative damage and maintain cell homeostasis54. Flavonoids have an antioxidant properties and have the ability to eliminate reactive oxygen species (ROS) in response to drought stress55. By impeding the metabolic functions of enzymes involved in ROS production pathways flavonoids induce activation of the antioxidant defense system. Given that phenylalanine is a precursor to flavonoids, these two genes might be part of a coordinated response to drought. Enhanced phenylalanine production could fuel flavonoid biosynthesis, boosting the plant's antioxidant capacity during drought56.These identified genes could impact other stress responses. For instance, flavonoids also play a role in UV protection, pathogen defense, and allelopathy. An upregulated flavonoid pathway could enhance multiple stress defenses simultaneously. Phenylalanine-derived compounds like lignins can also play roles in pathogen defense and structural integrity57.The two identified candidate genes play roles in interconnected pathways pivotal for plant stress responses. Beyond their direct metabolic functions, they contribute to a broader network of responses that can enhance plant resilience not just to drought but potentially other stresses as well. This holistic understanding underscores their significance and potential as targets for crop improvement. The QTL regions associated with these genes, whose role in drought stress has been well known, strongly support the hypothesis for association of the identified QTLs to drought stress tolerance.

Conclusions

The high-density linkage map developed in this study is likely to be of immense utility for high resolution mapping of QTLs for drought tolerance traits in chickpea. This enabled identification of closely linked SNP markers, which can be used for the introgression of superior alleles of the genes for drought tolerance traits through MAS for developing drought tolerant cultivars. In this study we identified novel QTLs for physiological traits, membrane stability index (MSI), relative water content (RWC), and yield/plant (YLD) under drought stress. The QTLs for membrane stability index (MSI) and relative water content (RWC), have been mapped for the first time in chickpea. To validate the accuracy of the mapped QTLs and determine the potential candidate genes implicated, a comprehensive haplotype analysis was conducted. The analysis unveiled the existence of five QTLs out of the initial nine QTLs that were identified. The identification of differentially expressed candidate genes in these QTL regions, paves way for better understanding of the biological pathways and molecular mechanisms involved in drought stress response with ultimate aim of introgressing these genomic regions, using MABC, for developing superior drought tolerant chickpea genotypes.

Materials and methods

Plant material and genomic DNA extraction

An F11 RIL population developed by crossing chickpea varieties Pusa 362 (drought tolerant, small seeded) and SBD 377 (drought sensitive, bold seeded) comprising of 186 individuals were used to identification of genomic regions and candidate genes for drought tolerance. Fresh young leaves were used for genomic DNA isolation using the CTAB method with minor modifications58. The isolated DNA was checked for quality by electrophoresis in 0.8% agarose gel and quantified using Nanodrop 8000 spectrophotometer equipment (Thermo-Scientific).

Phenotyping for drought tolerance parameters and statistical analysis

The set of 186 RILs along with the parents Pusa 362 and SBD 377 were grown for phenotyping in augmented block design for three consecutive years during the rabi season (2014–15, 2015–16 and 2016–17) under rainout shelter conditions at the ICAR-IIPR, Kanpur, Uttar Pradesh, India (26° 26′ 59.7228’’N, and 80° 19′ 54.7356’’E). Each experimental block consisted of a single 1 m plot with a spacing of 0.3 m between rows. In chickpea reproductive phase (particularly flowering and pod formation) is consider as a key step to evaluate drought traits. Exposure to water deficit at reproductive stage may cause yield loss through flower and pod abortion and is consider as a simple and effective screening techniques for terminal drought. Drought stress was imposed by withholding irrigation at the pod formation stage after 90 days after sowing. The parental lines and RILs were phenotyped for four drought tolerance parameters: (i) membrane stability index (MSI), (ii) relative water content (RWC), (ii) seed weight under drought (SWD) and yield /plant (YLD) under drought stress, over three consecutive years from 2014 to 2017. For determination of RWC, the third fully expanded leaf from top of the plants was collected before noon. The leaves were weighed to record the fresh weight (FW), then immersed in double distilled water in Petri plates for 4 h and weighed again to record turgid weight (TW). The samples were then dried at 65 °C for 48 h and weighed to record the dry weights (DW). The data was recorded in triplicates for each of the RILs and parental genotypes. The RWC (%) was calculated using the following formula:

$$ {\text{RWC }}\left( \% \right) \, = \, \left( {{\text{FW}} - {\text{ DW}}} \right) \, / \, \left( {{\text{TW}} - {\text{ DW}}} \right) \, \times { 1}00 $$

To measure the MSI, leaf tissue (0.5 g) was collected in glass vials containing 10 ml of double distilled water. These samples were incubated in a water bath at 40 °C for 30 min. After cooling the tubes to room temperature, electrical conductivity (C1) of the solution was recorded with a conductivity meter (Sanco, India). The tubes were then kept in a boiling water (100 °C) bath for 10 min and electrical conductivity measured (C2), after the tubes attained the room temperature (25 °C). The membrane stability index (MSI) using the following formula:

$$ {\text{MSI }}\left( \% \right) \, = \, \left[ {{1} - \, \left( {{\text{C1}}/{\text{C2}}} \right)} \right] \, \times { 1}00 $$

The seed and yield traits were recorded after threshing and cleaning of seeds. Average 100 seed weight under drought (SWD) was calculated by weighing 100 mature seeds from 10 representative plants of each RIL and the parental genotypes at a seed moisture content of 10%. Average seed yield per plant was recorded from the seed weights of five plants for each RIL and parental genotypes. Statistical analysis to compute mean, standard deviation (SD), coefficient of variation (CV), correlation coefficients, for the traits was performed by using XLSTAT/SPSS v17.0.

Illumina sequencing and SNP allele calling

To prepare libraries for GBS analysis, genomic DNA from the parental genotypes (Pusa 362 X SBD 377) and 186 RILs was subjected to restriction digestion with a type II restriction endonuclease ApeKI according to the protocol of Elshire et al.,13. All libraries were pooled and single end sequencing of the fragments was performed using Illumina TrueSeq V3.0 sequencing chemistry, with a read length of 100 bp. The raw sequence reads obtained in the FASTQ file format were processed by filtering through GBS analysis pipeline implemented in TASSEL v3.059. The reads were filtered for the criterion including the presence of perfectly matched barcodes with the four bps remnant sequences of the RE site and a minimum Phred score of 10 across the first 72 bases. The reads were sorted, de-multiplexed and further trimmed to 64 bases for analysis. The filtered reads were mapped to the reference chickpea genome15 using Burrows-Wheeler alignment software (BWA). The missing data were imputed using Beagle v3.3.260, while retaining the default parameters. The perfectly aligned sequences were used for SNP calling with a with a minor allele frequency (MAF) ≥ 0.2, minimum sequence read depth: 10X and SNP base quality ≥ 20. The distribution of the identified SNPs was visualized across the 8 linkage groups (LGs) using Circos v0.6161.

Structural and functional annotation of the SNPs

The distribution of SNPs in different genomic regions, namely intergenic, exonic, intronic and UTR, was determined to analyse their functional or structural significance. Gene ontology (GO) analysis was performed for the genes containing SNPs to classify them into three principal GO categories, using in-house custom perl scripts and SNP effect predictor (snpEff c3.1 h)62. The Benjamini and Hochberg false discovery rate (FDR) correction at a significance level of 5% was done using the BiINGO plugin of Cytoscape v2.663. To identify SNPs present in the transcription factors (TFs) family genes, the peptide sequences of different TF belonging to five leguminous crops (C. arietinum, Glycine max, Cajanus cajan, Lotus japonicas and Medicago truncatula) were retrieved from the Plant Transcription Factor database. These sequences were utilized to build HMM profiles for all TF families and categorize the genes with SNPs into different TF families.

Linkage map construction and QTL mapping

The genome wide SNPs derived from the GBS data on Pusa 362/ SBD 377 RILs were analyzed for segregation distortion against the expected 1:1 ratio using χ2-test (p < 0.05). Markers showing segregation distortion were excluded from the analysis. The linkage map was constructed using JOINMAP 4.1 program64. Ordering of the markers was done using regression mapping algorithm RECORD (REcombination Counting and ORDering)65. The Kosambi function was used to estimate the map distance between markers66 and MapChart67 was used for visualization the genetic map. The QTLs associated with the traits under consideration were mapped by integrating the genotypic and phenotypic data on the 186 RILs using QTL.gCIMapping68. The default settings were used with a walk speed of 1 cm, window size of 10 cm and a minimum LOD threshold score of 2.5 calculated at 10,000 permutations and significance level, (P ≤ 0.05) were used to identify significant QTLs for the traits. The phenotypic variance explained (PVE %) and the additive effect explained by the QTLs were also determined.

Identification of candidate genes in the QTL intervals

The genomic sequence between the SNP loci flanking the identified QTL intervals were retrieved by BLAST search against the reference chickpea genome15 sequence to identify the putative candidate genes controlling the traits The identified sequences were used for gene prediction using FGENESH69. In-silico expression analysis was performed for the identified candidate genes using available transcriptome data from the root and shoot tissues of chickpea genotypes ‘Bivanij’ (drought tolerant) and ‘Hashem’ (drought sensitive), exposed to drought stress (root-SRX3087500, SRX3087499, SRX3087504, SRX3087503; shoot-SRX3087498, SRX3087497,SRX3087502, SRX3087501)70. The transcriptome data was retrieved from NCBI and mapped to the putative candidate genes with the help of the 454 Roche gsMapper (Newblerv2.3.5). A heatmap was generated using the FPKM (Fragments per kilobase per million mapped reads) values to depict the differential gene expression by using the TIGR Multiple Experiment Viewer (MEV) software71,72. Biological significance of differentially expressed genes was explored by GO term enrichment analysis including biological processes, molecular functions and cellular components using Blast2GO 73. The KEGG enrichment was done using David v6.8. Haplotype analysis for selected genes were done using Haploview 74.

Ethical approval

The authors declare that the experimental research work involving the growth of plants in this study, was conducted in compliance with relevant institutional, and national guidelines and legislation.