Fusarium wilt (FW) is one of the most important biotic stresses causing yield losses in pigeonpea. Genetic improvement of pigeonpea through genomics-assisted breeding (GAB) is an economically feasible option for the development of high yielding FW resistant genotypes. In this context, two recombinant inbred lines (RILs) (ICPB 2049 × ICPL 99050 designated as PRIL_A and ICPL 20096 × ICPL 332 designated as PRIL_B) and one F2 (ICPL 85063 × ICPL 87119) populations were used for the development of high density genetic maps. Genotyping-by-sequencing (GBS) approach was used to identify and genotype SNPs in three mapping populations. As a result, three high density genetic maps with 964, 1101 and 557 SNPs with an average marker distance of 1.16, 0.84 and 2.60 cM were developed in PRIL_A, PRIL_B and F2, respectively. Based on the multi-location and multi-year phenotypic data of FW resistance a total of 14 quantitative trait loci (QTLs) including six major QTLs explaining >10% phenotypic variance explained (PVE) were identified. Comparative analysis across the populations has revealed three important QTLs (qFW11.1, qFW11.2 and qFW11.3) with upto 56.45% PVE for FW resistance. This is the first report of QTL mapping for FW resistance in pigeonpea and identified genomic region could be utilized in GAB.
Pigeonpea (Cajanus cajan (L.) Millsp.) is the sixth most important food legume crop in the world. Pigeonpea cultivars contain about 20 to 23% protein from the total composition of seeds which also possess carbohydrates (65 to 70%) and fiber. Pigeonpea “Dal” is a major source of dietary protein for more than a billion people in the developing world and a rich source of essential amino acids, like methionine, lysine and tryptophan1,2,3. Globally pigeonpea is cultivated on 6.23 million ha (m ha) with an annual production of 4.74 million tons (mt). India is the primary pigeonpea growing country, where more than 85% of the world’s pigeonpea production (3.29 mt, cultivated in 5.06 m ha area) and consumption takes place (http://faostat3.fao.org/home/, as of August 2016). Its productivity measured in experimental plots (~3700 kg/ha) could not be translated in farmers’ fields and average yield (<1000 kg/ha) remained stagnant during the last six decades3. This is mainly due to cultivation of pigeonpea with limited or no inputs in heavy diseases pressure by small-holder farmers. Fusarium wilt (FW) and sterility mosaic disease (SMD) are two epidemic diseases in pigeonpea that can cause complete yield losses in susceptible cultivars.
FW caused by Fusarium udum Butler has the global importance as it frequently occurs in both India and Africa, the major growing zones in the world. The annual losses due to FW have been reported to be US $ 71 million4 and 470, 000 t of grain in India and 30,000 t of grain in Africa5. As yield losses due to FW are significant, it is worthwhile to manage and further increase the disease resistance in pigeonpea cultivars. The most pragmatic, economical and environment-friendly approach for FW management is the utilization of genetic resistance6. FW resistance in pigeonpea has been found to be controlled by different gene actions while using different donor and suseptible parents. The phenotypic data of segregating mapping populations revealed single to multiple genes with complementary to duplicate gene actions6, 7.
Genomics-assisted breeding (GAB) provides strategies to combine indepth genome information with the trait phenotyping data for effective and efficient selection procedures8. For implementing GAB, it is essential to develop genetic maps and identify quantitative trait loci (QTLs) or genes associated with desired traits. Subsequently identified QTLs/genes could be deployed in breeding programs through marker-assisted selection (MAS) or some other molecular breeding approach9,10,11. In the case of pigeonpea, limited genetic diversity in cultivated gene pool coupled with inadequate genomic resources caused serious impediments for applying GAB12. However, in recent past, ample genomic and genetic resources such as segregating populations (~20 mapping populations, including recombinant inbred lines; introgression lines, multi-parent advanced generation intercross (MAGIC) population and nested association mapping (NAM) population), molecular markers (>3,000, simple sequence repeat (SSR) markers; 15,360 diversity arrays technology (DArT arrays; 10,000 single nucleotide polymorphism (SNP) markers; 1,616 Kompetitive Allele Specific PCR (KASPar) assays; 48 SNPs- based Golden-Gate VeraCode assays and 60K SNPs- based “Axiom®Cajanus SNP array”), genetic maps (8 maps with SSRs, DArTs, SNPs), transcriptome assemblies (CcTA v1 and CcTA v2), draft genome sequence etc. have been developed13, 14. Recently re-sequencing data through next generation sequencing (NGS) technologies have also been generated for identifying genome-wide variations in parental lines of mapping populations, segregating for economically important traits including FW resistance15. Additionally sequencing-based bulked segregant analysis (Seq-BSA) by using a few lines from the mapping population has been used to identify candidate SNPs for resistance to FW and SMD in pigeonpea16. In continuation to detect all possible minor and major affect loci contributing to FW resistance in pigeonpea genome, NGS based genetic mapping by employing the entire mapping population has been planned in the present study.
NGS is also providing opportunities to develop high-density genetic maps by deploying high throughput genotyping approaches such as genotyping-by-sequencing (GBS). In fact, GBS has been extensively used in a number of crop species for analysing genetic diversity17, developing dense genetic maps18, 19, refining the target genomic regions20, establishing marker-trait associations21 as well as deploying in genomic selection22, 23. Therefore, in the present study, GBS based genetic maps for three intra-specific populations have been developed. Populations were phenotyped for one to two years at two to three locations. Subsequently detailed analyses of genotyping data together with phenotyping data have provided consistent genomic regions associated with FW resistance in pigeonpea.
Mapping populations and phenotypic evaluation
Three mapping populations including two pigeonpea recombinant inbred lines (PRILs: PRIL_A (ICPB 2049 × ICPL 99050) and PRIL_B (ICPL 20096 × ICPL 332) and one F2:3 population (ICPL 85063 × ICPL 87119) along with their parental lines were phenotyped for FW resistance as per the standard procedures. PRIL_A and PRIL_B mapping populations were phenotyped for two years (2012–2013 and 2013–2014) in three replications. However, F2 individuals were phenotyped for only one year (2015–2016) at single location.
The phenotyping data of resistant parents (ICPL 20096, ICPB 2049 and ICPL 87119) and susceptible parents (ICPL 332, ICPL 99050 and ICPL 85063) of mapping populations for FW resistance were collected from the previous study16. The mean PDI score during 2012–2013 and 2013–2014 among PRIL_A mapping population ranged from 0 to 98.95 at Patancheru location and 5.73 to 92.75 at Gulbarga location (Supplementary Fig. S1(a,b)). The mean PDI score of (2012–2013 and 2013–2014) PRIL_B mapping population showed 0 to 95.6 at Patancheru location, 0 to 68.09 at Gulbarga location and 0 to 94.25 at Tandur location (Supplementary Fig. S1(c–e)). In the case of F2:3 population, the PDI score (2014–2015) ranged from 0 to 83.3, whereas the resistant (ICPL 87119) and susceptible parent (ICPL 85063) showed PDI score of 3.36 and 91.6 at Patancheru location, respectively (Supplementary Fig. S1(f)). The range of genetic variations observed in phenotyping data indicated several genes involved in FW resistance and the genetic material is sufficient for QTL mapping (Table 1).
High throughput sequencing and SNP discovery
A total of 46.90 Gb (464.40 million reads), 34.37 Gb (340.30 million reads) and 32.96 Gb (326.34 million reads) clean GBS reads were generated using HiSeq2500 platform for PRIL_A, PRIL_B (sequence data generated, SNPs identified in PRIL_B have been taken from the companion paper being published in Scientific Reports) and F2 mapping populations, respectively. The reads from individual progenies ranged from 0.80 to 8.79 million reads in PRIL_A (Supplementary Fig. S2), 0.80 to 5.10 million in PRIL_B, (Supplementary Fig. S3) and 0.80 to 5.46 millions reads for F2s (Supplementary Fig. S4). Also, a total of 3.17 (ICPB 2049) and 6.62 (ICPL 99050) million reads of PRIL_A parents, 1.21 (ICPL 20096) and 1.52 (ICPL 332) million reads of PRIL_B parents and 4.12 (ICPL 85063) and 3.61 (ICPL 87119) million reads of F2 parents were generated. To identify the genome-wide SNPs, TASSEL-GBS pipeline was used with sequence data for all parents along with respective mapping populations. As a result, a total of 0.10, 0.21 and 0.08 million SNPs were identified in PRIL_A, PRIL_B and F2s, respectively (Table 2). Detected SNPs were non-uniformly distributed across different pseudomolecules/CcLGs. For instance, 2,918 (CcLG05) to 21,711 (CcLG11) SNPs in PRIL_A, 5,682 (CcLG05) to 41,292 (CcLG11) SNPs in PRIL_B and 2,580 (CcLG05) to 18,522 (CcLG11) SNPs in F2s were distributed across the linkage groups. Further detected SNPs were filtered based on percent heterozygosity, minor allele frequency and missing data information. As a result, a total of 985, 1,789 and 4,209 SNPs were retained for further analysis in PRIL_A, PRIL_B and F2s respectively. The numbers of SNPs reduced from several millions to few hundred, due to stringent selection criteria used in the present analysis.
SNPs- based genetic maps
The identified genome-wide SNPs were evaluated against expected Mendelian segregation ratios through Chi-square analyses in PRIL_A, PRIL_B and F2 mapping populations, respectively. The markers with distorted segregation ratio were removed using specific criteria for PRIL_A, PRIL_B and F2 (see material and methods for details). As a result, 964, 1101 and 557 SNPs were used for the construction of genetic maps in PRIL_A, PRIL_B and F2, respectively. The genetic map derived from PRIL_A was comprised of 964 SNPs distributed on 11 linkage groups (Fig. 1 and Table 2). The PRIL_A genetic map encompassed 1120.56 cM, with linkage groups ranged from 57.83 cM (CcLG05) to 182.54 cM (CcLG02). The number of SNPs mapped to each linkage group varied from 20 SNPs in CcLG05 to 177 SNPs in CcLG11, with a mean of 88 SNPs per linkage group. The inter-marker distance in linkage group ranged from 0.65 (CcLG11) to 2.89 (CcLG05) with a mean inter-marker distance of 1.16. The genetic map for PRIL_B comprised of 1101 SNPs over 11 linkage groups with total map length of 921.20 cM (Fig. 2 and Table 2). The length of individual linkage group was ranged from 30.25 cM (CcLG05) to 118.39 cM (CcLG11). The number of SNPs mapped to each linkage group varied from 10 SNPs in CcLG05 to 382 SNPs in CcLG11, with a mean of 100 SNPs per linkage group. The inter-marker distance in linkage groups ranged from 0.31 (CcLG11) to 3.09 (CcLG01) with a mean inter-marker distance of 0.84. The genetic map in F2 population consisted of 557 SNPs with a total length of 1446.50 cM (Fig. 3 and Table 2). The number of SNPs in each linkage group ranged from 7 (CcLG03) to 314 (CcLG11) with the map length of 78.48 cM (CcLG03) to 228.20 cM (CcLG02). The average marker distance in linkage groups ranged from 0.68 cM (CcLG11) to 11.20 cM (CcLG03) with mean average inter-marker distance of 2.60 cM, across linkage groups. The developed linkage maps in all three mapping populations together with respective phenotyping data were used for QTL analysis.
QTLs for FW resistance
Phenotyping data together with SNP genotyping data were used for QTL analysis in PRIL_A, PRIL_B and F2 populations using composite interval mapping (CIM). Based on the phenotypic variance explained (PVE), identified QTLs were classified as major (≥10% PVE) and minor QTLs (<10% PVE). The identifed QTLs were also classified as stable (appeared in more than one location) and consistent QTLs (appeared in more than one year). For each population details on QTLs identified have been explained below.
QTLs in PRIL_A
A total of eight QTLs were identified in PRIL_A dispersed on six linkage groups (CcLG01, CcLG02, CcLG03, CcLG04, CcLG06, and CcLG11) with PVE ranged from 6.55 (qFW1.1) to 14.67% (qFW3.1) (Table 3). Four QTLs, namely qFW3.1 (14.67%), qFW6.1 (10.71%), qFW11.1 (12.11%) and qFW11.2 (10.04%) were identified as major effect QTLs and remaining four QTLs showed minor effects (qFW1.1, qFW2.1, qFW4.1 and qFW11.3) for FW resistance. Two QTLs were also identified as stable QTLs (appeared in more than one location), namely, qFW4.1 (Patancheru 2012–2013 and Gulbarga 2013–2014) and qFW11.1 (Gulbarga 2012–2013 and Patancheru 2013–2014). However, none of identified QTLs was found consistent QTL across the years and at both the locations.
QTLs in PRIL_B
QTL mapping analysis in PRIL_B identified a total of six QTLs on three linkage groups (CcLG03, CcLG07, and CcLG11) and PVE ranged from 7.92 (qFW3.2) to 15.26% (qFW7.1) (Table 4). Out of six, two QTLs, namely qFW7.1 (15.26%) and qFW11.4 (14.72%) were identified as major QTLs. Whereas, four QTLs (qFW3.2, qFW11.3, qFW11.2 and qFW11.1) with PVE ranged from 7.92 to 9.71% were defined as minor effect QTLs. A single consistent (appeared in more than one year/season) QTL (qFW11.4 appeared at Patancheru 2012–2013 and 2013–2014) and two stable QTLs namely qFW11.1 (Gulbarga 2012–2013 and Patancheru 2012–2013) and qFW11.2 (Gulbarga 2012–2013 and Patancheru 2012–2013) were detected in PRIL_B.
QTLs in F2 population
In the case of F2 population, a total of five QTLs, one on CcLG05 (qFW5.1), two on CcLG08 (qFW8.1 and qFW8.2) and two on CcLG11 (qFW11.1 and qFW11.3) were identified (Table 5). The PVE by the QTLs ranged from 2.75 (qFW8.1) to 56.45% (qFW11.3). Based on the PVE classification two major QTLs namely qFW5.1 (15.68%) and qFW11.3 (56.45%) were reported in the F2 population.
Candidate genomic regions for FW resistance
Composite Interval Mapping (CIM) analysis in the above mentioned populations have provided a total of 14 significant QTLs across all the linkage groups except CcLG09 and CcLG10 (Supplementary Table S1). Out of 14 QTLs, 7 QTLs showed major effects and the proportion of PVE by individual QTLs ranged from 10.04 (qFW11.4) to 56.45%. (qFW11.1). Comparative analysis of QTLs identified between PRIL_A and PRIL_B revealed three QTLs on CcLG11 (qFW11.1, qFW11.2 and qFW11.3) as common in both mapping populations based on the SNP positions in genome assembly. Across all the three populations i.e. PRIL_A, PRIL_B and F2, only one QTL, namely qFW11.1 was found common. To identify the common genomic regions on CcLG11 an iterative map was created using the genetic information of all three genetic maps (for PRIL_A, PRIL_B and F2 population). Based on the presence of common markers among three populations on CcLG11, an iterative linkage map of ~118 cM length has been created. As a result, three QTLs from PRIL_A, four QTLs from PRIL_B and two QTLs from F2 mapping population were visualized on CcLG11. Comparative analysis across all three mapping populations resulted in the identification of three important QTLs namely qFW11.1, qFW11.2 and qFW11.3 (Fig. 4). These genomic regions on CcLG11 can be considered as first choice for the breeder to introgress FW resistance in susceptible cultivars through GAB.
To understand the genetic nature of FW resistance in pigeonpea few studies have been conducted in past6, 7. Efforts in identifying trait associated markers through BSA or through marker association analysis have also been published. For instance, two random amplified polymorphic DNA (RAPD) markers24, four sequence characterized amplified region (SCAR) markers25 and six simple sequence repeat (SSR) markers26, 27 were reported for FW resistance. Recently, Seq-BSA together with non-synonymous SNP (nsSNPs) based approach was applied to identify four candidate SNPs in four genes associated with FW resistance16. However, no QTL mapping experiment has been conducted till date to find out genome-wide regions controlling FW resistance in pigeonpea. The identification and introgression of QTLs for FW resistance in popular but susceptible pigeonpea varieties/parental lines though GAB is a fast-track approach to the development of new breeding lines with enhanced resistance to FW. Therfore, this study aimed at identifying major and stable FW resistant QTLs from different donor background through GBS based genetic maps and multi-year and multi-location phenotyping.
Mapping populations in the present study were phenotyped in one to two crop seasons at one to three locations for FW resistance. Detailed analysis of phenotyping data showed that parents were true to the type in terms of disease reactions and huge variations among progenies were observed across locations and years. The results indicated that sufficient variatons were present to map the FW resistance. However, due to location × environment interations, huge variaion was observed in the average PDI score. The phenotypic variation ultimately, affects the identifcation of stable QTLs across the years.
Construction of dense genetic maps is useful in trait mapping studies, including fine mapping, positional cloning, construction of genome assembly and further improvement of genome sequences28, 29. The current availability of more than >3000 SSR markers; diversity arrays technology (DArT) array with 15,360 features; 1616 Kompetitive Allele Specific PCR (KASP) provides an opportunity to develop high-density genetic maps30. However, due to the low level of genetic polymorphism in Cajanus spp, it was difficult to develop dense genetic maps. In this direction, six SSR-markers based intra-specific genetic maps with 59 to 140 SSR loci were constructed, while the consensus map constructed using six component maps had only 339 SSR markers31. To overcome the problems mentioned above at some extent, first high density genetic map with 875 KASP and 35 SSR markers was developed for an inter-specific population with an average inter marker distance of 1.11 cM28. Therefore GBS seems to be a promising approach for the construction of dense intra-specific genetic maps in pigeonpea. Construction of GBS based high density genetic maps is the common approach in the majority of the crop plants30, 32, 33. In the present study GBS approach was used for the construction of three genetic maps for intra-specific populations.
In the present study, we have used GBS approach to discover and genotype SNPs in three mapping populations, namely PRIL_A, PRIL_B and F2s. Sequencing of mapping populations resulted in the identification of 0.10, 0.21 and 0.08 million SNPs in PRIL_A, PRIL_B and F2:3, respectively. Detected SNPs in the present analysis were comparatively higher to any of the previously reported genetic mapping studies in pigeonpea28. Further, a stringent selection criterion including missing percentage, minor allele frequency and percent heterozygosity was adopted to filter out the SNPs to select the panel of robust SNPs for constructing high-density genetic maps. The adopted stringent criteria reduce the number of SNPs from millions to few hundred, however, this is very common in GBS data, and similar results have been reported in many other GBS based genetic mapping studies19, 20. In summary, in the present study, a total of 985 (0.96%), 1789 (0.84%) and 4209 SNPs (4.89%) were finally obtained in PRIL_A, PRIL_B and F2 respectively. The number of mapped markers was low in comparison to the identified SNPs between parental lines may be due to limited sequencing depth used for GBS. The numbers of SNPs in F2 were found higher in comparison to PRILs and this may be attributed to different parameters used to select SNPs in RIL and F2.
As compared to earlier genetic maps, dense genetic maps were constructed using 964 (97.86% SNPs mapped), 1101 (61.54% SNPs mapped) and 557 (13.23% SNPs mapped) SNP loci for PRIL_A, PRIL_B and F2 populations, respectively. The number of SNPs mapped in F2 population was quite low as compared to RILs and this may be due to segregation distortion of SNPs in F2s. Similar observations were also reported in maize while constructing GBS based genetic map for F2 population34. The current dense genetic maps (964 SNPs in PRIL_A), 1101 SNPs in PRIL_B and 557 SNPs in F2) will be useful for the development of high-density consensus map and fine mapping of QTLs28, 31, 35. However, in the present study we could not develop a consensus map due to lack of common markers across the mapping populations. This may be because of inherent nature of GBS, which takes random sites in the genome for detecting polymorphism.
The developed genetic maps have shown big gaps (higher marker intervals) on few linkage groups (CcLG05 and CcLG08 in PRIL_A; CcLG01 and CcLG05 in PRIL_B). In the case of F2 mapping population, two linkage groups showed >10 cM of inter marker distance (CcLG03 and CcLG05), which was quite high for QTL mapping experiments. In this context, currently developed 60K “Axiom®Cajanus SNP array” (unpublished) will be useful in enriching these genetic maps in future for trait discovery programs in pigeonpea.
QTL mapping for FW resistance have revealed 14 significant QTLs with six of these QTLs showed major effects with PVE of more than 10%. The identified QTLs in the present study are novel as this is the first report of mapping QTLs for FW resistance in pigeonpea. As expected, only few QTLs were stable and consistent and this might be due to the presence of environment interactions/pathogenic variability across the locations. It is interesting to note that we found three QTLs on CcLG11 (qFW11.1, qFW11.2, and qFW11.3) with significant effects while using three different mapping populations. Comparative analysis of the present study with the Seq-BSA based FW resistance gene mapping16 revealed two earlier identified SNPs at CcLG11 (32606065 bp and 35228097 bp) in the close vicinity or in the identified QTL region of qFW11.1 (37133265–43777543 bp in PRIL_A; 23244327–45340470 bp in PRIL_B and 33516474–41835381 bp in F2). However, other SNPs present on the CcLG02 (27426866 bp and 27861114 bp) identified earlier using seq-BSA approach were comparatively far from the QTL qFW2.1 (15580586–16115010 bp) identified in this study. These comparative analyses across two different studies revealed the importance of the genomic region present at CcLG11 in controlling FW resistance. It was surprising to note that despite using GBS based genotyping approach and multi-location and multi-year phenotypic data sets we had detected QTLs with upto 15.26 PVE in RILs, which was quite low and thus, indicating the complex nature of FW resistance. In the case of the F2 population, a QTLs namely, qFW11.3 with 56.45 PVE was detected. However, the same QTLs was detected as minor QTLs in both of the RIL populations. The major reason for such differences in PVE showed in RILs and F2 is possibly due to the difference in resolution of the genetic maps, as high density mapping can reduce the false positives in QTL detection36. Altogether, these results suggested that the dense genetic map provides accurate detection of QTLs for FW resistance in pigeonpea.
In conclusion, three intra-specific dense genetic maps were constructed using GBS approach. The marker density in the maps was ranged from one in 0.84 to 2.60 cM. Based on these genetic maps and phenotypic data, 14 significant QTLs were identified with PVE ranged from 2.65% to 56.45%. Six of these QTLs showed major effects with PVE of more than 10% each. Comparative analysis revealed three important QTLs namely qFW11.1, qFW11.2 and qFW11.3 detected across the populations. All the QTLs identified in the present study for FW resistance were novel. Some of these QTLs can be deployed in GAB for development of FW resistant pigeonpea genotypes and for molecular dissection of FW resistance.
Two recombinant inbred line (RIL) populations generated by crossing ICPB 2049 (FW susceptible) × ICPL 99050 (FW resistant) (designated as PRIL_A), ICPL 20096 (FW resistant) × ICPL 332 (FW susceptible) (designated as PRIL_B) comprising of 188 individuals using single seed descent method12. Together with RILs, one early generation (F2) mapping population consisting of 168 lines by crossing ICPL 85063 (FW susceptible) × ICPL 87119 (FW resistant) was also utilized in the present study.
Phenotyping for FW resistance
PRIL_A were phenotyped in sick plot nursery at two locations namely Patancheru (Telangana State, India) and Gulbarga (Karnataka, India) for FW resistance during crop season 2012–2013 and 2013–2014. PRIL_B was phenotyped at three different locations, namely Patancheru (Telangana State, India), Gulbarga (Karnataka, India) and Tandur (Telangana State, India) during crop season 2012–2013 and 2013–2014. Field phenotyping of mapping populations was done in sick plot nurseries with standard procedures37. The phenotyping of RILs was conducted in three replications using randomized complete block design (RCBD). The disease score of highly susceptible local checks at regular intervals was recorded. The phenotyping data of parental lines of RIL mapping populations for FW resistance were collected from the previous study16. The phenotyping of F2 population was completed in F2:3 plants to avoid the loss of the plants and to get replicated data during the year 2015–2016 at Patancheru (Telangana State, India). BLUPs were estimated from multi-location data and the arcsine transformed values used for QTL analysis. The experimental plots were four meters long with a row to row spacing of 75 cm and 20 cm spacing between plants. Individual plants in each population were evaluated based on the percent disease incidence (PDI) score at 90 days after sowing (DAS).
DNA isolation and sequencing of GBS libraries
Two to three young leaves from individual plants in PRIL_A (188 individual), and F2s (168 individuals) and parental lines were used to isolate genomic DNA using NucleoSpin Plant II kit (Macherey-Nagel, Düren, Germany). The GBS data of PRIL_B have been taken from the companion paper being published in Scientific Reports). The quality and quantity of DNA was checked on 0.8% agarose gel and then using Qubit 2.0 fluorometer (Thermo Fisher Scientific Inc., USA).
GBS approach was used for simultaneous SNP discovery and genotyping of mapping populations38. 10 ng genomic DNA from each sample was restriction digested using ApeKI (recognition site: G/CWCG) endonuclease. The digested product was ligated with uniquely barcoded adaptors using T4 DNA ligase enzyme and was further incubated at 22 °C for 1 h and heated at 65 °C for 30 min to inactivate the T4 ligase. Such digested ligated products from each sample were mixed in equal proportion to construct the GBS libraries, which were then amplified, purified to remove excess adapters. The DNA libraries were sequenced on HiSeq 2500 platform (Illumina Inc, San Diego, CA, USA) to generate genome-wide sequence reads.
Sequence reads from raw FASTQ files were used for SNP identification and genotyping using reference based GBS analysis pipeline implemented in TASSEL v4.039. The draft genome sequence information of pigeonpea variety ‘Asha’ was used as reference assembly40. Briefly, the sequencing reads were searched for perfectly matched barcode with the expected four base remnant of the enzyme cut site using the in-house script. The barcode containing reads were sorted, de-multiplexed according to barcode sequence and trimmed to first 64 bases starting from enzyme cut site. Further, those reads containing ‘N’ within first 64 bases were rejected. The remaining good quality reads (called as tags) were aligned against the draft genome sequence of pigeonpea using Burrows-Wheeler Alignment tool (BWA)41. The alignment file was then processed through GBS analysis pipeline for SNP calling and genotyping.
The individuals with less than 80 Mb data were not selected for further analysis to avoid false positive detection. Due to the presence of the different levels of heterozygosity in PRIL_A and F2, different criteria were used to filter SNPs identified in these populations. In the case of PRIL_A, lines with more than 50% missing data and minor allele frequency (MAF) of ≤0.3 were filtered out. In F2, SNPs with contrasting alleles in parental genotypes and having <30% missing data were retained for further study. Further, imputation of missing data was carried out using FSFHap algorithm implemented in TASSEL v4.039 in both the mapping populations. The imputed SNPs were further filtered with minor allele frequency (MAF) cut off of 0.2 to remove missing data, and such filtered SNPs were used for genetic mapping and QTL studies.
Construction of genetic maps
For the linkage analysis, identified SNPs were first tested against the expected segregation ratios. The homozygous SNPs with respect to the two parents were expected to segregate in a 1:2:1 ratio in F2 population, whereas a pair of homozygous SNP alleles, in PRIL_A was expected to segregate in a 1:1 ratio. Markers showing significant segregation distortion (P < 0.01, χ2 test) were removed from PRIL_A.The genetic map information of PRIL_B have been taken from the companion paper being published in Scientific Reports). As segregation distortion of SNPs was the major issue in F2s, therefore, SNPs showing expected segregation at a P-value of <10−9 were retained and used for the construction of genetic map34.
To construct the genetic maps, Joinmap V4.0 was used. The grouping and ordering of markers were carried out using regression mapping algorithm with a maximum recombination frequency of 0.4 at minimum logarithm of odds (LOD) value of 3 using the command “LOD groupings” and “create groups for mapping” into respective linkage groups (LG). The Kosambi mapping function was used to convert recombination fraction into map units. After developing the framework genetic maps with the marker orders, the unmapped markers were integrated into different linkage groups at recombination frequency up to 50% using ripple command. The visualization of genetic maps was done using the software MapChart 2.3042. Iterative map projection of specific linkage group was created using BioMercator V4.243.
QTL analysis and visualization
QTL analysis was conducted with WinQTLCart2.5 software program (http://statgen.ncsu.edu/qtlcart/WinQTLCart) using composite interval mapping (CIM). The CIM analysis was run using 1.0 cM as scanning interval between markers and tentative QTL with a window size of 10.0, model 6, 1,000 permutations at a whole genome-wide significance level of P < 0.05. The location of each QTL was determined according to its LOD peak location and surrounding region. The LOD score values (2.5) were used to determine the significance of QTL. A QTL is named as qFWY.a with ‘FW’ being the trait abbreviation fusarium wilt, ‘Y’ the number of the linkage group, ‘a’ the letter to specify different QTLs for the same trait in one linkage group based on the physical positions of left and right QTL linked markers the nomenclature of QTL was designated as novel or not. MapChart 2.3042 was used to project QTLs on the linkage groups.
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
We thank Abdul Gafoor and Meriga Sudhakar for excellent technical assistance. Authors are thankful to the United States Agency for International Development (USAID), for providing funding support. This work has been undertaken as part of the CGIAR Research Program on Grain Legumes. ICRISAT is a member of CGIAR Consortium.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.