Bambara groundnut is a future emerging legume grown in Africa and Asia, is usually known as a poor man’s crop or as “Women´s Crop”1 and newly noted as Crop for New Millennium2. The present binomial name Vigna subterranea (L.) Verdc was suggested by Verdcourt, 19803 and chromosome number is 2n = 2x = 224. This crop occupied 3rd position after Arachis hypogea and Vigna unguiculate in African continent5 though it is used as food supplement instead of profitable crops due to its low ranked6. The word “Bambara’ comes from a place name near Timbuktu in central Mali, West Africa and the word “groundnut’ is the causes of it pods setting occur under the ground soil, hence jointly its common name is ’Bambara groundnut’7. Due to rich in carbohydrate (63–65%), protein (18–20%) and oil (17–18%), Bambara groundnut has been defined as a fully well-adjusted food for human feeding8. According to9 it contains 32.72% essential amino acids and 66.10% non-essential amino acids. The seed is regarded as a balanced food because when compared to other food legumes, it is rich in iron and its protein contains high level of lysine and methionine10. Lysine is the major essential amino acid and represents 10.3% of the total essential amino acid10. Bambara groundnut can fulfill the regular demand of protein for the marginal users where the animal protein is badly available due to its high cost11. Seeds of Bambara groundnut contain considerable amount of minerals such as Ca: 260 mg; K: 1723.25 mg; Fe: 3.6 mg and Na: 75.25 mg of each 100 g dry (weight) seed, moreover due to rich in potassium (K) it also have ability to lessen diabetes by prompting the insulin hormone12. The undeveloped fresh seeds of Bambara groundnut can be boiled to make pudding (Moi-Moi or Okpa)13; fodders to feed animals14 and extract of leaves has medicinal values as anti-vomiting agent15. In many developing countries where cultivation of other major crops is difficult, but Bambara groundnut can be accommodated nicely due to its drought tolerant and low diseases-insects infestation nature16 and as legume it can fix atmospheric nitrogen via nodulation17. This crop has the capacity to give high yield with low input and mostly grown by female in sole culture without any modern techniques18 and 10–40% of their total yield they sold in market rest is used by themselves19. According to FAOSTAT20, the annual production is about 42,023 kg ha−1, of which Africa produces half, with Burkina Faso occupied the major producing country. Bambara groundnut can well adapt to the tropical area like Malaysia, where cultivation of major crops (Rice, Wheat, Maize, etc.) are increasingly challenging due to drought and unpredictable rainfall patterns21. The average yield of Bambara groundnut is low 0.650–0.850 t ha−122, 28.96 g plant−112, and 0.38–1.6 t ha−123. The production of Bambara groundnut is mainly limited due to lack of improved cultural techniques24 and improvement of Bambara groundnut was neglected for many years by researchers because of the lack of available fund and unprivileged effort on its improvement25. Although numerous efforts have unsuccessful for the varietal improvement of Bambara groundnut through hybridization due to cleistogamous and autogamous nature of flower and knowledge gap on its reproductive biology as under developing crop26. On the aspect of Malaysia, the breeding approach of Bambara groundnut is undetermined, and no commercial high yielding variety is available, so the requirement is to discover commercially high yielding cultivar for certain rising areas27. Germplasm screening considering the agronomic variables is the initial attempt to identify the targeted characters of interest28. This current research reveals the genetic divergence of fifteen Bambara groundnut accessions to discover the existing variation and the selection to develop high yielding pure lines for this crop improvement. Accordingly, this study provides an evidence on Bambara groundnut diversity among the landraces that introduced from Africa (Nigeria) to Asia (Malaysia). All the modern applicable techniques may be applied for the betterment of this ongoing cultivated crops, but the dual approaches like conventional breeding linked with molecular breeding is highly successive over the solely use of one approach. However, traits improvement can be possible through direct selection with valuation of different genetic parameter analysis. The extent of selection approach exceedingly inspire by heritability and genetic gain estimation is the commanding tools for the enhancement of a certain traits29. Hence, the core intent of this study was to determine the inherent variation of Bamabara groundnut landraces using both qualitative and quantitative traits via valuation of characters association, variance component and different genetic parameters, resulting the identification of high yielding potentials from which pure lines will be developed for commercial cultivation.

Materials and methods

Experimental site

The research work has experimented under the Institute of Tropical Agriculture and Food Security (ITAFoS), University Agricultural Park, Universiti Putra Malaysia (UPM), Malaysia. Based on the Global Positioning System (GPS) the research location was 2°58ˊ54.0˝N latitude and 101°42ˊ53.8˝E longitude. The seeds of accessions were sown in open field conditions during the 2018–2019 cropping season. The soil PH is 6.6 to 7.5 with sandy loam to clay loam type (Dept. of land management, UPM). Fifteen accessions of Bambara groundnut were selected for this current research work, all representing the African accessions collected from the local market of Nigeria. Land races of Bambara groundnut used in this research was listed in Table 1. Randomly five plants were taken into consideration to evaluate genetic variability based on the agronomic traits30.

Table 1 The line-up of Bambara groundnut accessions with source of collection.

Experimental design

The experiment was conducted in a randomized complete block design (RCBD) with three replications. The experimental plot comprised of two rows measuring 1.6 m × 0.80 m. The distance between plant to the plant 30 cm, row to row 50 cm, plot to plot 1.5 m and the distance between replication was 2.0 m according to Unigwe et al.30. During the growing season the recommended intercultural practices like land preparation, land clearing, weeding, irrigation, and fertilizer were practiced. The recommended fertilizer rates (100% N = 45 kg N/ha, 100% P = 54 kg P2O5/ha, 100% K = 45 kg K2O/ha) and all portion of Phosphorus and Potassium were applied during field preparation hence, 70% N was applied at 5 weeks after sowing31.

Parameters recorded for data analysis

Twenty-seven quantitative and 14 qualitative characters (Table 2) were considered during the morphological characterization. For comfort description, quantitative traits were categorized as (1) Phenological traits; (2) Growth and vegetative traits; (3) Yield traits. Following the Bambara groundnut description and descriptors states by IPGRI, IITA, BAMNET32 data were recorded from 5 randomly selected plants of each plot at several growth stages in the field and post-harvest data in the physiology lab.

Table 2 Twenty-seven quantitative and 14 qualitative traits measured according to IPGRI, IITA, BAMNET32.

Statistical analysis

The SAS (statistical analysis software) version 9.3 was followed to test the significant differences using the analysis of variance (ANOVA) procedure at the level of LSD; P ≤ 0.05 and to compare among the mean of significant of traits. The correlations between the quantitative variables were determined using Pearson33 correlation coefficient formula. The genotypic and phenotypic variation were calculated as per following the formula given by Singh and Choudhary34. The coefficient variation of phenotypic (PCV) and genotypic (GCV): were estimated as per formula given by Khan et al.23 also relative differences was estimated using the formula (RD) = Relative difference between PCV and GCV. The estimated values of PCV and GCV were categorized by Robinson et al.35 and Khan et al.23, like as between 0 and 10% for low, 10–20% for intermediate and greater than (≥ 20%) for high. Broad sense heritability (\({h}_{b}^{2}\)) was estimated using the formula given by Falconer36 and Khan et al.23. In accordance with Johnson et al.37 and Khan et al.23, the heritability was categorized as between 0 and 30% for low, 30–60% for intermediate and greater than 60% as high. Genetic Advance (GA) (as a percentage of mean): was calculated with 5% selection intensity (K) following the method of37. Genetic advance is categorized as between 0 and 10% for low, 10% to 20% for intermediate and more (> 20%) than for high, following the formula given by Khan et al.23. K for constant also indicates the intensity of selection. According to Adewale et al.38 the rate is 2.06 at the point when the K is at 5%. Genetic Gain (%) = Estimated as genetic advance (GA) × 100; it is also categorized37,]39 as between (0 to 10%) for low, (10 to 20%) for intermediate and (≥ 20%) for high GA. Based on the Euclidian Distance Method also Dices’s and Jaccard’s similarity of coefficient data was analysed for investigation of genetic diversity. In addition to this, based on the Unweighted Pair Group Method using Arithmetic Average (UPGMA) and following the algorithm & sequential, agglomerative, hierarchic, and non-overlapping (SAHN) method the genetic inter-relationship (showing dendrogram) among the Bambara groundnut were estimated. For this analysis NTSYS version 2.1 (Numerical Taxonomy Multivariate Analysis System), Exeter Software, Setauket, NY, USA software40 were used. Using similar software, the principal component analysis (PCA) was done to produce two dimensional (2D) plots though, multivariate statistical packages (MVSP) was used for PCA biplot loading. However, the Shannon diversity index is a synonym for the Shannon equitability index and evenness was calculated using the formula given by Shannon41 and Hennink and Zeven42. For germplasm selection, rank summation index (RSI) was estimated using Onwubiko et al.43; Mulumba and Mock44 reported formula. For correlation networking, pattern search plot we used MDP tools while for heatmap study we used ClustVis bio tools.


Assessment of qualitative variation

The frequency of distribution of some qualitative variables are summarized in Fig. 1 and Fig. 2. After two weeks later, we observed 46.67% of the accessions had greenish stems, 20% had stripped stems and 33.33% were reddish stems.

Figure 1
figure 1

Some qualitative features of Bambara groundnut seeds: large and irregular eye pattern [two thick line join together forming an almost triangular shape a (top left), a (down left), & d]; thin and sharp circle around the eye and very light strip with creamy seedcoat [b (top right)]; No eye pattern [a (down right), a (top right), a (centre), b (top left, down left, down right, centre) & c]; Black stripped on one side of hilum [a (top left)]; Marbled with reddish seed coat [b (centre)]; Few rhomboid spots on one side of hilum (c); many rhomboid spots almost covering the entire hilum of both side [a (down right), b (top left)]; brownish seed coat [a (centre)]; cream seed coat [b (down right)]; dark reddish seed coat [b (down left)]; black seed coat [a (top right)]; cream with blackish seed coat (d).

The terminal leaflets had two different colours: 73.33% accession had a greenish leaflet while the purple was 26.67% accessions. The 53.33% of the total accessions had terminal leaflets shaped like lanceolate whereas 26.67% had oval and 20% had elliptic in shape. Among the 15 characterized accession, three growth habits were found: bunch type accessions (13.33%), semi bunch type accessions (53.33%) and the spreading type was (33.33%). Among the landraces, 33.33% had sparse hair on their stems and 20% had dense hair while 46.67% did not have any. Most of the landraces had reddish-brown (46.67%) and brown (33.33%) colour pods some had yellowish-brown (13.33%), and purple (6.67%) colour pods. Maximum accessions were found oval (73.33%) seeds shape and few had round (26.67%). Seed colour had cream and red (26.67%), black cream and cream purple (20.00%), only 6.67% had black colour. 13.33% of landraces had black eye color and 86.67% had no eye colour.

Figure 2
figure 2

Graphical display of qualitative traits’ frequency distribution of Bambara groundnut accessions. GH growth habits, SHstem hairiness, FSC color of first stem, TLS shape of terminal leaflet, TLC terminal leaf color, PP petiole pigmentation, PS pod shape, PC pod color, PT pod texture, SS seed shape; SC seed color; EC eye color, TP Testa pattern, TCEP Testa color with eye pattern round hilum.

Estimation of quantitative parameters

Morphological diversity

For crop upgrading plant breeders considered yield and its related traits as a controlling parameter. In current research, to select the best performing accessions for next breeding program a total number of 27 numerical traits of 15 Bambara groundnut landraces were analysed. The analysis of variance revealed the significant variation, mean, standard error of mean (SE.m), standard deviation (St.Dev), and coefficient of variation (CV%) displayed in Table 3. Except the traits (fifty percent flowering day, seed length, seed width) all other quantitative traits showed highly significant (P ≤ 0.01) difference. No significant difference was found for the trait plant height (24.36 cm ± 0.39). Within replication no significant variation was observed. The highest and lowest values for the respective traits across all plants were presented in Table 3. We found 16 out of the 27 quantitative traits had coefficient of variation (CV) ≥ 20% which ranged from 5.96% (Shelling%) to 58.55% (dry seed weight per plant (g). The average days to maturity was found (131.56 ± 1.18) days which is statistically significant (p ≤ 0.01). The highest value of standard deviation (SD) was found for yield kg/ha (650.98) with standard error (SEm ± 97.04) while the lowest was for internode length (SD = 0.49; SEm ± 0.07) (Table 3). Standard error (SE) is the indication of consistency of the average values, lowest SE values indicate the sample mean is more precise reflection of the real population mean.

Table 3 Summary of the signification variation revealed by analysis of variance (ANOVA).

The maximum and minimum values for overall accessions and mean comparison with least significant difference (LSD = 0.05) were shown in the Tables 4 and 5. Fig. 3 showing the graphical relationship of DSW(g) and HSW(g) with yield (kg ha−1). The days to 50% flowering varied from 31 to 49 days after sowing (DAS) while 66.67% of the accessions gave 50% flower before 40 days after sowing. Most of the landraces (80%) took more than 120 days to maturity which varied from 122 to 141 DAS. The genotype G1 marked as short duration line with a total day to maturity of 119 DAS (Table 4). The genotype G2 recorded highest values for the traits like—BFW (614.67 g); BDW(361.57 g) (Table 4); TNP (93.67); NMP (74.33); FPW (568.56 g); DPW (359.01 g); NSP (92.17); DSW (274.96 g); SW(10.74 mm); HSW(360.15 g); Sell% (76.68%) and yield (2991.77 kg ha−1) (Table 5) while the accession G13 had lowest values. The next maximum yield (kg ha−1) was recorded for the genotype G9 (2226.30 kg ha−1) followed by G3 (1557.61 kg ha−1), G6 (1414.67 kg ha−1) and G10 (1250.73 kg ha−1) (Tables 4 and 5).

Table 4 The mean and mean comparison of 3 phenological and 9 vegetative traits of 15 Bambara groundnut accessions.
Table 5 The mean and mean comparison of 15 yield related traits of 15 Bambara groundnut accessions.
Figure 3
figure 3

Graphical relationship of dry seed weight (DSW) and hundred seed weight (HSW) with yield (kg ha−1) for Bambara groundnut landraces.

Analysis of correlation matrix

The phenotypic correlation among the 27 numerical traits of fifteen Bambara groundnut accessions is given in Table 6. Days to 50% flowering showed negative and intermediate (0.25 ≤ r < 0.75) significant association with NB ( r = −0.29; P = 0.04), NNS(r = 0.33; P = 0.02), TNP (r = −0.38; P = 0.00), NMP (r = −0.35; P = 0.01) and NSP (r = −0.38; P = 0.00). A positive and highly strong (0.75 ≤ r < 1) significant association was found for the traits like NMP (r = 0.93; P ≤ 0.00), DPW (r = 0.76; P ≤ 0.00), NSP (r = 0.99; P ≤ 0.00), DSW (r = 0.77; P ≤ 0.00) and yield (r = 0.76; P ≤ 0.00) with the total number of pods. The trait yield (kg ha−1) revealed positive and perfect (r = 1.00) highly significant association with DPW (r = 1.00; P ≤ 0.00) while positive and highly strong ( 0.75 ≤ r < 1) significant association was found with DSW (r = 0.99; P ≤ 0.00), NSP (r = 0.76; P ≤ 0.00) and HSW (r = 0.76; P ≤ 0.00) per plant (Table 6). Correlation networking and pattern search plot are more visual illustration of correlation matrix. In correlation network (Fig. 4a) each node represents a variable with its colour based on the defined geographical population (Gombe-7 accessions, Kwami- 2 accessions, Akko- 2 accessions, Alkaleri- 2 accessions, and Sokoto- 2 accessions) group and its size is based on number of correlations to that variable. The traits yield, DPW, FPW, HSW, NP, and NL showed larger node size then other traits. Two variables are connected by an edge if the correlation between the two variables meet the p-value (0.05) and thresholds. The edge size also reflects the magnitude of the correlation. Helps in identifying biologically meaningful relationship or associations between group or features. On other hand, pattern search (Fig. 4b) plot showed the (max) top 25 features correlated with the sample of interest. The features are ranked by their correlation, and the blue bars represent −ve correlations while red bar indicates + ve correlations. The deeper the color (darker blue or red), the stronger the correlation. To the right mini heatmap showing whether the abundance of that features is higher (red) or lower (blue) in each population group.

Table 6 Pearson’s Correlation matrix (r) for 27 quantitative traits of Bambara groundnut accessions.
Figure 4
figure 4

(a) Correlation network illustrates the relationship of 27 quantitative descriptors with five geographic population. Each node represents 5 geographic population of BG with five defined color and each edge represent the correlation between the two traits. (b) Pattern search plot insight into the correlation abundance of top 25 traits of respective five (1 = Gombe; 2 = Kwami; 3 = Akko; 4 = Sokoto; 5 = Alkaleri) Bambara ground nut population. Days to emergence = DTE (day), days to 50% flowering = D50%F (day), days to maturity = DTM (day), plant height (cm) = PH, number of branches per plant = NB, number of stems per plant = NS, number of petioles per plant = NP, number of leaves per plant = NL, no. of nodes per stem = NNS, inter nodes length = IL (cm), biomass fresh weight per plant = BFW(g), biomass dry weight per plant = BDW(g), total no. of pods per plant = TNP, number of mature pods per plant = NMP, number of Immature pods per plant = NIP, fresh pods weight = FPW(g), dry pods weight = DPW(g), pod length = PL(mm), PW = PW (mm), number of seeds per plant = NSP, dry seed weight per plant = DSW(g), seed length = SL(mm), seed width = SW (mm), hundred seed weight = HSW(g), shelling percent = Shel%, harvest index = HI (%) and yield = Yld (kg ha−1).

Abundance (richness) analysis for traits

Abundance sketching (Stacked Bar) provides overall as well as comparative abundance profile across landraces at different variable levels (Fig. 5a). Variation present in data of multiple variable level can be visualized for all individual genotypes wise. It also summarized and compared the abundance of different variables based on the multiple data. The pie chart provides an exact clue for all variables at multiple variation level present in data. Pie chart helps in visualizing the degree of diversity or abundance of variables compositions for different BG samples. Provides exact composition of each group through direct quantitative comparison of abundances in percentages (Fig. 5b). In the “Stacked bar or Area plot” each bar represents an individual genotype under the respective location (at the top of area olot). Among the traits, yield, NL, and BFW possess the higher area with showing different colour for each genotype however, according to pic chart these traits also occupied 28%, 21% and 7%, respectively.

Figure 5
figure 5

Relative abundance sketching: (a) stacked bar/area plot showing richness of traits in each genotype and (b) Pai chart showing the richness percentages with a unique color code for each trait.

Analysis of genetic components

Variance and covariance, heritability in broad sense, relative differences, and genetic advances

The output of genetic components analysis was compiled in Table 7. Apparently, the phenotypic variance (\({\sigma }_{p}^{2}\)) is higher than the genotypic variance (\({\sigma }_{g}^{2}\)) regarding all the traits evaluated. The trait grain yield kg ha−1 reported higher genotypic (434,458.5) and phenotypic (442,869.6) variance while the lower genotypic (0.11) and phenotypic (0.24) value was recorded for the trait internode length. The traits such as TNP (PCV 19.48% & GCV 18.53%), PW (PCV 14.55% & GCV 10.95%), NSP (PCV 19.92% and GCV 18.95%), SL (PCV 16.37% & GCV 9.30%), SW (PCV 13.61% & GCV 7.29%) and shell% (PCV 6.05% & GCV 4.50%) showed below 20% of phenotypic coefficient of variation (PCV) and genotypic coefficient of variation (GCV). For the improvement of this crop further selection could be done considering the traits having GCV ≥ 20% (BFW, BDW, FPW, DPW, DSW and grain yield) which indicated high degree of variability among these traits although the variation is due to the effect of additive genes. Due to the lower GCV values (≤ 10%) the vegetative traits (PH, D50%F, and DTM) and yield traits (SL, SW, and Shel%) indicated the limited chance of selection based on respective traits due to the effect of environment on their phenotypic expression.

Table 7 Variance components, relative difference, heritability and genetic advance of 27 quantitative traits.

The relative difference (RD) is the ratio of GCV in association with the respective PCV and the estimated RD values varied from 0.22% (fresh pods weight) to 57.46% for plant height (Table 7). Relatively low difference value between GCV and PCV was recorded for the traits like DTM(9.74%), NB(2.59%), BFW(1.17%), BDW(5.18%), TNP(4.91%), FPW(0.22%), DPW(0.95%), PL(3.15%), DSW(0.57%), HSW(4.71%), and yield (0.95%) kg ha−1 and noticed that the variation present among the traits due to the effect of gene and have a better response to direct selection. On the other hand, the traits with higher difference in between their PCV and GCV values indicated the wider genetic variability due to environmental effect and not better feedback to direct selection for the improvement of traits.

The values of heritability in broad sense were observed high for most of the traits evaluated (Table 7) which ranged from 18.09% (PH) to 99.55% (FPW). Very high (≥ 60%) heritability was measured for traits like DTM (81.47%), NB (94.89%), BFW (97.68%), BDW(89.91%), TNP(90.42%), PL(93.79%), DPW(98.10%), DSW(98.86%), HSW (90.80%), HI(83.93%) and yield (98.10%) is the indication of limited chance of environmental effect. The heritability value 30% to 60% was marked for the traits like D50%F(30.32%), PW (56.64%), Shel% (55.22%) and SL(32.25%) which indicate the traits are moderately heritable whereas the trait PH (18.09%) and SW (28.71%) showed heritability below 30% i.e. low heritability. The trait dry seed weight (122.01%) had topmost genetic advance (as percentage mean) value (≥ 20%) or genetic gain while the lowest had (3.97%) for plant hight (PH) (Table 7). Moreover, the higher genetic gain was recorded for the traits like grain yield (113.94%), DPW (113.94%), BDW (106.66%), FPW (93.54%) and BFW (88.82%) with high heritability. In our study, most of the yield contributing traits showed moderate to high heritability with genetic advance (≥ 20%) excluding the traits like SW (GA 8.05%) and Shel% (GA 6.88%) which was ≤ 10%. Consideration of higher value of genotypic coefficient of variation along with higher heritability and genetic advance is the powerful tools of selection for crop improvement than the consideration of individual genetic matrix or measuring unit. However, these traits were governed by additive genes having limited response to environment and suggestively notable for the selection procedure.

Clustering patterns

In this study, the homogenized data was used to calculate the Euclidean distances among the 15 Bambara groundnut accessions and a UPGMA dendrogram was designed (Fig. 6). To discriminate against the relations in the population, the dendrograms of the 15 Bambara groundnut accessions were clustered into five major groups based on their twenty-seven measurable traits at 1.16 dissimilarity coefficients. In the dendrogram, there was a cut off at the point of 1.16 coefficient for ease of interpretation. The Table 8 showed the mean performances of the selections according to each class. Group I represented by G1. G6 and G12 are characterized by early germinating (6 days) but need more time to flowering close to 44 days among the group also took medium time to maturity and plant height (24.45 cm). Internode length was maximum while hundred seed weight and yield kg/ha showed medium values within the five groups. Group II is formed by maximum number of (G4, G5, G7, G11, G14 and G15) accessions characterized by the following traits: maximum maturation date (137 days) and minimum pod length (3.37 mm) with compared to other groups. Group III was illustrated by only one accession (G13). This accession was distinguished by the following traits with minimum values like plant height, branches number, stem number, leaves number, nodes number, internode length, biomass fresh & dry weight, total pod number, no. of mature pod, dry & fresh pod weight, pod width, seed number, dry seed weight, seed length & width, hundred seed weight, shelling percent, yield kg/ha as their most isolated characters. Group IV comprised of four accession G2, G3, G9 and G10 with maximum values of traits like plant height (cm), total pod number, no. of immature pod, dry & fresh pod weight, pod length & width, seed number, dry seed weight, seed length & width, hundred seed weight, harvest index and yield kg/ha as their most special characters. All the accessions under this group had mean yield maximum 2 ton/ha which was due to maximum value of yield contributing traits. The last Group V captured only one accession G8 seemed to have long time to emergence (9 days) but gave early flower with maximum values of traits like branches number, stem number, nodes number, biomass fresh & dry weight, mature pod number, pod length, hundred seed weight and shelling percent as their most distinctive characters. Group IV gave the highest yield (36.48%), while Group III gave the lowest yield (10.71%). Two groups I (19.44%) and group V (18.67%) had accessions near to each other on the aspect of yield traits (Table 8). Moreover, we estimated 70.07% higher ( +) mean yield over the average grand mean yield (1180 kg ha−1) for cluster IV whereas the cluster I (9.34%), cluster II (31.44%), cluster III (50.08%) and cluster V ( 12.95%) produced lower (−) yield.

Figure 6
figure 6

Dendrogram showing the relationship among the Bambara groundnut landraces revealed by non-overlapping (SHAN) UPGMA method.

Table 8 Mean values of quantitative traits for 5 groups revealed by cluster analysis.

Heatmap analysis for genotypes and agro-morphological traits

A heatmap is a data imagining practice that displays extent of a phenomenon as color in two dimensions. The variation in color may be by hue or intensity, giving noticeable visual indications to the reader about how the phenomenon is clustered or varies over space. It visualizes the relative patterns of high-abundance features against a background of features that are mostly low-abundance or absent. Heatmap analysis of agro-morphological descriptors were carried out to show a chromatic evaluation of the Bambara groundnut genotypes. The heatmap analysis constructed double dendrograms, the 1st dendrogram on the vertical direction, an arrangement that represent the Bambara groundnut accessions, and the 2nd dendrogram on the horizontal direction representing traits that influenced this diffusion. Dendrogram 1 showed two major group, group (a) linked to two genotypes Maikai (G2) and Giiwa (G9) while the group (b) comprised of rest 13 genotypes (Fig. 7). Dendrogram 2 also displayed two major groups, group (a) linked to traits DTE, DTM, D50%F and NP whereas other 23 traits belong to group (b). The genotype Maikai and Giiwa under left side of the dendrogram 1 appear in the same cluster having higher values of yield, FPW, DPW, HSW, TNP, NMP, SL,SW, PL, PW, BFW and BDW which separate this two genotype from other genotypes. On the right side of the dendrogram 1, two sub-groups were documented, 1st on the left included three genotypes (Karu, Roko & Dai), which exposes in the lower values for all traits evaluated and similar values were recorded for Hawayenzekai and Duna in right side sub-group of cluster (b). The higher values were recorded for genotype Katawa (NB, BDW, BFW), Jatau (IL & D50%F), Bidiyashi (D50%F), Exsokoto (NNS & DTE), Bidillali (IL) and Maibergo (NS, Shell% & DTE), all other traits exhibits lower values are responsible for constructing different sub-groups among the genotypes. Interestingly, the groups and sub-groups in dendrogram 2 sharply highlighted the discrepancy effects of the diverse the Bambara groundnut landraces.

Figure 7
figure 7

Heatmap and hierarchical clustering (double dendrogram) responses to morphological descriptors of Bambara groundnut landraces constructed using ClustVis tools ( The heatmap plot describes the relative abundance of each Bambara groundnut genotypes (columns) within each feature (rows). The color code (blue to dark red) displays the row z-score: red color indicates high abundance, blue color low abundance. The dendrogram shows hierarchical clustering of Bambara groundnut genotypes based on the Euclidian as the measure of distance and Ward’s cluster agglomeration method.

Valuation of principal component analysis

Based on the results from Table 9 it appeared that the principal component 1 (PC1) accounted for close to 45.88% of the total variation and the characters responsible for genotypes separation along this axis were TNP (highest 0.271), NMP, FPW, DPW, NSP, DSW, SL, HSW and YLD (kg ha−1) with high and positive value of coefficient of variation. The second principal component (PC2) associated with the traits D50%F, PH, NP, NL, FPW, DPW, DSW, HI (maximum 0.447) and YLD (kg ha−1) accounted for 10.64% of the total variation. About 8.78% of the total variation was detected for Principal Component 3 (PC3) and displayed differences based on PH, NB, NS, NP, NL, NNS, IL (largest 0.472), BFW and BDW. The principal component 4 (PC4) accounted for 8.19% of the total variation and consisted mostly of the traits of D50%F, NS, NP, NL, HSW, and Shell% (maximum 0.444). The variation (6.75%) was found for principal component 5 (PC5) and comprised with D50%F (maximum 0.479), DTM, DSW and YLD (kg ha−1). The last principal component 6 (PC6) accounted for 5.69% of the total variation and consisted of the traits DTE, DTM, NP, NL, BDW, NIP, PW and SL up to this principal component covered close to 86% of the cumulative variation. The two-dimensional (Fig. 8a) and three dimensional (Fig. 8b) graphical elucidation demonstrated that most of the accessions were dispersed at low distances whereas the few were dispersed at high distances as reflected by eigenvector (Table 9). The farthest accession from the centroid was G2, G3, G8, G9, G12 and G13 whereas other accessions were near to centroid. The proportion of variation for principal component (PC1) and (PC2) were 45.88% and 10.64% respectively, in which the first principal component occupied the topmost position of the total variation existed (Table 9). PCA biplot loaded the both variables and cases (accessions) at the same time shows how strongly each trait influences a PC and correlated to each other it also shows the how distances the genotype from each other. The lesser angle between two vectors (Fig. 9) indicate higher and positive correlation (e.g. DSW & Yield), when angle between two vectors form 90° indicate no correlation while it goes more than 90° to near 180°, indicate negative correlation between the traits (e.g. HSW & NL) .

Table 9 Eigenvectors and values for the first six principal component axes for 27 agronomic traits associated Bambara groundnut accessions.
Figure 8
figure 8

PCA-2D (a) and PCA-3D (b) graphical relationship among the Bambara groundnuts accessions based on Euclidian distance.

Figure 9
figure 9

PCA biplot with variables (27) and cases (15) loading of Bambara groundnuts accessions using MVSP.

Valuation of Shannon–Weaver diversity (H’ Index)

The Shannon–Weaver diversity index was used to assess the phenotypic diversity for each trait. The estimation of the Shannon–Weaver diversity index (H) and Evenness (EH) for the twenty-seven traits shown in Table 9. The Shannon–Weaver diversity index ranged from 2.57 for dry seed weight per plant to 2.71 for plant height, maturity date and shelling percent including the traits like fifty percent flowering date, nodes number per stem, internode length, pod width, seed length and seed width which indicated that maximum diversity (H = 1.70) was present among these traits. The equitability or evenness was found varied from 0.95 to 1.00. Similarly, maximum (EH = 1.00 ) values of evenness was marked for the traits like fifty percent flowering date, maturity date, plant height, nodes number per stem, internode length, pod width, seed width, shelling percent whereas minimum (EH = 0.95) was for biomass dry weight, dry pod weight, dry seed weight, yield kg ha−1.

Selection of Bambara groundnut accessions based on Rank summation index (RSI)

Pearson's correlation coefficients were estimated for all traits to define the traits of a positive or negative association with yield. The traits which were strongly positively associated with yield were used to compute Rank Summation Index (RSI) to lead high yielding Bambara groundnut selection from the population. Furthermore, correlations within a couple of traits are probably significant when the ideal values of the coefficient (%) are higher than 0.2043. Using Mulumba and Mock44 reported formula of RSI method the accessions were first ranked using average values of the respective traits of positively correlated with yield (here, 1 = topmost and 15 = lowermost), consequently the ranked values of traits were summed to estimate the total performance of individual accessions. In this way, the accession with the lowest RSI score indicates the superior yielding potential. The values of RSI ranking of Bambara groundnut with the traits of positive and strong significantly correlated traits with yield namely fresh pod weight, dry pod weight, no. of seed per plant and dry seed weight per plant are represented in Table 10. Based on yield, the topmost genotype G2 (2991 kg ha−1), G9 (2226 kg ha−1), G3 (1557 kg ha−1), G6 (1414 kg ha−1), G10 (1250 kg ha−1), G1(1101 kg ha−1) were identified as high yielding accessions among the population evaluated with lowest RSI values 4, 8, 16, 21, 23, and 28 were observed, respectively while the entries G13 had the lowest yield (558 kg ha−1) with highest RSI values of 59.

Table 10 Mean value of positive, strongly significant correlated traits with yield and their rank summation index (RSI).


Qualitative disparity

The existence of a significant qualitative variation was found for all the qualitative traits, supported by Gbaguidi et al.45 he found significant variation among all the qualitative traits. We recorded three types of growth habit and similar observation were noticed by Ntundu et al.1 in Tanzania, Khan et al.23 in Malaysia and Azam-Ali et al.46 in Cameroon. We categorized the vegetative growth of Bambara groundnut namely, bunches type, semi bunches type and spreading type which is matched by the result of Doku47 and highly significant difference among the qualitative trait was noted by Egbadzor et al.48.

Quantitative traits

The estimated 27 quantitative traits showed a massive genetic variation and similar variation was confirmed by Ntundu et al.1 and Aliyu et al.49 in Vigna subterranea (L.) Verdc and the cowpea (Vigna unguiculata L)50. The estimated high coefficients of variation (CV) in our study is the indication of vast scale of heterogeneity confirmed by Goli et al.51 in Bambara groundnut. We found D50%F close to 39 days but52 noticed close to 68 days in Ghana. The indeterminate53 nature of flower bearing make it vital issue for adjustment mechanism to an environment54. The inconsistency of flowering time was reported by51 from 38 to 68 days; Massawe et al.10 from 64 to 76 days; Masindeni55 reported 43–80 days and Ouedraogo et al.8 from 32 to 53 days. Several climatic issues photoperiod, temperature, altitude and soil structure as well as genotypic nature is responsible to bearing flower in Bambara groundnut28 and reported flowering happened between 36 to 53 days. In our study, genotype G1, G2, G3, G8, G9 and G10 identified as early flowering lines; early flowering ensures early maturity56. A significant difference (P ≤ 0.01) was recorded for maturity (119.67 to 141.33) days is supported by Goli et al.51 and Masindeni55 and due to diverse cultivar along with multi-environmental factors maturation time varied from 90 to180 days57. Plant hight had no significant variation, supported by Ntundu et al.1 in Tanzania and Shegro et al.28 in south Africa. The yield and yield related traits like TNP, NSP, FPW, DPW, DSW, PL, PW, NMP, NIP and HSW showed high genetic discrepancy, similar variation stated by Shegro et al.28 with a recommendation of variation happened due to effect of genotype by environment interaction Bambara yield. Hundred seed weight varied from 177.52 g to 360.15 g, is a vital factor for the measurement of morphological traits linked to yield23,52,55,58 it also influences the yield directly. The yield of Bambara groundnut was recorded from 146.6 to 2678.6 kg ha−1 by Gbaguidi et al.45; average 703.3 kg ha−1 by FAO20; 1058.8 kg ha−1 by Dansi et al.59 in west Africa whereas we calculated from 588.98 to 2991.77 kg ha−1. Typically, FAO20 estimated average yield Bambara groundnut is lower than our estimated yield 1180 kg ha−1.

Correlation coefficient

In plant breeding correlation matrix is a prominent approach for the judgement of degree of the association between two or more variables, is supported by Mohammed52. For superior genotype’s selection programme consideration of correlation matrix can be a great scale of measurement60. Strong and positive significant correlation for total number of pods (TNP) was identified with the traits NMP, DPW, NSP, DSW and Yield this result is consistent with the study of Pranesh et al. 61 and Jonah et al.62. We got moderate and positively high significant association of plant hight (PH) with TNP, NMP, FPW and yield can be proposed the selection based on these traits may be beneficial for yield enhancement of this crop as well as fodder production for animal feeding. Similar recommendation was stated by Mohammed52 in Cote d’Ivoire and63 in Cameroon. Nankar et al.64 described correlation network in tomato phenotypic diversity and pattern search plot that was also supportive to my findings. Relative abundance study using area plot and pie chart revealed richness of traits diversity for each genotype. The traits possessing higher extent in both area plot and pie chart indicates that these traits are highly contributed to governing the diversity among the landraces.

Genetic components

For the selection program variation presents among the traits was taken into consideration which depends on the degree of heritability. To know the projected gain from selection, valuation of genetic advance with heritability can be a significant approach of crop improvement. Various research findings reported that the selection may be effective for a specific trait improvement using available genetic variation with the degree of heritability29,65. Consideration of both heritability and genetic advance is more effective over the uniquely use of heritability66,67. Like the previous reporters Adebola et al.68 findings, we disclose higher phenotypic variance values than genotypic variance for all traits, indicates the trait expression govern by the environment. The obtained GCV and PCV value was categorized based on the suggested index of 0%-10% for low, 10–20% for moderate and ≥ 20% for high variation23,69,70. Intermediate to strong genetic advance with heritability was found for all yield related traits except seed width and shelling% is the indication of the traits have significant potential in the selection process due to low environmental influences, supported by Meena et al.71. The improvement of the traits with low heritability and genetic advance can be boost over heterosis breeding this is supported by Usman et al.70. The value of relative differences between GCV and PCV had higher for the trait plant hight, seed width, days to 50% flowering, and seed length is the sign of higher environmental effect and the improvement of these traits are tough via direct selection whereas the trait with lower difference is the symbol of lower influence by the environment which may give desirable strong and significant output in crop improvement program, is supported by Umar et al.29 and Usman et al.70. Direct selection can be effective considering the traits having low relative differences72. Considering the heritability and genetic advance index23,37 like as more than 60% for high, 30–60% for moderate, and 0–30% for low, we found the traits BFW (Hb = 97.68% GA = 88.82%), BDW(Hb = 89.91% GA = 106.66%), FPW (Hb = 99.55% GA = 93.54%), DPW(Hb = 98.10% GA = 113.94%), DSW (Hb = 98.42% GA = 122.01%) and yield (Hb = 98.10% GA = 113.94%) were highly heritable together with high genetic advance value, recommended that for crop improvement direct selection can be effective based on these traits with effect of additive genes; similar findings documented by the previous researchers65,73. Low to moderate heritability and genetic advance values may hindrance in the trait’s betterment due to high environmental effects over the genetic effects on its stated by Ridzuan et al.74. So, only an effective selection can be gained picking the traits with higher GCV, PCV, HB, and GA meaning that effect of additive genes is sufficiently robust than environmental effect70.

Clustering patterns

Five clusters were constructed based on the 27 quantitative traits at 1.16 of the distant coefficients that indicates a degree of diversity among the genotypes. The cluster V considered as potential group of genotypes for the crop betterment associated with high yielding capacity. The findings of previous researchers30,45,75,76 stated that they constructed same type of cluster and found significant variation regarding morphological traits in Bambara groundnut. The study of Unigwe et al.30 explored the four distinct groups of Bambara groundnut genotypes in south Africa using UPGMA model. The timing of flowering duration is a motivational factor for the final yield also play a positive role to the best yield of the group and selection could be effective from this class noted by Tourél et al.77. Flowering in Bambara groundnut is indeterminate up harvesting stage explained by Kumaga et al.78. However, early flowering has been considered as a well agronomic trait of crops to quick maturity, uniform yield as well as generally crop production78 thus, accessions that have early flowering criteria should be treated as best to production of Bambara groundnut79. The groups achieved from the cluster analysis of quantitative characteristics illustrate the performances of Bambara groundnut accession cultivated in Benin would be the future guideline for this crop improvement22,80. The clustering and characterization of accessions considering their agro-morphological traits and genetic similarity would be the crucial issue to identification and selection of the best parents for hybridisation81. Additionally, cluster IV produced 70.05% higher mean yield than the average grand mean yield of 1180 kg ha−1 while the other groups gave lower yield and this finding were supported by Onwubiko et al.73. Therefore, current research represents significant information to the plant breeders based on their similarity and grouping of accessions through univariate and multivariate methods. The heatmap analysis depicted the depth of correspondence among morphological traits evaluated of Bambara groundnut genotypes and this result was constantly supported by Virga et al.82

Principal component analysis

The principal component analysis (PCA) is the re-validation instrument of cluster analysis. To estimate the total variation, exist in a set of characters, PCA is effective noted by Johnson39. The first axes (PC1) elucidate utmost portion of total variation in any PCA83. In our findings first principal component (PC1) accounted more proportion of variation (45.88%) than PC2 (10.68%). Similar result was identified by Mohammed et al.84 of total variation at 19% (PC1) and 14% (PC2) in Bambara groundnut. The results of several researchers like Usman et al.70, Farhad et al.85 & Maqbool et al.86 supported our findings. Shegro et al.28 grouped the 20 Bambara groundnut accessions by PCA analysis using quantitative traits. For yield improvement the selection PC1 was revealed as the most powerful criterion concluded by the work of Adéoti et al.87 and Mih et al.88. In my research total pod numbers, mature pods number, seed number, dry seed weight and yield kg/ha occupied high values in PC1. This finding supported by Stoilova & Pereira80 described that the most significant components for yield are the pods number and seeds number per plant. The cluster analysis together with principal component analysis explored the common association among landraces in terms of seed yield and related agronomic traits.

Shannon diversity index (H) and evenness (E)

Shannon’s diversity index (H) is another index that is generally used to categorize the species diversity in a certain community. Shannon’s diversity index is an account for both richness and evenness present in the species also used for a wide diversity of fields. The estimated H’ Index varied from 2.57 for dry seed weight per plant to 2.71 for plant height, maturity date and shelling percent among the phenotypic traits. In our study the observed diversity index value was more than 2.50 for most of the traits evaluated and highest the value indicates higher diversity, though H’ index ranges typically from 1.5 to 3.5 but rare can be reaches 4.589,90. Olukolu et al.22 reported H’ Index of nineteen qualitative traits (0.1 to 0.15) and twenty-eight numerical traits (0.09 to 0.16) of Bambara groundnut that supported our findings. Bonny et al.76 evaluated the diversity in qualitative traits of Bambara groundnut landraces of similar findings with our result.

Selection of high yielding accession using RSI method

The result from correlation studies revealed the traits that have a positive significant correlation with yield also give valuable information in a breeding program for the selection of high yielding Bambara groundnut. Similar findings were concluded by Onwubiko et al.43 and Ajala et al.91, he got two high yielding genotypes from 33 accessions using the RSI method. Other agronomic evaluations in this study like clustering and principal component analysis also had confirmed the accuracy of the selective index result.


From the present study an evident has been established that the improvement of Bambara groundnut (Vigna subterranea [L.] Verdc.) yield and it related traits can be gained via selection with the valuation of different genetic parameters analysis like GCV, PCV, HB, and GA. Based on the recorded data and considering the supplementary analysis (heatmap study, correlation network, abundance analysis) it can be state that a considerable degree of variation exist in almost all the agronomic traits evaluated in this study. Moderate to perfect significant association was noted between the yield and its related traits. Additionally, this research also depicts selection criteria using the traits which had strongly positive correlation with grain yield. More than 20% PCV and GCV values was estimated for all traits excluding the traits like TNP, PW, NSP, SL, SW, and shell% beside this, the six traits like DSW, DPW, FPW, BDW, BFW, and Grain yield showed high (≥ 20%) genetic advance (as percentage mean) with high heritable values. However, it can be declared that a higher extent of divergence was detected among tested landraces based on H’-index (1.57–2.71) as well as Euclidian distance clustering (into five group). Considering all statistical findings, the genotype G2, G3, G8 and G9 identified as high yielding promising lines and can be use as distance parents for hybridization program. We suggested that further research can be conducted to gain the homogeneity of genotypes based on yield and its contributed traits improvement. Concurrently, we must provide emphasis on intensive research of these potentially high yielding lines, together with conventional breeding and molecular approaches.