Introduction

Vigna subterranea L. (Verdc), Bambara groundnut (2n = 2x = 22) is a legume crop in the Fabaceae family and subfamily Faboideae of the genus Vigna that has received little attention1. It first appeared in West Africa and is now a common food crop in African countries. It is also grown successfully in Asia and South Asia, including Malaysia, Thailand, India, the Philippines, Sri Lanka, India, and Brazil2. Bambara groundnut is the third most common legume in Africa after groundnut and cowpea since it can withstand drought and can be grown in low fertility soils where other crops fail3. By fixing nitrogen, it helps to increase soil fertility4. Surplus Bambara groundnuts are often sold in local markets, raising income for resource-limited farmers5. Bambara groundnut is a reliable source of food for low-income people6. The world's population is continuing to rise at an alarming pace, necessitating an increase in the production of this critical crop to counter potential demographic threats. In Malaysia, average production was 1.18 t ha−17, 0.38 to 1.6 t ha−18, and 0.97 to 3.41 t ha−19 whereas 0.7 to 2.0 t ha−110 was produced in Indonesia. At optimal farming conditions, it has the ability to produce up to 4.0 t ha−111 and 5.0 t ha−13 of the dry pod.

Scientists have investigated many ways to increase yields. These include selecting high-yielding varieties well adapted to particular growing areas, quality seed, crop establishment, nursery management, weed management, and post-harvest management. Breeding for high yield has been identified as the most sustainable approach since yield traits have heritability. However, there is high genotype by environment (GE) interaction for grain yield and more information is needed to identify broadly adapted high yielding genotypes. The lack of a modern production system and commercial high-yielding cultivars is the major limitation for this crop expansion; many growers still use traditional landrace varieties12. One of the main constraints to Bambara groundnut production in Malaysia is a scarcity of high-yielding cultivars. Plant breeders have used both conventional and molecular approaches to produce improved Bambara groundnut varieties1. Plant breeders routinely evaluate crop cultivars in broad environment tests using yield and its associated traits, as well as phenotypic expression. The variability in yield over environments (seasons and locations) is related to both biotic and abiotic environmental factors13.

In most crops, genotype by environment (G × E) interactions are common14 as certain genotypes have adapted to particular environments. A G × E interaction is characterized by a difference in the relative output of specific traits of two or more genotypes evaluated in two or more environments. This interaction usually changes genotype ranking across environments15. The uncertainty in identifying the target population in ecosystems (TPE), the lack of suitable selection criteria and finding suitable test locations to represent the target environments are all exacerbated by G × E interaction. Therefore, the focus on Bambara groundnut breeding programs has changed from developing genotypes with general adaption to identifying genotypes suited for particular conditions. Higher Bambara grain yield has the potential to improve food security. To discover superior and stable Bambara groundnut genotypes that are suited to several environments, the potential lines must be evaluated across those environments.

In order to assess G × E interaction in a multi-environmental yield trial, analysis of variance (ANOVA) is used in this study to test for differences between genotype, location and environment. Among those who have contributed to this work are Zobel et al.16 and Oladosu et al.15. Priority should be given to crop management and agronomical issues, especially during the vegetative stage and harvesting, according to Vadiveloo and Phang17, since improved lines generally require better growing conditions to achieve their yield potential. While the focus of any breeding program is to improve yield and grain quality, the identification of correlated traits that contribute to yield stability is important since yield is the net result of many plant processes. The current study aims to evaluate the contribution of several yield-related traits to yield stability in high yielding cultivars across four environments.

Materials and methods

Plant materials

The research work was conducted with the Institute of Tropical Agriculture and Food Security (ITAFoS), University Putra Malaysia (UPM), Malaysia. A set of 30 accessions of V. subterranea were used in this study. Initially, we collected 15 landrace seed samples from Nigeria from which we obtained 150 plants. These were selfed for 3 generations and then evaluated6,7,8. From that evaluation, we selected 44 lines which we selfed for 2 more generations and evaluated. We advanced the best 30 of these representing 11 of the 15 original accessions to this experiment. In terms of plant guidelines, we complied with relevant institutional, national, and international guidelines and legislation. We collected the plant seeds or specimens with the proper permission of the institution's authority by following the national and international strategies and deposited them in GenBank, ITAFoS, UPM. We also had appropriate permission from farm or field owners during collection and experimentation. We affirm that during the collection and execution of the experiment, the authors complied with the IUCN Statement on Research Involving Species at Risk of Extinction and the Convention on the Trade in Endangered Species of Wild Fauna and Flora. The name and ID of each accession are listed in Table 1.

Table 1 The list of thirty selected Bambara groundnut accession used in this study.

Environment and location

Four field trials were conducted in two nearby fields in two cropping seasons (2020 and 2021) in Malaysia. These four environments represented a range of conditions in temperature, rainfall, soil type, soil structure, soil pH and management practices. Details of the environmental conditions were presented in Table 2. The soil properties of the experimental site are listed in Table 3.

Table 2 Environmental description of the experimental site.
Table 3 Characterization of soil properties of the experimental region.

Experimental design and intercultural practices

The experiment was set up as a randomized complete block design (RCBD) with three replications in each environment. The experimental plot consisted of two rows 1.6 m × 0.80 m each. According to Khan et al.8, the distance between plants was 30 cm, row to row was 50 cm, plot to plot was 1.5 m, and the distance between replication was 2.0 m. Recommended intercultural activities such as field planning, land clearing, weeding, irrigation, and fertilizing were used during the growing season. The prescribed fertilizer rates (100% N = 45 kg N/ha, 100% P = 54 kg P2O5/ha, 100% K = 45 kg K2O/ha) and all portions of phosphorus and potassium were applied during final land preparation, though, 70% N was added at five weeks after planting18. The field was ploughed following the usual cultural traditions of the local farmers. Where necessary, pest and disease control was carried out. Regular hand weeding was conducted as needed.

Data collection

According to Bambara groundnut description and descriptors by IPGRI, IITA, BAMNET19 twenty-seven numerical traits (Table 4) were assessed during data collection. The data were recorded from 5 randomly selected plants of each plot in each replication at several growth stages in the field and post-harvest data in the plant physiology lab.

Table 4 Twenty-seven quantitative traits measured according to IPGRI, IITA, BAMNET19.

Statistical analysis

Analysis of variance (ANOVA)

While data was collected on a randomly selected plant basis, it was analysed on a plot mean basis. Standard analyses of variance (ANOVA) were performed on the data using SAS version 9.4. The sources of variance tested were genotype (G), environment (E), and genotype by environment (G × E interaction). For each attribute, the mean, range, coefficient of variation (CV) and standard deviation were computed. Mean comparisons were carried out with the least significant difference in consideration (LSD) at 5% probability. Finally, the Pearson correlation was calculated on genotype mean basis using the means reported in Table 8 to study the correlations among the traits. The correlations between the quantitative variables were determined based on the rules given by Pearson20 using XLSTAT. Table 5 contains the ANOVA table for the expected mean squares for pooled locations and seasons or environments.

Table 5 Sketch of ANOVA table and EMS for pooled locations and seasons.

The variance components were derived from the expected mean squares in Table 5 using “SAS proc varcomp” with a restricted maximum likelihood (REML) approach. The phenotypic variance was computed as follows. Phenotypic variance: σ2p = σ2g + σ2gE + σ2e. where: σ2g is the genotypic variance, σ2gE is the G × E variance, and σ2e is the mean error variance.

Estimation of variance components, heritability, and genetic advance

Phenotypic and genotypic coefficient of variation

According to Singh and Chaudhary21, the estimations of phenotypic and genotypic coefficients of variation were derived as follows:

$$(\mathrm{a})\mathrm{ PCV }\left(\mathrm{\%}\right)=\frac{\sqrt{{\upsigma }_{\mathrm{p}}^{2}}}{\overline{\mathrm{X}}} \times 100 \,\,\,\,\,\,\,\,\,\,\, (\mathrm{ b })\mathrm{ GCV }(\mathrm{\%})=\frac{\sqrt{{\upsigma }_{\mathrm{g}}^{2}}}{\overline{\mathrm{X}} } \times 100$$

where: PCV = Phenotypic coefficient of variation; GCV = Genotypic coefficient of variation; \(\overline{\mathrm{X} }\) = Grand average of the characteristics; \({\upsigma }_{\mathrm{p}}^{2}\) = Phenotypic variance; \({\upsigma }_{\mathrm{g}}^{2}\) = Genotypic variance. According to Sivasubramanian and Madhava22, GCV and PCV levels were classified as low (0–10%), moderate (10–20%), and high (≥ 20%).

Heritability

The ratio of genotypic variation (\({\upsigma }_{\mathrm{g}}^{2}\)) to phenotypic variation (\({\upsigma }_{\mathrm{p}}^{2}\)) is defined as broad-sense heritability (\({h}_{\mathrm{b}}^{2}\)). Falconer23, defines the formula of heritability as follows:

$${h}_{b}^{2}\left(\%\right)=\frac{{\sigma }_{g}^{2}}{{ \sigma }_{p}^{2}} \times 100$$

where: The genotypic variance is denoted by \({\upsigma }_{\mathrm{g}}^{2}\), while the phenotypic variance is denoted by \({\upsigma }_{\mathrm{p}}^{2}\). According to Johnson et al.24 heritability percentages are classified as low (0–30%), moderate (30–60%), and high (≥ 60%).

Genetic advance

The genetic advance (GA) (as a percentage of the mean) was calculated using the Johnson et al.24 approach, with selection intensity (K) set to 5%. Following Johnson et al.24 the genetic advance was classified as modest (0–10%), moderate (10–20%), and high (> 20%).

$$GA\left(\%\right)=K\times \frac{\sqrt{{\sigma }_{p}^{2}}}{\overline{X}} \times {h }_{b}^{2} \times 100$$

where: K for constant also indicates the intensity of selection. According to Khan et al.25 the rate is 2.06 at the point when the K is at 5%. \(\sqrt{{\upsigma }_{\mathrm{p}}^{2}}\)= Standard deviation of phenotype; \({\mathrm{h}}_{\mathrm{b}}^{2}\) = Broad sense heritability and \(\overline{\mathrm{X} }\) = Grand mean values of traits.

Multivariate analysis

To examine the relationships between the different variables in this study, the correlation coefficient was calculated using SAS software (version 9.4). The correlation heat map was generated using XLSTAT. To show the graphical relationship among principal axis, eigenvalues, and cumulative variance on a single plot, PCA variable, and case loading plot was created using XLSTAT. For two-way (double-dendrogram) clustering and constellation plot, we used JMP ver.16 software based on Ward’s hierarchical clustering (WHC) method. For scatter plot, density plot, and PCA 3D plot we followed NCSS 2021 program.

Result

Analysis of variance and mean performance

A greater understanding of the contribution of genotypes, environment, and their interaction as sources of heterogeneity is critical for developing more stable genotypes. The combined analysis of variance used for quantifying interactions and defining heterogeneity for agro-morphological traits indicated that the mean square for genotypes, environment and genotype × environment (G × E) demonstrated major variations at p ≤ 0.01 or p ≤ 0.05 or p > 0.05). The mean squares related to the G × E interactions of growth traits from a combined analysis of variance were summarised in Table 6. This investigation reflects the broad differences in genotype response to the environments for virtually all traits. Here, we focus on the traits significantly related to yield. The mean performance among the genotypes were displayed in Table 7.

Table 6 Mean square for growth and yield traits of 30 Bambara groundnut accessions revealed by ANOVA.
Table 7 Means performance and comparison of 30 Bambara groundnut genotypes tested four environments.

Analysis of variance and mean performance for vegetative components

Days to emergence (DTE) differed significantly (p ≤ 0.01) among environments, genotypes and genotype by environment (G × E). The average emergence period spanned from 9 days (S5G18) to 13 days (S5G12) with an average of 11 days (Table 7). Days to 50% flowering (D50%F) showed highly significant variation (p ≤ 0.01) for all sources of variation. Genotype S5G10 took more time (46 days) to reach 50% flowering in the plant while the genotype S5G1 and S5G2 produced 50% flower in a relatively short time (32 days). However, the average days to flowering were recorded as 39 days after planting (Table 7). Days to maturity (DTM) differed among genotypes and environments and their interaction. The longest duration in days to maturity was observed in S5G10 at 132 days while the shortest days to maturity were observed in S5G11, S5G26, S5G28, and S5G24 at 127 days followed by S5G27 at 128 days as indicated in Table 7. Plant height (PH, cm) varied significantly (p ≤ 0.05) among the genotypes (G), environment (E), G × E (Table 6). The tallest genotype was S5G17 (32.74 cm) while the lower was 27.2 cm (S5G7) with an average of 29.49 cm as shown in Table 10. Biomass fresh weight (BFW, g) per plant was significantly (p ≤ 0.01) different for all sources of variation. The average weight (g) of fresh biomass was noted as 453 g, with a range of 322 g (S5G24) to 647 g (S5G17) followed by 635 (S5G28) as displayed in Table 7. Biomass dry weight (BDW, g) per plant was significantly (P ≤ 0.01) different for all sources of variation. The average weight of dry biomass was noted as 252 g, with a range of 163 g (S5G12) to 384 g (S5G17) followed by 373 g (S5G28) as displayed in Table 7.

Analysis of variance and mean performance for yield components

The number of pods per plant (TNP) was significantly (p ≤ 0.01) different for genotype (G), environment (E) and G × E. The average number of pods per plant was 84, with a range of 73 (S5G24) to 92 (S5G4) as displayed in Table 7. A highly significant difference (p ≤ 0.01) was observed for all sources of variations for the trait fresh pod weight (FPW). The average weight of fresh pods was 633.61 g, with a range of 558.78 g (S5G27) to 684.58 g (S5G3) as displayed in Table 7. Except for genotype by environment (G × E) all other sources showed a highly significant variation (p ≤ 0.01) for dry pod weight (g). The average weight of dry pods (DPW) was 395.57 g, with a range of 343.57 g (S5G30) to 430 g (S5G1) as indicated in Table 7. Genotype by environment (G × E) had significant variation at p ≤ 0.01 for dry seed weight (DSW, g). The average weight of dry seed was 305.49 g, with a range of 261.30 g (S5G30) to 335.73 g (S5G1) as presented in Table 7. For a hundred seed weight (g) (HSW) showed a highly significant variation (p ≤ 0.01) for genotypes, environments, and G × E.

The average hundred seed weight (g) was 176.32 g, with a range of 148.45 g (S5G28) to 197.16 g (S5G5) as indicated in Table 7. For shelling percentage (Shell%) all the source of variation showed highly significant (p ≤ 0.01). The average shelling percentage was 77%, with a range of 74% (S5G11) to 80% (S5G22) as shown in Table 7. For harvest index (HI %), genotypes, environments, and their interaction (G × E) were observed highly significant variation (p ≤ 0.01). The average values were 62%, with a range of 50% (S5G28) to 72% (S5G12) followed by S5G15 (71%) for harvest index as indicated in Table 7. The yield per hectare had highly significant difference (p ≤ 0.01) for genotype, environments, though a significant (p ≤ 0.05) variation has noted for interaction of genotype with environment. The average yield per hectare was 2354.59 kg/ha, with a range of 2045.12 kg/ha (S5G30) to 2560.29 kg/ha (S5G1) as specified in Table 7.

Estimation of the relationship between traits

Correlation between growth and yield components

The correlations among overall trait means for the vegetative and yield components are shown in Table 8. Among the 27 traits we considered 12 as vegetative traits. Days to emergence showed negative and weak association with dry pod weight (r = − 0.19), dry seed weight (r = − 0.12), and yield (r = − 0.19) while positive and moderately significant correlation was found with harvest index (r = 0.16), days to maturity ( r = 0.40), and days to 50% flowering (r = 0.58). Days to 50% flowering showed positive and intermediate significant association with DTM (r = 0.38), whereas negative significant difference was recorded with total number of pods (r = − 0.40), dry pods weight (r = − 0.48), harvest index (r = − 0.13), and yield (r = − 0.48). Days to maturity had no meaningful association with plant height (r = − 0.09), though positive and significant association was noted with dry pods weight (r = 0.37), fresh pod weight (r = 0.29), hundred seed weight (r = 0.46), and yield (r = 0.37). Plant height had positive and significant association with number of branch (r = 38), hundred seed wieght (r = 0.20) though negative and non significant association was ovserved with dry pod weight (r = − 0.04), harvest index (r = − 0.30), and yield (r = − 0.04). Number of branches exhibited positive and significant correlation with biomass fresh and dry weight (r = 0.36) but there was no significant relation with yield (r = 0.13). There was no meaningful correlation of yield with number of petiole and leaves though perfect positive and highly significant correlation (r = 1.00) was found between number of petiole and leaves. Biomass fresh weight showed negative significant association with harvest index (r = − 0.97), though no significant variations was noted with fresh pod weight (r = − 0.03), dry pods weight (r = − 0.15), dry seed weight (r = − 0.20), and yield (r = − 0.15). However, harvest index (r = − 0.96) and hundred seed weight (r = − 0.22) had negative and moderately non significant correlation with biomass dry weight.

Table 8 Pearson's correlation (r) estimates for 27 phenotypes of 30 accessions of Bambara groundnut.

Correlation between yield and yield components

The correlations among the 15 yield related components over the combined analysis are shown in Table 8. The total number of pods showed a strong significant positive correlation with mature pods (r = 0.92), Moderately associated with yield (r = 75), fresh pod weight (r = 78), and dry pods weight (r = 0.75). Moreover, hundred seed weight (r = 0.63) and dry seed weight (r = 0.67) showed a moderate association with total number of pods. The yield per hectare had a positively strong significant association with number of mature pods (r = 0.83), fresh pod weight (r = 0.83), and dry seed weight (r = 0.94). A highly significant and perfect correlation was recorded among dry pods weight with yield (r = 1.00), though, a moderate positive association was found with hundred seed weight (r = 0.84) and harvest index (r = 0.36). Pod width (r = 0.48) and pod length (r = 0.58) showed a considerable degree of association with yield per hectare. A moderate positive association was observed for seed width (r = 0.39) and seed length (r = 0.74) with yield. Figure 1 showed the color map on a cluster of phenotypic traits whereas Fig. 2 depicted a graphical portrayal of the relationship between yield and its strongly contributing factors. In this color map red, and yellow colors indicate the negative and positive correlation of the tested traits, respectively and the intensity of color implies the magnitude of association among the traits. From the color map, we established that the traits such as the total number of pods, fresh pods weight, dry pods weight, yield, dry seeds weight, and hundred seeds weight captured the high intense yellow color indicating that a significant relationship was present among these traits. The yield kg per hectare was directly derived from dry pods weight per plant (g) and we observed a direct linear relationship between yield and dry pods weight (Fig. 2).

Figure 1
figure 1

Correlation cluster heatmap showing graphical relationship among the 27 traits revealed by XLSTAT. Note: Red and yellow colour indicate negative and positive correlation, respectively among the tested traits. Darker the hue greater the relation between the traits and vice versa.

Figure 2
figure 2

Scatter matrix with density and dot plot illustrates the graphical relationship of yield and its components in Bambara groundnut accessions revealed by NCSS 2021. (A) The total number of pods (TNP) vs Yield, (B) Fresh pods weight (g) (FWP) vs Yield, (C) Hundred seed weight (g) (HSW) vs Yield, (D) Dry pods weight (g) (DPW) vs Yield.

Variance component analysis

Variation in every population is caused by genetics and environmental influences. Only genetic variability is heritable from generation to generation, however, distinguishing between heritable and non-heritable characteristics complicates the selection process for breeders. As a result, before beginning a prudent breeding effort, breeders must differentiate the heritable from the non-heritable variation. Table 9 shows the estimated phenotypic coefficient of variation (PCV) and genotypic coefficient of variation (GCV) for all characteristics. We considered genotypic variance (σ2g), genotype by season variance (σ2gs), genotype by location variance (σ2gl), genotype by location by season variance (σ2gls), and error variance (σ2e) as variance components which collectively contribute to phenotypic variance (σ2p). However, the more the variance component lesser the value of heritability and genetic advance26. The PCV spanned from 5.2% for days to maturity to 25.91% for biomass dry weight (g) while, the GCV ranged from 1.15% for shelling percent to 24.57% for biomass dry weight (g). Traits such as number of petiole, number of leaves, biomass fresh and dry weight (g) all had high PCV values of more than 20% whereas, days to emergence, days to 50% flowering (d), number of branches, number of stem, number of nodes, internode length (mm), pod length, pod width, seed length, and seed width all had moderate PCV values ranged from 10 to 20% (Table 9). We observed low genotypic coefficient of variation (GCV) for most of the evaluated traits excluding biomass fresh (GCV = 20.11%) and dry weight (GCV = 24.57%) which is greater than 20%.

Table 9 Estimation of variance component, heritability, and genetic advance for 27 phenotypic traits in 30 Bambara groundnut accessions.

Variation in traits was considered for the selection program, which is based on heredity. The assessment of genetic advance with heritability may be a significant tool in crop improvement for determining the expected benefit from the selection. The variables in this investigation indicated low to high heritability estimates ranging from 3.98% for shelling percent to harvest index (92.05%) (Table 9). The broad-sense heritability of biomass dry weight (89.93%), biomass fresh weight (84.13%), dry pod weight (78.07%), harvest index (92.05%), and yield (78.07%) was found to be highly heritable. A moderate heritability was recorded for days to 50% flowering (34.75%), day to maturity (33.27%), number of seed (51.5%), dry seed weight (49.23%) and hundred seed weight (48.35%) (Table 9) remaining of the traits had low heredity. Genetic advance is a measure of how far a population may go via selection. Because heritability does not always imply high genetic gain, but it does when combined with high genetic advance. Vegetative and yield component characteristics have genetic advance values ranging from low to moderate (Table 9). The maximum genetic gain was discovered in biomass dry weight (g), which was 48%, followed by biomass fresh weight (g) (38%), whereas moderate genetic gain was recorded for the traits such as number of immature pod (12.35%), harvest index (18.31%), dry pod weight (10.34%), hundred seed weight (9.78%), and yield (10.34%) (Table 9). These characteristics of moderate to high heritability along with genetic gain have the potentiality of successful selection in genetic improvement. Because their expressions are controlled by additive gene action, the simple phenotypic selection is enough to enhance the next generation. The understanding of the extent and nature of variability across genotypes for certain traits is a requirement for doing simultaneous selection on more traits for Bambara groundnut development. We also estimated the Shannon diversity index (Fig. 3) to explore the extent of diversity, the genotype S5G30 showed a maximum value of more than 2.33 followed by S5G25 close to 2.33 (Fig. 3).

Figure 3
figure 3

Estimation of Shannon-diversity of 30 evaluated accessions. In the figure right side legend (description indicated the 11 population of Bambara groundnut from which the 30 accessions were sampled. Samples are Duna (S5G6, S5G18, S5G24, S5G28); Maikai (S5G1, S5G2, S5G3, S5G14); Cancaraki (S5G7, S5G19); Roko (S5G8, S5G29); Bidillali (S5G4, S5G20); Jatau (S5G5, S5G30); Maibergo (S5G15, S5G16); Katawa (S5G17, S5G26); Giiwa (S5G11, S5G21, S5G22, S5G25); Karu (S5G13, S5G23, S5G27); and Exsokoto (S5G9, S5G10, S5G12).

Clustering pattern

Genetic differentiation analysis is one of the standard statistics for parental selection, which reveals the degree of divergence across existing genotypes. The clustering provides a very strong and compact indication of the degree and shape of genetic variation, which is important for selecting the expected genotype. The phenotypic data were used in this study to compute the phylogenetic relationship among the 30 Bambara groundnut genotypes. The Ward hierarchical cluster analysis illustrated a two-way dendrogram (Fig. 4A), found distinct clusters indicating relationships among tested genotypes. The two-way dendrogram constructed a double dendrogram at the same plot. The horizontal dendrogram represents the dendrogram for genotypes and the vertical one represents the dendrogram for variables. The cluster I loaded the eight genotypes such as S5G1, S5G3, S5G2, S5G4, S5G5, S5G6, S5G7, and S5G11. The six genotypes viz. S5G20, S5G8, S5G9, S5G16, S5G10, and S5G13 were comprised cluster II. The maximum number of genotypes (9) were assembled in cluster III while the minimum genotypes (3) were in cluster IV. However, cluster V had four accessions. In the vertical dendrogram, among the 27 traits, ten traits such as TNP, NMP, FPW, SL, PW, DPW, Yield, NSP, DSW, and HSW into cluster I. Cluster II possesses the three vegetative traits such as NNS, NP, and NL whereas eight variables are grouped into cluster III (BDW, BFW, IL, SW, PL, NB, Shell%, and PH). However, cluster IV assembled the traits like DTE, D50%F, DTM, NS, NIP, and HI. This result had validation based on a correlation test i.e., there was a meaningful relationship among the traits. A heatmap with the cluster represents the chromatic visualization of the relationship between traits and genotypes. In the heatmap, each bar represents the position of accessions and variables intersection point. According to Z-score red and blue colors represent the high and low abundance of traits with accessions. The intensity of the color indicates the magnitude of traits abundance or richness, hence, the more color intensity the more abundance of the components. A constellation diagram (Fig. 4B) showing the correct position of the point representing each accession. Cluster analysis depicting constellation plot of Bambara groundnut (Clusters I, II, III, IV, and V) represent accessions as the indigo circle, red cross, blue square bar, green cross, and pink triangle symbols, respectively.

Figure 4
figure 4

Cluster analysis: (A) double dendrogram and (B) constellation plot illustrating of phylogenetic relationship of V. subterranea genotypes (Cluster I, II, III, IV, and V) represent accessions as indigo, green, blue, and orange symbols.

Principal component analysis

For sorting characteristics and categorizing accessions, principal component analysis (PCA) has been widely utilized in agricultural research. In the current study, the first seven principal components (PC) accounted for 83.38% of the total variance (Table 10 and Fig. 5). The first PC was gained and recorded for the greatest proportion of variance in the set of all PCs, while the rest were acquired and recorded for decreasingly lower and smaller amounts of variation. The proportion of variance for PC1 and PC2 were 33.87% and 15.13%, respectively, while the 7th PC accounted for 4.07% of the variation. As revealed by an analysis, the traits contributing to PC1 and PC2 have the most variability, with a high coefficient of variation. Table 10 shows the factor loading of several traits that were discovered using principal component analysis. The PC1 allowed loading of most of the traits evaluated indicating the positively significant for the respective principal components except DTE (− 0.08), D50%F (− 0.18), NP (− 0.01), NL (− 0.01), NNS (− 0.01), BDW − 0.01), SHEL% (− 0.01), NIP (− 0.09), IL (− 0.02), and BFW (− 0.02). Simultaneously, for PC2 the traits such as plant height, branch number, total number of pods, biomass fresh and dry weight, pod length, seed width showed significance to the respective principal components. Figure 6A depicts a graphical representation of sample loading (30 accessions) and Fig. 6B variable loadings (27 morphological characteristics) revealed by principal component analysis (PCA). In PCA variable loading, all the traits were positioned on the positive side excluding the traits D50%F, NIP, Shell%, and DTE. The angle between two vectors indicates the degree of association of the respective traits. The lower the angle higher the relationship between the trait and vice-versa. The angle of 90° between the trait indicated no association while more than 90° is suggested as a negative relationship. However, the density plot (Fig. 6C) showed the intensity of the genotypes’ dispersion based on yield potentials. Figure 6D illustrated the three-dimensional (3D) visualization of the 30 evaluated accessions.

Table 10 Eigenvalues, variation percentage, and eigenvectors revealed by PCA analysis.
Figure 5
figure 5

Graphical illustration of eigen values, axis, and % cumulative variation revealed by XLSTAT.

Figure 6
figure 6

The PCA analysis depicts the (A) loading of samples (genotypes) and (B) loading of variables (traits) (C) density plot showing the genotypes distribution based on yield potential and (D) PCA 3D plot revealed by XLSTAT and NCSS 2021.

Discussion

Analysis of variance and mean performance for vegetative and yield components

A significant variation presents in the pooled analysis of variance for the 27 traits of the 30 Bambara groundnut genotypes. Among the vegetative trait, the coefficient of variation ranged from 5.43% for days to maturity to 27.96% for biomass fresh weight (g) however, for the yield component it varied from 6.17% (shelling percent) to 26.24% (hundred seed weight). Masindeni26, reported CV = 24.79% for grain yield, CV = 9.32% for hundred seed weight and CV = 31.86% for pods number per plant over six locations. In terms of grain yield and hundred seed weight, our findings revealed highly significant (p ≤ 0.01) variations across genotypes. For almost all characteristics, including dry pod weight, hundred seed weight, and dry seed weight, the G × E interaction was highly significant (p ≤ 0.01), though statistically shelling percent had no meaningful difference. This implies that the genotypes' ordering was not consistent. Similar findings of significance of G × E have been reported by Masindeni26, in Bambara groundnut, Oladosu et al.15 in rice, and Ali et al.27 in groundnut. . In this current research, a significant level of divergence among GE interaction and genotype effect indicated that certain of the presence of diverse multi-environments with different genotypes as well as high yield potential28. According to Yan et al.29, the GEI over a wide extent of mega- environmental trail comprises of two interactions namely, crossover and non-crossover interactions. The yield performance among the tested accessions over multi-environments persistently designates as non-crossover interaction whereas crossover interaction illustrates the comparative alternation in genotypes ranking over a wide range of environments. Plant breeders can either select genotypes for a certain environment or widely adjusted genotypes across the multi-environmental conditions when GEI is subjected to the influence of predictable components30. However, the generation of comparatively constant and stable genotypes over multi-environmental situations is obligatory when GEI is subjected to an unpredictable component31 and for well explanation and realization of GEI, yield stability analysis under multi-location and seasons may prompt both reproducibility and heritability of the traits evaluated32.

Estimation of the relationship between traits

Consideration of the correlation matrix might be a fantastic scale of assessment for improved genotype selection programs33. Mohammed34 supports the use of correlation matrices in plant breeding as a popular method for determining the degree of relationship between two or more traits. This conclusion is similar to the findings of Pranesh et al.35 and Jonah et al.36 who found a strong and positive significant association between the total number of pods and the attributes such as mature pods number, dry pods weight, seeds number, dry seeds weight, and yield. We found a weak to intermediate and positive relationship between plant height (PH) and yield, biomass fresh and dry weight, and we propose selection based on these qualities may be beneficial for yield increase as well as fodder production for animal feeding. Our findings were corroborated with the research report stated by Mohammed34 in Cote d’ Ivoire and Zenabou et al.37 in Cameroon on Bambara groundnut. Zenabou et al.38 reported days to 50% flowering had a negative correlation with fresh and dry pod weight, which was consistent with our findings due to less variation in planting materials. Plant height was negatiively associated with yield but it was positively associated with shelling percentage, hundred seed weight, fresh pod weight, and these findings agree with the statement of Ahmad39. The number of petiole and leaves per plant expressed a positive correlation with total number of pods, fresh pods weight, dry seed weight, hundred seed weight and shelling percent but a negative correlation with harvest index, yield, a similar statement was noted by Unigwe et al.40. The yield components such as the total number of pods, fresh and dry pods weight , number of seeds, dry seeds weight, hundred seed weight, and harvest index expressed a positively significant correlation with grain yield in our investigation. These results were comparable with the report on Bambara groundnut, published by Khan et al.8, Mohammed34, Pranesh et al.35, Khan et al.41, and Onwubiko et al.42. This suggests that these characteristics might be chosen for to increase Bambara groundnut production.

Variance component analysis

The values of phenotypic variance is greater than the other components studied namely, genotypic variance, genotype by environment variance etc. for all traits, similar to the findings of earlier reports by Khan et al.8,25 indicating that trait expression is governed by the environment. According to Sivasubramanian and Madhavamenon22, the proportion of GCV and PCV values is deemed low when the value ranges between 0 and 10%, moderate when the value ranges between 10–20%, and high when the value ranges over 20%. However, because the coefficient of variation is independent of the measurement unit, it is more trustworthy when comparing trials25. The selection may be beneficial to the traits with lower PCV with higher GCV levels to isolate promising cultivars. On Bambara groundnut, Onwubiko et al.43, Gonné et al.44, and Naik45 reported similar findings, as well as in groundnut reported by Ambros46. According to earlier findings, selection may be successful for a specific trait improvement by effectively utilizing genetic variation with the higher degree of heritability47,48. In a broad sense, heritability refers to the fraction of total variation in phenotypic variables across individuals in a particular group that may be attributed to genetic variation. Higher genotypic coefficient of variation coupled with high heritability as well as high genetic advance gives better clues than the individual measures of variance component49. According to Johnson et al.24, the heritability percentage is deemed low when the value runs between 0 and 30%, moderate when the value runs between 30 and 60%, and high when the value exceeds 60%. Breeders might use high heritability for certain characteristics to choose superior genotypes based on phenotypic observations43,48. Low heritability, on the other hand, denotes a low heritable component of variation and a greater influence of environmental effects on the expression of such trait, as a result, selection based on such characteristics is futile, according to Ridzuan et al.50. Jaiswal et al.51 on the other hand, emphasized that attributes linked with high heritability do not always result in a higher genetic advance; hence, high heritability coupled with high genetic advance gives a more credible outcome. Furthermore, heredity encompasses both additive and non-additive gene action; hence, heritability should be considered in conjunction with genetic advancement for predicting the selection of superior genotypes52. The proportion of genetic advance is deemed modest, with values ranging from 0 to 10%, moderate (10–20%), and high (> 20%)24. However, in our investigation moderate to high heritability was identified in yield components coupled with moderate to high genetic advance and this is the evidence of medium to high environmental influences on these traits. A similar observation was reported by Masindeni26 and Khaliqi et al.9 in Bambara groundnut. However, using low to medium heritable traits, improvement in the following generation may not provide the expected outcomes since it has been proposed that non-additive gene action i.e., epistatic and the interaction between genotype and environment play a substantial influence in the expression of this trait26. Depending on the variability and heritability estimations, it is possible to deduce that direct selection can improve variables such as hundred seed weight (g), harvest index, biomass fresh and dry weight (g), number of seed, dry seed weight, and yield per hectare in Bambara groundnut. High heritability and genetic advance observed high in biomass fresh weight and biomass dry weight is the representing traits of yield per hectare is supported by Molosiwa49. We recorded Shannon diversity index range from 2.25 to 2.34 indicating that the evaluated genotypes showed a significant level of variation over the environment. Similar findings have been noted by Khan et al.7, who stated a standard scale for Shannon diversity value of 1.5 to 3.5.

Clustering pattern and PCA analysis

Based on the heatmap we observed that the genotypes under clusters I and II captured more red hue with the association of the yield and its contributing traits under cluster I. Our findings were advocated by Khan et al.7,8 in Bambara groundnut. Several analyses using various agglomeration approaches were tried to obtain the best possible categorization of accessions. The Ward technique seems to be the greatest agglomeration method for producing the finest results53. The constellation plot (Fig. 4B) arranges the accessions as endpoints and each cluster joins as a new point, with lines drawn that represent the membership. The wider the lines, the higher the distance between groups. The current clustering investigation was supported by previous research, which was noticed by Unigwe et al.40, and Bonny et al.54 found substantial variation in morphological features of Bambara groundnut. Moreover, Kumari et al.53 in Maize, Doumbia et al.55 in cowpea. In PCA sample loading, except the accessions, S5G12, S5G24, S5G19, S5G16, S5G23, S5G22, and S5G29 rest of the genotypes were palaced into positive parts of the PCA plot. These findings are validated by the report of Khan et al.7,8. The goal of the principal component analysis is to identify the total variance in a group of characteristics that successively accounts for the most variability in the data50. In general, traits are inter-correlated to various degrees, thus all of the principal components are not necessary to properly summarise the data. In any PCA, the first axes (PC1) explain the greatest proportion of the overall variance56. Shegro et al.57 used PCA analysis to categorize the 20 Bambara groundnut accessions whereas Mohammed58 found that PC1 and PC2 contributed to the overall variation at 19% and 14%, respectively.

Conclusion

The combined analysis of variance indicated that genotype (G), environments (E) and genotype by location (G × E interaction) showed extremely significant variations in vegetative, yield, and yield component characteristics. According to the means comparison results, Bambara groundnut genotypes G1, G3, G5, G6, G8, G7, G2, G4, G10, G13, G11, and G14 were the closest to the ideal genotype with superior yield across the environment. These genotypes were grouped into similar clustering according to Ward hierarchical clustering methods which assembled the accessions into five distinct clusters. Considering the pooled data PCA accounted for 49.01% variation contributed by PC1 (33.87%) and PC2 (15.13%). Farmers will reap high yields and steady revenue if better genotypes with the capacity to give a consistently high yield over different conditions are identified and certified for cropping. According to the results shown above, the planting materials have a sufficient level of genetic variation. This indicates that there is enough diversity to be exploited by selection. As a result, greater GCV, broad-sense heritability, and genetic advance are demonstrated by the various yield component features, particularly dry pods weight, hundred seed weight, biomass fresh and dry weight, dry seed weight, harvest index, total number of pods significantly impact the yield. As a result, they would be receptive to positive selection. In deciding yield and yield components, the environment played a greater role than genotype, and G × E interaction accounted for significant variation resulted, complicating genotype selection, an additional statistical analysis is necessary to estimate the stability of each genotype throughout the whole environment. The application of stability statistical measures is recommended to analyze genotype stability that divulges several G × E interaction features, resulting in the detection of stable genotypes across environments. However, statistical techniques such as univariate and multivariate analysis can be more fruitful in unfolding and understanding the G × E interaction alongside variance component analysis. Overall, this result will assist plant breeders in this crop improvement as well as selecting superior lines for the future breeding program.