Exploring physicochemical and cytogenomic diversity of African cowpea and common bean

In sub-Saharan Africa, grain legumes (pulses) are essential food sources and play an important role in sustainable agriculture. Among the major pulse crops, the native cowpea (Vigna unguiculata) and introduced common bean (Phaseolus vulgaris) stand out. This paper has two main goals. First, we provide a comprehensive view of the available genetic resources of these genera in Africa, including data on germplasm collections and mapping biodiversity-rich areas. Second, we investigate patterns of physicochemical and cytogenomic variation across Africa to explore the geographical structuring of variation between native and introduced beans. Our results revealed that 73 Vigna and 5 Phaseolus species occur in tropical regions of Africa, with 8 countries accounting for more than 20 native species. Conversely, germplasm collections are poorly represented when compared to the worldwide collections. Regarding the nuclear DNA content, on average, V. unguiculata presents significantly higher values than P. vulgaris. Also, V. unguiculata is enriched in B, Mg, S, and Zn, while P. vulgaris has more Fe, Ca, and Cu. Overall, our study suggests that the physicochemical and cytogenomic diversity of native Vigna species is higher than previously thought, representing valuable food resources to reduce food insecurity and hunger, particularly of people living in African developing countries.


Results
Species diversity in Africa. Our study identified 73 Vigna species and 5 Phaseolus species occurring in Africa (Fig. 1a), representing 69.5% and 5.6% respectively of the total number of species currently accepted. The diversity of Vigna is much higher and most species are native to Africa (63 species) (Supplementary Table S1). Thirteen of them are endemic, occurring in only one country, for instance: V. angivensis, V. bosseri, V. keraudrenii, and V. microsperma occur in Madagascar; V. mendesii and V. ramanniana in Angola; and V. monantha and V. somaliensis in Somalia. Among native Vigna, six are cultivated by African populations (namely, V. subterranea, V. parkeri, V. unguiculata, V. marina, V. luteola and V. vexillata) mainly for human consumption, but also for animal fodder and medicinal uses. The introduced species of Vigna are native from Asia (seven species) and America (3 species), most of them (6 species) being cultivated.
The genus Phaseolus has low species diversity in Africa, and four out of the five species identified originate from America. These species are extensively cultivated in Africa mainly for human and animal consumption. Only one species, P. massaiensis, is native to Africa (Supplementary Table S1) and is restricted to Tanzania.

Conservation status and germplasm collections. The information available at the IUCN Red List 34
revealed that 61.9% of the native species were already evaluated according to the IUCN criteria and categories: one species, V. dolomitica, is classified as Critically Endangered, four species as Endangered, one species as Near Threatened, three species as Data Deficient, and 30 species as Least Concern. Twenty-four native species remain unevaluated, including P. massaiensis (Fig. 1b).
The analysis of the accessions available worldwide reveals that only 50 of the 78 studied species (64.1%) are preserved in gene banks (Supplementary Table S1). The species P. vulgaris, V. unguiculata and the introduced V. radiata have the highest number of worldwide accessions, ca. 136,000, 40,000 and 16,000 respectively. Six species (7.7%) have 10,000-1000 accessions, 10 species (12.8%) 1000-100; 15 species (19.2%) 100-10, and 16 To explore the Angola's crop diversity conserved in gene banks, we consulted the available data in the Genesys database 35 , which contains 530 accessions of five species, with around 95% of them belonging to P. vulgaris (290) and V. unguiculta (217). However, most of them are stored at the SADC Plant Genetic Resources Centre in Zambia and none have been reported in Angolan gene banks or in other local Institutions. In Mozambique, 58 accessions were collected (i.e.: 46 of V. unguiculata, 8 of V. radiata and the remaining belong to three other species). Like Angola, Mozambique does not have accessions registered in national gene banks, and most are stored at the International Institute of Tropical Agriculture in Nigeria. Cabo Verde has only two accessions (P. lunatus and P. vulgaris) gathered in the country, but both are stored at the Portuguese Plant Germplasm Bank. More details are provided in Supplementary Table S1.
Biodiversity-rich areas in Africa. The highest number of Vigna species (native and introduced) were found in Central Africa, with a maximum of 26 species by cell (200 × 200 km) found in the Democratic Republic of Congo (Fig. 2a). Other areas in West and East Africa such as Togo, Benin, Nigeria, Cameroon, Tanzania, Malawi, and Zambia, are also very rich in Vigna species. The region of Southern Africa presents low values of richness, with a maximum of 9 species found in a cell. In Africa, the diversity of Phaseolus species reaches the maximum of 3 species per cell (Fig. 2b), which is found in Cameroon, Ethiopia, Eastern Democratic Republic of Congo, Rwanda, Burundi, Zambia and Malawi. The number of records recorded in Southern Africa, Namibia and Angola is very low for this genus. The Northern African region presents the lowest species richness for both genera.
In terms of native species, four countries stand out for hosting a high number of Vigna species (Fig. 3): Democratic Republic of Congo with 32 species, Tanzania with 30 species, Zambia with 28 species, and Cameroon with 27 species. Burundi, Angola, Nigeria, and Malawi also boast a great diversity with more than 20 native Vigna species. These results represent the data available in herbariums and museums worldwide, which is affected by the collection effort in each region over time.
Physicochemical characterization of P. vulgaris and V. unguiculata. Physical characterization. Our results showed that the studied seeds of V. unguiculata and P. vulgaris presented a varied array of colors and seed shapes.
Vigna unguiculata samples are mostly globose (94%) and most of the samples (82%) presented brown seeds, the hilum was always white, while the color of the rim varied between black and brown, and rarely yellow (MP04Vu). Length values of V. unguiculata varied from 5.5 mm (UI31Vu) to 10.3 mm (CC27Vu), width values ranged from 4.7 mm (UI31Vu) to 8.2 mm (MP34Vu) and height values ranged between 3.2 mm (UI31Vu) and 5.9 mm (MP34Vu).
Based on a correlation matrix, a heatmap was constructed ( Fig. 4) using Euclidean distances and the UPGMA method, wherein the vertical columns are the clusters of minerals while in the horizontal lines are the clusters of African bean accessions. In the plotted grid, boxes for each factor combination are encoded by dark red colors for the highest values and dark blue for the lowest values of correlation between mineral contents and accessions. The heatmap also plots a dendrogram from a cluster analysis showing the hierarchy of values for both accessions and mineral traits. Four groups of minerals could be discriminated in Fig. 4: Group A encompasses B, Mg, S, and Zn minerals; Group B is composed of Mn and Na minerals; Group C includes Cu, Fe, and Ca; and Group D is defined by P and K. Figure 4 also reveals two clusters of accessions corresponding to the two distinct species: V. unguiculata (Cluster 1) and P. vulgaris (Cluster 2). Cluster 1 is related to a higher content of B, Mg, S, and Zn minerals defined by cluster A. Cluster 2 is associated with a higher content of Cu, Fe, and Ca minerals defined by cluster C. Cluster 1 represents the 17 V. unguiculata accessions which are discriminated by mineral group A. Cluster 2 is composed of 21 P. vulgaris accessions defined by group C. Groups B and D did not exhibit a clear pattern for the Clusters 1 and 2. Furthermore, a comparative analysis between the mineral content of the two studied species and the geographical origin of accessions did not reveal a clear pattern (Fig. 5). Nonetheless, Cabo Verde presented extreme values in P. vulgaris for the majority of minerals (B, Ca, Cu, Fe, K, Mn, Z) (Fig. 5a), while V. unguiculata showed less variation between the geographical origins ( Fig. 5b).   Table S9).

Discussion
Our study reveals a much greater diversity of Vigna species in Africa (73 species, 63 native) than Phaseolus (5 species, 1 native), as expected 14,36 . This great disparity is justified by the original centers of diversity of both genera: Vigna has its centers in Africa and Asia 14,37 , while Phaseolus has America as its center 38 . www.nature.com/scientificreports/ Only one Phaseolus species (P. massaiensis) 9 is native to Africa, which is poorly studied and occurs only in Tanzania. The other Phaseolus (P. vulgaris, P. lunatus, P. acutifolius, and P. coccineus) are native from America and grown in Africa, where they were probably introduced during the sixteenth century as a food source 39 .
Based on specimen occurrence data, we found that the areas of the highest diversity of Vigna species, located in Western and Eastern Africa, were mainly associated with regions with a tropical climate. Our results are according to previous studies 14, 16,40 , which identify these areas as significant biodiversity hotspots of African flora and specifically of Vigna genus. Eight countries (i.e. Democratic Republic of the Congo, Tanzania, Zambia, Cameroon, Burundi, Angola, Nigeria, and Malawi) present more than 20 native species, highlighting their central role in the conservation of African Vigna diversity.
The conservation status has been accessed over the last few years, but only 60.9% (39 of the 64 native species) were already evaluated according to the IUCN criteria 34 . Among them, six are classified in the threat categories (one critically endangered and five endangered) and one as near threatened, but this number is likely to increase in the near future, because the pressure on natural resources continues to increase in most African countries, driven by economic and population growth as well as climate change 41,42 . Currently, the specific threats to African native pulses species are not well known 14 , but some studies focusing on the conservation of native biodiversity of African legumes 41-43 identified deforestation, agriculture and urbanization, harvesting for medicinal uses, fires, and invasive species as the main threats. Three species are classified as Data Deficient and 25 remain unevaluated, revealing the lack of knowledge about them. The evaluation of the conservation status, as well as the study of their threats, distribution and ecology, based upon extensive fieldwork, is a required step to protect native species 16,44,45 . Of particular concern are V. monantha, V. bosseri and V. keraudreni, which are classified as endangered and are not preserved in gene banks.
Despite the recognized importance of African native pulses, the information available on ex situ conservation, and the germplasm collections preserved in national institutions of the three studied countries (Angola, Mozambique and Cabo Verde), is still very limited. Nevertheless, it must be highlighted the role that the International Institute of Tropical Agriculture-IITA 46 have in Africa to keep collections of both crops and non-crops, in (in situ) or out (ex situ) of their natural environment. IITA's gene bank holds over 28,000 accessions of plant material of major African food crops and maintains the world's largest assemblage of cowpea germplasm collections with over 15,000 accessions 47 . As stated above, Angola, Mozambique and Cabo Verde revealed a high genetic diversity, however, no accessions of the studied species were reported in the national gene banks. Our results agree with previous studies, which recognized the lack of ex situ conservation strategies for Crop Wild Relatives  www.nature.com/scientificreports/ native to Angola 16 and Cabo Verde 48,49 . As Vigna and Phaseolus genera include several cultivated species and their Crop Wild Relatives (i.e., other taxa with a close genetic relationship to cultivated species), their conservation is extremely important to guarantee the supply of new genetic material, essential for the future crop improvement.
The Vigna genus includes 13 species cultivated in Africa, which constitute an essential source of nutrients for humans and domestic animals. Most of these species (7) are native to Asia and have only recently been introduced to Africa 14 . However, the cultivation of Vigna in Africa mainly focuses on two native species, V. unguiculata and V. subterranea.
Vigna unguiculata is one of the most important sub-Saharan Africa indigenous grain legumes, having been firstly domesticated in Northeast Africa, and secondarily in West Africa and in the Indian sub-continent 15,50 . Cowpea matures earlier than cereals, becoming an important source of income for the rural population before maize, millets and cassava are harvested 14,24 . It is widely cultivated in almost all African countries and was probably introduced around 300 BC in Europe, 200 BC in India, and in the seventeenth century to tropical America by the Spanish 9 .
Mineral content analyses revealed that V. unguiculata and P. vulgaris form an excellent source of minerals. Vigna unguiculata had, on average, higher content of B, Mg, S, and Zn, while P. vulgaris had more Fe, Ca, and Cu. Fe content in P. vulgaris ranged from 49.0 mg/kg (MA21Pv) to 79.5 mg/kg (CV49Pv) and 39.1 mg/kg (MP04Vu) to 78.0 mg/kg (HI25Vu) in V. unguiculata. For P. vulgaris these values were lower than the Fe contents reported by Di Bella et al. 51 and Gelin et al. 52 , which were on average 86.2 mg/kg and 86.9 mg/kg, respectively; whilst for V. unguiculata the contents tallied with Dakora and Belane's study 53 where the obtained values ranged from 61 mg/kg to 67 mg/kg. Moreover, Yeken et al. 54 found much higher values of Cu, Zn, Mn, Fe in biofortified P. vulgaris but lower values of Ca and Mg, for which our accessions show higher content. However, Gondwe et al. 55 focusing on V. unguiculata from Swaziland, found much lower contents of Ca, Fe, and Zn than the content found in our study, even for improved varieties. Content variation within the same species can be explained by several cross-related reasons: (i) the nutritional content of legumes can vary greatly and is highly dependent on soil fertility, which directly influences the supply and availability of most nutrient elements 56 ; (ii) the nutritional content is dependent and vary with different varieties and genotypes 46,57 ; and (iii) the nutritional content is influenced by preharvest conditions of the plant, maturity of the edible product at harvest and postharvest handling and storage conditions 58 . Additionally, three different aspects account for the content of micronutrients in the seeds, namely micronutrient uptake, transport and remobilization from other parts of the plant, and storage capacity 56 . The mechanisms and proteins involved in all these processes differ for each species due to different evolutionary processes.
Our results highlight the importance of P. vulgaris and V. unguiculata as sources of essential micronutrients. Chronic deficiency of essential vitamins and minerals is a scourge affecting more than two billion people globally and account for approximately 7% of the global disease burden 59 . This form of undernutrition is known as the 'hidden hunger' and occurs due an insufficient intake and absorption of essential vitamins and minerals 60 . Deficiencies in zinc, iron, and iodine have the strongest negative impact on public health, but other minerals such as calcium, selenium, magnesium, and fluoride, also contribute to the health burden 61 . Research carried out by Muthayya et al. 62 which calculated the Hidden Hunger Index (HHI-PD) for preschool-age children revealed that sub-Saharan Africa was the most affected region with high rates of anemia due to iron and vitamin-A deficiency. Moreover, a study by Joy et al. 63 identified the top three mineral deficiencies, Ca being the highest, followed by Zn and I (Iodine). Regarding Fe deficiency, recent studies have noted an increasing percentage over the last 5 years, from 24.8% 64 to 54% 65 . Therefore, the inclusion of grain beans in their diets will aid the fight against food insecurity and other several forms of malnutrition.
Analyzing the mean values of the results based on the nutrient reference values for daily reference intakes for minerals (Adults) [NRVs] values reported by the European Union 66 , it appears that V. unguiculata and P. vulgaris are an excellent mineral source. A usual intake of 100 mg per day is enough to reach about 50% of NRVs of K, V. unguiculata (49.0%) and P. vulgaris (49.5%); more than 50% of P, V. unguiculata (59.3%) and P. vulgaris (58.9%) and for Mg, V. unguiculata (54.5%) and P. vulgaris (45.5%).
Regarding cytogenomic diversity, our study reveals a higher genome size variation in V. unguiculata (1414.7 ± 86.2 Mbp) when compared with P. vulgaris (1337.4 ± 33.3 Mbp). However, our results do not reveal a correlation between the genome size variation and the studied locations (i.e. Eastern region of Africa: Mozambique; Western region of Africa: Angola, and in the Northeastern Atlantic Ocean: Cabo Verde Islands). Particularly, very small intraspecific variation in nuclear DNA content was found among P. vulgaris samples collected from different regions and different habitats (e.g. samples collected in humid zones of Malanje, and others in semi-arid zones of Cunene, close to the Kalahari Desert; see Supplementary Table S2). This species is introduced and more domesticated than V. unguiculata, a pattern observed in other crop species, like Daucus species as demonstrated by Nowicka et al. 67 . Guilengue et al. 68 found a 9.2% genome size variability among a tarwi (Lupinus mutabilis Sweet) collection, an Andean crop that is believed to have been domesticated ca. 2600 years ago. The genome size variability of the P. vulgaris accessions analyzed in the present study is of 7.8%, suggesting that P. vulgaris might have undergone selective pressures within the studied African countries associated with the domestication process. On the other hand, V. unguiculata accessions analyzed in this study present a 17.6% genome size variability, showing a higher diversity and illustrating the native nature of this crop in Africa.
Crop genetic diversity is highly dynamic. Differences could be related to several factors such as agroecological conditions of cultivation, seed selection and management by the preferences of farmers in seed selection and the management of seed lots, among others, which have important effects on the chemical composition of the grains and population genetics 69 . Moreover, climate and growth conditions may act as mutagenic factors, causing structural changes in chromosomes (addicting and deleting regions), which result in differences in DNA among different samples of the same species 70  www.nature.com/scientificreports/ Comparative molecular cytogenetic analysis of native and introduced lineages could provide important information on possible karyotypic reorganization and evolution based upon regional environmental stresses. Modern plant-breeding and crop improvement programmes must be conducted following a holistic methodology and research focusing on only one aspect of a crop will not provide enough information to decide on which breeding technology may be more suitable. An initial step of plant genome mapping is the accurate estimation of nuclear genome sizes 71,72 . Nutrient content and genome size are known to be connected to life form, with geophytes being among the extreme examples of large genomes, linked, among other factors, to nutrient storage capacity, while genome size is usually reduced in annual species such as therophytes 73 . The life strategy of a species and a particular environment may limit genome size increase such as in improved polyploids, or the genome size may cause a species to adopt specific life strategies or colonize certain environments 73 . A priori information on distribution, genome size, capacity of storage of mineral content are thus fundamental data which allow breeding companies and governmental agricultural stakeholders to determine the applicability and interest of implementing an improvement or simple breeding programme.

Final remarks
Based on extensive research, our study exposes the great diversity of native and introduced Vigna and Phaseolus species in Africa. Our results highlight the importance of V. unguiculata and P. vulgaris as sources of essential nutrients, making them a vital resource for the poorest populations. If complemented with further investigation on genetic traits, this study may also be relevant to increase the efficiency of breeding programs for both commercial production and smallholder farming systems in Africa.
Thus, the device of holistic strategies (e.g., combining socio-economy; ecogeography; in situ and ex situ conservation; nutrigenomics) to generate metadata, will be essential to achieve the sustainable use of understudied and overused plant resources in Sub-Saharan Africa. We believe that this is a key point for further inquiry to respond to some issues raised by the Sustainable Development Goals, such as increasing nutritional wellbeing, food security, and reducing hunger and poverty.
Finally, our study highlights the lack of taxonomic, genetic and ecogeographic knowledge of native Vigna and Phaseolus, and that greater efforts should be made to improve the in situ and ex situ conservation of these species, especially the more restricted ones. The coordination of national, regional and international conservation strategies is important to ensure the preservation of native edible African flora.  14,36,[81][82][83] were consulted. With the information collected, we constructed a comprehensive dataset that included the scientific name of each taxon, English common names, native status in Africa, native distribution, and conservation category according to the IUCN Red List 34 , and ex situ conservation data.

Species diversity, distribution and conservation in Africa. Initial identification of
The geographical distribution of Vigna and Phaseolus species in Africa was estimated based on occurrence data available on the GBIF database 84 Table S2) of the most cultivated species, P. vulgaris (n = 21) and V. unguiculata (n = 17), were collected between 2018 and 2019 in Angola (20), Mozambique (13); and Cabo Verde (5) (Supplementary Fig. S3). Each sample was divided in two parts, one was used for the chemical analysis (n = 38) and the other cultivated to obtain developed and mature leaves for flow cytometric analysis (n = 33). Cultivation took place between June 2019 and February 2020, in a greenhouse under controlled conditions (i.e., using the same kind of well-drained fertile soils in a dimensionally equal pots at an optimal temperature for germination of ± 25 °C) at Instituto Superior de Agronomia of the University of Lisbon (ISA/UL) as shown in the Fig. 7. The germination experiments were performed in triplicate for each accession. Herbarium vouchers were deposited in the Herbarium LISI and the identification of plant specimens was done by some of the authors (SC, MB and MR). Permissions to collect all the seeds used in this study were obtained. This study complies with local and national regulations. www.nature.com/scientificreports/ seed phenotypic traits was performed (i.e. quantitative: length, width, and height; and qualitative: shape, color and hilum). For each bean accession, seed length, width, and height were obtained for 10 seeds by using a digital calliper with 0.01 mm of accuracy. Specifically, seed length was measured from the base to the tip portion while the seed width was measured from the hilum to the opposite side. The mean values were recorded in millimeters. The shape and color of the bean, and the description of the hilum was obtained through direct observation of a randomized pool of 10 seeds.

Sampling of P. vulgaris and V. unguiculata. A total of 38 accessions (Supplementary
Mineral analysis. Mineral contents (Na, K, Ca, Mg, P, S, Fe, Cu, Zn, Mn, B) in P. vulgaris and V. unguiculata seeds were determined by inductively coupled plasma optical emission spectrometry (ICP-OES). Grain beans were grounded into a powdery form with a stainless-steel grinder (Kunft coffee mill). A portion of 0.3 g from bean powder from each sample was weighed and subjected to the digestion process with a mixture of nitric acid and hydrochloric acid (1:3, v/v) at 105 °C for 90 min and analyzed using the Thermo Scientific iCAP 7000 Series ICP-OES spectrometer (Thermo Scientific, Cambridge, UK). Procedural blanks were obtained following the above-mentioned protocol. Calibration curves of five different concentrations were applied to calculate the concentration of each mineral. Standard solutions of minerals for ICP-OES were from Panreac (Barcelona, www.nature.com/scientificreports/ Spain) and all the water used was purified using a Milli-Q water system (Millipore, Bedford, MA, USA). All the measurements were carried out in triplicate and the results obtained in mg per kg wet weight.
Nuclear DNA content estimation. For each cultivated sample (n = 33), young leaves in healthy conditions were randomly collected and immediately analysed in the laboratory. Nuclear DNA content was measured by flow cytometry using Solanum lycopersicum 'Stupické' [2C = 1.96 pg] 88 as DNA standard. Each sample was chopped with a razor blade along with the standard in the presence of 1 mL of buffer 89 . The nuclear suspension obtained was filtrated using a 30 μm nylon filter to separate cells from plant debris. After filtration, 50 μg/mL of propidium iodide (PI; Sigma-Aldrich) and 50 μg/mL of RNase (Sigma-Aldrich) were added to stain DNA and prevent staining of double-stranded RNA, respectively. The samples were maintained at room temperature and analyzed using a CyFlow Space flow cytometer (Sysmex, Norderstedt, Germany) as previously described by Guilengue et al. 68 . The reproducibility of the results was assessed using five independent replicates for each accession. FloMax software v2.4d (Sysmex) was used to measure nuclear DNA content and three graphics were generated from data measurement: fluorescence pulse integral in linear scale (FL); fluorescence pulse integral in linear scale versus time; and fluorescence pulse integral in linear scale versus side light scatter in logarithmic scale (SSC). The absolute DNA amount of a sample was calculated based on the values of the G1 peak means, as suggested by Doležel and Bartoš 90 : The results generated from 2C DNA (in picogram) were transformed to million base pairs using the following conversion: 1 pg = 978 Mbp 91 . Coefficient of variation (CV, %) of G1 peaks in the FL histograms, and estimates of the CV of the genome size of each accession were used to assess the reliability of the results. Statistical analysis. All data measurements are presented as mean values. In order to compare morphometric measurements, mineral content, and genome size across the accessions, a Univariate analysis (UA) was performed. Test of means was carried out using Scott Knott test for all variables at 5% significance with the ScottKnott package 92 . To perform a multivariate analysis the mineral content data was standardized (mean = 0, and standard deviation = 1). Cluster analysis was performed based on Euclidean distance and average method for the 38 accessions (heatmap function). Results of cluster analysis were visualized with the ggplot function of ggplot2 package 93 . All analyses were performed in the RStudio program version 1.1.456 94 .
Sample 2C DNA content = Sample G1 peak mean standard G1 peak mean × Standard 2C DNA Content