Introduction

Balanites aegyptiaca Del. belongs to Zygophyllaceae family, commonly known as desert date tree, is a long-lived dicotyledonous spiny tree up to 10 m tall that is widely distributed in arid and semi-arid regions in Africa and South Asia1,2,3. In Mauritania, B. aegyptiaca is widespread throughout the country. It is a xerophytic tree distributed in Sahelian as well as the Saharan zones. The Desert date is a multipurpose tree offering protection against desertification, provides food for human and animals in addition to many medicinal uses2,3. For instance, fruit of B. aegyptiaca (locally known as Tooga) is used in traditional medicine to treat diabetes, asthma, epilepsy, malaria, etc. Moreover, the seed or Kernel of B. aegyptiaca fruit is very rich in oil (46.0–54.7%) particularly unsaturated fatty acid (up to 75% of the total fatty acids) and protein (26.1–34.3%)1,4. Some authors successfully tested B. aegyptiaca oil for biodiesel production and showed that it can be an alternate diesel5,6.

Due to its extreme resistance to drought as well as its diverse set of natural regeneration strategies through seeds, suckers and rejuvenation, B. aegyptiaca has recently identified as one of the native plants for restoring of degraded Sahelian ecosystem in the so called Great Green Well (GGW) project7. This project aims to create a 15-km- wide (north–south) green belt of trees south of the Saharan desert across more than 11 African countries (from Senegal to Djibouti)8,9.

With two-third of its surface areas lying in the African great Sahara Desert, Mauritania like many other Sahelian countries was faced during the 1970s–1980s with the severe drought resulting in huge socio-economic and environmental impacts (massive rural exodus, food insecurity and desertification). The limited biodiversity that naturally characterizes arid and desert regions was significantly reduced due to that environmental condition.

Studying genetic variability is of great relevance in plant genetic resource management programs. Through information it provides, it is possible to identify genotypes of interest and use them in the establishment of effective conservation strategies.

Today, molecular markers are by far more suitable to analyze the genetic diversity than morphological and biochemical traits because they segregate as a single gene and they are not affected by the environment10. A vast array of molecular markers including restriction fragment length polymorphism (RFLP), randomly amplified polymorphic DNA (RAPD), amplified fragment length polymorphism (AFLP), inter simple sequence repeats (ISSR) and microsatellites or simple sequence repeats (SSR) have been used to assess genetic diversity in plant species from different parts of the world11,12,13,14,15. ISSRs are segments of DNA that are flanked at both side by shorts DNA motifs of 2–5 nucleotide long repeated multiple time called microsatellite16. PCR-ISSR is a quick, cost-effective and high reproducibility molecular technique based on PCR amplification of such multilocus inter-microsatellite sequences ISSRs are found to be useful in the analysis of genetic variation below the species level, mainly in studying population structure and differentiation17. Data on the genetic diversity of B. aegyptiaca germplasm worldwide are very limited. Domyati et al.29 assessed genetic diversity in 7 medicinal plants, including B. aegyptiaca using ISSR, RAPD, and AFLP. They showed that ISSR markers revealed high genetic diversity in B. aegyptiaca compared to other markers (RAPD, AFLP). Moreover, AFLP markers have been successfully used to assess genetic diversity amongst B. aegyptiaca collected from different geographical regions18.

In the present study, we evaluate genetic diversity among and within B. aegyptiaca populations collected from different bioclimatic zones in Mauritania using ISSR markers. The main objective of this study was to provide baseline molecular information for this economically and ecologically important neglected crop for a best conservation and restoration program of Sahelian ecosystem.

Results

Polymorphism of the ISSR markers

The 16 ISSR primers used generated 505 polymorphic bands ranged in size between 40 and 3369 bp. The primer BTH3[(AG)8TC] produced the lowest number of bands (21) whereas the primer BTH2[(AG)8T] showed the highest number (42) with an average of 31.5 fragments per primer (Table 1). The PIC values ranged from 0.13 to 0.29 with an average of 0.23. Marker Index (MI) and resolution power (Rp) averaged 7.28 (range 3.3–10.3) and 9.9 (range 4.5–14.6), respectively.

Table 1 ISSR primer used and their respective markers performance indexes.

At the population level, the number of observed alleles (Na) ranged from 0.62 to 1.39 with an average of 1.12, while the number of effective alleles (Ne) ranged from 1.13 to 1.23 with an average of 1.2 (Table 2) and the private alleles (Pa) ranged from 2 to 14 with an average 8.29. Populations from the Saharo–Sahelian zone (Aghchorguit and Boutilimit) had the lowest (2) and the highest (14) number of private alleles, respectively. Shannon index (I) ranged between 0.13 and 0.24 with an average of 0.21. The highest percentage of polymorphic loci (%P = 68.32%) was observed in the population of Yaghref 2 and the lowest (%P = 30.1%) in the population of Tazyazet (Table 2).

Table 2 Genetic diversity within B. aegyptiaca populations from Mauritania.

Genetic differentiation and gene flow

The coefficient of genetic differentiation (GST) between populations and gene flow (Nm) were 0.21 and 1.91, respectively. Analysis of molecular variance showed that 20% of the molecular variance was between populations (Table 3).

Table 3 AMOVA for B. aegyptiaca populations from Mauritania.

Cluster analysis

The dendrogram based on Jaccard's similarity coefficient ranging from 0.14 to 0.56 was constructed using the whole ISSR data matrix (Fig. 1). The obtained dendrogram divided the populations in to two main groups, the largest one contains five geographically distant populations (Aleg, Aghchorguit, Boutilimit, Yaghref1, Yaghref2) and the second group contains Tazyazet and Chami populations. Tazyzet and Chami both from the Saharan zone.

Figure 1
figure 1

UPGMA-dendrogram based on Jaccard’s similarity coefficient, of 7 natural populations of B. aegyptiaca, implemented by the software NTSYS-PC version 2.02 (Exeter software, New York).

The dendrogram grouped the studied populations into two major clusters and the individuals were separated according to their populations. Some populations of each region were placed into the same sub-cluster and showed high similarities with each other.

Principal coordinates analysis

Principal coordinates analysis (PCoA) was carried out to provide spatial representation of the genetic diversity among the desert date populations (Fig. 2). The first two principal coordinates accounted for 21.3% of the total variance (12.37% and 8.93%, respectively). Obtained picture confirmed the clustering pattern observed using UPGMA analysis and the classification of the B. aegyptiaca populations into two major groups. Indeed, populations of Chami and Tazyazet from the Saharan zone are genetically distant from the rest of the populations (Aleg, Agchorguit, Boutilimit, Yaghref 1 and Yaghref 2).

Figure 2
figure 2

Two-dimensional representations of the first two axes of the principal coordinates analysis (PCoA) from the matrix of genetic distances of 91 samples from 7 populations.

Population structure analysis

To provide further evidence and deduce population structure, Bayesian assignment analyses were used. The maximum log-likelihood given by STRUCTURE and ∆K method was K = 2 followed by K = 5 indicating that the populations of B. aegyptiaca studied could be grouped into two main populations and five subpopulations (Fig. 3). The populations Tazyazet and Chami form one cluster at K = 2 and at K = 5. Whereas the populations Aleg, Agchorguit, Boutilimit, Yaghref 1 and Yaghref 2 are more similar among them and form one genetic group at K = 2. This group was further separated into 4 sub-clusters at K = 5. The above analyses (UPGMA and PCoA) indicate a similar result and show two main clusters which is consistent with the STRUCTURE results at K = 2.

Figure 3
figure 3

Results of Bayesian structure of the populations of B. aegyptiaca obtained with the structure program37 The STRUCTURE HARVESTER based on the approach of Evanno et al.27 indicates Delta K was achieved its highest peak when K = 2 followed by K = 5 (a). (b) Table output of the Evanno method results which Yellow highlight shows the maximum value in the Delta K column. From top to bottom the clusters at K = 2 (b) and at K = 5 (c). Each population is represented by a single vertical bar. The bar is divided into K colors, where K represents the number of genetic groups assumed as identified by the STRUCTURE program. Population’s numbers see Table 2.

Discussion

B. aegyptiaca is a woody plant endemic to Mauritania as well as the Sahel where it plays an important ecological and socio-economic role. In the present study, genetic diversity among seven B. aegyptiaca natural populations from different bioclimatic stages in Mauritania was assessed using a set of ISSR primers. The study revealed significant number of markers (505 polymorphic bands) in the 16 tested ISSR primers compared to that reported in B. aegyptiaca species from Egypt (177 bands and 17 ISSR primers)29. Khamis et al.18 reported 477 polymorphic bands using AFLPs markers in B. aegyptiaca from different geographical origins. However, polymorphism information content (PIC) obtained in our study (0.23), is lower than that reported by Domyati et al.29 who found a PIC value of 0.37. Variable PICs values from other perennial dicotyledonous species such as jujub (Z. mauritiana) (PIC = 0.42) and cork oak (Quercus suber L.) (PIC = 0.28) have been reported30,31. Moreover, marker index (MI) for the present study populations (7.3) was very low compared to that of 53.34 found by Domyati et al.29 in B. aegyptiaca using ISSR marker. The difference in the markers in formativeness and performance between our study and that of Domyati et al.29 could be the results of the difference in the primers sequences used and the genetic background of the populations tested. Nevertheless, the level of polymorphism found in our study is consistent with that reported for similar long-lived perennial species such as Cork Oak31 and Baobab (Adansonia digitata)32.

Moreover, natural populations of B. aegyptiaca in the Saharan zone are subjected to several anthropogenic (the need of wood for fuel), animal (mainly dromedary grazing) and environmental (extreme aridity) pressures, that could be probably the cause of the low observed variability. It is worth noting that B. aegyptiaca trees in this desert region, which probably constitutes the northern limit of the species distribution in the country, are much dispersed and not abundant.

In addition, private alleles, particularly in the Saharan and Saharo-Sahelian populations, were noted. These alleles reveal information about the differentiation of species and could be involved in adaptation of the species to local environment. Also, the private alleles occur as a result of mutation33 and can be studied to reveal genes of adaptation. These results can be used to identify the ecotypes of this tree.

The AMOVA values also provide an insight into intra- and inter-population differentiation of B. aegyptiaca. Indeed, AMOVA showed that 80% of the total genetic variations was attributed to variations within populations rather than between populations (GST = 0.21). These findings are comparable to those reported by Alansi et al.14 in Z. spina-christi L. (GST = 0.17) and Medhi et al.34 in Xanthoxylum spp. (GST = 0.24). Furthermore, it was observed that 100% of the total variation in B. aegyptiaca populations from Egypt was attributed to difference within populations29. The high level of the intra-population diversity in B. aegyptiaca corroborate that long-lived, cross-pollinated and widespread plant species maintains high level of intra- population differentiation and low level of inter-population diversity35 For instance, an allopollination rate of 37% for B. aegyptiaca from Senegal was reported. This cross-pollination is brought about by wind and insects36. It is worth noting that geographically distant populations are genetically related as revealed by principal coordinates analysis, UPGMA dendrogram, based on Jaccard's similarity coefficient, and population structure analysis. This finding can be explained by the absence of geographical barriers between the regions studied and effective gene flow (Nm = 1.91). One factor that could maintain gene flow between geographically separated populations is seed dispersal through animal grazing35. In the case of B. aegyptiaca, one of the common preferred forage species of Camel dromedary, seeds can be transferred as Camel graze from one land to another during transhumance across Sahara, thus explaining why this species is common in the Sahel and Sahara.

To our knowledge, this is the first study addressing the genetic diversity of B. aegyptiaca from Mauritania. It demonstrates that ISSR markers offer a useful approach for characterizing genetic diversity within and among B. aegyptiaca populations. Further studies for analyzing genetic diversity using other ISSR primers and B. aegyptiaca populations may help the selection of plants of interest particularly in the context of the GGW project. Furthermore, it is useful to conduct comparative studies with different molecular and even morphological marker.

Materials and methods

Plant material

In this study, we evaluated 91 accessions belonging to 7 natural populations of B. aegyptiaca. These populations were collected from Aleg in the Sahelian zone (rainfall > 200 mm), Boutilimit, and Aghchorguit, in the Saharo-Sahelian zone (rainfall between 100 and 200 mm) and Yaghref1, Yaghref2, Tazyazt and Chami from the Saharan zone (rainfall < 100 mm) (Fig. 4).

Figure 4
figure 4

Map of Mauritania showing the study sites and rainfall patterns.

The fresh leaves were collected from 10 to 15 trees per population and per study sites. Due to the asexual reproduction mode of the desert date tree through suckering19, a minimum distance of not less than 20 m was maintained between sampled trees for the majority of populations. The leaves were transported in a container in presence of ice to the laboratory where they were either stored at − 80 °C or lyophilized for further processing.

DNA extraction

Genomic DNA was extracted from 20 mg of lyophilized leaves using the commercial Isolate II Plant DNA Mini Kit (Bioline, France) according to the manufacturer’s instructions. DNA integrity and purity were estimated using a spectrophotometer (NanoDrop 2000, USA) and 1% agarose gel electrophoresis.

DNA amplification by ISSR

We tested 17 anchored ISSR primers of which 16 given reproducible bands. Table 1 summarizes the main characteristics of the ISSR primers used. PCR was carried out according to the protocol of Zietkiewicz et al.20 with slight modifications. Briefly, 25 µl of PCR mixture contained 4 ng/µl genomic DNA, 1U of Taq DNA Polymerase, 1 × PCR buffer, 2 mM of each dNTPs, 2.5 mM MgCl2 and 4 µM of each primer. DNA amplification was performed in a 96-well thermal cycler (Veriti®, California, USA). Cycling included a 5 min initial denaturation at 94 °C, following by 32 cycles of: 30 s at 94 °C, 45 s at annealing temperature of each primer, 2 min at 72 °C and a final elongation cycle of 7 min at 72 °C. The ISSR-PCR products were separated on 2.8% agarose gel, stained with ethidium bromide and visualized under UV. Band size was estimated by comparing the bands with 1 kb ladder (Invitrogen, USA).

Data analysis

The binary data matrix was analyzed using GelCompar II software (version 2.5, Applied Maths, Kortrijk, Belgium). Only clear and sharp bands were considered as ISSR markers. They were then coded as 1 for the presence and 0 for the absence. From this binary presence/absence matrix, we calculated the polymorphism information content (PIC) according to the formula, described by Roldán-Ruiz et al.21, PICi = 2fi (1 − fi), where PICi is the polymorphic information content of marker i, fi is the frequency of the present fragments and (1 − fi) is the frequency of the absent fragments. Marker index (MI) was calculated according to the Powell et al.22 formula: MI = PIC × EMR, where EMR (effective multiple ratios) is defined as the product of the fraction of polymorphic loci (β) and the number of polymorphic loci (n). The resolving power (RP) for each primer was calculated using the formula of Prevost and Wilkinson23: Rp = ΣIb, where Ib is the informative fragment and calculated as follows: Ib = 1 − [2 ×|0.5 − p|], where p is the proportion of the genotypes containing the fragment.

To estimate genetic diversity, five parameters were calculated using GenAlex v6.523,24 including the number of alleles (Na), the effective alleles (Ne), the private alleles (Pa), the Shannon’s information index (I) and the percentage of polymorphic loci (%P).

POPGENE version 1.3225 was used to compute the coefficient of gene differentiation (Gst) and the gene flow (Nm). Principal coordinate analysis (PCoA) and Molecular Analysis of Variance (AMOVA) were also calculated using the GenAlex v6.5 program.

The data were analyzed using the SIMQUAL (similarity for qualitative data) method to generate Jaccard similarity coefficients. These similarity coefficients were used to construct dendrogram using the Unweighted Pair-Group Method with Arithmetic mean (UPGMA) employing the SAHN (Sequential Agglomerative Hierarchical and Nested clustering) routine from NTSYS-PC v. 2.02 program (Applied Biostatistics, Setauket, N.Y.)26 The genetic structure pattern was analyzed using a Bayesian algorithm implemented in the STRUCTURE 2.3.4 software which to infer the number of genetically distinct clusters (K) without a priori group designation. Twenty independent simulations were carried out for each simulated value of K (the number of populations). The range of possible clusters was set from 1 to 10. Analysis parameters included a burn-in period of 30,000 iterations and 50,000 Markov Chain Monte Carlo (MCMC) repetitions with the selection of admixture and correlated allele frequencies models. The credible number of populations was assessed using the Evanno’s ΔK method27 based of maximum value of ΔK with online tool Structure Harvester28.