Inter simple sequence repeat markers to assess genetic diversity of the desert date (Balanites aegyptiaca Del.) for Sahelian ecosystem restoration

Drought and desertification are the major environmental constraints facing the Sahelian agro-ecosystems for decades. Assessing genetic diversity of native tree species is critical to assist ecosystems restoration efforts. Here we describe genetic diversity and structure of seven Balanites aegyptiaca L. natural populations distributed across the Sahelian-Saharan zone of Mauritania using 16 polymorphic ISSR primers. These generated 505 polymorphic bands. Polymorphism information content (PIC) varied from (0.13–0.29) with an average 0.23, marker index (MI) averaged 7.3 (range 3.3–10.3) and resolving power (RP) ranged from (4.53–14.6) with an average 9.9. The number of observed alleles (Na) ranged from (0.62–1.39), Effective number of alleles (Ne) varied from (1.26–1.37), Shannon’s information index (I) ranged from (0.25–0.36). AMOVA analysis showed that 80% of the genetic variation was fined within populations, which is supported by a low level of genetic differentiation between population (GST = 0.21) and an overall estimate of gene flow among populations (Nm = 1.9). The dendrogram based on Jaccard's similarity coefficient and the structure analysis divided the seven populations into two main clusters in which two populations from the Saharan zone were grouped. Our results provide baseline data for genetic conservation programs of this Sahelian neglected crop and with an important econ-ecological role.

www.nature.com/scientificreports/ environmental impacts (massive rural exodus, food insecurity and desertification). The limited biodiversity that naturally characterizes arid and desert regions was significantly reduced due to that environmental condition. Studying genetic variability is of great relevance in plant genetic resource management programs. Through information it provides, it is possible to identify genotypes of interest and use them in the establishment of effective conservation strategies.
Today, molecular markers are by far more suitable to analyze the genetic diversity than morphological and biochemical traits because they segregate as a single gene and they are not affected by the environment 10 . A vast array of molecular markers including restriction fragment length polymorphism (RFLP), randomly amplified polymorphic DNA (RAPD), amplified fragment length polymorphism (AFLP), inter simple sequence repeats (ISSR) and microsatellites or simple sequence repeats (SSR) have been used to assess genetic diversity in plant species from different parts of the world [11][12][13][14][15] . ISSRs are segments of DNA that are flanked at both side by shorts DNA motifs of 2-5 nucleotide long repeated multiple time called microsatellite 16 . PCR-ISSR is a quick, costeffective and high reproducibility molecular technique based on PCR amplification of such multilocus intermicrosatellite sequences ISSRs are found to be useful in the analysis of genetic variation below the species level, mainly in studying population structure and differentiation 17 . Data on the genetic diversity of B. aegyptiaca germplasm worldwide are very limited. Domyati et al. 29 assessed genetic diversity in 7 medicinal plants, including B. aegyptiaca using ISSR, RAPD, and AFLP. They showed that ISSR markers revealed high genetic diversity in B. aegyptiaca compared to other markers (RAPD, AFLP). Moreover, AFLP markers have been successfully used to assess genetic diversity amongst B. aegyptiaca collected from different geographical regions 18 .
In the present study, we evaluate genetic diversity among and within B. aegyptiaca populations collected from different bioclimatic zones in Mauritania using ISSR markers. The main objective of this study was to provide baseline molecular information for this economically and ecologically important neglected crop for a best conservation and restoration program of Sahelian ecosystem.

Results
polymorphism of the iSSR markers. The 16 ISSR primers used generated 505 polymorphic bands ranged in size between 40 and 3369 bp. The primer BTH3[(AG) 8 TC] produced the lowest number of bands (21) whereas the primer BTH2[(AG) 8 T] showed the highest number (42) with an average of 31.5 fragments per primer ( Table 1). The PIC values ranged from 0.13 to 0.29 with an average of 0.23. Marker Index (MI) and resolution power (Rp) averaged 7.28 (range 3.3-10.3) and 9.9 (range 4.5-14.6), respectively.
At the population level, the number of observed alleles (Na) ranged from 0.62 to 1.39 with an average of 1.12, while the number of effective alleles (Ne) ranged from 1.13 to 1.23 with an average of 1.2 (Table 2) and the private alleles (Pa) ranged from 2 to 14 with an average 8.29. Populations from the Saharo-Sahelian zone (Aghchorguit and Boutilimit) had the lowest (2) and the highest (14) number of private alleles, respectively. Shannon index (I) ranged between 0.13 and 0.24 with an average of 0.21. The highest percentage of polymorphic loci (%P = 68.32%) was observed in the population of Yaghref 2 and the lowest (%P = 30.1%) in the population of Tazyazet (Table 2). www.nature.com/scientificreports/ Genetic differentiation and gene flow. The coefficient of genetic differentiation (G ST ) between populations and gene flow (Nm) were 0.21 and 1.91, respectively. Analysis of molecular variance showed that 20% of the molecular variance was between populations (Table 3).
cluster analysis. The dendrogram based on Jaccard's similarity coefficient ranging from 0.14 to 0.56 was constructed using the whole ISSR data matrix (Fig. 1). The obtained dendrogram divided the populations in to two main groups, the largest one contains five geographically distant populations (Aleg, Aghchorguit, Boutilimit, Yaghref1, Yaghref2) and the second group contains Tazyazet and Chami populations. Tazyzet and Chami both from the Saharan zone. The dendrogram grouped the studied populations into two major clusters and the individuals were separated according to their populations. Some populations of each region were placed into the same sub-cluster and showed high similarities with each other. principal coordinates analysis. Principal coordinates analysis (PCoA) was carried out to provide spatial representation of the genetic diversity among the desert date populations (Fig. 2). The first two principal coordinates accounted for 21.3% of the total variance (12.37% and 8.93%, respectively). Obtained picture confirmed the clustering pattern observed using UPGMA analysis and the classification of the B. aegyptiaca populations into two major groups. Indeed, populations of Chami and Tazyazet from the Saharan zone are genetically distant from the rest of the populations (Aleg, Agchorguit, Boutilimit, Yaghref 1 and Yaghref 2). population structure analysis. To provide further evidence and deduce population structure, Bayesian assignment analyses were used. The maximum log-likelihood given by STRU CTU RE and ∆K method was K = 2 followed by K = 5 indicating that the populations of B. aegyptiaca studied could be grouped into two main populations and five subpopulations (Fig. 3). The populations Tazyazet and Chami form one cluster at K = 2 and at K = 5. Whereas the populations Aleg, Agchorguit, Boutilimit, Yaghref 1 and Yaghref 2 are more similar among them and form one genetic group at K = 2. This group was further separated into 4 sub-clusters at K = 5. The above analyses (UPGMA and PCoA) indicate a similar result and show two main clusters which is consistent with the STRU CTU RE results at K = 2.

Discussion
B. aegyptiaca is a woody plant endemic to Mauritania as well as the Sahel where it plays an important ecological and socio-economic role. In the present study, genetic diversity among seven B. aegyptiaca natural populations from different bioclimatic stages in Mauritania was assessed using a set of ISSR primers. The study revealed significant number of markers (505 polymorphic bands) in the 16 tested ISSR primers compared to that reported in B. aegyptiaca species from Egypt (177 bands and 17 ISSR primers) 29 . Khamis et al. 18 reported 477 polymorphic bands using AFLPs markers in B. aegyptiaca from different geographical origins. However, polymorphism information content (PIC) obtained in our study (0.23), is lower than that reported by Domyati et al. 29   www.nature.com/scientificreports/ for the present study populations (7.3) was very low compared to that of 53.34 found by Domyati et al. 29 in B. aegyptiaca using ISSR marker. The difference in the markers in formativeness and performance between our study and that of Domyati et al. 29 could be the results of the difference in the primers sequences used and the genetic background of the populations tested. Nevertheless, the level of polymorphism found in our study is consistent with that reported for similar long-lived perennial species such as Cork Oak 31 and Baobab (Adansonia digitata) 32 . Moreover, natural populations of B. aegyptiaca in the Saharan zone are subjected to several anthropogenic (the need of wood for fuel), animal (mainly dromedary grazing) and environmental (extreme aridity) pressures, that could be probably the cause of the low observed variability. It is worth noting that B. aegyptiaca trees in this desert region, which probably constitutes the northern limit of the species distribution in the country, are much dispersed and not abundant.
In addition, private alleles, particularly in the Saharan and Saharo-Sahelian populations, were noted. These alleles reveal information about the differentiation of species and could be involved in adaptation of the species to local environment. Also, the private alleles occur as a result of mutation 33 and can be studied to reveal genes of adaptation. These results can be used to identify the ecotypes of this tree.
The AMOVA values also provide an insight into intra-and inter-population differentiation of B. aegyptiaca. Indeed, AMOVA showed that 80% of the total genetic variations was attributed to variations within populations rather than between populations (G ST = 0.21). These findings are comparable to those reported by Alansi et al. 14 34 in Xanthoxylum spp. (G ST = 0.24). Furthermore, it was observed that 100% of the total variation in B. aegyptiaca populations from Egypt was attributed to difference within populations 29 . The high level of the intra-population diversity in B. aegyptiaca corroborate that long-lived, cross-pollinated and widespread plant species maintains high level of intra-population differentiation and low level of inter-population diversity 35 For instance, an allopollination rate of 37% for B. aegyptiaca from Senegal was reported. This cross-pollination is brought about by wind and insects 36 . It is worth noting that geographically distant populations are genetically related as revealed by principal coordinates analysis, UPGMA dendrogram, based on Jaccard's similarity coefficient, and population structure analysis. This finding can be explained by the absence of geographical barriers between the regions studied and effective gene flow (Nm = 1.91). One factor that could maintain gene flow between geographically separated populations is seed dispersal through animal grazing 35 . In the case of B. aegyptiaca, one of the common preferred forage species of Camel dromedary, seeds can be transferred as Camel graze from one land to another during transhumance across Sahara, thus explaining why this species is common in the Sahel and Sahara.
To our knowledge, this is the first study addressing the genetic diversity of B. aegyptiaca from Mauritania. It demonstrates that ISSR markers offer a useful approach for characterizing genetic diversity within and among B. aegyptiaca populations. Further studies for analyzing genetic diversity using other ISSR primers and B. aegyptiaca

Materials and methods
plant material. In this study, we evaluated 91 accessions belonging to 7 natural populations of B. aegyptiaca. These populations were collected from Aleg in the Sahelian zone (rainfall > 200 mm), Boutilimit, and Aghchorguit, in the Saharo-Sahelian zone (rainfall between 100 and 200 mm) and Yaghref1, Yaghref2, Tazyazt and Chami from the Saharan zone (rainfall < 100 mm) (Fig. 4).
The fresh leaves were collected from 10 to 15 trees per population and per study sites. Due to the asexual reproduction mode of the desert date tree through suckering 19 , a minimum distance of not less than 20 m was maintained between sampled trees for the majority of populations. The leaves were transported in a container in presence of ice to the laboratory where they were either stored at − 80 °C or lyophilized for further processing. Data analysis. The binary data matrix was analyzed using GelCompar II software (version 2.5, Applied Maths, Kortrijk, Belgium). Only clear and sharp bands were considered as ISSR markers. They were then coded as 1 for the presence and 0 for the absence. From this binary presence/absence matrix, we calculated the polymorphism information content (PIC) according to the formula, described by Roldán-Ruiz et al. 21  To estimate genetic diversity, five parameters were calculated using GenAlex v6.5 23,24 including the number of alleles (Na), the effective alleles (Ne), the private alleles (Pa), the Shannon's information index (I) and the percentage of polymorphic loci (%P). POPGENE version 1.32 25 was used to compute the coefficient of gene differentiation (Gst) and the gene flow (Nm). Principal coordinate analysis (PCoA) and Molecular Analysis of Variance (AMOVA) were also calculated using the GenAlex v6.5 program.
The data were analyzed using the SIMQUAL (similarity for qualitative data) method to generate Jaccard similarity coefficients. These similarity coefficients were used to construct dendrogram using the Unweighted Pair-Group Method with Arithmetic mean (UPGMA) employing the SAHN (Sequential Agglomerative Hierarchical and Nested clustering) routine from NTSYS-PC v. 2.02 program (Applied Biostatistics, Setauket, N.Y.) 26   37 The STRU CTU RE HARVESTER based on the approach of Evanno et al. 27 indicates Delta K was achieved its highest peak when K = 2 followed by K = 5 (a). (b) Table output of the Evanno method results which Yellow highlight shows the maximum value in the Delta K column. From top to bottom the clusters at K = 2 (b) and at K = 5 (c). Each population is represented by a single vertical bar. The bar is divided into K colors, where K represents the number of genetic groups assumed as identified by the STRU CTU RE program. Population's numbers see Table 2.