Introduction

The origin of the Berber people is not clearly established. North Africa was peopled around the 16th millennium B.C. by a late Palaeolithic culture (Iberomarusian) [1] and then by a more advanced Mesolithic culture (Capsian). Transition to agriculture (Neolithic) occurred around 9,500–7,000 B.C., spreading from the Near East to Egypt. Berbers may be the descendants of the Capsian and the later Neolithic peoples [2]. Berber kingdoms declined under the impact of Greek invasions (457–404 B.C.), Roman Punic wars (264–6 B.C.), and Roman settlements in the area [35]. The Arab invasion (7/8th centuries) brought islamisation and dispersal of the Berber culture, even seeing their leader Tariq invade Spain in 710 A.D. and reaching as far as Poitiers in France.

Present-day populations of North Africa are mostly Arabic-speaking, whatever their remote origin. Berbers, however, with their languages and customs, still live in small niches of Northern Morocco and Algeria, and in some Northern Oases of the Sahara, including those of the Mzab (Algeria). The Tuaregs also speak Berber languages. They inhabit the South of the Sahara and have been involved for centuries in trans-Saharan trade. Tuaregs have their own culture that probably diverged from the Berber world through isolation.

We studied haemoglobin (Hb) variants and blood groups in the population of the Mzab. Due to social, religious, and geographic isolation, it has remained unmixed for centuries and may be one of the most representative of Berber identity. The finding of Hb D-Ouled Rabah, considered as a ‘private marker’ of the Kel Kummer Tuaregs, suggested a common origin.

Material and Methods

We studied 598 children (359 males, 239 females; aged 5–19 years) attending Koranic schools of the Ibadite rite in three oases of the Mzab and in Ghardaia, the main city, where pupils come from all the oases. Sampling was at random and informed parental consent was obtained.

Hb screening was performed by alkaline electrophoresis, eventually completed by isoelectric focusing (IEF) [6]. The Hb D mutation was identified by direct sequencing of PCR-amplified DNA from a homozygote subject. It abolishes a MaeII site and this was used for confirmation in all the Hb D carriers. β-Globin gene haplotypes were based upon the markers shown in figure 1. PCR-RFLPs were determined as described [710], repeats by DNA sequencing [9, 10], and the β-gene framework by denaturing gradient gel electrophoresis [11].

Fig. 1.
figure 1

β-Globin gene haplotypes. Locations of the polymorphic sites that make up the haplotype, together with the restriction enzymes with which they are detected, are shown below the map; + and − indicate the presence or absence of the polymorphic restriction sites, respectively. Sequences of the polymorphic AT repeats in both the (AT)xN12 (AT)y region of the LCR-HS2 and the (AT)xTy region in the β-silencer are also indicated. The β-gene frameworks are indicated by their corresponding sequence and restriction site polymorphisms.

Blood group typing used classical methods. Maximum likelihood allele and haplotype frequencies were estimated using GENEF2, a Lalouel’s adaptation of Yasuda’s ALL-TYPE program [12]. Genetic distances between any pair of populations were obtained both as co-ancestry coefficient or linearised Fst values [13], and a

$${{\rm{D}}_{{\rm{ij}}}} = {1 \over 2}\sum\limits_{{\rm{i}} = 1}^{\rm{k}} {\left| {{{\rm{x}}_{\rm{i}}} - {{\rm{y}}_{\rm{i}}}} \right|} ,$$

where xi and yi are the frequencies of allele or haplotype i in populations X and y, respectively, and k the size of the frequency vector [14]. The two genetic distance matrices were compared by a Mantel [15] test, using the NTSYS package [16]. Dij distances were used for principal co-ordinate analysis [17]. The significance of the genetic distances between populations was tested using a resampling algorithm [18]. The genetic structure of three population groups (Arabic-speaking, Berbers and Tuaregs) was estimated by analysis of molecular variance (AMOVA) [18] with the ARLEQUIN package [Schneider, Kuffner, Roesseli, Excoffier, unpubl. data].

Results

Haemoglobin Analysis

Two Hb variants (Hb C and D) were observed in the Mozabites with a similar frequency (table 1). Carriers belonged to different families and one D/D homozygote was observed. The D variant was identified as Hb D-Ouled Rabah (codon 19 C → A, β19 Asn → Lys) [6, 19], up to now considered a private marker of the Kel Kummer Tuaregs [20]. The genetic background of the mutation was established on DNA from the D/D individual from the Mzab and from an EBV-immortalised cell line from a D/D Kel Kummer individual. In both cases, the mutation was the same. Studied markers included RFLPs, repeats in the LCR HS2 and 5′ to the β-gene, and point mutations in β-IVS2 defining the β-gene frameworks (fig. 1). Only markers 3′ from the mutation are identical in the two populations. In the 5′ region, 5 out of 8 markers are different, including the repeat 639 nt 5′ from the mutation.

Table 1. Red blood cell phenotypes and gene frequencies in the Mozabites

Blood Group Analysis

Results obtained for the Mozabite population are given in table 1 in comparison with those published for the Kel Kummers. Mozabites exhibit a high frequency of allele O (79.3%) and 2.1% of allele K. Among seven RH haplotypes, RO is the most frequent (32.0%), followed by R1 (28.6%) and r (25.3%). These data were compared to those published for geographically related populations. Genetic distances independently obtained from the Dij’s and co-ancestry coefficients are significantly correlated (ABO: r = 0.95, p < 0.001; RH: r = 0.79, p < 0.001).

According to the ABO system, the Mozabites are closely related to other Berber-speaking groups among whom they are genetically intermediate (fig. 2A). For this system, they are not significantly different from the Kel Kummers, but the same result is obtained between Mozabites and other Berbers (No. 2 and 36 in fig. 2A) and non-Berbers (No. 8, 9, 31, 40, and 41). Most Berber groups exhibit high O frequencies, in contrast to the Arabic groups. Among the Berbers, the Tuaregs are genetically the most distant from the latter. Overall, Berber groups are more differentiated from each other (Fst = 0.023, p < 0.001) than the Arabic-speaking groups (Fst = 0.006, p < 0.001), while the Tuaregs are homogeneous (Fst = 0.009, 0.03 < p < 0.04). The genetic differentiation between Arabic- and Berber-speaking groups (including Tuaregs) is low but significant (Fct = 0.013, p < 0.01). However, it loses significance (Fct = 0.009, 0.04 < p < 0.05) when the Tuaregs are removed from the comparison. The Arabic groups vs. the Tuaregs taken alone are highly significantly differentiated (Fct = 0.054, p < 0.001).

Fig. 2.
figure 2

Principal co-ordinate analysis for geographically related populations tested for ABO and RH systems. A 22 populations tested for the ABO system. First axis (horizontal): 88% of total variance; second axis (vertical): 5% of total variance. B 36 populations tested for the RH system. First axis (horizontal): 49% of total variance; second axis (vertical): 17% of total variance. Among Berber-speaking populations (▲, ) Tuaregs are individualised (); Arabic-speaking populations (□). Mozabites (the figures in parentheses indicate the size of the studied population for both blood group systems, or for ABO then for RH when it is different for each system; the references are indicated in square brackets): 1 = Ghardaia, Algeria (531; [present study]); Berbers: 6 = Zaian, Oran, Algeria (985−630; [35]); 7 = Ait Haddidou, Central Atlas, Morocco (256; [51]); 37 = Kossovitch, M’sirda-Fouaga, Algeria (503; [35]); 38 = Gaud, M’sirda-Fouaga, Algeria (191; [35]); 42 = Messerlin, M’sirda-Fouaga (850; [35]); 36 = Arabs M’Sirdas Fouaga, Oran, Algeria (245; [35]); Tuaregs: 2 = Isseqqamaren, Ahaggar, Algeria (160; [36]); 3 = Isseqqamaren, Tassili N–Ajjer, Algeria (129; [36]); 4 = Air, Niger (164−93; [37]); 5 = Kel Kummer, Adras des Iforas, Mali (286; [38]); Arabic-speaking populations: 8 = Algerians, Tidikelt, Algeria (268; [39]); 9 = Algerians, Hoggar, Algeria (132; [40]); 10 = Moroccans, Ksar Glagla (149; [41]); 11 = Mostaganem (127); 12 = Chief (199); 13 = Blida (172); 14 = Guelma (262); 15 = Jijel (168); 16 = Tiaret (114); 17 = Sidi bel Abbes (112); 18 = Medea (104); 19 = Tizi Ouzou (455); 20 = Constantine (220); 21 = Bouira (186); 22 = Tlemcen (137); 23 = Algiers (315); 24 = Tebessa (125); 25 = Annaba (135); 26 = Batna (155); 27 = Setif (333); 28 = Skikda(148); 29 = Bejaia (164) (11–29 = Algerian samples from Aireche et al. [42]); 30 = Egyptians, Mansurah (250; [43]); 31 = Arabs Chaamba, Metlili, Algeria (232; [35]); 32 = Tunisians (1986-474; [44, 45]); 33 = Libyans, Benghazi, Tripoli (168; [46]); 34 = Libyans, Benghazi (6,000−2,071; [47]); 35 = Moroccan Jews, Tafilalet (146; [48]); 39 = Reguibat, M’Sirda-Fouaga, Algeria (401; [35]); 40 = Algerians, Saoura (293; [49]); 41 = Towara Bedouins, Sinai (202; [50]).

Overall, the analysis on the RH system (fig. 2B) reveals a similar pattern. The Mozabites are genetically close to the other Berbers. But in this analysis, the Kel Kummers occupy a peculiar position due to a high R1 and a low r frequencies. Accordingly, unlike for ABO, the genetic distance between Mozabites and Kel Kummers is highly significant (p < 0.001). Other differences are observed compared to ABO. Berber and Arabic groups show comparable levels of intra-group diversity (Arabs: Fst = 0.033, p < 0.001; Berbers: Fst = 0.031, p < 0.001). Indeed, some Arabic-speaking populations in Southern Algeria and the Moroccan Atlas (No. 8, 9, 10; fig. 2) are very distant from the others and close to the Berbers. Thus, Arabs and Berbers are not significantly differentiated (Fct = 0.012, p = 0.18). This also stands when removing the Tuaregs (Fct = 0.008, p = 0.14), and even when comparing Arabic groups to the Tuaregs alone (Fct = 0.021, p = 0.19), apparently because of the peculiar Kel Kummer RH distribution.

Discussion

Next to blood groups, Hb variants have been studied more extensively and in more human populations than any other genetic marker. Common variants (Hbs S, C, E, A2, D-Punjab &) are found in polymorphic frequencies among populations geographically widely distributed [21]. Some rare variants are frequent in specific populations or isolates because of genetic drift and/or founder effect. Distribution of Hb variants in Algeria reflects successive waves of migration, endogamy, and selective pressure by malaria [22]. Up to now, little was known from the Southern desert part of the country. Our study in the Mzab reveals only two variants, Hb C and D, with a similar frequency. The largest epidemiological study in Algeria included 69 subjects of Mozabite origin in whom only Hb C was observed (gene frequency 0.051 vs. 0.014 in our series) [23]. As in our study, Hb S was not observed in the Mozabites.

Hb C presence in North Africa probably relates to war expeditions and slave trade from its epicentre on the Voltaic plateau during the 15th century [21, 2426]. Its high frequency among Mozabites may be the result of genetic drift or it may indicate that their isolation was not as tight as believed. Alternatively, as for Hb S in Sicily [27], it may have been introduced at a low frequency and may have later expanded by selective pressure.

Finding Hb D-Ouled Rabah in the Mzab is surprising as it was considered a ‘private marker’ of the Kel Kummer Tuaregs [28]. The β-globin haplotype surrounding the mutation is different in the two ethnic groups, but all the differences are 5′ from the mutation. A sporadic case of Hb D-Ouled Rabah was reported in China [29], but the hypothesis of independent mutations in two linguistically and geographically close populations (Mozabites and Kel Kummer) is improbable. A hot spot of recombination spans over 9.1 kb from 5′ to the δ-gene to 5′ of the β-globin gene [30]. Thus, a common origin to the mutation is likely, and recombination must have occurred 3′ of, or within, the β-silencer AT repeat. Its origin should be rather ancient to explain the high gene frequency of a rare variant in two groups considered as ethnic isolates.

As this observation suggests a link between the settled Mozabites and the nomadic Tuaregs, we determined blood group frequencies in the Mozabites, and compared our results to those published for the Tuaregs. This comparison was also extended to all the data reported to date concerning Berber- and Arabic-speaking groups in North Africa. Independent analyses of the ABO and RH data converge to show that the Mozabites are close to other Berber-speaking populations. However, they do not confirm a particularly close relationship between Mozabites and Kel Kummers. Actually, most Tuareg communities, although speaking Berber languages, exhibit common genetic features which tend to differentiate them from other Berbers and relate them to sub-Saharan Africans. Cavalli-Sforza et al. [2] propose a common origin between the Tuaregs and an Afro-Asiatic population, the Beja from Sudan. The Tuaregs may have differentiated from the Beja 5,000 years ago and migrated westwards where they were exposed to a Berber influence. Previous results on RH polymorphism indicate that the Beja are genetically closer to North Africans than all other tested Afro-Asiatic populations, except the Tutsi [31, 32]. Another explanation would be a common origin with the Berber people, and a differentiation of the Tuaregs due to their nomadic way of life and higher isolation in the Sahara desert.

Among the Tuaregs, the Kel Kummers distinguish themselves by one of the highest O and R1 frequency ever observed in North Africa, and by the apparent lack of K antigen. A founder effect has been invoked to explain the high frequency of Hb D-Ouled Rabah in the Kel Kummers, and the analysis of blood groups and other classical polymorphisms (for example HLA) reinforces this hypothesis [33, 34].

In conclusion, the finding of Hb D-Ouled Rabah in the Mozabites contradicts the idea that it is a Kel Kummer private marker. However, these two populations are not particularly close genetically. Hb D-Ouled Rabah may be sporadic in North Africa [19], and eventually specific to Berber-speaking populations.