Introduction

Mitochondrial DNA (mtDNA) and the non-recombining portion of the Y-chromosome (NRY) are unipaternally inherited by females and males, respectively, allowing the re-construction of sex-specific demographic patterns. Usually, genetic distances among human populations are significantly larger for NRY than for mtDNA. This has been explained by a higher rate of female versus male migration mainly because of patrilocality, a common mating rule in food producer communities.1 In this respect, the Berber populations of North Africa deserve particular attention. It seems that, since the Neolithic transition, sedentary and pastoralist Berbers lived in largely isolated small tribal groups composed of a few paternal familial clans, scattered throughout the Maghreb, mainly in the present day countries of Morocco, Algeria and Tunisia. In spite of the successive cultural influences of the Punic, Roman, Arab and Ottoman colonizers, this demographic structure remained until recently.2 As geographic barriers are less important in Tunisia than in Morocco and Algeria, Tunisia suffered the strongest Arabization process. In fact, Berber speaking communities in this country are limited to a few southern villages and to the south eastern island of Jerba. In spite of this, all the population genetics analyses carried on in Tunisia, using different markers and Arab or Berber speaking ethnic groups, have evidenced a strong genetic structure weakly affected by the Arab domination.3, 4, 5, 6 Only the present day massive migration from rural villages to large cities, within each country and abroad, might have attenuated this strong tribal identity. It is expected that changes in genetic structure from rural to urban communities might have important consequences in demographic studies. Sampling of rural villages will produce a fine-scale genetic patchiness, whereas sampling in urban towns will show a more uniform landscape.

To complete the study of the Tunisian Berbers, here we analyzed the mtDNA and Y-chromosome profiles of two other Berber isolates from central Tunisia. After this, we compare the global patterns of these uniparental lineages in Berber and Arab speaking communities and in rural and urban areas of this country to assess the relative importance of geographic and cultural barriers in its genetic structure.

Materials and methods

Samples

Total blood samples were taken from eighty males from the rural areas of Bou Omrane (40) and Bou Saâd (40) belonging to the Governorate of Gafsa in central Tunisia. DNA extractions were carried out following a protocol based on the use of proteinase K, dithiothreitol and sodium dodecyl sulfate.7 A total of 93 DNA samples (46 Arab speaking and 47 Berber speaking individuals), from unrelated native islanders from Jerba, previously analyzed for six Y-STR (short tandem repeats) loci,8 were included in the present study, taken into account that 59 of them were previously analyzed for mtDNA variation in Loueslati et al.6

mtDNA and NRY analyses

mtDNA amplification, first hypervariable region sequencing and restriction fragment length polymorphism characterization were accomplished as in González et al.,9 and sequences sorted into haplogroups following van Oven and Kayser.10

For NRY analysis, 17 biallelic markers (M2, M9, M17, M34, M45, M60, M78, M81, M89, M96, M172, M173, M267, M269, M304, SRY2627 and SRY10831) and an Alu polymorphism were hierarchically characterized.11 Y-chromosome haplogroups were designated according to Karafet et al.12 STR amplification was carried out using the AmpFlSTR Yfiler kit (Applied Biosystems, Foster City, CA, USA), following the manufacturer recommendations. The amplified products were compared with commercial ladders (Applied Biosystems) using GeneMapper ID Software v3.2 (Applied Biosystems). The GATA H4.1 locus nomenclature is in accordance with the International Society of Forensic Genetics recommendations.13

Data analysis

In addition to our own samples, 586 individuals for mtDNA and 768 for NRY from both, rural and urban areas of Tunisia were taken from the literature and used for comparisons (Table 1). Additional information and references for these samples are detailed in Supplementary Table S1.

Table 1 Tunisian samples analyzed (a) for mtDNA and Y-chromosome and (b) for mtDNA or Y-chromosome

For genetic comparisons with mtDNA published data, only first hypervariable region positions from 16 024 to 16 383 were taken into account. In addition, transversions 16 182C and 16 183C and indels within 16 184–16 193 were not considered. For Y-chromosome STR (Y-STR) analysis, haplotypes were determined taking into account 6 (DYS19, DYS389I, DYS390, DYS391, DYS392 and DYS393), 9 (by the addition of DYS389b, DYS438 and DYS439), 12 (by further addition of DYS385a, DYS385b and DYS437) or 17 loci (by further addition of DYS448, DYS456, DYS458, DYS635 and GATA H4.1), depending on the available information.

In all the analyses, small villages (<15 000 inhabitants) or groups of small villages from rural areas in the same Governorate were considered as rural populations. Samples from large towns such as Tunis or Sfax, mixed samples taken from different Governorates, or those taken from migratory populations outside Tunisia, that form heterogeneous pools, were considered as urban (Supplementary Table S1). To perform this classification we used the 2004 Tunisian population census (http://www.ins.nat.tn).

Gene diversity (h) of populations were calculated according to Nei,14 as implemented in Arlequin 3.11 software.15

Arlequin 3.11 package was also used to calculate genetic distances and to perform analyses of molecular variance (AMOVA). Genetic distances were estimated using haplogroup- and haplotype-frequency-based linearized FSTs.16 For both mtDNA and NRY data, two hierarchical AMOVA designs were performed: one in which populations were assigned into ethnic groups (Arabs, Berbers and Andalusians), and one in which populations were assigned into a rural or an urban. Following Wilkins and Marlowe,17 estimates of sex-biased migration, based on partitioning of genetic variance, were calculated for populations analyzed for both mtDNA and NRY. The male/female Nm ratio was calculated assuming that Nm=(1/FST)−1. All the above mentioned analyses were performed taking into account the four combinations of Y-STR available (6, 9, 12 and 17 Y-STRs).

Mantel test with 1000 permutation steps was used to test the correlation among the genetic distances FST/(1–FST)18 and the logarithm of geographic distances, as implemented in the Arlequin software.

Multidimensional scaling analysis, based on mtDNA and Y-chromosome haplotypic genetic distances, were performed using the SPSS, PASW Statistics 18 version, software package.

Results

mtDNA diversity in Bou Omrane and Bou Saâd

Mitochondrial haplotypes and their haplogroup assignation for the Berber samples analyzed in this study are displayed in Supplementary Table S2. In spite of its geographical proximity (Figure 1), there is a great differentiation between samples. The Eurasian haplogroup H (47.5%) and the sub-Saharan African haplogroup L3b (15%) are the most abundant in Bou Omrane, whereas the sub-Saharan African haplogroup L0a1b (27.5%) and the Eurasian haplogroup V (17.5%) are majority in Bou Saâd. However, the North African autochthonous haplogroup U6 is present in both populations with frequencies of 5% (Bou Omrane) and 7.5% (Bou Saâd). In all, only four haplotypes (within 16 024–16 383 segment) are shared between the populations. Congruently, FST distances at haplogroup (0.180; P<0.001) and haplotypic (0.119; P<0.001) levels are highly significant. On the other hand, their diversity values, in the lowest range of all the Tunisian population analyzed (Supplementary Table S3), suggested a high degree of genetic drift. It has been stated that isolation and inbreeding are the main characteristics of the Tunisian rural communities.6, 19 However, when our FST values are compared with those obtained from a Berber and an Arab sample from Jerba, therefore, also in close geographical proximity and studied at the same level of mtDNA resolution,6 their haplogroup (0.048; P<0.01) and haplotypic (0.000; P=nonsignificant (ns)) differentiation are one order of magnitude lower than ours.

Figure 1
figure 1

Map showing the geographical situation of the rural populations studied. Those analyzed for mitochondrial DNA are represented by circles, those for non-recombining portion of the Y-chromosome by triangles, and those for both markers by the superposition of the two symbols. Backgrounds are as follows: black for Arabs, white for Berbers, stripped for Andalusian and gray for unknown ethnicity.

Y-chromosome diversity in Bou Omrane and Bou Saâd

Figure 2 shows the Y chromosome haplogroup distribution in Bou Omrane, Bou Saâd and the Arab and Berber Jerbian samples studied here. Haplotypes obtained for Bou Omrane and Bou Saâd, based on 17 Y-STRs, are listed in Supplementary Table S4. In contrast to mtDNA data, the autochthonous North African haplogroup E-M81 is the most abundant in all four samples. In principle, this fact reinforces, once more, the supposition that the Arab domination of the North African Berber communities had more a cultural than a demic impact.19, 20 At haplogroup level, there are not significant differences between Bou Omrane and Bou Saâd (0.005; P=ns), nor between Jerba samples (0.019; P=ns). However, when their haplotypic composition, based on six common STRs,8 are taken into account, significant differences exist as much between Bou Omrane and Bou Saâd Berbers (0.059; P<0.05), as between the Arab and Berber Jerba samples (0.034; P<0.01). In addition, distances between Berbers from Jerba and Berbers from Bou Omrane and Bou Saâd (0.631±0.179) are similar (t=0.99; ns) to those between Arabs and Berbers from different areas (0.469±0.147). Finally, haplotypic diversities of the Jerbian Arabs (0.96±0.02) and Jerbian Berbers (0.83±0.05) are rather similar and significantly higher (t=4.6; P<0.05) than those obtained for Berbers of Bou Omrane (0.23±0.09) and Bou Saâd (0.52±0.09). Likewise mtDNA, Y-chromosome results indicate that the Bou Omrane and Bou Saâd Berber communities are more isolated and endogamic than the Arab and Berber Jerba communities.

Figure 2
figure 2

Y-chromosome tree, taken from Karafet et al.,12 representing the genealogical relationships of the haplogroups characterized in this study and their absolute frequencies in Bou Omrane (OmB), Bou Saâd (SâB), Jerbian Arabs (JeA) and Jerbian Berbers (JeB). Asterisk indicates underived lineages.

High female and male population structure in Tunisia

Mitochondrial AMOVA haplotypic analysis, involving the whole 14 rural Tunisian populations, available as a sole group (Table 2), indicated that <3% (P<0.001) of the variance is explained by differentiation among populations. The mean value ( × 100), of all possible pair-wise FST distances (data not shown) was 2.72±0.28, being 56% of them statistically significant. This important genetic structure of the female sub-population, found within a small country as Tunisia, is at the same range that those found when comparing samples from different European countries.6 When the same analysis is applied to haplotypes defined by three different sets of Y-chromosome STRs (Table 2), the heterogeneity found for males is much higher than the above mentioned for females. The variance ascribed to among populations differentiation, ranges from 13.1 to 9.0% being highly significant in all cases (P<0.001). Average ( × 100) pair-wise FST distances (data not shown) ranged from 18.3±2.9 to 10.8±1.8, being around 91% of the comparisons statistically significant. However, when pair-wise FST distances are graphically represented as bidimensional plots, it can be observed that for mtDNA (Figure 3a), differentiation is mainly because of the strong divergence of Berbers from Bou Saâd, Bou Omrane and Chenini-Douiret isolates. In fact when these samples are excluded, average FST distances diminished to 0.96±0.09, being only 29% of them statistically significant. Nevertheless, for the Y-6-STR set (Figure 3b), divergence among populations is a more global issue, although distances were greater between rural than between urban samples. Previous studies on patterns of autosomic genetic variation in African populations have been explained by cultural, mainly linguistic and geographical barriers.21 More recently, the contrasting patterns of mtDNA and Y-chromosome have evidenced sex-biased demographic processes involving asymmetric gene flow between populations, patrilocality and polygyny.22, 23, 24, 25

Table 2 AMOVA haplotypic analysis and mean pairwise FST distances between rural Tunisian populations
Figure 3
figure 3

Multidimensional scaling plot based on mitochondrial DNA (a) and Y-chromosome six STR (b) haplotypic FST distances. Codes are as in Table 1.

Lack of female and male ethnic differentiation in Tunisia

Arab and Berber speaking communities are the main ethnic groups in Tunisia. In addition, some Andalusian communities, of Berber or Arab origin, settled in Tunisia in the early 17th century after being expelled from the Iberian Peninsula, have also been studied for Y-chromosome and mtDNA markers.3, 26 From previous analyses, using different sets of autosomic and uniparental markers, it was deduced that the ethnic component have a minor role in the genetic structure found in the Tunisian population.3, 6 In agreement with those results, the more complete partition of the haplotypic variation of the mtDNA and the Y-chromosome by AMOVA analyses performed here, involving three ethnic Tunisian groups (Berbers, Arabs and Andalusians), showed a lack of significant male and female differentiation among ethnic groups, contrasting with the high heterogeneity found among populations within groups (Table 3).

Table 3 Percentage of variation within and among ethnic groups

Lack of male correlation between genetic and geographic distances in Tunisia

To assess whether the male and female patterns of Tunisian populations agree with a model of isolation by distance, we tested the correlation between geographic and haplotypic FST pair-wise distances. When using all the samples, Mantel tests gave no significant correlation coefficients for mtDNA (r=0.01; P=0.40, ns) nor for the different sets of Y-STRs (6 (r=0.20; P=0.08, ns); 9 (r=−0.03; P=0.60, ns); 12 (r=0.02; P=0.41, ns)). However, when the more divergent samples of Bou Omrane and Bou Saâd were excluded, a significant correlation between geographic and genetic distances was detected for mtDNA (r=0.23; P=0.006 <0.01) but not for any of the male Y-STR sets (6 (r=0.19; P=0.125, ns); 9 (r=−0.22; P=0.69, ns); 12 (r=−0.55; P=0.94, ns). Congruently, a lack of correlation was also found (r=0.11; P=0.314, ns) when a Mantel test comparing Y-chromosome and mtDNA genetic distance matrices was performed.

Patrilocality might explain the Nm male/female migration ratio found in Tunisian populations

Table 4 shows the Nm values obtained for male and female samples from the same populations. In all cases, the Nm ratio of males versus females gives values significantly <1, which has been taken as a crude measure of higher female versus male migration across populations.17 As the population genetic structure is also higher in males than in females and, as there is a stronger lack of correlation among genetic and geographic distances in males than in females, it seems that patrilocality, a common marriage system in Tunisian communities, might be the main responsible of this sex-biased population structure. It is worth mentioning that this mating system was also invoked as the main cause of the asymmetric mtDNA and Y-chromosome patterns found in Bedouin tribal groups from Sinai compared with Nile Delta and Valley Egyptian communities.27

Table 4 Male versus female Nm ratio for different sets of markers

Evidence of endogamy in rural versus urban Tunisian populations

Kinship and tribal norms have kept Tunisian rural communities as endogamic isolates. However, these strict social rules weaken, after few generations, when people migrate to urban centers. This should produce the scrambling of the rural genetic variation, increasing genetic diversity in urban towns, but decreasing genetic structure among them. These predictions are fully confirmed. The mean genetic diversity for mtDNA (0.94±0.01) and for the most representative six Y-STRs set (0.70±0.09) in rural samples are significantly lower (P<0.001 and P<0.05, respectively) than the same mtDNA (0.99±0.00) and six Y-STRs (0.95±0.01) values in urban samples. On the other hand, AMOVA analyses show that the percentage of variation among populations for mtDNA (2.8%; P<0.0001) and for six Y-STRs set (13.1%; P<0.0001) in rural samples are also higher than those obtained for mtDNA (0.3%; P=0.065) and six Y-STRs (0.4%; P=0.015) in urban areas. Similar results have been obtained when rural and urban samples have been compared in Iran28 and in Jordan.29, 30

Discussion

Phylogeography of mtDNA and Y-chromosome haplogroups in Tunisia

Likewise to other North African regions, the global Tunisian mtDNA (Supplementary Table S5) and Y-chromosome (Supplementary Table S6) haplogroup profiles can be subdivided in three main regional components. To begin with, there is a sub-Saharan Africa contribution comprising all L mtDNA lineages (34.1%) and the Y-chromosome haplogroups A, B, E-M96, E-M2 and E-M35 (16.3%). There is also a native North African component represented by the U6 and M1mtDNA haplogroups (8.2%) and by the Y-chromosome E-M81 haplogroup (48.2%). The resting lineages have to be assigned to a more generalized western Eurasian origin. Within them, the mtDNA R0a and J1b haplogroups and the Y-chromosome J1-M267 haplogroup, the most representative clades in the Arabian Peninsula11, 31 might be taken as result of Arab gene flow mediated by the spread of Islam at historic times, although the influence of previous Paleolithic and Neolithic spreads from the Near East cannot be discarded.32, 33, 34 There is evidence that the sub-Saharan African gene flow in Tunisia shows a strong sex bias, being the women contribution (34%) significantly (P<0.0001) larger than the men contribution (16%). In contrast, the putative Arab influence seems to be mainly male-driven (P<0.0001), as mtDNA lineages R0a and J1b together only represent 3% of the Tunisian female gene pool, whereas J1-M267 lineages amount 17% of the Tunisian male gene pool. Taking the present-day frequencies of these lineages in the Arabian Peninsula as representative of those carried to North Africa by the 7th Century Islamic expansion, the Arab male genetic input on Tunisia could be as high as 38%, whereas the female counterpart was significantly lower ranging from 13 to 17%.

Male and female regional differentiation

When the global Tunisian male and female gene profiles are compared with those of Morocco at the West and Libya and Egypt at the East (Supplementary Tables S5, S6 and Figure 4), it is evident that the mtDNA sub-Saharan African L haplogroups show a significant higher frequency in Tunisia compared with both, Morocco (P<0.0001) and Egypt (P<0.0003). In contrast, the corresponding male contribution, around 15%, is rather uniform in the whole area even including the Arabian Peninsula but with the exception of the endogamic sample representing Libya.35 In addition, several geographic haplogroup frequency trends can also be appreciated. Although the mtDNA haplogroup H and the autochthonous female U6 and male E-M81 haplogroups show a decreasing gradient from west to east, as previously detected,36, 37, 38 other lineages present decreasing gradients from east to west. Some of them, as the male haplogroups E-M78 and F-M89 and the female haplogroups M1 and T, show the highest frequencies in Egypt. Others, as the male and female J lineages, the male P-M45 and the female R0a haplogroups, show their highest frequencies in Arabia. These patterns are in agreement with the supposition that the most important genetic expansions affecting North Africa since Paleolithic times had a near eastern provenance,32, 36, 38, 39, 40 with only minor western influences.26, 37, 41, 42

Figure 4
figure 4

Mitochondrial DNA (a) and Y-chromosome (b) haplogroup frequencies (%) for total Tunisian sample and for nearby regions: Morocco (MOR), Libya (LIB), Egypt (EGY) and Arabian peninsula (ARP).

Sex-biased differences between Tunisian rural and urban populations

Accepting that urbanization is a relatively recent process, and immigration from villages to towns is not biased, we might take urban populations as pools of those from rural isolated areas. Under this assumption, it is expected that total haplogroup frequencies from rural and urban populations should be alike. However, this is not the case for Tunisia (Supplementary Tables S5 and S6). To begin with, the sub-Saharan African component shows a strong sex bias. Male sub-Saharan African lineages are significantly (P<0.0001) more abundant in the rural conglomerate, whereas sub-Saharan African female lineages in the total urban population strongly exceed (P<0.0001) those present in the rural population. This is congruent with the proposed easier diffusion of women from different social groups through polygyny and patrilocality.22, 23 The same trend is observed between autochthonous male and female lineages. Although frequencies for the maternal lineages M1 and U6 are rather similar in rural and urban total populations, rural males carrying the E-M81 lineage are in significantly (P<0.0001) higher frequencies than urban carriers. Another interesting sex-biased contribution is also evident for the putative Arab contribution. There is again a lack of significant differences between rural and urban females for mtDNA haplogroup J1b and R0a frequencies, whereas the Y-chromosome J1-M267 lineage is significantly (P<0.0001) more abundant in the urban than in the rural total population. These results could be explained supposing that Arabization in Tunisia was a military enterprise, therefore, mainly driven by men that displaced native Berbers to geographically marginal areas but that frequently married Berber women. This scenario seems congruent with the history of the Arab domination in North Africa.43 These data suggest that the demic impact of the Arab rule, at least in Tunisia, could be higher than that previously supposed. However, this influence was not detected when Arab and Berber speaking communities were compared.

Microdifferentiation within rural Tunisian communities

In addition to the differences found between the total rural and urban populations, a significantly larger microdifferentiation exists within rural samples compared with urban samples as exemplified by the large differences in male and female haplotypes frequencies found here between the neighboring populations of Bou Omrane and Bou Saâd. It is evident that geography is not the main responsible of this high rural genetic structure, which is better interpreted as the consequence of socio-cultural factors.

During centuries, the optimal exploitation of land and livestock promoted endogamic tribal structures in Tunisia. To keep inheritance undivided, male primogeniture was common practice and patrilocality and polygyny social rules. As a consequence, women interchange between tribes was more frequent than men interchange. All these cultural rules, also observed in Arab societies, are faithfully reflected in the genetic structure of the Tunisian populations that show a higher male than female genetic differentiation, not significantly correlated with geography, and a loss of genetic diversity within communities due to genetic drift and endogamy mainly promoted by cultural isolation. However, it seems that these cultural and genetic structures are being quickly eroded in urban areas.

Accession numbers

The eighty new first hypervariable region mitochondrial DNA sequences are available under GenBank accession numbers: JN233721-800.