Introduction

The increasing resolution of the Y-chromosome phylogeny and the fine-grained sampling strategies are refining the worldwide geographical distribution of this male-specific marker. Previous continental-range studies detected broad and smooth haplogroup frequency clines (Malaspina et al. 1998; Rosser et al. 2000; Semino et al. 2000; Underhill et al. 2000; Karafet et al. 2001; Wells et al. 2001). However, studies focused on discrete regions (Stefan et al. 2001; Weale et al. 2001; Qamar et al. 2002; Zerjal et al. 2002; Di Giacomo et al. 2003; Cinnioglu et al. 2004; Flores et al. 2004; Nasidze et al. 2004, 2005) and on specific haplogroups (Cruciani et al. 2004; Di Giacomo et al. 2004; Rootsi et al. 2004; Semino et al. 2004) have found geographic barriers, strong microdifferentiation, and additional minor clines, stressing the importance of local isolation and secondary diffusions in the making of the modern Y-chromosome landscape.

In the Middle East, Syria, Lebanon, Israel, Palestine, and Jordan conform the Levant, a region with a rich archaeological record of continued human occupation since Paleolithic times. In particular, the central position of Jordan in this area made it an important human migratory route throughout history. In ancient times, it served as a major trade route with neighboring civilizations and empires (Redman 1978). In recent times, during the past 100 years, it received many refugees as a result of wars and civil conflicts in the neighboring countries. Jordanians are Arabs, except for a few small communities of Circassians, Armenians, and Kurds, which represent <1% of a 5.2 million population. Around half of the Jordanians are of Palestinian origin, and about 70% of the population is urban while less than 6% of the rural population is seminomadic.

Classical genetic studies (Cavalli-Sforza et al. 1994 and references herein) have substantiated our views about the Middle East, which is considered a crucial region for gene flow among surrounding areas. However, the Y-chromosome analyses of Middle Eastern populations have focused mainly on comparisons of the Jewish genetic pool with other populations from the same area (Hammer et al. 2000; Nebel et al. 2001; Shen et al. 2004). In the present article, we add high-resolution Y-chromosome information in this region by analyzing two samples from Jordan using 46 binary markers.

Materials and methods

Samples

We analyzed a total of 146 unrelated Jordanian samples of whom 101 were from the capital, Amman, and 45 were from the Dead Sea in the Jordan Valley. Informed consent was obtained from all the donors. The inhabitants of the Amman metropolitan area are most representative of the Jordanian population as a whole because most of the population of Amman migrated from various regions of Jordan during the past 50 years seeking better economic and educational opportunities. In contrast, the dwellers of the Jordan Valley show much less genetic heterogeneity. Residents of the Jordan Valley number about 110,000 people who live 390 meters below sea level and therefore under mild chronic hyperoxia. There has been little immigration from other areas in Jordan to the Jordan Valley and so its people might keep the genetic features of the original inhabitants of this region. Twelve additional populations from the Middle East and surrounding regions were taken from the bibliography and used for comparisons (Table 1). In order to make such comparisons reliable, populations were pooled by countries or ethnic groups to keep large sample sizes. Analyses were performed at low and high resolution levels. For low resolution, haplogroup frequencies were reduced to the same level as the sample with lowest haplogroup subdivision. For high resolution, populations with similar levels of haplogroup discrimination as our Jordan sample were used as published. For those analyzed at lower levels, the haplogroup frequency in the subsample with less phylogenetic resolution was subdivided according to the subhaplogroup proportions observed in the phylogenetically more resolved subsample from the same region whenever there were not significant haplogroup frequency differences between subsamples at the low-resolution haplogroup level. Further subhaplogroup subdivisions were carried out using samples, from the same origin, that were studied at high-resolution level only for particular haplogroups. In three instances, the subdivision of haplogroup E* had to be inferred from those of their closest neighbors. The Iranian frequencies were extrapolated from Azeri (Cruciani et al. 2004), the Syrian from a mean of Palestinian and Druze Arabs (Cruciani et al. 2004), and that of Kurds from southeastern Turks (Cruciani et al. 2004).

Table 1 Y-chromosomal haplogroup frequencies and diversities in Middle East and adjacent populationsa,b,c

Biallelic markers

A total of 46 binary markers have been genotyped (Fig. 1). Procedures for 44 of them were detailed elsewhere (Flores et al. 2003, 2004; Alonso et al. 2005). In addition, M17 was typed by PCR–RFLP using a mismatch primer (Thomas et al. 1999) and P25 by sequencing using the BigDye Terminator kit v.3.1 and an ABI Prism 310 Genetic Analyzer. According to the Y-Chromosome Consortium (YCC) standarized nomenclature (YCC 2002; Jobling and Tyler-Smith 2003), lineages are referred to by haplogroup and terminal mutation. For comparisons with previous works, we used the marker correspondences recorded in the YCC.

Fig. 1
figure 1

Genealogic relationships, nomenclature, and frequencies of the Y-chromosome binary haplogroups. Only informative markers are included. The status of the underlined marker was inferred. Markers P25, Tat, SRY10831.2, SRY2627, M2, M20, M26, M52, M65, M70, M92, M107, M122, M124, M148, M153, M163, M165, M166, M175, M224, M342, and M377 were also genotyped but not observed

Statistical analysis

Population genetic diversity measured as H (Nei 1987), analysis of molecular variance (AMOVA) (Excoffier et al. 1992), and pairwise FST genetic distances (Slatkin 1995) based on haplogroup frequencies were calculated using ARLEQUIN 2000 package (Schneider et al. 2000). Principal component (PC) analysis was performed using the SPSS package 11.5 (SPSS, Inc.).

Results

Y-haplogroup frequencies in Jordan

The haplogroup phylogenetic relationships and frequency distributions in Amman and the Dead Sea samples are depicted in Fig. 1. Only a subset of six of the 18 clades found in Amman are also present in the Dead Sea region. Whereas the J subhaplogroups characterized by mutations M267 and M172 are the modal haplogroups in Amman, as in all studied Middle East populations from the bibliography, R1*-M173 and E3b3a-M34 are the most abundant haplogroups in the Dead Sea. A quantification of the heterogeneity using AMOVA showed that 14.7% (P<0.0001) of variance is found between populations, which highlights the divergence of these populations.

The Jordan position in the Middle East

In general, we did not detect great frequency deviations in the processes of merging samples from the same geographical area and of subdividing haplogroup frequencies into those of their subclade components. A notable exception was I-M170 in Iran. Although this haplogroup was not detected in a sample of 52 chromosomes from different Iranian regions and only at 2% in Iranians from Samarkand (Wells et al. 2001), it has been recently found in Teheran and Isfahan with a mean frequency of 17.6% (Nasidze et al. 2004). There were also appreciable differences for several markers, including M170, among Kurdish populations that could not clearly be attributed to geography or language (Nasidze et al. 2005). Small sample sizes for some of the populations could be one of the causes of that heterogeneity. AMOVA analysis of the 14 populations studied showed that 89% of the variance is found within populations and 11% among populations. The significant overall FST=0.107 (P<0.001) obtained also points to a notable geographic structuring. This regional heterogeneity is confirmed by the pairwise genetic distances between populations both at high and low levels of haplogroup resolution (Table 2). Again, the difference between Amman and the Dead Sea is evident, as are their relationships to the other populations. While the Dead Sea sample shows significant differences in all pairwise comparisons, Amman does not significantly differ from their geographic neighbors Syria and Palestine using full resolution data nor to Iraq, Lebanon, Syria and Palestine when collapsed haplogroup information is used. Notice that, as expected, significance levels are higher when the total haplogroup information is taken into account. Figure 2 shows the results of PC analysis with both sets of data. In general, results are highly congruent: the first component clearly separates the Middle East eastern regions (Pakistan, Iran, and Kurds) from the African samples and Oman. The main haplogroups responsible for the positive displacement are R-M17, H-M52, P-M45, and L-M20, with high frequencies in areas to the northeast of the Middle East, and the negative displacement is mainly driven by African haplogroups. The second component again displaces Oman, Somalia, and Egypt, mainly due to their relative abundance of lineages with sub-Saharan African adscription, leaving a core of samples that include all Levantine and Fertile Crescent populations and Greece. Once more, the outlier position of the Dead Sea sample is reflected. Furthermore, after Somalia (0.387), this area has one of the smallest genetic diversities of the region (Table 1).

Table 2 Pairwise FST values between populations without (above diagonal) and with (below diagonal) collapsed dataa. ns not significant
Fig. 2
figure 2

Principal component plots based on Y-chromosome haplogroup frequency data in Middle East and adjacent populations: a expanded data; b collapsed data

Discussion

Intrapopulation differentiation in Jordan

As Bedouin tribes had an important role in the colonization of southeast Jordan, it could be that the haplogroup composition of the Dead Sea reflected genetic affinities to them, but that is not the case. The most striking characteristic of the Dead Sea sample is the high prevalence of R1*-M173 lineages (40%), contrasting with the lack of them and of its derivates R1b3-M269 in Bedouin from Nebel et al. (2001) and its low frequencies in Amman. It is worth mentioning that until now, similar frequencies for R1*-M173 have only been found in northern Cameroon (Cruciani et al. 2002). The possibility that the Dead Sea and Cameroon are isolated remnants of a past broad human expansion deserves future studies. Interestingly, when the molecular heterogeneity of the G6PD locus was compared between the Amman and the Dead Sea samples, a lower number of different variants and a higher incidence of the African G6PD-A allele was detected in the latter (Karadsheh, personal communication). Another singularity of the Dead Sea is its high frequency (31%) of E3b3a-M34, a derivate clade of E3b3-M123 that is only found in 7% in Bedouins (Cruciani et al. 2004). Until now, the highest frequencies for this marker (23.5%) had been found in Ethiopians from Amhara (Cruciani et al. 2004). On the contrary, most Bedouin chromosomes (63%) belong to haplogroup J1-M267 (Semino et al. 2004) compared with 9% in the Dead Sea. All these evidences point to the Dead Sea as an isolated region perhaps with past ties to sub-Saharan and eastern Africa. Strong drift and/or founder effects might be responsible for its present anomalous haplogroup frequencies. This strong microgeographic differentiation has been previously detected in several Middle Eastern areas, as in the Caucasus (Weale et al. 2001; Nasidze et al. 2004), Pakistan (Qamar et al. 2002), Anatolia (Cinnioglu et al. 2004), or among Kurdish populations (Nasidze et al. 2005). However, it seems that when several close rural isolates are melted into larger samples or when samples are obtained from large urban centers, those sharp micro differences disappear, giving rise to smooth, clinal gradients usually detected in past global Y-chromosome analysis (Rosser et al. 2000; Semino et al. 2000; Wells et al. 2001). This is the case of the Amman sample that, congruently, fits in its geographic region and therefore seems to be representative of Jordan.

Y-chromosome haplogroups in the Middle East

Like all countries belonging to the Levant or to the Fertile Crescent, the most prevalent haplogroup in Jordan is J (56%). Their two main subclades (J1-M267 and J2-M172) show opposite latitudinal gradients in the Middle East: J1-M267 is more abundant in adjacent southern areas such as Somalia, Egypt, and Oman while J2-M172 is more common in the Levant, including Syria, Iraq, and Lebanon with decreasing frequencies northward to Turkey and the Caucasus. The fact that J1-M267 and J2-M172 frequencies decrease with distance from the Levant in all directions reinforces this region as the most probable origin of its dispersions (Semino et al. 1996; Rosser et al. 2000; Quintana-Murci et al. 2001). Recently, J2 has been further subdivided into several subclades (J2e-M12, J2f-M67, J2f1-M92) that show additional geographic substructure (Scozzari et al. 2001; Cinnioglu et al. 2004; Di Giacomo et al. 2004; Semino et al. 2004). The J2e-M12 clade has been detected from the Iberian Peninsula to Central Asia and from India to Russia (Scozzari et al. 2001; Kivisild et al. 2003) but always in low frequencies. In Europe, a possible focus of dispersion has been localized in the southern Balkans (Semino et al. 2004). In the Levant, it was found in Palestinian (Scozzari et al. 2001) at similar frequencies than in our Jordan sample (3%) and could represent an Aegean influence on this area. J2f-M67 is also a widespread clade that is most frequent in the Caucasus (Semino et al. 2004) whereas its derivate J2f1-M92 shows a peak of radiation in western Turkey (Di Giacomo et al. 2004; Semino et al. 2004). The fact that all our J2f-M67 chromosomes from Jordan are ancestral for M92 points to a northeastern and not to an Aegean influence.

Haplogroup E3b is the second most abundant in Jordan (18%). E3b2-M81 could indicate a rather recent gene flow from North Africa (Cruciani et al. 2004; Semino et al. 2004) while E3b1-M78 and E3b3-M123 are probable signals of Paleolithic dispersals from eastern Africa to the Levant through Egypt (Cruciani et al. 2004; Underhill et al. 2001; Luis et al. 2004). Secondary diffusions from the Middle East must have happened in order to explain the wide Eurasian distributions of these clades. The E3b3a-M34 derivate from E3b3-M123 testifies to one of these reexpansions from the Middle East to Africa and western Asia (Cruciani et al. 2004). In fact, four of the five E3b3-M123 chromosomes detected in Jordan belong to the E3b3a-M34 subclade.

P-92R7 is the third most numerous haplogroup in Jordan (12%), but most chromosomes were characterized as R1*-M173, R1a1-M17, and R1b3*-M269. It has been proposed that P-92R7 emerged in Central Asia in the Palaeolithic period and that the R1*-M173 branch traces its most ancient westward expansion (Underhill et al. 2001, Wells et al. 2001). This expansion reached west Africa where the undifferentiated R1*-M173 has been detected in high frequency (Cruciani et al. 2002). The presence of this clade in Oman may suggest the possibility of a southern route involving the Horn of Africa. However, the lack of R1*-M173 in Somalia (Sanchez et al. 2005) and its presence in Jordan and Egypt points to the Levant as the alternative bridge of passage of R1*-M173 to Africa (Luis et al. 2004). All these results evidence the Levant as both a crossroad of migrations and a main focus of expansions.

Finally, the microgeographic differentiation detected in different regions, including the Levant, and the local and global clines observed for the Y-chromosome could be modeled as a landscape of isolated human spots with occasional demic flashes interconnecting them. Climatic changes, resource exhaustion, or technical advances might be some of the causes that prompted those expansions. This genetic model broadly parallels the archaeological record in the Levant (Redman 1978).