Introduction

Sequence analysis of the hypervariable 1 (HVR1) region of mitochondrial DNA (mtDNA) mutations is a powerful tool in the understanding of human history and evolution, at both the global (Vigilant et al. 1991; Jorde et al. 1995) and the regional (Ward et al. 1991; Mountain et al. 1995) level. HVR1 in the control region (about 400 bases sequenced) is generally used because it shows the greatest rate of evolution of the whole mtDNA molecule. A wide range of literature concerning mtDNA mutations is available.

Broad clines from the Levant into northern and western Europe have been observed for many classes of genetic markers (Menozzi et al. 1978; Sokal and Menozzi 1982; Sokal et al. 1989; Richards et al. 1996). This genetic diffusion was particularly underlined by works using classical markers, autosomal microsatellites and Y chromosome markers (Cavalli-Sforza et al. 1994; Semino et al. 1996; Malaspina et al. 2000). This gradual diffusion into western and northern Europe undeniably exists, but its origin is still debated.

Because two separate expansions, the Upper Paleolithic and Neolithic, share the same origin in the Middle East, their differential effect on the genetic construction of Europe is not completely clear. This study of mtDNA control region sequences in well-defined isolated Mediterranean populations could provide information on these ancient migrations. We investigated isolated populations originating from some major islands of the Western Mediterranean (Sardinian, Corsican and Balearic Islands), and from some other Mediterranean areas (Andalusia, Tuscany and Morocco). This study could explain, through mtDNA sequences, the history of ancient migrations in the Mediterranean area.

The HVR1 sequences obtained for these isolates are compared with data published in relation to other populations (Table 1).

Table 1 Populations, samples size (N), references and GenBank accession numbers for original sequence data

Genetic characteristics of Mediterranean populations

The sequences of 1,477 individuals were studied. Analyses showed a number of variable sites, ranging from 32 (Basques) to 97 (Tunisian Berbers). The number of different haplotypes (or sequences) varied from 35.3% (Algeria) to 93.2% (Turkey). Haplotypic diversity (Nei 1984) for each population declined with geographical distance from east to west. The average pairwise nucleotide differences decreased from 7.615 in Egypt to 3.112 in Galicia. A Fu test (Excoffier and Yang 1999) showed, for all populations, an acceptable model of demographic expansion (data available on http://fst.univ-corse.fr/ heading “La recherche”, sub-heading “Lab génétique humaine”). The neighbor-joining tree (Fig. 1) showed a partition between European populations and North African populations, as the Middle East sample and the Sardinian, Corsican and Iberian Peninsula populations formed a homogeneous cluster that appears separated from the one constituted by the Balearic and San Pietro Islands and Northern Italy. Within this cluster, Sicily, Tuscany and Turkey occupy a peculiar position, being strongly separated from Eastern populations. North African populations, particularly Algerians and Berbers from Morocco, showed high variability.

Fig. 1
figure 1

Neighbor-joining mtDNA sequence tree based on pairwise genetic distances. Numbers at nodes indicate bootstrap values (0–100%) obtained by 1,000 bootstrap replications (Felsenstein 1989; Nei and Kumar 2000; Kumar et al. 2004)

An AMOVA was performed with populations divided into five groups: (1) Corsica–Sardinia, (2) Iberian Peninsula, (3) Continental Italy (including Sicily), (4) North Africa, and (5) Middle East–Turkey. These results showed genetic similarity among Corsica–Sardinia, Iberian Peninsula and Continental Italy (0.143<P<0.264). Maghrebian populations displayed a significant differentiation from western European populations (P<0.001). Eastern populations distinguished themselves from Iberian, Sardinian and Corsican populations (P<0.001) and were characterized by genetic similarities with regard to Italians (P=0.089) and, especially, to Maghrebians (P=0.123).

Paleolithic or Neolithic diffusion?

To date, mtDNA sequence research on human evolution has been used mostly for understanding human evolutionary history. Studies on Basques (Bertranpetit et al. 1995), Saami (Tambets et al. 2004) or Icelanders (Wittig et al. 2003) are good examples of the application of specific evolutionary events. Genetic variations at the molecular level show the demographic history of populations, and are potentially responsible for their genetic structure. This population history may have influenced a number of parameters, including number of sequences, allelic partition, distribution of mutated sites, amount of sequence divergence, and distribution of pairwise differences.

In this study, we attempted to better elucidate ancestral genetic flows in the Mediterranean area. Diversity parameters declined from east to west, suggesting that these major mtDNA lineages were brought from the Middle East into Europe. These values are in agreement with an early Upper Paleolithic origin (Torroni et al. 2001; Richards et al. 2000). After the last maximum glaciation (LMG), climatic improvements may have led to these expansions. Continental areas (Sicily included) were reached before islands and mountains. During the Neolithic period, the most important influence from the Middle East was the spread of agriculture 10,000–6,000 years ago. For some authors (Cavalli-Sforza et al. 1994), this spread was the major cause underlying the observed genetic structure. Different estimations always give dates older than the Neolithic expansion into Europe (Excoffier and Yang 1999).

The two major refuges were southwestern Europe (France, Spain) and Ukraine, but other neighbouring refuges could have existed (Torroni et al. 2001). In Europe, after the LMG, the different refuges communicated by several expansion routes. However, the lines of connection are rather unclear. Our genetic data on current isolated European populations could cast some light on this question.

Some population genetic portraits

Our results supply elements confirming the history of settlement of some Mediterranean areas:

  1. 1.

    Northwest African populations are relatively heterogeneous in their mtDNA sequence pools. These populations have received more gene flows from the East, as evidenced by the neighbor-joining tree (Fig. 1) and genetic parameters.

  2. 2.

    This study shows genetic similarity between Corsica and Sardinia, previously reported using classical and nuclear markers (Vona et al. 1995, 2002, 2003; Tofanelli et al. 2004). We demonstrate the genetic maternal influences of southwestern European refuges during the Upper Paleolithic period. This result is in accordance with the genetic paternal influences shown by the 3.1G Y chromosome haplotype gradient (Malaspina et al. 2000). The genetic similarities among Iberian, Corsican and Sardinian populations seem to be reflected in the high frequencies of the β039 thalassemic mutation (Falchi et al. 2005).

  3. 3.

    The Balearic Islands show a moderate genetic affinity with Iberian populations (Fig. 1). This could be assigned to the late settlement of these islands, confirmed by archeological data highlighting the first human traces 5,000 years ago (Fox et al. 1996). This settlement started from several origins, as confirmed by the amount of haplotype diversity compared with the other Iberian populations. Some analysis with classical markers shows a considerable gene flow from European groups (Moral 1985; Miguel et al. 1990; Picornell et al. 1996). Recent studies on mtDNA and microsatellites confirmed this flow (Massanet et al. 1997; Tomas et al. 2000; Picornell et al. 2005). An AMOVA analysis, comparing the Balearic Islands with the other Iberian populations and Continental Italian populations, highlights many influences (Balearic Islands vs Iberian populations 0.39%, P=0.193; Balearic Islands vs Continental Italy, including Sicily: 0.02%, P=0.739).

  4. 4.

    Despite historical population movements from the south to the north of the Mediterranean (e.g. the Moslem invasions of the seventh to eleventh centuries), the Andalusian isolate, geographically near north Africa, shows genetic similarities with Iberian populations. Y haplogroup diffusion highlighted these genetic similarities (Flores et al. 2004). This genetic flow has been explained by the slow Christian reconquest (Comas et al. 2000; Esteban et al. 2004) and by the effect of the geographical barrier imposed by the Mediterranean sea (Plaza et al. 2003).

  5. 5.

    Another remarkable observation relates to the island of San Pietro (Sardinia). Our analyses consistently show that San Pietro mtDNA sequences present features that are genetically closest to those of Continental Italian populations. This is especially apparent (1) in pairwise differences distribution, and hence in haplotypic diversity, and (2) by its position in the population tree (Fig. 1). This genetic affinity with Continental Italy is confirmed by an AMOVA study comparing the San Pietro population with Corsican and Sardinian isolates, and with Italian Continental populations. The results clearly illustrate a genetic differentiation between San Pietro and other insular isolates (1.08%; P<0.001), and a strong genetic influence of Continental Italian populations (0.03%; P=0.751). These results confirm the effect of historic influences on the genetic structure of San Pietro inhabitants. San Pietro Island was not inhabited until 1736, when King Charles Emanuel III of Savoy offered it to a group of colonists from Tabarka (Tunisia), to which they had emigrated from Pegli (Liguria, Italy) in 1542. San Pietro Island has an interesting and original culture: even the ancient dialect spoken by the inhabitants has remained the same as in Pegli (Vona et al. 1996).

Concluding remarks

The nucleotide sequences of HVR1 mtDNA from ten isolates sampled in the western Mediterranean (Corsica, Sardinia, Morocco, Spain, and Continental Italy) were investigated through extensive analysis of sequence data. Most mtDNA haplogroups coalesced in Paleolithic times, and this has been interpreted as a consequence of expansions from glacial refuges in the LMG period. These data have been regarded as inconsistent with the Neolithic model. This study confirms these interpretations, and also points to local phenomena that still contain Paleolithic characteristics and traces of contacts between specific populations. The archeological and paleoanthropological data have shown Paleolithic human traces in all the studied areas.

This study also clarifies some characteristics of settlement. The Corsican and Sardinian islands—geographically, culturally and historically close to Continental Italian populations—still show genetic characteristics of the Paleolithic refuges of southwestern Europe. In the same way, the isolated Andalusian population shows more genetic affinity with Iberian populations than with populations close to the Maghreb, in spite of the formers’ many historical contacts over many centuries. The population of the Balearic Islands seems to harbour genetic characteristics consistent with several European flows. However clear this genetic diffusion seems to be, the routes used remain obscure.