Introduction

The greater white-toothed shrew, Crocidura russula Hermann, 1780 is a medium-sized (7–15 g) Palaearctic shrew, which is widespread in southern and central Europe, North Africa, and several Mediterranean islands such as Sardinia, Pantelleria, and Thiza. Regarding its northern distribution (eg Switzerland), it relies on suitable man-made sites (ie, it is a synanthropic species) in order to overcome the energetic stress of the cold season (Genoud and Hausser, 1979; Hausser, 1984). Anthropophilic habits and the short range of juvenile dispersal were found to create genetically subdivided populations (Favre et al, 1997; Balloux et al, 1998). Synanthropy at lower latitudes was also present (eg on Pantelleria, personal observation by Sarà), but it is not required for winter survival due to the mild Mediterranean climate.

Three conventional subspecies, C. r. russula in Europe, C. r. yebalensis in Morocco (and probably southern Spain), and C.r. cf. agilis in Tunisia, are normally recognized in the Palaearctic continental range. Previous genetic investigations have showed a strong homogeneity between C. r. russula and C. r. yebalensis (Catzeflis, 1983, 1984; Genoud and Hutterer, 1990), together with a smooth and gradual morphological cline from Morocco to Germany (Sarà and Vogel, 1996). Conversely, C. r. cf. agilis from Tunisia and C. r. cossyrensis from Pantelleria show some differences in karyotype and isoenzymes (Vogel et al, 1992) with respect to the European and Moroccan populations. While exploring the species’ geographical variation throughout its Palaearctic range, Sarà and Vogel (1996) identified a stepped clinal variation (sensu Endler, 1977) from Tunisia to Morocco, with a morphological interruption in Eastern Algeria.

Previous knowledge regarding biochemical and morphological differences suggested testing the hypothesis of two different lineages within the C. russula range, a situation that may have also led to the colonization of north Europe and Mediterranean islands by different genetic stocks. To verify the Sarà and Vogel hypothesis, our investigation focused on analyzing two regions of mitochondrial DNA (mtDNA), the control region and the 12S-rRNA gene. In many vertebrates, the mtDNA control region is characterized by the presence of highly variable tandem repeats (Fumagalli et al, 1996; Casane et al, 1997; Lunt et al, 1998). These tandem repeats vary from two to hundreds of base pairs, and can occur in each of the two domains that flank the conserved sequence in the center of the control region (Casane et al, 1997; Zardoya and Meyer, 1998). The majority of the mutations in tandem repeat arrays involve the gain or loss of a single repeated unit, even though the length of the array of repeat units is constrained (Casane et al, 1997; Estoup et al, 2002). This leads to a high level of homoplasy, which can be expected to lead to an underestimation of the distance between populations when the variation is detected by the number of tandem repeated copies (Broughton and Dowling, 1997; Estoup et al, 2002). In this study, we have used the primary sequences of the single R2 motifs, as recorded by Fumagalli et al (1996), in the right domain of the mitochondrial control region. This latter is adjacent to the tRNAphe gene, containing the site of initiation for the H-strand replication (OH). In addition and to further support the hypothesis of subdivision in the two C. russula lineages, we have also analyzed the 12S-rRNA, which is generally useful as a taxonomic level marker above the population or species level.

Materials and methods

Sample collection

In all, 29 C. russula specimens were obtained from trapping sessions and they were stored in the theriological collections of museums (Figure 1). Of these, 11 specimens came from Tunisia, six from the small Kneiss Island, and five from the Aïn Draham area; five specimens came from Pantelleria, a small island off the coast of Sicily, Southern Italy; two specimens were from the island of Sardinia (S. Pietru); 10 specimens from Spain (three from Heyos del Espino, four from Eugi, and three from Montseny); and one specimen came from Belgium (Meròn). A Moroccan sample was not available; however, this was not a problem since shrews from this area are known to be genetically homogeneous with the European group (Catzeflis, 1983, 1984; Genoud and Hutterer, 1990).

Figure 1
figure 1

The greater white-toothed shrew (C. russula) range in the West Palaearctic and the eight sampled localities. (1) Tunisia (Kneiss Island); (2) Tunisia (Aïn Draham); (3) Pantelleria Island (Montagna grande); (4) Sardinia (S. Pietru); (5) Spain (Heyos del Espino); (6) Spain (Eugi); (7) Spain (Montseny); and (8) Belgium (Meròn).

All the specimens were used for simultaneous sequencing the 12S-rRNA and the control region fragment (right domain). Samples were kept frozen at −20°C until their tissues had been removed and then the entire muscle tissue, which was suitable for mtDNA analysis, was stored in ethanol at 4°C.

Polymerase chain reaction and sequencing

Total DNA was isolated from the skeletal muscle of tissues, which had been preserved in ethanol, with a Dneasy Tissue Kit (QIAGEN). Specific primers for Crocidurinae (Fumagalli et al, 1996) were: LC2,5′-CAGTCAATGGAATCAGGACATAA-3′, and HSC, 5′ TTGTTTTAGGGGTTTGGCAAGA-3′, used to amplify a portion of the right domain of the control region, which was downstream of the Conserved Sequence Block 1 (CSB1), and which contained the CSB2. The partial 12S-rRNA gene was amplified by using primers found in Bouchon et al (1994): L12: 5′-GAAAAGCTTCAAACTGGGATTAGATACCCCACTAT-3′, H12: 5′-TGACTGCAGAGGGTGACGGGCGGTGTGT-3′. DNA was amplified in 50 μl reaction volumes containing 2 U of Taq polymerase, 5 μl of a 10 × reaction buffer (200 mM (NH4)SO4, 750 mM Tris-HCl pH 8.8, 0.1% (v/v) Tween), 4 μl of 25 mM MgCl2, 10 μl of 200 mM dNTPs, 10 pmol of each primer and approximately 200 ng of DNA. The polymerase chain reaction (PCR) amplification conditions were as follows: strand denaturation at 95°C for 5 min, annealing at 55°C (for LC2-HSC) and 50°C (for L12-H12) for 1 min and primer extension at 72°C for 1 min, repeated for 30 cycles. DNA fragments of the PCR product were separated in a 2% w/v horizontal agarose gel, which had been stained with ethidium bromide. A single band was cut from the standard agarose gel, the PCR fragment extracted and purified using a QIAquick Gel Extraction Kit (QIAGEN), and the DNA sequenced in an automated ABI PRISM 3700 sequencer (Applied Biosystems).

We observed heteroplasmy in all the specimens (100% as found by Fumagalli et al, 1996); the 29 specimens all had a clear two-band zymogram, and this permitted us to cut a single band on the gel for the analysis. Nucleotide sequences were determined for both strands. The extent of intraindividual length polymorphism was not evaluated. In the following text, we refer to the 5′–3′ DNA orientation of the light strand of mtDNA.

Phylogenetic analysis

The sequences were aligned with the Clustal-W program (Thompson et al, 1994). The control region sequence corresponded to positions 942–1052 in the D-loop region of C. russula in the EMBL databank, Accession Number (AN) X90952 (Fumagalli et al, 1996). Multiple alignments between all individuals were carried out using one single R2 tandem motif for each individual. In this way, all the R2 variants proved to be homologous. The alignment, which would be used as an input for phylogenetic analysis, was constructed manually by repeating these R2 motifs three times for each sequence (as shown in Figure 2), since three was the maximum number of variants scored in a single sample.

Figure 2
figure 2

Partial sequence (135 bp) of the right domain in the light strand of the control region of C. russula. The sequence is correspondent to the positions 933–1052 of the Swiss C. russula control region (AN: X90952). The (CSB2) of mammals and the 12 bp motifs of R2 repeated are shown. The sequences are written as they have been aligned, repeating three times the R2 motifs to allow the alignment of each variant R2 as they have been found in our samples; see Table 1. The R2 motifs were written along each line in this order, Spain and Belgium: R2 × 3 (three times); Sardinia: R2-I × 3; Tunisia R2-I × 2, R2-II × 1; and Tunisia and Pantelleria R2-I × 1, R2-III × 1, R2-II × 1.

Evolutionary distances for the control region sequences were calculated according to the Kimura two-parameter model (Kimura, 1980). The clustering algorithm used for the tree construction was the Unweighted Pair Group Method with an Arithmetic Average (UPGMA), since this assumes an equal rate of evolution along all branches. This assumption is applicable when reconstructing a tree of populations that are thought to be closely related, when short sequences are used or when using linear distance measures, as in the case with the Kimura two-parameter model (Avise, 1994; Li, 1997).

The partial sequence for the 12S-rRNA gene (Genebank AN AF441243) corresponded to the 455–844 position on the mt-12S ribosomal RNA gene of Suncus murinus (Crocidurinae) (AN AB032845). Genetic distances were calculated as the proportion of nucleotide sites, in which a pair of sequences is differed and used to constructed a neighbor-joining tree (NJ) (Saitou and Nei, 1987) using the MEGA program version 2.1 (Kumar et al, 2001). The robustness of the inferences was assessed by means of the bootstrap method (1000 replicates) (Felsenstein, 1985).

Results

Excluding the size variation due to the tandem repeated sequences, the amplified fragment of the control region was 135 bp long, and it was located in the right domain between the LC2 and HSC primers, including the CSB2 element (Figure 2). We chose to analyze the right domain on account of its known high divergence in shrew species (Fumagalli et al, 1996). As verified by Fumagalli et al (1996), all the 29 individuals have a tandem array of R2 repeat units, which are 12 bp long and correspond to the 955–966 position in the C. russula D-loop region (AN X90952); 12–23 copies of repeats per array were recorded in our specimens. We did not analyze the genetic variation with respect to the array length, given the small sample size investigated and the high level of heteroplasmy recorded in C. russula. The tandem repeated motifs R2 were of the same length (12 bp) in all the specimens, but they did not show the same primary structure (Table 1). It is the first time that such a polymorphism for C. russula has been recorded.

Table 1 R2 repeated motif variants sequenced in the C. russula specimens

The sequences of the repeated units R2 divide the specimens into two geographic groups. The former includes the Spanish and Belgian specimens, (ie the European group, and, from our previous assumption, also the northwest-African); the fragment of the right domain had an array of tandem repeats with a base composition matching that already described by Fumagalli et al (1996). We observed uniformity in the repeats for the European specimens, except for a single imperfect monomer at the beginning of each repeat array, which was 10 bp in length: (5′-ACACACGTGT-3′); it was not included in the fragment alignment given in Figure 2. The existence of the single and imperfect monomer flanking the European R2 repeats may suggest that duplication events, leading to the origin of these sequences, occurred early in the history of this taxon (Stewart and Baker, 1994).

The second group comprises the specimens from Tunisia, Pantelleria, and Sardinia (ie, the northeast-African populations), characterized by the presence of nucleotide polymorphism in the repeat variants. We recorded three R2 motif variants, named R2-I, R2-II, and R2-III, which were differentiated by two transitions, thereby determining three distinct haplotypes (Table 1, Figure 2). The two specimens from Sardinia showed only the R2-I motifs (in 15–16 copies) at the 3′ end of the array. Nine of the Tunisian specimens showed the R2-I motifs (in 11–18 copies) plus a single R2-II motif at the 3′ end of the array. The five specimens from Pantelleria and the other two from Tunisia (Aïn Draham) also had the R2-I motifs (in 6–10 copies) at the 5′ end, plus the single copy of R2-II at the 3′ end of the array, and, in addition, the R2-III motifs (in 9–11 copies) were always near the 3′ end (Figure 2).

The purine/pyrimidine alternation is characteristic of many repetitive motifs in the control region of mammalian mtDNAs (Ursing and Arnason, 1998), and this feature was also found in the European as well as the northeast-African R2 units except for the final two bases. As is the case with many other species of shrew (Fumagalli et al, 1996), the R2 array was directly flanked on its 3′ end by other repeats, which were very short. The (ACGC)2 motif (ie ACGC repeated twice in a 5′–3′ direction) was recorded by us only in the specimens originating from Spain and Belgium, as already verified by Fumagalli et al (1996). Two others were new [ATA (AC)2(TG)2] was recorded in the Sardinia sample and [(AC)3(AT)2] in all the specimens from Tunisia and Pantelleria.

The UPGMA tree clustered the distinct groups of mtDNA haplotypes (Figure 3), resulting in a separation of two lineages, and this is congruent with the hypothesis of a geographic separation occurring in this species. In fact, the UPGMA separated the two main geographical groups by the two principal R2 motifs, that is, the already known ‘European’ R2 of Fumagalli et al (1996) and the new R2-I motif. The UPGMA tree divided the northeast-African group into three further clusters, characterized by the presence/absence of R2-II and R2-III variants. The first motif is present only in shrews from Tunisia and Pantelleria; whereas the second, although found in two Tunisian specimens, was fixed in all the individuals from Pantelleria (Figure 3).

Figure 3
figure 3

Bootstrap consensus tree obtained from the UPGMA cluster of the Kimura 2-parameter model for the right domain of the C. russula control region. The different tandem repeated motifs R2 (see Table 1) are shown as they should have appeared in the different lineages (black boxes) and, inside the northeast-African group, in the different demes (grey boxes). Along the tree, the new R2 variants are added to the previous and keep themselves within the clade.

We further examined the above relationships using both the NJ and the UPGMA with several measures of distance other than the Kimura-2P (ie, Jukes–Cantor, Tamura–Nei, Tajima–Nei methods, data not shown). The UPGMA gave exactly the same results for each one of the measures used. On the contrary, all the NJ trees gave a result not following the parsimony principle, in which the R2-II sequence was found to have evolved twice in independent lineages, and this interpretation was therefore rejected.

In addition to the UPGMA cluster of the D-loop sequences, an NJ tree for the 12S-rRNA was constructed. The amplified 12S-rRNA gene was 135 bp in length and found to be identical among all the specimens pertaining to the European and to the northeast-African group (Figure 4). The differences between the two main geographic groups consisted in the number of transitions (five), transversions (one), and indels (two). The sequences of the 12S-rRNA gene identified two clusters (Figure 5), which corresponded again to the two lineages separated by the R2 and R2-I sequences in the UPGMA dendrogram (Figure 3). The nucleotide sequence divergence of the conserved 12S-rRNA gene separating these two sister groups was about 1%.

Figure 4
figure 4

Partial sequence (394 bp) of the mitochondrial 12S-rRNA gene sequenced from the northeast-African (Tunisia, Pantelleria and Sardinia OTUs) and the European lineage (Spain and Belgium OTUs) of C. russula (C.r.), aligned to the position 455–844 of Suncus murinus (AN AB032845).

Figure 5
figure 5

Phylogenetic tree based on the partial 12S-rRNA gene of the northeast-African (including Tunisia, Pantelleria and Sardinia) and the European (Spain and Belgium) lineages of C. russula with Suncus murinus as outgroup. The tree represents the topology estimated through the Neighbor-Joining.

Discussion

The polymorphism in the SSRs of the mitochondrial control region and the 12S-rRNA gene sequences divided the C. russula species into two sister groups, thereby revealing two divergent lineages, the northeast-African and the European (Figures 4 and 5). This fact confirmed the two evolutionary lineages previously hypothesized by Sarà and Vogel (1996). The European clade was characterized exclusively by the R2 unit in the tandem array of the SSRs, while the northeast-African clade was marked by the R2-I sequence. This latter was present in all the 18 northeast-African specimens and could thus be considered as a characteristic marker for this lineage.

The two lineages also differed in the base composition length of the 12S-rRNA fragment. The evolutionary rate of the mammalian 12S-rRNA gene is less heterogeneous than that of the D-loop region, and its analysis permits us to estimate the date of divergence between two lineages, assuming a common ancestor. We used the conventional rate of 2% mitochondrial sequence divergence per million years between the pairs of lineages (Brown et al, 1979) to estimate the putative time of divergence between the two C. russula lineages. This estimate only applies to the first 10 Myr of the species separation process (Avise, 1994). The 1% sequence divergence, as highlighted by the NJ tree, would therefore produce a split that occurred some 500 000 years ago.

This divergence time is congruent with the paleobiogeographical data available. C. russula is an ancient Afrotropical lineage (Maddalena and Ruedi, 1994), and its oldest fossils are recorded in Mid-Pleistocene sites located in Morocco and Western Algeria (Rzebik-Kowalska, 1988). Climatic fluctuations were responsible for important changes in the North African fauna throughout the Quaternary period (Kowalski, 1991). The beginning of the Mid-Pleistocene in Africa (the Taourirtian period) coincided with the Gunz glaciation in Europe, and it was characterized by a cool and humid climate (Kowalski and Rzebik-Kowalska, 1991). During the cold Taourirtian period, the entire region of North Africa and the Sahara were characterized by enormous expansions of freshwater masses, that is, the ‘Big Lakes Age’ (Monod, 1963; Faure, 1987). Thereafter, these lakes retracted in the dry Ougartian period, but the process was repeated until the last glacial Würmian acme, when two refugial areas in northwestern and northeastern Africa are known to have existed (Brown and Gibson, 1983). In the last half million years, climatic fluctuations due to the glaciation process may have, therefore, interrupted the ancestral species’ range into at least two (eastern and western) refuges, thus prompting the separation of the two modern lineages. This ecoclimatic reconstruction can also be applied to other taxa living in the area such as some Amphibian like Anura (Lanza et al, 1986) or Insect such as Phasmidae (Bullini and Nascetti, 1987). Thereafter, the northwestern population migrated, from Morocco to Spain, owing to the action of man, and from there invaded France (in 3500–2000 BC, according to Poitevin et al, 1986) and North Europe (Vogel et al, 2003). On the other side of the range, the eastern lineage expansion was presumably blocked by the Sahara, but further studies on the Lybian C. alexandrisi are required to verify this hypothesis. The existence of the other two northeastern-African haplotypes describes a pattern that is concordant with a geographical isolation from the continental source (Tunisia), where shrews living in distant Sardinia are more divergent and presumably more isolated than those in the closer Pantelleria. Excluding geographical distance, such a population structure could also reflect the recent colonization of Pantelleria and Sardinia. Pantelleria was completely destroyed by volcanic eruptions some 40 000 years ago and the first human settlement dates from no more than 4000 years ago, while the first C. russula evidence in Sardinia dates from 6000 BC (Alcover and Vesmanis, 1985). Differences between populations on an even smaller geographical scale have been reported by Ehinger et al (2002), and they suggested that the present-day haplotype distribution of the C. russula mtDNA control region may have been shaped by historical events. Ehinger et al (2002) have hypothesized that, during range expansion, rare and long-distance colonists created isolated populations, thereby inducing a spatial clustering of genotypes that could persist for hundreds of generations. Balloux et al (1998) further suggested that the species’ breeding strategy (monogamy, male phylopatry, and small breeding units) may be responsible for the high level of population differentiation. A single C. russula pair is capable of colonizing empty sites and establishing a persistent population (Vogel, 1999). Some evidence of small and closed breeding units was found in Tunisia, where two shrews trapped in a meadow at Aïn Draham were found to be more genetically similar to the Pantelleria population than to other individuals trapped in bushes a few kilometers away. The distances (3–5 km) between the two trapping sites in the Aïn Draham area are comparable to those reported in Balloux et al (1998). The above observation is testimony to the capacity of small nuclei of this species, presumably involuntarily carried by human beings, to colonize islands such as Pantelleria and Sardinia.

In conclusion, and in accordance with the phylogeographic pattern identified by the authors of this paper, a taxonomic revision that ranks C. r. cf agilis Loche, 1867 from Tunisia as a full species, including C. r. ichnusae Festa 1912 (Sardinia) and C. r. cossyrensis Contoli, 1990 (Pantelleria) is required.