Introduction

Hybrid zones can be regarded as natural filters for the identification of genes that distinguish closely related populations or species (Barton and Bengtsson, 1986; Arnold, 1997; Martinsen et al., 2001; Wu, 2001; Orr et al., 2004; Wu and Ting, 2004). When two hybridizing populations harbour alternative sets of coadapted alleles at different loci, hybrids may suffer reduced fitness because the allelic combinations specific to one population do not function properly when arrayed against the novel genetic background of the other population (Bateson, 1909; Dobzhansky, 1936; Muller, 1942). This type of negative epistasis should result in reduced levels of introgression at the interacting genes relative to a genome-wide average for unlinked marker loci (Payseur et al., 2004; Dopman et al., 2005; Payseur and Hoekstra, 2005). Alternatively, when alleles that are specific to one population are recombined against the genetic background of another population, the novel combinations of alleles at different loci may provide an important source of adaptive genetic variation (Barton and Hewitt, 1985; Woodruff, 1989; Rieseberg et al., 1996; Arnold et al., 1999, 2001; Allendorf et al., 2001). This type of positive epistasis should result in increased levels of introgression at the interacting genes relative to a genome-wide average for unlinked marker loci. Introgression in hybrid zones may also be promoted by other mechanisms, such as additive effects of alleles at loci controlling fitness-related traits (for example, Rieseberg et al., 2003; Martin et al., 2006).

In Europe, most hybrid zones are the result of secondary contact between phylogenetically distinct populations (or ‘phylogroups’) that were isolated in allopatry during the glacial cycles of the Quaternary (Hewitt, 1996). The European rabbit (Oryctolagus cuniculus) originated in the Iberian Peninsula and is composed of two divergent phylogroups that correspond to the subspecies O. c. algirus and O. c. cuniculus. Analysis of mtDNA variation has revealed two highly divergent lineages with distinct geographic distributions consistent with an allopatric origin: type A (predominant in O. c. algirus) in the southwest Iberia and type B (predominant in O. c. cuniculus) in the northeast and the rest of colonized areas (Biju-Duval et al., 1991; Monnerot et al., 1994; Branco et al., 2000). Levels of divergence at mitochondrial and X-linked loci suggest that divergence between the two groups dates back to the late Pleistocene, 2 million years BP (Biju-Duval et al., 1991; Geraldes et al., 2006) and spatial patterns of variation in mtDNA, Y-chromosome and two centromeric X-chromosome markers reveal a narrow, sharply delineated zone of secondary contact between the two phylogroups in central Iberia (Branco et al., 2002; Geraldes et al., 2005, 2006). In contrast to the patterns revealed by uniparentally inherited, haploid markers, multilocus surveys of nuclear-encoded allozyme variation have suggested a more recent divergence time of 275 000–550 000 years BP and also reveal a much broader zone of admixture (Ferrand and Branco, 2007).

In comparisons between O. c. algirus and O. c. cuniculus, surveys of allozyme variation at 21 polymorphic loci revealed a surprisingly large range of divergence levels (mean FST=0.16, range 0–0.54; Branco, 2000; Campos et al., 2007). This broad range of FST values presumably reflects stochastic variation due to drift during the period of allopatric divergence, as well as locus-specific differences in ongoing rates of introgression across the contact zone. Interestingly, the two loci that fell at opposite ends of the distribution of FST values, haemoglobin α-chain (HBA) (FST=0.54) and haemoglobin β-chain (HBB) (FST close to zero; Campos et al., 2007), encode interacting subunits of the same multimeric protein. Specifically, the HBA and HBB genes encode the α- and β-chain subunits of the tetrameric haemoglobin protein. In rabbits, as in all other amniote vertebrates, the α- and β-globin genes are located on different chromosomes (Xu and Hardison, 1989, 1991). The unusually high level of differentiation at HBA is attributable to the fact that the two main protein electromorphs exhibit pronounced allele frequency differences between the two subspecies: the HBA*1 allele is present at high frequency in O. c. cuniculus to the north, and the alternative HBA*2 allele is present at high frequency in O. c. algirus to the south. Differences in electrophoretic mobility of the two HBA alleles are attributable to a set of three amino acid polymorphisms that are in complete linkage disequilibrium with one another: HBA*1 is defined by the three-site amino acid haplotype, α29Val/48Phe/49Thr and HBA*2 is defined by the alternative haplotype, α29Leu/48Leu/49Ser (Hunter and Munro, 1969; Hardison et al., 1991; N Ferrand, unpublished).

The unusually low level of differentiation at HBB is attributable to the fact that the two main protein alleles, HBB*1 and HBB*2, are present at frequencies of 0.10 and 0.90, respectively, across the entire range of the species. This uniform pattern of allele frequency variation is all the more remarkable given the pronounced level of genetic subdivision revealed by other unlinked markers (Branco et al., 2000; Geraldes et al., 2005, 2006; Ferrand and Branco, 2007). Similar to the case with HBA, differences in electrophoretic mobility of the two HBB alleles are attributable to a set of four amino acid polymorphisms that are in complete linkage disequilibrium with one another: HBB*1 is defined by the four-site amino acid haplotype, β52His/56Ser/76Asn/112Val and HBB*2 is defined by the alternative haplotype, β52Asn/56Asn/76Ser/112Ile (Galizzi, 1970; Bricker and Garrick, 1974; Campos et al., 2007). The fact that the two major alleles at both HBA and HBB are distinguished by three and four replacement substitutions, respectively, suggests that both polymorphisms are fairly old and are probably related to allopatric divergence during the Quaternary.

The objective of this study was to assess whether it is necessary to invoke some form of natural selection to account for the contrasting spatial patterns of allele frequency variation at the HBA and HBB genes. We also assessed whether epistasis between the two unlinked genes may help explain the observed patterns of geographic variation. We conducted a multilocus analysis using a set of 25 polymorphic allozyme loci and a set of 15 population samples from the Iberian Peninsula. Our results suggest that the contrasting levels of spatial differentiation at these two globin genes cannot be reconciled under a neutral model of population structure, and that patterns of variation in the α- and β-chain subunits of rabbit haemoglobin have been shaped by different modes of selection.

Materials and methods

Sampling

The data used in this study resulted from a long-standing project on genetic characterization of wild and domestic European rabbit populations developed in CIBIO laboratories. We analyzed data from a total of 324 wild rabbits that were collected from 15 localities across the Iberian Peninsula (Figure 1). The sampling localities cover the range of both subspecies as well as the zone of secondary contact between them. Previous allozyme surveys demonstrated that these samples cluster into clearly delineated groups that are referable to the subspecies ‘algirus’ in southwest Iberia (Doñana, Huelva, Infantado, Las Lomas, Santarém and Vila Viçosa), and the subspecies ‘cuniculus’ in northeast Iberia (Lérida, Navarra and Tudela; Ferrand and Branco, 2007). These two subspecies also represent the putative ancestral populations that were isolated during the glacial periods. A third group, ‘hybrid’, represents the populations from the zone of secondary contact in central Iberia (Bragança, Idanha, Badajoz, Cabreira, Toledo and Alicante).

Figure 1
figure 1

Distribution map of the three population groups used in the multilocus analysis of 25 allozyme loci in the European rabbit. Group algirus (dark grey shaded area): Santarém (San), Infantado (Inf), Vila Viçosa (VV), Huelva (Hue), Doñana (Don) and Las Lomas (Ll). Group cuniculus (light grey shaded area): Navarra (Nav), Tudela (Tud), Lérida (Lle). Populations from the admixture area (white area): Cabreira (Cab), Bragança (Bra), Idanha (Id), Badajoz (Ba), Toledo (Tol) and Alicante (Alt).

Electrophoretic analysis

We collected polymorphism data for a total of 25 loci (Table 1). Thirteen loci were analyzed using starch gel electrophoresis: adenosine deaminase (ADA), diaphorase I, galactose-1-phosphate uridyltransferase (GALT), haemoglobin α-chain (HBA), haemoglobin β-chain (HBB), mannose phosphate isomerase (MPI), nucleoside phosphorylase (NP), peptidase A (PEPA), peptidase B (PEPB), peptidase C (PEPC), peptidase D (PEPD), phosphogluconate dehydrogenase (PGD) and phosphoglucomutase 2 (PGM2). Three loci were analyzed using agarose gel electrophoresis: carbonic anhydrase I (CAI) and II (CAII) and transferrin (TF). Six loci were screened using isoelectric focusing: acid phosphatase 3 (ACP3), albumin (ALB), vitamin D-binding protein (GC), glucose phosphate isomerase, haemopexin (HPX) and isocitrate dehydrogenase (IDH). Finally, hybrid isoelectric focusing was used in the analysis of three additional loci: antithrombin III (AT3), properdin factor B (BF) and haptoglobin (HP) (see Table 1 and references therein for details).

Table 1 Protein locus, chromosome location and separation systems used in the survey of outlier loci in natural populations of the European rabbit

Codominant segregation of alleles has been verified for all loci used in this analysis (Ferrand, 1995; unpublished results). Using the Weir and Cockerham (1984) estimator of the inbreeding coefficient f (=FIS), we used a randomization test to determine whether observed genotypic proportions deviated from Hardy–Weinberg expectations. To test for linkage disequilibrium (LD) between each pairwise combination of loci, we used a contingency table test on diploid genotypes based on the log–likelihood ratio G-statistic. Confidence limits and probability values for f were calculated by permutating alleles within individual samples 1000 times. Bonferroni corrections were performed on the results as appropriate (Rice, 1989). All analyses were implemented in the Fstat v2.9.3.2 program (Goudet, 2001).

Additionally, we specifically tested the null hypothesis that genotypes at HBA and HBB are independent of one another. Using the program GenePop web version 3.1c (updated from version 1.2; Raymond and Rousset, 1995), we tested for evidence of pairwise LD by performing a Fisher's exact test on contingency tables of diploid genotypes.

Simulation analysis

For the ‘algirus’ and the ‘cuniculus’ groups, single-locus measures of genetic differentiation were obtained using the θ (=FST) estimator of Weir and Cockerham (1984). For each data set, locus-specific departures from neutral expectations were tested by comparing observed FST values (conditioned on heterozygosity) to a null distribution generated by a coalescent-based simulation model (Beaumont and Nichols, 1996; Beaumont and Balding, 2004). To generate the null distributions, we used a nonequilibrium model of population structure that incorporated the history of divergence between the two subspecies as well as internal subdivision within each subspecies. Specifically, we considered a scenario in which two populations (=O. c. cuniculus and O. c. algirus) diverged from a panmictic ancestral population at a specified time in the past. We used an iterative fitting procedure to generate the expected neutral distribution for the total sample. In each set of iterations, divergence times were varied between 2.5 × 105 and 5 × 106 years BP to produce a null distribution in which equal numbers of loci fell above and below the median quantile. This range of divergence times for the two subspecies spans the range of empirical estimates based on mtDNA (Biju-Duval et al., 1991; Branco et al., 2000), X-linked markers (Geraldes et al., 2006) and nuclear-encoded allozymes (Ferrand and Branco, 2007). In the simulation model, each of the two subspecies was subdivided into a network of equal-sized demes that were interconnected by ongoing migration. Within each of the two subspecies, the rate of migration among demes was set equal to the value that produced the observed level of differentiation under an island model of population structure. This model allowed us to account for the effects of ongoing migration in shaping patterns of allele frequency variation within each subspecies, which was not possible, for example, using the outlier-detection approach of Vitalis et al. (2001), which is based on a drift and divergence model of population structure. We also modelled the effects of bottlenecks following the initial sundering of the ancestral population into two refugial isolates. At the time of the initial divergence, each of the two populations underwent a 5- to 10-fold reduction in population size. The duration of the bottleneck was varied between 10 000 to 500 000 years, and was followed by a stepwise increase to the contemporary population size. Coalescent simulations were conducted under the infinite alleles model (IAM) as well as the stepwise mutation model (SMM). In each set of iterations, sample sizes were set equal to the median of actual sample sizes in the specific data set under consideration. Coalescent simulations were used to generate a total of 50 000 paired values of FST and H, which was then used to compute the 0.975, 0.50 and 0.025 quantiles of the conditional distribution (Beaumont and Nichols, 1996; Storz and Nachman, 2003; Storz and Dubach, 2004). Loci with FST values that exceeded the 0.975 quantile of the null distribution were considered candidates for spatially varying selection (that is, the observed heterogeneity in allele frequencies exceeded neutral expectations). Conversely, loci with FST values that fell below the 0.025 quantile of the distribution were considered candidates for spatially uniform selection (that is, the observed heterogeneity in allele frequencies fell below neutral expectations).

Results and discussion

Allozyme polymorphism

Diversity measured as the number of alleles was consistently higher in algirus relative to cuniculus (Table 1). This same discrepancy in diversity levels was observed for mtDNA, X-chromosome markers and nuclear-encoded allozymes (Branco et al., 2000; Geraldes et al., 2006; Ferrand and Branco, 2007), and suggests that algirus descended from a refugial population that maintained a larger effective size during the Quaternary glacial cycles.

Samples from all localities conformed to Hardy–Weinberg genotypic proportions, with the exception of the Alicante sample (FIS=0.148; P=0.002). We detected a within-sample deficit of heterozygotes at a total of nine loci (ADA in Cabreira, GALT in Lérida, HBA in Vila Viçosa, HBB in Alicante, HPX in Huelva and Toledo, MPI in Huelva, PEPA in Las Lomas and Idanha, PEPD in Santarém and PGD in Toledo). We detected a within-sample excess of heterozygotes in a single locus × locality combination (AT3 in Tudela). A single pair of loci exhibited a statistically significant level of pairwise linkage disequilibrium in the total sample (CAII × PEPC, P=0.01). However, significant linkage disequilibrium between these two loci was only detected in two of the individual samples: Alicante and Toledo. We also detected significant linkage disequilibrium in the following locus × locality combinations: ALB × AT3, BF × TF and GALT × PEPA in Badajoz, NP × PEPD in Infantado, HBA × MPI, GC × PEPA and ADA × MPI in Las Lomas, HBB × PEPC in Navarra and BF × PEPC and PEPB × PEPD in Santarém.

Levels and patterns of population differentiation

Across the Iberian Peninsula, the weighted mean FST value for all 25 loci was 0.200. All loci exhibited a good fit to the neutral model of population structure, with the exception of HBA and HBB (Figure 2). The HBA locus exhibited the highest level of differentiation (FST=0.465; Table 1) and exceeded the upper 0.975 quantile of the null distribution (P=0.00033 under the IAM and 0.00028 under the SMM). By contrast, the HBB locus exhibited the lowest level of differentiation (FST≈0) and fell below the lower 0.025 quantile of the distribution (P<0.00001 under both the IAM and SMM). For both loci, observed departures from neutral expectations remained statistically significant under both mutation models even at a Bonferroni-corrected α-level of 0.002 (=0.05/25 loci). Locus-specific P-values were nearly identical for simulations under the IAM and SMM. Hereafter, we restrict the discussion to results obtained under the IAM. HBA and HBB also exhibited statistically significant departures from neutral expectations in the simulation models that incorporated population bottlenecks following the initial divergence of northern and southern phylogroups (P<0.0001 in all cases). In simulation models that spanned the full range of bottleneck durations and population size changes (see Materials and methods), the observed FST for HBA consistently exceeded the upper 0.975 quantile of the null distribution and the observed FST for HBB consistently fell below the lower 0.025 quantile.

Figure 2
figure 2

Distribution of transformed P-values for all 25 loci under a neutral model of population structure. The left vertical bar depicts the 0.05 cutoff. The right vertical bar depicts the Bonferroni-adjusted cutoff (=0.05/25). Locus-specific Fst values are given in Table 1.

The two globin genes, HBA and HBB, also exhibited strongly contrasting patterns of spatial differentiation within the range of O. c. cuniculus in northern Iberia. In the comparisons among the three samples from Lérida, Navarra and Tudela, the weighted mean FST for all 25 loci was 0.212. Of these 25 loci, HBA exhibited the highest FST value (0.646) and HBB exhibited one of the lowest values (≈0). By contrast, in the comparison among the six population samples of O. c. algirus from southern Iberia, the weighted mean FST for all 25 loci was 0.083 and the range was −0.010 to 0.160. FST values for HBA and HBB (0.065 and 0.017, respectively) were not especially high or low relative to the multilocus average.

Functional significance of haemoglobin polymorphism

The red blood cells of O. c. algirus primarily contain (HBA*2)2(HBB*2)2 haemoglobin tetramers, whereas the red blood cells of O. c. cuniculus primarily contain (HBA*1)2(HBB*2)2 tetramers. With reference to the amino acid replacements underlying each HBA and HBB variant, β112 is the only residue located in an intersubunit contact surface. Specifically, the β112 residue is located within a 6 Å radius of α117Phe and α122His. Relative to β112Val, the β112Ile mutant is slightly less bulky and slightly more hydrophobic. Mutations in these intersubunit contact surfaces may affect the transition in quaternary structure between the oxy- and deoxy-state of the HB tetramer. However, a more detailed functional analysis will be required to determine whether the different β112 mutants have any effect on the equilibrium ratio of oxy- and deoxy-haemoglobin in the rabbit red blood cells.

The fact that HBA and HBB encode polypeptides that are assembled into the same multimeric protein suggests the possibility that the non-neutral patterns of allele frequency variation observed at each of the two genes may reflect some form of epistatic selection. Since the allosteric mechanism of haemoglobin oxygenation and deoxygenation requires coordinated shifts between the two α/β dimers that comprise each tetramer, one possibility is that one allelic type of α-chain polypeptide only functions well with a particular allelic type of β-chain polypeptide, and vice versa. If this were the case we might expect to see a highly non-random pattern of association between alleles at each of the two genes. In principal, if selection were sufficiently strong, we might observe repulsion-phase LD between the incompatible HBA and HBB alleles, and coupling-phase LD between the compatible HBA and HBB alleles. We would also expect to see high levels of differentiation at both loci, reflecting the fact that algirus-specific HBA alleles only function properly in combination with algirus-specific HBB alleles and likewise for cuniculus-specific alleles. Interestingly, a case of epistatic selection maintaining linkage-disequilibrium between two physically unlinked loci in the European rabbit was already described (van der Loo et al., 1987, 1996; van der Loo, 1993). However, since we observed no statistically significant LD between the HBA and HBB genes (Table 2), and since only HBA exhibits elevated differentiation between the two subspecies, this type of epistasis does not appear to explain the observed patterns of variation. Overall, patterns of variation at the two loci do not appear to conform to any simple model of two-locus epistatic selection.

Table 2 Probability values for genotypic disequilibrium between HBA and HBB in each population and across all populations

Contrasting patterns of differentiation in HBA and HBB

The high level of differentiation at HBA reflects the fact that each allele is either restricted or more common within each population group (HBA*1 in cuniculus and HBA*2 in algirus populations, respectively; Figure 3; see also Campos et al., 2007; Ferrand and Branco, 2007). HBA*3 is the most common allele in the contact zone and has no association with mtDNA haplotype background (Campos et al., 2007). The prevalence of this allele in the zone of admixture suggests that it is a novel recombinant that originated by crossing-over between the parental alleles, HBA*1 and HBA*2. The creation of novel alleles in hybrid zones has been documented in several species (for example, Woodruff, 1989; Bradley et al., 1993; Godinho et al., 2006; Sequeira, 2006) which makes the hybrid origin of these alleles a plausible hypothesis, given its geographical distribution.

Figure 3
figure 3

Geographical distribution of the alleles of HBA (left map) and HBB (right map) detected by electrophoresis in wild rabbit populations from the Iberian Peninsula. Populations and the three geographic groups are as described in Figure 1 except for Portimão (prt) from the algirus group (populations from the dark grey shaded area), Ciudad Real (cre) and Amoladeras (amo) from the admixture area (white area) and Tarragona (tar) from the cuniculus group (populations from the light grey shaded area). In the HBA map, only the three common alleles are represented (see text for details on the other alleles). Adapted from Campos et al., 2007.

HBB is characterized by lower-than-expected FST values, which are nearly zero in subspecies comparisons (Campos et al., 2007). This low level of differentiation reflects a surprisingly uniform distribution of allele frequencies across the species range (Figure 3), especially considering that unlinked markers exhibit high levels of genetic subdivision (Branco et al., 2000; Geraldes et al., 2005).

Both globin genes exhibit patterns of variation that are consistent with completely different modes of natural selection. However, in each case it is possible that the true target of selection is actually a different, closely linked locus. HBA and HBB are members of the α- and β-globin gene families, respectively. Each family consists of multiple genes that control haemoglobin synthesis during different stages of development (Hardison, 2001). In rabbits, the α-globin gene family spans approximately 60 kb on chromosome 6 (Hardison et al., 1991; Xu and Hardison, 1991) and the β-globin gene family spans 45 kb on chromosome 1 (Margot et al., 1989; Xu and Hardison, 1989). If the observed patterns of variation at the HBA and HBB genes are attributable to hitchhiking associated with selection at linked loci, in each case the true target of selection is likely to be another closely linked globin gene.

In mammals and other vertebrates, haemoglobin polymorphism plays a well-documented role in adaptation to hypoxic environments (Poyart et al., 1992; Weber and Fago, 2004; Storz, 2007; Storz et al., 2007), and in humans, amino acid and deletion polymorphisms in the α- and β-globin genes have been implicated in resistance to malaria (for example, Agarwal et al., 2000; Ohashi et al., 2004; Kwiatkowski, 2005; Williams et al., 2005). In the case of the European rabbit, patterns of HBA and HBB variation across the Iberian Peninsula do not correlate with altitude or any obvious environmental factors. It thus seems unlikely that the HB polymorphism is involved in some form of adaptation to the abiotic environment. Another possibility is that the observed patterns of α- and β-globin polymorphism reflect selection for resistance to blood-borne pathogens, as allelic variation in haemoglobin function is known to play an important role in modulating the reduction–oxidation status of red blood cells. In humans, despite the long term knowledge of the selective agent for the most common haemoglobinopathies in sub-Saharan Africa, the existence of negative epistasis between both conditions was reported only recently (Williams et al., 2005). It may therefore prove difficult to unravel the causes of the observed patterns of allele frequency variation at both globin loci in the European rabbit. Nevertheless, while the mechanisms of selection acting on this system remain to be discovered and experimental physiological and expression assays are yet to be done, the intriguing patterns of variation seen at the rabbit globin loci warrant further study to elucidate the possible causes of fitness variation.