Introduction

All species are subdivided, display population structure and are to some extent influenced by spatially heterogeneous landscapes. Spatial heterogeneity affects, among other things, dispersal and, consequently, gene flow (Holderegger et al., 2006). A central issue is whether changes in landscape features create barriers to gene flow and as such induce population structure. In conservation and management of species, it is important that conservation or management units are defined on relevant biological criteria rather than being arbitrarily defined (Bonin et al., 2007).

To understand dispersal dynamics at a local scale, it is important to understand the patterns and processes of gene flow at larger scales (Manel et al., 2003). For example, in studies of patch occupancy dynamics, such as in metapopulation models, dispersal is a crucial factor (Taylor et al., 1993; Hanski, 2001; Hanski and Ovaskainen, 2003). However, the available knowledge of dispersal is often sparse and becomes a limiting factor when modelling movement patterns or patch occupancy. At large scales, traditional mark-recapture methods for estimating dispersal become unfeasible because impossibly large number of individuals need to be monitored. This problem may be overcome by using molecular genetic techniques (Waser and Strobeck, 1998). Landscape genetics has emerged from a combination of spatial statistics, molecular genetic techniques and landscape ecological theories (Manel et al., 2003; Holderegger and Wagner, 2006). This approach uses individuals as the study unit and attempts to address whether geographic and environmental structures affect gene flow and genetic structure.

Classically, genetic studies have used a priori defined populations when studying gene flow. However, identifying populations in advance may be undesirable due to potential biases arising from unidentified migrants and cryptic spatial structure (Rousset, 1999; Sumner et al., 2001; Manel et al., 2003). Analytical tools using Bayesian clustering algorithms to detect population structure use individuals as the study unit and thus there is no need of prior knowledge of discrete populations in advance. In such approaches, it is possible to assign genotyped individuals to k populations, where k may be unknown (Pritchard et al., 2000; Dawson and Belkhir, 2001). Treating allele frequency as a random variable, these methods find units defined by Hardy–Weinberg equilibrium, and any signal of linkage disequilibrium is assumed to be due to population structure rather than physical linkage (Pritchard et al., 2000). Thus it is possible to make inferences about the genetic structure of a sample of individuals (Pritchard et al., 2000). Unknown parameters are integrated out using Markov chain Monte Carlo (MCMC) methods. In a recent expansion of Bayesian detection of population structure, the spatial locations of genetic discontinuities between populations are supposed to be spatially organized through the so-called coloured Poisson–Voronoi tessellation (Ripley 1981; Dupanloup et al., 2002; Guillot et al., 2005b), and from thematic maps, land cover features associated with genetic discontinuities can be identified.

In continuous populations, neighbourhood size is a basic population entity (Wright 1943; Slatkin and Barton, 1989), and is usually defined as NS=4πDσ2, where D is the population density and σ is the mean axial square distance between related individuals, the σ parameter determines how much genetic differentiation increases with distance (Rousset, 2000; Sumner et al., 2001). From the definition of neighbourhood size and given knowledge about effective density it is thus possible to estimate dispersal distance (Rousset, 2000). With theoretical models of isolation by distance it can be shown that kinship and a distance measure, called Rousset's a described in Rousset (2000), are expected to vary approximately linearly with the logarithm of distance (Rousset 1997; 2000). Thus, neighbourhood size can be approximated by using kinship and Rousset's a coefficients (Hardy and Vekemans, 2002). Simulation studies have shown that estimates of a are fairly robust to biases introduced by temporal changes in dispersal, density reduction and spatial expansions with constant density but less so to variation in mutation rate and density increases in the recent past (Leblois et al., 2003; 2004).

Previous studies of a range of organisms have used a combination of the landscape genetic approach and genetic estimates of dispersal distance to infer levels of population structure and gene flow in wild animals (for example, Coulon et al., 2006; Fontaine et al., 2007; Janssens et al., 2008). In this paper, we attempt a similar approach using data from the hazel grouse (Bonasa bonasia), which is a forest breeding bird species with poor dispersal ability and ecological features that indicate a wide-spread, but patchy distribution (Swenson, 1991a, 1991b; Åberg et al., 2000). In Sweden as well as throughout most of its range, the hazel grouse distribution is tied to the occurrence of continuous old-growth coniferous forest. Previous genetic studies have used a phylogeographic approach to detect broad patterns of genetic structure within the entire range of the species (Baba et al., 2002). There are, however, hitherto no studies of genetic structure at the regional level, that is at intermediate geographic scales. The species is ecologically well studied, and direct estimates of dispersal using radiotelemetry are available (Swenson, 1991a, 1991b; Swenson and Danielsen, 1995). Moreover, density measures and sex ratio estimates from a former study (Swenson, 1991a) allow the calculation of effective density to indirectly estimate dispersal distance of hazel grouse.

In this study, we used 12 microsatellite markers to study genetic differentiation, dispersal and population structure of hazel grouse at a regional scale. The aims of the present study were the following: (1) to quantify basic levels of genetic diversity in Swedish hazel grouse, specifically whether genetic distance between individuals increases with geographic distance; (2) to establish the genetic neighbourhood size and estimate gene flow and hence effective dispersal, and we compare these estimates to what is known from ecological studies; and (3) to investigate the landscape genetic pattern of hazel grouse in northern Sweden. In doing this, we establish the number of populations (units in Hardy–Weinberg and linkage equilibrium) and study whether there are any landscape features that coincide with any genetic discontinuities. Hence this study is an important contribution to the general understanding of how geographic and environmental structures at large scale affect gene flow, dispersal patterns and population structure of a forest-breeding bird species sensitive to a heterogeneous landscape.

Methods

Georeferencing

To study genetic differences and population genetic structure at a regional scale, we used an available tissue sample collection covering the whole of northern Sweden (north of the river Dalälven), this collection being made and stored by the Swedish Museum of Natural History. The natural history museum collection consisted of hazel grouse wings collected by hunters during 1978–1986 (see Hörnfeldt (1978) for methods). From these wings, samples from 1981 and 1982 were selected because these years included wings from geographic locations that covered most of the distribution in northern Sweden (Figure 1). We genotyped each of the 613 georeferenced individuals in this collection at 12 microsatellite loci with the objective to quantify genetic diversity and to infer the number of populations and dispersal distance for hazel grouse in northern Sweden. The aim was to locate genetic boundaries and tie those to geographic structures. The sex ratio in the sample was not known, but is probably dominated by males because of the hunting technique in which a whistle pipe imitating male song is used. Males are more likely than females to respond to this stimulus (Swenson, 1991a, 1991b). From the 613 hazel grouse wings, information on location in terms of coordinates or a name of the sampling locality was available to georeference the samples. One of the map systems (GSD-fastighetskartan) in Sweden divides the land cover into specific maps with 5 × 5 km extents; thus using this system, the georeferencing was limited to a resolution of 5 × 5 km. Using the available centre coordinates of each map, each wing sample was assigned a coordinate.

Figure 1
figure 1

Distribution of the collected and georeferenced wing sample within northern Sweden (each point can contain more than one individual). Approximately 60°–68° N – 13°–20° E was covered. This is the main taiga region of Sweden, and compared with the southern part of Sweden, the forests are generally less fragmented.

Genotyping

DNA was extracted from the tissue samples by a high salt purification protocol. Genetic variation was determined at 12 microsatellite loci as described in Segelbacher et al. (2000) and Piertney and Höglund (2001). PCR conditions and temperature profiles followed the original publications; with slight modifications, the protocol was adjusted to allow amplification of the 12 loci in three multiplex reactions (Table 1). The markers used have been chosen and the multiplex protocol has been tested as to allow reliable genotyping of all tetraonid species and when we cannot completely rule the presence of null alleles and allelic drop out, such were kept at a minimum (J Höglund et al., unpublished data). The post-PCR products were prepared by diluting 1 μl post-PCR product with 9 μl ddH20. Further, 2 μl of the diluted PCR product was dispensed in 8 μl of ET-ROX size marker and genotyped using MegaBase (Amersham Biosciences, Buckinghamshire, UK). Allele sizes and genotypes for each sample were scored using the software Fragment Profiler (Fragment Profiler 1.2, Amersham Biosciences, 2003). The scoring was performed by implementation of a peak filter (Table 1). To check for discrepancies between peak filter results and genotype data, histograms of peak frequencies and bins were used to manually search for peaks indicating alleles.

Table 1 Microsatellite loci used in the study and peak filter settings used in Fragment Profiler to screen for alleles

Data analyses

Using the Excel add-in Micro-Satellite Toolkit (Park, 2001) the data set was checked to confirm that it did not contain non-numeric, non-integer or negative values. Descriptive genetic diversity was quantified as expected heterozygosity (HE), observed heterozygosity (HO), number of alleles at each loci and allele frequencies by using the Micro-Satellite Toolkit. Tests for deviations from Hardy–Weinberg equilibrium and linkage disequilibrium within and population differentiation among genetic clusters (see below) were performed using Genepop on the web (http://genepop.curtin.edu.au/). Confidence limits for FIS were obtained with the jack-knifed estimates in Genetix (Belkhir et al., 2000).

Spatial structure and isolation by distance between individuals were analysed with the software SPAGeDi v. 1.2 (Hardy and Vekemans, 2002). We used two estimators of genetic distance between individuals: kinship (Loiselle et al., 1995) and a (Rousset, 2000). Kinship is expected to decrease as the geographic distance between individuals increases whereas the reverse is predicted for a. Spatial autocorrelation correlograms were obtained from SPAGeDi. We obtained variance estimates for each distance class by jack-knifing over loci.

Neighbourhood size was estimated using a kinship coefficient and with Rousset's a coefficient (Rousset, 1997, 2000). With theoretical models of isolation by distance it can be shown that kinship is expected to vary approximately linearly with the logarithm of distance; in addition, an estimate of neighbourhood size can be obtained using an estimate of Rousset's a (Rousset, 1997, 2000). These theoretical models were implemented here to approximate neighbourhood size among our samples. Using kinship and Rousset's a coefficients, approximation of neighbourhood size was made. Using kinship, NS≈-(1-F)/b-log, where b-log is the slope of regression of the kinship coefficients on log distance and F is a mean jack-knifed estimate of the inbreeding coefficient (Hardy and Vekemans, 2002). With Rousset's a coefficient the neighbourhood size was approximated by the inverse of slope of regression curve, NS≈1/b-log, where b-log is the slope of regression of a on log distance. Significance of the regression slopes were tested by 105 random permutations of individual locations (similar to a Mantel test) in SPAGeDi (Hardy and Vekemans, 2002).

Neighbourhood size is usually defined as NS=4πDσ2, where D is the population density and σ is the mean axial square distance; the σ parameter determines how much genetic differentiation increases with distance (Rousset, 2000; Sumner et al., 2001). Thus, it is possible to estimate dispersal distance if density and neighbourhood size are known. With available data on male hazel grouse density and information that there were 40 percent more males in a study area located near Grimsö research station in south central Sweden (Swenson, 1991a, 1991b), effective density could be estimated. Hence, with estimates of neighbourhood size and effective density, dispersal distances were estimated using both kinship and Rousset's a.

To calculate dispersal distance, we used the density estimates and sex ratio in Swenson (1991a, 1991b) to calculate the effective populations size Ne using equation 1. To obtain effective density De we divided Ne with the size of the study area in Swenson (1991a, 1991b).

To determine population structure and the location of possible genetic boundaries, we used the R package Geneland (R, Development Core team, 2005; Guillot et al., 2005a). This program infers population structure and genetic boundaries by using a Bayesian clustering method (Guillot et al., 2005b, building on work by Pritchard et al., 2000). Georeferenced multi-locus genotyped individuals of unknown origin were (probabilistically) assigned to a population by the use of a Bayesian cluster model, implemented using MCMC methods (Guillot et al., 2005b). The assumption in this model is that each individual is a member of one population at Hardy–Weinberg equilibrium. Within populations, individuals are assumed to be randomly distributed and the linkage disequilibrium due to population structure may prevail between loci. The allele frequencies are assumed to follow independent Dirichlet distributions. Furthermore, populations are assumed to be organized by Poisson–Voronoi tessellation (Dupanloup et al., 2002).

To implement the MCMC method, both priors and initial values for parameters have to be provided. For the number of populations, we assumed a uniform prior between 1 and 100. Each MCMC run was initialized in a state with 10 populations. As no iteration of the MCMC ever reached 100 populations, identical results would have been obtained with flat priors over larger ranges. With the parameters provided, the MCMC model was replicated 1 000 000 times with a thinning of 100 (each one-hundredth replication was stored for analysis). To account for uncertainty in the positioning of individuals, we used a prior additive noise blurring of coordinates in the MCMC model (Guillot et al., 2005b). With the estimated number of panmictic populations from the first run, the model was re-run with a fixed number of two populations, and Voronoi tessellation of observed genetic data resulted in maps of posterior probabilities of population membership. To conclude if any of the potential barriers (for example, rivers, mountains or main roads) coincided with genetic discontinuities, a thematic map (1:200 000) including the potential barriers was compared visually with the tessellation map, including genetic discontinuities.

We also used the model-based clustering algorithm implemented in Structure v. 2.1 (Pritchard et al., 2000; Falush et al., 2003) to find the most likely number of populations (k). The burn-in period consisted of 100 000 replications, after which 1 000 000 MCMC iterations were run for a number of clusters from k=1 to k=10 under a model assuming admixture and allowing for correlation of allele frequencies between clusters. We adopted the approach suggested by Evanno et al. (2005) to calculate the most likely value of k.

Results

The analyses suggest that genetic diversity in Swedish hazel grouse is at the same magnitude as in several other grouse species. The primers, developed for black grouse, capercaillie and chicken, were suitable for hazel grouse as well. Using Fragment Profiler and Microsatellite Toolkit to screen the data it was concluded that two of the microsatellite makers (ADL257 and TUT2) were monomorphic among the sampled hazel grouse and they were consequently removed from the data set.

The expected unbiased heterozygosity for the Swedish hazel grouse population was 0.561±0.066 s.d. and the observed heterozygosity was 0.466±0.007 s.d., suggesting that the Swedish hazel grouse do not suffer from loss of genetic diversity compared with other grouse species. Using jack-knife estimators over loci, mean FIS was estimated to 0.1632±0.0545 s.e. The average number of alleles was 10.5±6.04 s.d. We found evidence of two genetic clusters (see below). FIS was significantly different from zero in both of the clusters (0.160 in the ‘northern’ and 0.216 in the ‘southern’ cluster) but no loci were found in significant linkage disequilibrium within clusters after Bonferroni correction. Population differentiation among the clusters was weak but significant (FST=0.0052, χ2=43.270, d.f.=20, P=0.0019).

The kinship coefficient decreased with geographic distance and Rousset's a increased with distance (both significantly at P=0.000, Figure 2), allowing the calculation of neighbourhood size and corresponding genetic dispersal distances.

Figure 2
figure 2

Spatial autocorrelation correlogram for estimated kinship within 10 distance classes (left) and correlogram for the Rousset's a coefficient with 10 distance classes (right). Error bars represent s.e.

The results from the calculation of the neighbourhood size give knowledge about effective density and thereby it was possible to estimate dispersal distances. Neighbourhood size determined by the kinship coefficient was 158.27 (min=117.52, max=261.63) and from Rousset's a, neighbourhood size was estimated to 62.85 (min=39.78, max=149.64). We estimated that the effective population size in the study area (195 hectares) of Swenson (1991a) was Ne=10.72 corresponding to a De=5.5. Solving NS=4πDeσ2 for the mean axial parent–offspring dispersal distance, σ yielded an estimate of 1514 m (min=1304, max=1946) and 954 m (min=759, max=1472), respectively.

Using Geneland we found the number of populations to be k=2, with posterior probability very close to 1. Maps of the probability for population adherence of any pixel within northern Sweden suggested that one population had a more southern distribution and the other a more northern, but no sharp border between the populations was evident (Figure 3). We calculated the modal population for each pixel and this map also shows that there was no clear south–north subdivision between the two populations (Figure 3). From the comparison of the genetic discontinuities and a thematic map, no evidence of geographical barriers could be concluded. With Structure we also found that the most likely number of clusters (k) was 2 (Table 2). However, the sampled individuals were mostly not assigned unambiguously to either cluster, and the posterior probability of cluster membership seemed to vary quite smoothly over individuals (Figure 4).

Figure 3
figure 3

Tessellation map of the inferred populations suggesting that one population has a more northern distribution and the other has a southern distribution (I). The modal population for each pixel shows the distribution of two populations, although no sharp border is evident (II).

Table 2 Results of the Structure analysis and the calculations to infer the number of clusters (k)
Figure 4
figure 4

Summary plot of estimates of Q (estimated membership coefficients for each individual in the two clusters) sorted by Q as inferred from the software Structure 2.1. Each individual is represented by a single vertical line broken into two coloured segments with lengths proportional to each of the inferred clusters.

Discussion

We found levels of genetic diversity in Swedish hazel grouse to be of the same magnitude as in other grouse species genotyped at many of the same microsatellite loci (capercaillie Tetrao urogallus: Segelbacher et al. (2003); black grouse T. tetrix: Höglund et al. (2007)). However, the levels of diversity are not strictly comparable between species as it is well known that not only amplification probability but also polymorphism decreases when loci developed in one species are cross-amplified in phylogenetically related species (Primmer et al., 1996). The TUT primers were originally cloned in capercaillie (Segelbacher et al., 2003) and the BG primers in black grouse (Piertney and Höglund, 2001). Hence, it may be expected that for this reason, the reported levels of microsatellite genetic diversity in capercaillie and black grouse may be higher and thus not ascribable to differing population processes among the species. Nevertheless, the levels of genetic diversity reported in this study suggest that Swedish hazel grouse have substantial levels of genetic diversity and do not suffer loss of genetic diversity compared with other grouse species.

We did find a significant FIS (deviation from Hardy–Weinberg expectations) in the entire sample. This is most likely explained by a Wahlund effect (Wahlund, 1928). When two population samples are lumped and analysed for departures from Hardy–Weinberg expectations as a single unit, the number of homozygotes become artificially increased because of the hidden population structure. As Geneland and Structure analyses found strong support for two genetic populations in Swedish hazel grouse, the significant FIS may be partly explained by this. However, significant heterozygote deficiency remained within clusters after individuals were assigned to one of the clusters that argue against this being the sole explanation.

Kinship coefficients decreased and genetic distance among individuals increased with geographic distance. This allowed us to calculate neighbourhood size and corresponding genetic dispersal distances of roughly 900–1500 m per generation. These estimates are comparable with estimates from what is known from ecological studies. Juvenile dispersal in Fennoscandia has been found to be 800 m (Swenson, 1991b). In southeastern French Alps, an average dispersal distance of 4 km has been concluded for post-juvenile hazel (Montadert and Leonárd, 2006). The longer distance in France may be because that two males dispersed 15 and 29.4 km, respectively. Hence, the median distance of 1.6 km found in the French study is more comparable with our results.

The landscape genetic pattern of hazel grouse in northern Sweden revealed evidence of two genetic populations in Sweden: one with a more northern distribution and one more southern. Similar patterns have been found in many studies of other animals (for example, bears: Taberlet et al. (1995); willow warblers: Bensch et al. (2002); shrews: Andersson, (2004)). The proposed explanation for this north–south divide is the post-glacial reinvasion of the Scandinavian Peninsula following the retreat of the inland ice approximately 10 000 years ago. The southern limit of the last large block of inland ice was situated in mid northern Sweden at an approximate latitude of 62° N. The Scandinavian Peninsula was thus re-colonized from two directions: one through a land bridge in the southwest, the other through Finland from the northeast. When the last inland ice finally melted away, previously separated populations came into secondary contact and in some instances formed hybrid zones (see references above). In the case of hazel grouse, we could not detect a sharp hybrid zone. Rather, we interpret the data as evidence of quite substantial migration and clinal variation among two genetic populations.

Both Structure and Geneland analyses suggested two populations (k=2) but the spatial autocorrelation analyses strongly hint at an isolation by distance pattern. Furthermore, when plotting population adherence sorted by Q (the most likely population for any individual), the pattern could be interpreted as clinal variation. Both Geneland and Structure may not perform so well when there is clinal variation, isolation by distance (Pritchard and Wen, 2004) and no sharp discontinuities between populations. We interpret the mosaic pattern of the Voronoi tessellation as the result of this clinal variation. With a strong north–south cline, k=2 is a more likely result than k=1. We suggest that the two populations were indeed separated for generations by the previous sheet of inland ice and that this period of separation allowed divergence of the populations. When the two populations came in to contact again, the populations became admixed. With a Voronoi tessellation, each pixel on a map is forced to belong to one of each of the detected genetic clusters, and without a hybrid zone or a geographical divide, the pattern appears as a mosaic.

We could not find evidence of any landscape features coinciding with genetic discontinuities. Rather it seems as if hazel grouse can disperse rather freely in the boreal taiga zone of Sweden. Earlier studies have indicated that the hazel grouse is a poor disperser (Swenson, 1991b; Swenson and Danielsen, 1995; Åberg et al., 2000) and avoids open land (Sahlsten, manuscript). However, Montadert and Leonárd (2006) did find that radio-tagged hazel grouse could disperse over longer distances than previously thought (mean 4 km) and also over unsuitable habitats. Given that northern Sweden to a large extent is covered by unbroken forests suitable for hazel grouse, it is perhaps not surprising that we do find evidence of substantial dispersal and no sharp genetic boundaries between the populations. The main potential physical barriers to gene flow existing within the studied area are several moderately sized rivers flowing west to east from the Scandinavian mountains to the Baltic Sea. However, none of these appear to impose any barriers to gene flow in hazel grouse. This is somewhat surprising as previous studies have suggested that rivers in Scotland can prevent gene flow in red grouse (Lagopus lagopus scoticus, Piertney et al., 1998).

In summary, we found that there is evidence of a population structure reminiscent of what has been found in many other Scandinavian animals with a basic north–south divide. Genetic distance increased with geographic distance between individuals. However, we could not find any evidence that geographic and environmental structures affected gene flow and dispersal patterns for the forest-breeding hazel grouse. This may suggest that the boreal taiga region of northern Sweden is in general a good habitat to sustain hazel grouse.