Introduction

Sex bias in natal dispersal is common; in most mammalian species, males are dispersers whereas females are philopatric, and the opposite trend is shown in birds (Greenwood, 1980). Exploring why the sexes differ in their dispersal patterns can shed light on the evolutionary causes of dispersal in general (Goudet et al., 2002), and accurate characterization of dispersal behavior is integral to our understanding of the social structure, mating system and population genetic structure of a species. Yet detection of sex-biased dispersal can be tricky because a dispersal event may occur once in an animal's lifetime, and such events can be difficult to observe directly.

Measuring dispersal

In the last few decades molecular genetics has provided a means of investigating sex-biased dispersal within and among populations (reviewed in Lawson-Handley and Perrin, 2007). Several powerful approaches have been developed to detect individual dispersers through assignment tests or to characterize general patterns of dispersal through summary statistics of population genetic structure (F-statistics, relatedness). Most of these approaches utilize autosomal microsatellites as molecular markers, either alone (Mossman and Waser, 1999; Petit et al., 2001; Waser et al., 2001; Goudet et al., 2002) or in tandem with a uniparentally inherited marker such as mitochondrial DNA (mtDNA) or a Y-chromosome locus (Girman et al., 1997; Escorza-Trevino and Dizon, 2000). The expectation inherent to all these approaches is that greater genetic structure will be evident in the philopatric sex compared with the dispersing sex, thus comparisons of sex-specific FST estimates should reveal the direction (and suggest the relative strength) of sex-bias in dispersal (Goudet et al., 2002).

Because mitochondrial DNA is matrilineally inherited, it is commonly used to infer female-biased dispersal rates (Prugnolle and de Meeus, 2002). When mtDNA haplotype distribution patterns are examined in isolation, inferences can be made about female dispersal behavior without respect to males, but this approach is qualitative and not widely applied (Hoelzer et al., 1994). However, it is possible to use mtDNA alone to infer the relative dispersal of both sexes by extending methods developed for autosomal, bi-parentally inherited markers. For instance, the comparisons of sex-specific population differentiation from haplotype frequency data can indicate which sex disperses more (Escorza-Trevino and Dizon, 2000; Yang et al., 2003).

Using sex-specific fixation indices to estimate instantaneous dispersal rates

Vitalis (2002) developed a method to quantitatively measure sex bias in instantaneous dispersal rates using data from biparentally inherited markers such as microsatellites. This approach allows the inference of sex-specific dispersal rates by comparing sex-specific estimates of genetic differentiation (FST) measured before and after dispersal. This intuitive method can be further extended to incorporate the hierarchical structure within social species (Fontanillas et al., 2004), as it has been recognized that social organization can strongly influence correlations of gene frequencies (Chesser, 1991; Slatkin and Voelm, 1991; Sugg and Chesser, 1994; Chesser and Baker, 1996; Vigouroux and Couvet, 2000). In this study, we develop and use an extension of the Vitalis' (2002) method to estimate instantaneous dispersal rates through analyses of mtDNA haplotype distribution patterns in a social mammal, the collared peccary (Pecari tajacu, family Tayassuidae).

We sampled extensively within three populations separated by long distances, with the goal of quantifying local dispersal among breeding groups within populations. We then compared sex- and age-specific estimates of population differentiation based solely on mtDNA haplotype frequencies, using probability-based estimates of intra-class correlations of gene frequencies among social groups within populations. We used a resampling approach to test for the significance of the observed age and sex-bias in dispersal. Last, the fixation indices generated by these analyses were used to calculate single-generation sex-specific dispersal rates. Earlier mtDNA was used primarily to infer female dispersal patterns, but we show that this matrilineally inherited genetic marker can be used to quantify male dispersal rates in the absence of nuclear population genetic data.

Materials and methods

Study species

The collared peccary is a socially complex, pig-like ungulate that forms stable, mixed sex herds of 3 to 30 individuals (Sowls, 1978). These groups associate throughout the year and vigorously defend territories against other social groups (Ellisor and Harwell, 1969; Bissonette, 1982; Hellgren et al., 1984). Herds are socially cohesive and attempts to immigrate may be met with aggression, although direct observational data on dispersal behavior are still scarce. Male exchange between groups and solitary wandering of both sexes has been observed but natal dispersal has not been adequately described (Ellisor and Harwell, 1969; Day, 1985; Gabor and Hellgren, 2000). Earlier little population genetic data existed for P. tajacu (but see Gongora et al., 2006). Theimer and Keim (1994) utilized mtDNA variation to measure sequence divergence and geographic partitioning in Arizona populations, but their samples were not associated with social groups. There was sufficient heterogeneity in mtDNA haplotype distribution to indicate limited female dispersal across regions (rather than among neighboring herds as is considered here), although it was not clear if the patterns observed were also a signature of founding events (Theimer and Keim, 1994).

Sampling

Data were collected from three wild populations of P. tajacu in Texas. In the mid-1990s, 102 whole blood samples were collected from the Chaparral Wildlife Management Area (CWMA) in south Texas (Gabor and Hellgren, 2000). These samples were taken from live-trapped animals from 13 social groups, but not all group members were sampled. In 2005, we collected 31 ear snip tissue samples from live-trapped animals from four groups in the Welder Wildlife Refuge (WWR) in south Texas. In 2006–2007 we similarly sampled 134 animals from 13 groups in Big Bend Ranch State Park (BB) in west Texas, along the Texas–Mexico border. The WWR and BB populations were sampled extensively; every social group at these locations was identified through direct and remote camera observation and trapped in large corrals over several sessions. Groups ranged in size from 2 to 18 animals and mean group size was 8.9. Individuals were uniquely marked with numbered ear tags and the strongest possible effort was made to trap and sample every unmarked individual. All samples include associated data on age class (adult, subadult, juvenile, infant), sex, territory location and social group affiliation. Age class was assigned according to behavior and morphological traits such as pelage, body size and testicular development. Individuals showing immature characteristics such as ginger or spotted pelage, undescended or partially descended testicles, adult-oriented following behavior, or estimated body size of less than 9 kg were classed as infants or juveniles, whereas individuals that weighed 10–13 kg were classed as ‘subadults’ that were on the cusp of sexual maturity. Whole blood samples were frozen at −20 °C, and tissue samples were stored in lysis buffer at room temperature until DNA was extracted for long-term storage at 4 °C.

Genetic analysis

Blood clot samples (0.5 g) were digested by rotating for 12 h at 55 °C in 750 μl of lysis buffer (100 mM Tris-Cl pH 8, 10 mM EDTA, 1% SDS, ddH2O), 40 μl of proteinase K (10 mg ml−1) and 2 μl of streptokinase (10 U μl−1). Tissue samples (5 × 5 mm) were digested by rotating for 24 h at 55 °C in 750 μl of lysis buffer and 20 μl of proteinase K (10 mg ml−1). Genomic DNA was extracted from blood using a standard phenol–chloroform method, and from tissue samples using either a phenol–chloroform–isopropanol method or ammonium acetate method (Sambrook and Russell, 2001). All DNA precipitations were washed twice in 70% ethanol, and DNA pellets were resuspended in 250 μl of TLE (10 mM Tris-Cl, 0.1 mM EDTA). A 449 bp region between sites 15 390 and 15 900 of the collared peccary mtDNA D-loop was amplified from genomic DNA using porcine primers (Alves et al., 2003). This sequence lies in the hypervariable 5′ end of the mitochondrial control region and does not code for any known protein product. PCR volumes were 25 μl and contained final concentrations of the following reagents: 1.5 mM MgCl2; 0.5 μM each primer; 0.21 mM dNTPs; 1.25 U Taq polymerase. PCRs were performed in an Eppendorf MasterCycler using the following temperature profile: denaturation for 3 min. at 94 °C, followed by 30 cycles of 94 °C for 4 s, 55 °C for 4 s, and 72 °C for 12 s; finishing with a 15 min extension step at 72° C. PCR products were cleaned using a low sodium protocol; 28 μl of a mixture containing 500 ml of absolute ethanol and 20 μl of 3M NaOAc (pH 5.2) was added to each sample, shaken for 15 min, and centrifuged at 2051 g for 35 min. This step was followed by 70% ethanol precipitation under centrifugation (twice) and resuspension in 20 μl ddH20.

PCR products were then directly sequenced in both directions using Big Dye 3.1 chemistry. Sequencing products were purified using the low sodium protocol described above, and then electrophoresed using an AB Prism 3730XL sequencer (Applied Biosystems, Foster City, CA, USA). Sequence data were aligned and edited with Sequencher 4.5 (gene codes). Nuclear copies of mtDNA genes (numts) can greatly confound evolutionary analyses, and we avoided numts using methods described in Triant and DeWoody (2007). For example, a few individuals (<5%), harbored apparently heterozygous sites so we reamplified their DNA and completely resequenced the amplicons in both directions. In every case, this procedure completely resolved the mismatch and suggested the initial discrepancy was probably a result of Taq error.

We converted sequences into NEXUS format and imported them into PAUP* 4.0 (Swofford, 2003) for haplotype assignment. Haplotypes were determined through reconstruction of unrooted phylogenetic trees using a neighbor-joining algorithm. Direct sequencing of a sub-set of the CWMA population revealed that some of the mtDNA haplotypes could be discriminated by restriction digest with the MboI enzyme, but all individuals from WWR and BB were typed by direct sequencing.

Statistical analyses

Among-population differentiation

MtDNA haplotype frequencies were calculated by hand for all three populations. Genetic differentiation among populations was inferred from FST estimates (Weir and Cockerham, 1984) and exact tests of population differentiation (Raymond and Rousset, 1995) using the software package Arlequin Version 3.1 (Excoffier et al., 2005). For the latter, P-values were estimated from a Markov chain set to 110 000 steps including 10 000 dememorization steps. All analyses were based on pure haplotype frequency data rather than nucleotide differences.

Within-population differentiation

Because P. tajacu populations are subdivided into breeding groups, we incorporated breeding group as a hierarchical level. We calculated identity probabilities by simple counting of identical pairs of genes at different hierarchical levels (Q1 for pairs of genes within groups, Q2 for pairs of genes sampled among groups within populations, and Q3 for pairs of genes sampled in different populations). We then estimated the intra-class correlations by taking appropriate ratios of identity probabilities, weighted according to the number of pairs in each sample (see Rousset, 2007), following the definitions of F-statistics as functions of identity probabilities between pairs of genes (see Appendix). As the distances among populations are large in this study (range of 225 km–945 km among the three sampling sites), we considered the three populations as independent replicates in the analysis, and we restricted our analyses to estimate within-population dispersal. We focused on the level of genetic differentiation among social groups within populations as measured by the parameter FGP. The notation is adapted from Wright (1965). This approach is different from that of Fontanillas et al. (2004) who considered dispersal both among populations and among breeding groups. Although the samples from each site were collected in different years, FGP estimates do not depend upon identity between pairs of genes from different populations and temporally discontinuous sampling is therefore unlikely to undermine the approach. We employed a bootstrapping procedure to calculate confidence limits around estimates of FGP for each class of individuals. Using the statistical software package R (R Development Core Team, 2008), we generated 25 000 bootstrap samples, with each sample being produced by random resampling (with replacement) of the 255 nucleotide sites from the mtDNA haplotypes (254 sites+1 indel). This allowed us to calculate FGP estimates for each sample and generate a distribution; confidence intervals end points were then calculated as the 2.5% and the 97.5% percentiles of this distribution. This procedure is strictly equivalent to that implemented in the software package Arlequin Version 3.1 (Excoffier et al., 2005) to generate 95% confidence limits by bootstrapping genetic differentiation values in a locus-by-locus AMOVA (see, for example, Langergraber et al., 2007).

Class-specific analyses

Dispersal is a trait that can be partitioned into pre- and post-dispersal conditions, therefore our first analysis partitioned the data by age. We performed independent analyses on data partitioned into two age sets, respectively for adults and immatures (the latter including both juveniles and infants). Subadults were classed as immatures and then as adults in sequential analyses. Each age-specific data set was composed of individuals assigned to their respective populations and social groups, and intra-class correlations (FGP) were calculated among social groups within populations from identity probabilities of pairs of genes (see the previous section: ‘Within-population differentiation’). Only those social groups containing a representative individual from each treatment were included in the analysis (for example, in the independent analyses on adult and immature data sets, a social group must have contained at least 1 adult and 1 immature to be included). We then duplicated the analysis with the data partitioned by sex rather than age. From these results, we were able to distinguish a putative class of dispersing individuals, from a putative class of non-dispersers. We therefore performed a posteriori, independent analyses on data sets of putative dispersers and non-dispersers.

We used a resampling scheme after Goudet et al. (2002) to test whether the estimated fixation indices among social groups within replicate populations (FGP) for specific classes (age, sex or putative dispersal class) departed significantly from the null hypothesis that dispersal is independent from the class of individuals. Resampling tests were all performed with the statistical software package R (R Development Core Team, 2008). For each class, we generated 25 000 randomized datasets, by re-assigning the age (or sex, or dispersal class) of each haplotype randomly within each breeding group. By doing so, we kept the number of individuals from each class constant within each breeding group. We calculated the probabilities of identities between pairs of genes for each resampled dataset, and obtained the distribution of class-specific FGP estimates under the null hypothesis that dispersal behavior or capability is independent from age, sex, or dispersal class. We then calculated P-values as the proportion of times where FGP from the randomized data sets was larger than or equal to the observed FGP on the original dataset.

Estimating dispersal

To calculate a sex-specific dispersal rate within a single generation, we adapted the Vitalis' (2002) approach and extended it to mtDNA data. In Vitalis (2002), the ratio of the sex-specific differentiation evaluated after juvenile dispersal (GPXX) divided by the differentiation evaluated before dispersal (GP*) gives the sex-specific dispersal rate. Appendix 1 shows that this relationship also applies to uniparentally inherited markers, and:

gives the sex-specific dispersal rate. In this study, we use this simple model to compare fixation indices before and after dispersal at the within-population level, focusing on dispersal of individuals among breeding groups. This equation assumes that the number of breeding groups, n, is large (infinite); by considering an infinitely large n, we slightly overestimate dispersal rate mx (for example, 10% relative bias with n=10). We estimated instantaneous sex-specific dispersal rates for P. tajacu by applying equation (1), using fixation indices estimates for adult males, adult females and all immatures of both sexes (2). Confidence intervals for dispersal rates were obtained by means of a bootstrap procedure, similar to that used for FGP (see the previous section: ‘Within-population differentiation’), modified as follows. For each bootstrap sample, FGP estimates were calculated for adult males (resp. adult females) and all immatures, and male- (resp. female-) specific migration rates were calculated using equation (1). Confidence intervals for sex-specific dispersal rates were then derived from the 0.025 and 0.975 percentiles of the bootstrap distribution.

Results

mtDNA haplotype distribution patterns

A total of 18 nucleotide sites were variable (17 substitutions and a single indel) over 449 bp. We recovered six mtDNA haplotypes from 267 individual collared peccaries among the three sites sampled (Table 1). Haplotype A was observed in all sampling sites, but haplotype B was unique to the CWMA, and haplotype C was found in both the WWR and the CWMA. The BB population was almost fixed for haplotype E (96%). Haplotypes F and G were only found in the CWMA, and were represented by single individuals (both males).

Table 1 Distribution of mtDNA haplotypes in three wild populations of P. tajacu in Texas, across sex and age classes

We overlaid mtDNA haplotype distribution onto the social group territory distribution for all populations. At the local level, haplotype distribution did not show geographic structuring in the CWMA or the WWR; all haplotypes present at each sampling site were found distributed throughout that site. In the BB population, haplotype A was found only in the eastern portion of the sampling site. At the regional level across Texas, we observed significant population differentiation. Pairwise FST estimates ranged from 0.31 to 0.86 between populations and pairwise exact tests of population differentiation were highly significant (P=0.001), indicating that these populations are significantly divergent from one another.

Patterns of genetic variation revealed by F-statistics as functions of identity probabilities

Because dispersal status is often dependent upon age, we tested for an age bias in dispersal. To that end, we pooled infants and juveniles (categorized hereafter as ‘immatures’) in one class, and adults in another class. It was not clear if individuals categorized as subadults were sufficiently developed to be considered as adults, therefore we performed a preliminary analysis on adult-only and immature-only data sets partitioned into social groups, which revealed a decrease in FGP when subadults were included in the adult class (not shown). This result indicates that individual genetic variation in the subadult class is apportioned among rather than within social groups, and therefore subadults were classed as adults in all subsequent analyses. We estimated fixation indices among social groups for each sex, with individuals partitioned into known breeding groups (Table 2). It is clear that FGP for adults (0.30 [0.03, 0.38]) is much smaller than that for immatures (0.60 [0.60, 1.00]), as would be expected if the adult class included dispersed individuals. To test for significance of these quantitative differences, we used a randomization approach, and generated randomized data sets by assigning an age randomly to each mtDNA haplotype. Under the null hypothesis that dispersal is not age-biased, we expect the observed FGP of adults and immatures not to depart significantly from the null distribution. For adults, there was a large proportion of randomized data sets with a differentiation among groups within populations (FGP) larger than the observed, although this proportion did not achieve significance (P=0.79; Figure 1a). In contrast, for immatures of both sexes, there was only a small proportion of randomized data sets giving an FGP larger than the observed, although the test was not significant (P=0.20; Figure 1b). In general terms, these results clearly indicate a greater amount of dispersal among social groups for adults when compared to immatures.

Table 2 Intra-class correlations for pairs of genes among social groups within replicate populations estimated by means of identity probabilities
Figure 1
figure 1

Re-sampled data null distributions for each class of individuals: (a) adults, (b) immatures, (c) males, (d) females, (e) dispersers, (f) non-dispersers. Observed FGP for each analysis represented by hatched vertical line. Significance tested over 25 000 permutations. Histogram class heights are represented as black dots, and the smoothed density was obtained using the Average Shifted Histogram (ASH) algorithm (Scott, 1992) with smoothing parameter m=20.

To test for a signal of sex-biased dispersal, intra-class correlations (FGP) were estimated for each sex with individuals partitioned into known breeding groups (Table 2). It can be seen from these results that FGP among social groups is much smaller for males (0.23 (0.09, 0.36)) than it is for females (0.90 (0.87, 1.00)), which indicates that even when pre-dispersal age individuals are included in the male class the sex difference is still apparent. To test the significance of the sex difference, we used a randomization approach identical to the one described for age bias, and generated randomized data sets by assigning a sex randomly to each mtDNA haplotype. For males, there was a very large proportion of randomized data sets with a larger FGP than the observed (P=0.98; Figure 1c). In contrast, for females, there was only a very small proportion of randomized data sets giving a FGP larger than the observed, and the test was therefore highly significant (P<0.001; Figure 1d). These results suggest that dispersal is strongly male-biased in P. tajacu.

The inferred dispersal pattern of P. tajacu being of adult male dispersal, we conducted a further analysis, a posteriori, on data partitioned by putative dispersal condition: the data were partitioned into ‘dispersers’ (adult males) and ‘philopatrics’ (immature males and all females) and separate analyses performed on individuals assigned to breeding groups. As expected, FGP for the philopatric class was much larger (0.76 (0.73, 0.99)) than was seen for adult males (0.24 (0.07, 0.36)). For adult males, there was a very large proportion of randomized data sets with a FGP larger than the observed (P=0.99; Figure 1e). In contrast, for putative non-dispersers, the test was highly significant, with very few data sets giving an FGP larger than the observed (P<0.001; Figure 1f).

Dispersal rate estimates

The instantaneous sex-specific dispersal rate among social groups within populations was estimated using equation (1). We used the FGP estimates among social groups within populations (Table 2) for adult males (dispersers) (FGP=0.24), for adult females (FGP=0.91), and for pre-dispersal individuals of both sexes (also categorized as ‘immature’; FGP=0.60). This yielded a male-specific dispersal rate estimate (m) of 0.37 (0.32, 0.65). Equation (1) only makes sense if there is a significant difference between FGP measured after dispersal and before dispersal. As the confidence limits of FGP for adult females ((0.88; 1.00)) and immatures ((0.60; 1.00)) largely overlap, we were unable to calculate a female-specific dispersal rate from equation (1).

Discussion

We have demonstrated that maternally inherited genes can be used to describe the contemporary dispersal patterns of males (and the overall dispersal patterns of females) within an analytical framework based on intra-class genetic correlations. This was accomplished through comparisons of age- and sex-specific intra-class correlations partitioned hierarchically within populations. A second aim was to show that instantaneous sex-specific dispersal rates could be calculated from sex-specific estimates of differentiation using single-locus haplotypic data.

Dispersal in Pecari tajacu

In our study, we quantitatively showed that dispersal in collared peccaries is strongly biased toward males, and that approximately one-third of males dispersed from their natal groups in this single generation. This is a minimum estimate, as some individuals die before or during dispersal, and the lack of mtDNA variation undoubtedly prevented our detection of dispersal between some groups. Moreover, the pronounced local genetic structure indicates that males preferentially disperse over short distances, perhaps into neighboring herds; this is congruent with trapping data (Gabor and Hellgren, 2000). The results from the age-based analysis indicate that dispersal in this species is usually accomplished by subadults (18–24 months). At this age, they have not reached their full body mass and may be forced out by larger, resident males.

Measuring dispersal biases

Our approach allowed us to organize data into age classes, sex classes, social groups and populations and then test hypotheses about the dispersal rate of each class. For example, by performing separate analyses on sex-specific data sets, we were able to both detect a sex-bias in dispersal and also determine, which sex contributed to the pattern. Because the method relies on contrasts of sex-specific estimates of population differentiation, rather than absolutes, the power to detect differences among hierarchies is limited only by the intensity of the bias (Vitalis, 2002). In this study, there was sufficient contrast between pre- and post-dispersal age classes in males to provide a direct estimate of the instantaneous dispersal rate.

The method presented here should be applicable to any species in which there is a bias in dispersal, whether that bias is conditional on sex, age or some other phenotype, so long as trait variation can be readily distinguished and assigned to different hierarchical levels. This approach does not impose spatial distance (or a distance proxy) onto the analysis, as is seen in other approaches such as spatial autocorrelation (Smouse and Peakall, 1999). Such approaches force investigators to make assumptions about how distance interacts with the social organization when it may be inappropriate or irrelevant (for example, when sampling a highly mobile species, or at a scale where an individual is equally likely to disperse to any location under consideration). Our approach removes metric distance and location from the equation, and shifts the focus onto how the genetic variation is distributed across space irrespective of distance, which is especially useful for addressing questions of how sociality influences dispersal.

Measuring sex-biased dispersal with uni-parentally inherited markers

The approach discussed herein relies upon contrasts: we compared the genetic structure of the pre-dispersal class to the sex-specific genetic structure of the post-dispersal class to estimate instantaneous dispersal within a single generation (Lawson-Handley and Perrin, 2007). When autosomal markers are used, the expectation is that genetic structure will be more apparent in the pre-dispersal class compared with the post-dispersal class as a whole, and even more apparent in the non-dispersing sex (whichever sex it may be). When a uniparentally inherited marker is used the expectation is similar, but not identical, to what is seen for biparentally inherited markers. For instance, under a system of male-biased dispersal mtDNA haplotypes are carried within males into breeding groups, but males do not contribute mtDNA to the subsequent generation and thus the contrast between pre-dispersal individuals and adult males is substantial. However, under a system of female-biased dispersal, haplotypes would be re-distributed within and among populations in each generation. Thus a contrast between genetic differentiation for pre- and post-dispersal individuals would be difficult to detect. As a result, this approach is most useful for deriving instantaneous sex-specific dispersal rates with mtDNA data under a system of male-biased dispersal, or a double uniparental system of mitochondrial DNA inheritance (for example, Mytilus mussels). Here, we use mtDNA haplotypes as a tag, but any physical or genetic tag that could be identified in males and females before and after dispersal may have the same role as mtDNA markers in this context.

We have shown that mtDNA can be used in isolation to estimate sex-specific dispersal in the current generation. The main caveat is that mtDNA is, in effect, a single genetic marker that might be biased by selection (Bazin et al., 2006). Yet, because we based our analyses upon differences of variation in male and female within a single generation, it is difficult to imagine a pattern of selection that would undermine the approach.