Introduction

A central theme of ecological and evolutionary research has been the importance of phenotypic and genetic variation within a species as a foundation for selection, adaptation and evolution. Of particular interest are species with phenotypically divergent populations, which may be in the early stages of speciation. Teleost fishes show extensive intraspecific variation, usually related to preferred habitat, foraging strategy or spawning site. For example, limnetic and benthic forms (also known as morphs or ecotypes) exist within salmonids, percids and gasterosteids (Skulason and Smith, 1995). Such intraspecific polymorphisms provide excellent opportunities to investigate the interaction between selection and gene flow in the evolution of local adaptation and genetic and phenotypic divergence (for example, Moore and Hendry, 2005).

An often neglected aspect of the gene flow-selection balance of population divergence is how phenotypic and genetic characteristics of dispersers compare to those of their philopatric conspecifics. Most models assume that dispersers are a random sample of the source populations (Hendry, 2004), although there is evidence that they are not (Murren et al., 2001). However, phenotypic characters of dispersers are generally related to their dispersal ability (for example, presence of wings in insects; Harrison, 1980), not to the local environment in recipient habitats. If dispersers are ‘phenotypically pre-adapted’ to recipient environments, selection may be lower and gene flow higher than otherwise expected, and phenotypic divergence at highly heritable traits may still be maintained.

Sockeye salmon (Oncorhynchus nerka) display an array of phenotypically divergent life history ecotypes, including non-anadromous (kokanee) and anadromous (sockeye) forms, sea-type, river type and lake type anadromous fish (Burgner, 1991). Within lake type fish, there are two primary ecotypes, which spawn in creeks and beaches, respectively (Wood, 1995). These two ecotypes show morphological differences; notably, beach males are much deeper bodied than creek males, likely as a result of the different balances between natural and sexual selection (Quinn and Foote, 1994; Quinn et al., 2001a). This extensive variability has apparently evolved repeatedly in different river systems within the last few thousand years, as sockeye populations only expanded to their current range following the last Pleistocene glaciation (Wood, 1995). Studies of transplanted populations suggest that the process can occur quite rapidly (Hendry et al., 2000).

Patterns of molecular genetic differentiation are sometimes associated with differences in life history. Numerous studies have been conducted on population differentiation in sockeye, with the general result that individuals reared in different lakes are genetically distinct (reviewed in Wood, 1995). Patterns of differentiation have also been found within lakes, sometimes but not consistently associated with differences in spawning timing and life history (Taylor et al., 1997; Ramstad et al., 2003). In general there is evidence of an ‘isolation by distance’ pattern (Hendry et al., 2004), but there is essentially no information about phenotypically divergent but very proximate populations, on the order of 1 km apart or less. Such proximate populations are of scientific interest because differences do not reflect isolation by distance but more direct mechanisms of ecological and behavioral isolation, potentially revealing much about patterns of dispersal between populations.

The phenotypic and genetic diversity in sockeye salmon may arise from responses to different environmental conditions or from local adaptation. Evolutionary models suggest that phenotypically plastic generalists should evolve when migration rates between habitats are high, whereas low rates of gene flow favor the evolution of locally adapted habitat specialists (Sultan and Spencer, 2002). Although almost all sockeye salmon home to their natal sites for spawning (Burgner, 1991), some individuals disperse (stray) and spawn elsewhere. From here, we will refer to dispersers as strays. This interaction between straying and homing is key to balancing dispersal ability with the benefits of local adaptation (Quinn, 1999; Hendry et al., 2004), yet there is virtually no information on why individual fish stray or whether some salmon are predisposed to stray by virtue of their phenotype. Information on how often and why individuals stray is, therefore, crucial for developing a better understanding of Pacific salmon and their evolution.

Little Togiak Lake (Figure 1) is an ideal study site for investigating fine-scale differentiation and straying between beach and creek sockeye ecotypes. The lake, located within the Wood River system in southwestern Alaska, has two small creeks supporting sockeye salmon populations, A creek and C creek. In addition, sockeye salmon also spawn on lake beaches immediately adjacent to the mouths of these creeks (A beach and C beach) and in several other beaches throughout the lake, including the north and south ends (north beach and south beach), which are separated by over 10 km (Figure 1). These breeding populations can be used to make informative comparisons for several reasons. First, beach and creek ecotypes come into close physical contact at the creek mouths (that is, the beach fish spawn within about 50 m of the creek), making geographic distances a minimal barrier to interbreeding. Second, A and C creeks are separated by 1.5 km of shoreline, so the distance between the two sites is at least an order of magnitude larger than between each creek and its respective beach. Third, genetic differentiation among the ecotypes (see Results) allowed reliable identification of strays whose phenotypes could be compared with averages in source and recipient populations. Using data from these two creeks and four beaches, we investigated genetic and phenotypic relationships among sockeye salmon spawning ecotypes. Our objectives were as follows: (1) to quantify genetic differentiation between and within the two types of spawning habitats, (2) to estimate patterns and rates of straying within and between ecotypes and (3) to determine whether the fish that strayed were a random or phenotypically biased subset of the population.

Figure 1
figure 1

Locations where mature sockeye salmon were sampled. A beach and C beach are located immediately adjacent to the mouths of A creek and C creek, respectively.

Materials and methods

Study area

Samples were collected from six spawning locations in Little Togiak Lake (Figure 1): two creeks (A creek and C creek) and four discrete beaches (A beach, C beach, a beach at the southern end of the lake (south beach) and a beach at the northern end (north beach)). According to field surveys conducted from 2002–2004, A and C creeks typically had annual returns of 200–400 individuals, whereas A and C beaches had returns of 10–70 individuals.

Beach spawners generally spawn later than creek spawners due to the different thermal regimes of typical beach and creek habitats (Burgner, 1991), but the spawning times of the A and C beach populations overlap considerably with those of A and C creeks. Sockeye salmon spawn in the creeks from late July to late August, and on the beaches at least through August. Only samples from this period of overlap (8 August 2002, 5–11 August 2003 and 8–9 August 2004) were analyzed to avoid confounding effects of temporal differentiation.

Sample collection

In 2002–2004, 26–48 fish were sampled from each sampling site (except A beach in 2003; Table 1). A total of 601 individuals were analyzed, representing approximately 10–50% of the individuals spawning in each site each year. Creek spawners were captured by dip net, and beach spawners by beach seine. To avoid sampling fish in one habitat that were going to spawn in the other, beach spawner samples were collected only from fish that had settled on their nests (redds), and creek spawner samples were collected only from fish that had entered the creeks, assuming that entry was coincident with the initiation of breeding activities. Beach seining was also conducted at least 10 m from creek mouths to avoid accidental capture of creek spawners holding in the mouths before entering the creeks. Approximately 1 cm2 of fin tissue was collected per fish and stored in 95% ethanol.

Table 1 Summary of genetic data characteristics

Microsatellite analysis

Eleven tetranucleotide repeat microsatellite loci and one dinucleotide repeat locus were used to analyze genetic variation. These loci were One100, One102, One103, One108, One109, One112, One114 (Olsen et al., 2000), One110c (J Seeb, personal communication, F: 5′-GAGTGGCCGTCGTTTTACCCTCCATTTCAATCTCATCC-3′ and R: 5′-GCGCATGGTCATAGCTGTTACAGAGAACAGTGAGGGAGC-3′), Ots3 (Olsen et al., 1996), Ots103 (Beacham et al., 1998), Ots107 (Nelson and Beacham, 1999) and OtsG68 (Williamson et al., 2002). DNA was extracted with Qiagen DNeasy kits, following the manufacturer's protocol. Each PCR was carried out using 2 μl of a 1:4 dilution of the extracted DNA in diluted TE buffer pH 8.0, 1.5 mM MgCl2, 1 mM dNTPs, 2 μM of each primer and 0.5 Units Taq DNA polymerase to make up a total 10 μl reaction volume. The optimized PCR conditions (Tanneal: annealing temperature) consisted of a 6-cycle touchdown with a 1-min denaturing step at 95 °C, a 30-s annealing step at (Tanneal+5)°C (−1 °C /cycle) and a 15-s extension at 72 °C (15 s); 22 cycles of a 1-min denaturing step at 92 °C, a 30-s annealing step at Tanneal and a 15-s extension at 72 °C and a final extension time of 20 min at 72 °C. Tanneal for all loci except Ots3 and Ots107 was 56 °C. Ots3 and Ots107 had Tanneal of 51 °C and 47 °C, respectively. All forward primers were labeled with fluorescent dye, and the labeled PCR fragments were size-separated on a MegaBACE 1000 (GE Healthcare Life Sciences, Piscataway, NJ, USA) sequencer with appropriate size standards. Allele fragment sizes were estimated using Genetic Profiler genotyping software (GE Healthcare Life Sciences).

To assess genotyping error rates, 10 individuals from each 2004 sample, a total of 60 individuals, were re-analyzed for all steps from DNA extraction to allele scoring. Genotyping error rate was quantified as the percentage of allele calls that differed between analyses.

Genotype frequencies were tested for departures from Hardy–Weinberg equilibrium (Guo and Thompson, 1992) and for linkage disequilibrium with GENEPOP 3.3 (Raymond and Rousset, 1995). Pairwise FST values were calculated following Weir and Cockerham (1984) in GENEPOP,and significance of pairwise differentiation was tested in FSTAT (Goudet, 1995). MICROCHECKER (van Oosterhout et al., 2004) was used to check for evidence of null alleles, large allele dropout and accidental scoring of stutter bands. Allelic richness was calculated in FSTAT for every locus based on a minimum sample size of 26 individuals. A beach 2003, which had an extremely low sample size of four fish, was excluded from allelic richness calculations. Observed and expected heterozygosities were obtained from GENALEX (Peakall and Smouse, 2006). Because GENEPOP showed that linkage disequilibrium was high in some samples, potential causes of the disequilibrium were investigated by estimating effective population sizes and relatedness among individuals. Temporal and spatial separation among populations was assessed using analysis of molecular variance in ARLEQUIN 3.0 (Excoffier et al., 2005), grouping the different sampling years within each site. Effective population sizes were calculated via the linkage disequilibrium method using LDNE (Waples and Do, in press), excluding alleles with frequencies less than 0.02.

Bayesian cluster analysis, as implemented in STRUCTURE (Pritchard et al., 2000; Falush et al., 2003), was used to estimate the number of populations in the entire data set. STRUCTURE groups individual genotypes into populations so that Hardy–Weinberg and linkage disequilibria are minimized. An admixture model with correlated allele frequencies was used, with a 200 000 iteration burn-in and 400 000 iterations of a Markov chain. Putative population number (K) was set from 1–16, and calculations were carried out three times for each K value. The K value with the highest likelihood and the lowest variance in likelihood among the three runs was chosen as the true number of populations.

Putative strays between populations identified by STRUCTURE (see Results) were detected by a permutation procedure (Paetkau et al., 2004) implemented in GENECLASS2 (Piry et al., 2004). Briefly, individuals were ranked according to the ratio Lh/Lmax, where Lh is the likelihood of belonging to the population where the individual was sampled (home population), and Lmax is the maximum likelihood of that individual belonging to any of the populations. Rejection zones for the null hypothesis that the individual was sampled in its natal population were created by resampling gametes (multilocus haploid genotypes) from existing data sets, combining them to form diploid individuals and creating expected distributions of Lh/Lmax. Probabilities of individual i belonging to population l were calculated from Li,lLi,j for all j populations (Piry et al., 2004). Putative strays were identified when they could be excluded at the 0.05 level from their home population and could be assigned at higher than 95% probability to another population.

To evaluate the power of this approach for stray identification, we simulated 10 data sets with the same sample sizes and number of strays as found in the observed data. Straying and philopatric individuals were simulated by combining gametes (haploid multilocus genotypes) drawn randomly from respective source populations in POPTOOLS (Add-In for Excel, Greg Hood, CSIRO, http://www.cse.csiro.au/poptools/). Simulated data sets were transformed into GENEPOP files and submitted to the GENECLASS2 analysis as above, with only 10 sets analyzed because of the computation time required to carry out the exclusion analyses. The known number and origin of simulated strays were compared with GENECLASS2 results as a measure of the reliability of assignments.

Morphological analysis

For morphological comparisons, measurements of dorso-ventral body depth from the anterior insertion of the dorsal fin to the belly, perpendicular to the long axis of the fish and body length from the mid-eye to the end of the hypural bone were taken. The 2002 samples were used opportunistically, and body measurements were taken only of beach spawners, not of creek spawners. To allow comparisons independent of allometric growth, body depths were standardized to a length of 450 mm using the adjustment equation provided by Ihssen et al. (1981). The standard length of 450 mm approximates the long-term average of sockeye populations and has been used in other studies to facilitate comparison among populations (Blair et al., 1993; Quinn et al., 2001b). Males and females were analyzed separately because sockeye salmon are sexually dimorphic, with males typically exhibiting larger trait values than females (Blair et al., 1993). Female body depth was not considered a very comparable trait because female body depth changes over the spawning season as eggs are released, and females were not consistently captured in pre-spawning condition. Nevertheless, we did analyze the available female data. A two-way analysis of variance tested the effects of year and sample site on body measurements, and independent t-tests were used to compare mean trait values between creek and beach ecotypes.

Because of small and unequal sample sizes, differences in traits between putative strays and non-strays were determined using the Welch statistic, which is appropriate for small sample sizes and unequal variances (Brown and Forsythe, 1974), and Kruskal–Wallis and Mann–Whitney tests, which are nonparametric tests also robust to unequal variances (Kruskal and Wallis, 1952).

Results

Microsatellite variation

All microsatellite loci were variable in all samples (Table 1), and genetic variability was higher in beach (HE=0.789) than in creek spawners (HE=0.730; average over years and sites). Allelic richness was higher at 11 of 12 loci for the beaches (average allelic richness over all loci=16.7) than for the creeks (average allelic richness over all loci=13.9). Genotyping error rate was minimal, with per locus error rates ranging from 0 to 1.7%. The per-allele error rate over all allele calls was 0.7%.

About 13% of tests for departures from Hardy–Weinberg equilibrium were significant at the P<0.05 level (Table 1). Linkage disequilibrium also occurred more frequently than expected by chance (P<0.05) and varied both temporally and spatially. Disequilibria were most prevalent in the 2004 samples, and, on average, in samples collected from C creek, C beach and south beach (Table 1).

There was relatively high and significant differentiation between beach and creek spawners from each location, and A creek spawners appeared to be more differentiated from A beach spawners than C creek spawners were from C beach spawners (mean across years: A: FST=0.048, C: FST=0.025, Table 2). Low but sometimes significant differentiation was detected among samples of beach spawners (FST=0.007 over all years and sites), whereas creek spawners showed higher differentiation (FST=0.038 over all years and sites). No among year variation was detected at A beach, south beach and north beach. In contrast, the 2004 C beach, A creek and C creek samples differed significantly from their respective 2002 and 2003 samples. An analysis of molecular variance indicated that differences among sites were slightly smaller than differences among years within sites (FST=0.027, C: FSC=0.015, FCT=0.013), with all differentiation measures being highly significant (P<0.001). When the genetically distinct A creek 2004 sample was removed from the analysis of molecular variance analysis, differences among sites were notably larger than differences among years within sites (FST=0.022, C: FSC=0.006, FCT=0.017). Again, all three measures were highly significant (P<0.001). Estimated effective population sizes ranged from about 40 to 160 individuals for samples with significant linkage disequilibrium (Table 1).

Table 2 Pairwise FST values and significance at the P<0.05 level after applying a Bonferroni correction (indicated by asterisk, below diagonal) and number of loci at which FST was significant (P<0.05, above diagonal)

Morphological variation

The two-way analysis of variance and two-sided t-tests showed that the two ecotypes differed in morphological traits, but trait values also varied with sampling year and sampling site (Table 3). Creek females were shorter than beach females (means: 434.8 vs 446.8 mm, P=0.004) but not significantly different in standardized body depth (P=0.364). For males, year, site and the interaction between year and site all affected body length and standardized body depth (P<0.01 in all cases for length, MSE=1388.0, and for depth, MSE=115.7). Male creek spawners were shorter than beach spawners (mean lengths 439.1 vs 476.9 mm, P<0.001) and also less deep-bodied for a given length (mean depths 133.0 vs 177.4 mm, P<0.001). Standardized body depth showed a clearly bimodal distribution and was considered a distinguishing characteristic for beach and creek males (Figure 2).

Table 3 Morphological information (in mm) from beach and creek populations
Figure 2
figure 2

Frequency histograms of male standardized body depths. BB represents individuals genetically assigned as beach fish and sampled on a beach. CC represents individuals genetically assigned as creek fish and sampled in a creek, with creek-to-creek strays colored in gray. BC represents individuals genetically assigned as beach fish and sampled in a creek, and CB represents individuals genetically assigned as creek fish and sampled on a beach.

Putative populations and straying rates

The STRUCTURE analysis showed a clear increase in likelihood from putative population number K=1 to K=2 in all runs, with a maximum likelihood at K=5. Assuming five populations, one population was mostly C creek fish, one was A creek fish sampled in 2002 and 2003, one was A creek fish sampled in 2004 and a mixture of two populations was beach fish. STRUCTURE considered most beach fish ‘hybrids’ of these two beach populations, meaning there was no clear separation of the two populations. The two populations constituting all beach samples were thereafter combined and considered a general beach population. These four populations were used in subsequent GENECLASS2 analyses identifying putative strays.

In total, 56 individuals were excluded from the population where they were sampled (home population) by the permutation procedure in GENECLASS2. Of those, 27 individuals were assigned to one of the other three populations with higher than 95% probability—these individuals were, thus, likely strays. One of the putative strays was a female sampled in A creek 2003 but genetically assigned to A creek 2004. This fish may have returned to freshwater 1 year earlier than most of her cohort, though no age data were available to support this notion. As we were interested in spatial patterns of straying, this individual was not considered in further analyses.

Simulations demonstrated that GENECLASS2 was generally able to reliably identify true strays and assign them to their population of origin. The program also misidentified some strays but failed to assign those false strays to any specific source population. An average of 22.9 false strays were identified (4.0% of all non-strays, range 9–29 individuals) per simulation run, and only 7.5 of those false strays (1.3%, range 3–11 individuals) were assigned to a source population. In contrast, 25.4 of the 27 true strays were detected (94.1%, range 23–27 individuals), 20.6 with a source population (76.3%, range 15–24 individuals) and 20.4 with the correct source population (75.6%, range 15–23 individuals). These simulations showed that our method would detect 75% of true strays while producing very few (<11 individuals over all populations) false strays. Most of the 27 strays identified in the original data set were, therefore, likely to be true strays.

More strays between habitats moved from creeks to beaches (N=12) than from beach to creek (N=5). Most of the putative strays into A and C beaches were from their respective creek; two of three creek strays into A beach came from A creek and nine of nine creek strays into C beach came from C creek. No creek strays were found at north or south beach. More of the strays were males than females (13 males and 10 females across all years; 3 fish had no sex data). Nine of 14 strays found in creeks came from the other creek. Stray rates also varied among creeks; nine strays were found in C creek (8.7%) and only five in A creek (4.5%). Strays between beaches could not be detected because the beach samples were genetically indistinguishable from each other.

Male strays between habitats differed phenotypically from other, non-stray fish of the same genetic origin in terms of standardized body depth. Groups differed overall according to Welch (F=154.3, P<0.001) and Kruskal–Wallis tests (H=115.5, P<0.001). In addition, all pairwise comparisons were significant according to both Welch and Mann–Whitney tests. Males genetically assigned to a creek but sampled on a beach (CB, Figure 2) had significantly deeper bodies than the creek males that spawned in the creek (F=9.12, P=0.022; Z=−3.22, P=0.001) but significantly shallower body depths than other individuals spawning on the beaches (F=11.6, P=0.012; Z=−2.95, P=0.003). Likewise, males genetically assigned to a beach but sampled in the creeks (BC, Figure 2) were more shallow-bodied than males both assigned to and sampled on a beach (F=87.0, P=0.007; Z=−2.92, P=0.004), although they did not differ from other creek fish (F=2.32, P=0.26; Z=−1.44, P=0.15). In other words, individuals that strayed between habitats were a non-random subset of their population of origin, phenotypically resembling the fish in the population into which they strayed. No morphological comparisons were possible in strays between the two creeks because body depth data were available for only two of the nine strays. Nevertheless, these two males had body depths well within the range of non-stray creek fish (Figure 2).

Discussion

The salient findings of this study were as follows: (1) despite very close physical proximity, the microsatellite allele frequencies of creek spawning sockeye differed dramatically from those of fish spawning in the adjacent beach, (2) the fish spawning in the two creeks differed in allele frequencies far more than the beach spawning groups differed from each other and (3) most importantly, the morphology of straying sockeye resembled that of the population into which they strayed more than that of their native population.

Despite their extreme geographic proximity, creek and beach populations showed surprisingly high levels of genetic differentiation. Several factors may contribute to such differentiation. Accurate natal homing has been suggested by data from parasites, otolith and scale microstructure, and morphology and age distribution (Quinn et al., 1987, 1999). Indeed, recent experimental evidence indicates that sockeye salmon can home to specific regions with a single small creek (Quinn et al., 2006). However, the FST values between these creeks and beaches separated by less than 1 km (0.048 and 0.025 at A and C, respectively) are on the same order of magnitude as FST values observed between populations separated by thousands of kilometers (Beacham et al., 2006). Thus, assuming that straying is more likely to proximate than to distant sites (Hard and Heard, 1999), other factors must be contributing to differentiation in the A and C populations.

Both A and C creeks have small annual spawning populations of about 400 individuals. Furthermore, sex ratio bias, population fluctuations and high variance in reproductive success may reduce the effective population size (Ne) compared to census size, N (Hedrick et al., 2000; Hauser et al., 2002). In Pacific salmon, estimates of the Ne/N ratio can vary between 0.02 and 0.3 (Shrimpton and Heath, 2003; Waples, 2004), and so effective size of the creek populations may be sufficiently small for considerable drift to occur. The significant temporal differentiation in both creeks also suggests small effective populations. Beach spawners may stray among beach sites and constitute a larger population, experiencing less genetic drift and being less differentiated from each other. Lower genetic diversity in creek than in beach spawners supports the hypothesis of smaller effective populations of creek spawners, as has also been suggested elsewhere (Habicht et al., 2004). Therefore, small effective population size in creeks, coupled with accurate homing and strong selection against strays (as suggested by morphological differences) may maintain small-scale genetic differentiation. More detailed consideration of straying fish may help to disentangle these factors.

Most identified strays between habitats moved from creeks to beaches. Presumably, beach fish seldom attempted or succeeded in entering the creeks. Both A and C creeks are very shallow, and deep-bodied beach males may have trouble migrating upstream. Larger-bodied fish are also more vulnerable to bear predation in shallow creeks, whereas predation is greatly reduced on beaches (Quinn et al., 2001a). Moving from a creek to a beach may be physically easier, but strays from creeks to beaches may not have successfully spawned in the new habitat. Sexual selection probably reduces the reproductive success of such strays, because female choice and male competition favor larger, deeper-bodied males on beaches (Quinn and Foote, 1994). Indeed, creek spawners appear to have had little reproductive success on the beaches (as suggested by Hendry et al., 2000; Hendry, 2001), given the high levels of genetic differentiation between beaches and creeks. In total, 17 strays between the two habitats were identified out of 601 sampled individuals. Extrapolated to total lake population numbers, the total number of strays between habitats in 2002–2004 was over 100 fish. This number greatly exceeds long-term equilibrium expectations of gene flow per generation (Nem=15) from the commonly used (and criticized, Whitlock and McCauley, 1999) equation FST=1/(4Nem+1) (A beach and creek: FST=0.048, Nem=5.0; C beach and creek: FST=0.025, Nem=9.8). Phenotypic and genetic differentiation may, therefore, be maintained, at least, in part, by selection against strays.

Despite indication of selection against strays, our results suggested that straying fish were phenotypically more similar to the recipient population than were other fish of the same putative genetic origin. Male strays from beaches to creeks had similar body depths to the creek spawners (Figure 2) and had shallower bodies than other beach fish. Correspondingly, creek-origin strays on beaches were deeper bodied than most of the creek fish but less deep bodied than other beach spawners. The creek-to-creek strays, although limited in number, had body depths similar to those of other creek fish (Figure 2), a result consistent with the hypothesis that strays resemble other individuals in their recipient populations and are not necessarily morphological outliers in their source populations. Directed ‘morphological bias’ in strays between habitats would greatly reduce selection pressure against them. Body depth is a highly heritable trait in Pacific salmon (Chinook salmon, heritability of hump size 0.91±0.27 (h2±s.e.); Kinnison et al., 2003), and selection could act very effectively on this trait. Current models of the effects of selection against strays on adaptation (Hendry, 2004) do not consider such phenotypically biased strays and may, therefore, overestimate selection coefficients and, thus, the speed of divergent evolution.

To our knowledge, this is the first study providing evidence for such phenotypic bias of strays in relation to recipient habitats, not only in salmon but in any species. In salmon, some studies suggested that younger fish were less prone to stray than older fish (Quinn et al., 1991; Labelle, 1992), and one study indicated a male sex bias in strays (Hard and Heard, 1999). Both natal and recipient habitat quality may affect straying rate (Quinn and Fresh, 1984; Quinn et al., 1991), but there is no evidence thus far that habitat quality could pre-select fish for straying. The decision whether to stray or to home can be seen as a reflection of the balance of interacting benefits and costs (Hendry et al., 2004), and the close proximity of creeks and beaches in Little Togiak Lake may allow fish to better assess that balance. For example, a shallow-bodied beach spawner male may ‘decide’ that its competitive chances are better in the creek where males are smaller, whereas deep-bodied creek spawners may avoid stranding and bear predation by staying on the beach. Alternatively, that same beach spawner might be ‘forced’ to stray due to competition and low chance of reproductive success in its natal habitat. It remains to be seen if similar patterns can be found in populations that are further apart, though recent radio tagging experiments suggest that individuals do show exploratory behavior over tens or even a few hundred kilometers (Young and Woody, 2007).

Although high genetic differentiation exists between the two creeks, many fish straying from one creek to the other were observed. Indeed, the discrepancy between long-term estimates of gene flow (FST=0.038, Nem=6.3) and assignment of potential strays (N=9, extrapolated to the entire populations about 50 strays) is comparable to the discrepancy in short-term and long-term estimates of gene flow between habitats. This discrepancy may suggest selection against strays, although it may also indicate that fish visit proximate creeks before homing to their natal stream. No clear morphological differences were observed between creeks, and strays between creeks appeared to be a random sample of their population (Figure 2). Although the scope for selection may appear more limited against strays from another creek than against strays from a beach, less observable traits, such as egg and juvenile traits, may vary between creeks. Further research is required to investigate such differences, which could show a high level of local adaptation in populations inhabiting very similar creeks.

Among beach populations, on the other hand, no genetic or morphological differentiation was found. Although we were, therefore, unable to identify strays from one beach to another, beach-to-beach straying likely occurred because beach sites are easily accessible from one another, and the habitats are similar. Patterns of genetic diversity and differentiation support the notion of higher straying rates between beaches than between creeks. Generally, higher diversity was found on the beaches than in creeks, even though beach spawners occurred in clusters of fewer than 200 fish, whereas A and C creeks each contained approximately 300–500 spawners. Beach spawner aggregations may, therefore, maintain high rates of exchange with other beach aggregations and follow a classic metapopulation model (Hanski, 1998), with local extinctions and recolonizations, but sufficient connectivity to prevent genetic differentiation. Indeed, a recent radio telemetry study showed more direct migration in creek spawners than in beach spawners, which supports the notion of more accurate homing in the former (Young and Woody, 2007).

The genetic data showed some disequilibrium patterns, but those patterns should not have affected the findings on phenotype and straying. Genotype frequencies deviated significantly from both Hardy–Weinberg and linkage equilibrium. Theoretically, populations are in Hardy–Weinberg equilibrium if they are infinitely large and have random mating, no inbreeding, no selection and no migration. Most of these assumptions were likely violated in the study populations, and so determining the precise cause of the Hardy–Weinberg disequilibrium was not possible. High levels of linkage disequilibrium may be indicative of physical linkage between loci, population mixing between genetically differentiated populations, low effective population size (Ne) or sampling of related individuals. Physical linkage was unlikely because many of these loci have been used together in other studies without showing notable linkage disequilibrium (for example, Olsen et al., 2004). There was also no indication of higher than random relatedness among individuals (data not shown). The most likely contributor to linkage disequilibrium was small effective population size, as estimates of effective population sizes from linkage disequilibrium were 10–20% of census sizes, which is within the range reported for Pacific salmonids (Waples, 2004).

Whatever its cause, the linkage disequilibria likely led STRUCTURE to cluster all beach samples into two populations without geographic or temporal pattern. This probable artifact may raise some skepticism regarding the STRUCTURE output, but the pattern of distinct creek populations and a general beach population was corroborated by FST values as well as by another Bayesian clustering program, BAPS 3 (Corander et al., 2004). For analysis of putative strays, we used a method specifically designed to detect first generation migrants (Paetkau et al., 2004), and our simulations demonstrated that we could detect most strays with few false detections. Any false strays would make the phenotypic comparisons more conservative by reducing differences between putative strays and philopatric fish. In addition, Hauser et al. (2006) demonstrated high (>90%) assignment success with only eight loci in populations with low FST values (FST=0.02). In comparison, this study used 12 loci for beach-creek pairs with FST values of 0.048 and 0.025, and should, therefore, have higher power. Perhaps most importantly, the clear morphological differences between putative strays and non-strays indicated that the assignment results reflected biological reality.

Small effective population size is also the most likely explanation for the temporal genetic differentiation observed at A creek, C creek and C beach between 2004 and the other two sampling years. Such temporal differentiation is exacerbated in species with overlapping generations, especially when most individuals mature and breed at the same age (Jorde and Ryman, 1995). If, for instance, fish consistently spawn as 4-year-olds, fish from every fourth year would be genetically more similar to each other than to spawners in subsequent years. Indeed, data from A and C creeks indicated that 70.9% of females (n=1424) and 75.5% of males (n=1062) collected over 23 years were age 4 (unpublished records, Fisheries Research Institute, University of Washington). Temporal differentiation is probably not due to gene flow from other populations, as genetic analysis of other Wood River populations showed that the A and C creek populations are highly differentiated and that there are no potential source populations within the drainage system (unpublished data).

Another potential concern was that sample sizes of strays were relatively low: only 10 male strays between habitats were identified. However, approaches using genetic assignment tests are necessarily based on small numbers of identified strays; if there were many reproductively successful strays, genetic differentiation probably would be insufficient to detect those strays genetically. We were very stringent with our genetic assignments, thus reducing the number of identified strays, but also reducing the likelihood of identifying philopatric fish as strays. Our simulations demonstrated that this stringent approach was successful, and indeed misidentified only very few philopatric fish as strays. To allow for the resulting small sample sizes, we tested for differences between putative strays and non-strays with Welch, Kruskal–Wallis and Mann–Whitney tests, which are robust to the unequal variances that accompany small sample sizes (Kruskal and Wallis, 1952; Brown and Forsythe, 1974). Nevertheless, conclusions from these small samples are somewhat preliminary and need to be confirmed by larger samples and more sampling years.

In summary, we demonstrated high genetic and morphological differentiation between and within ecotypes on a very small geographic scale, suggesting complex patterns of local adaptation and demographic connectivity. More importantly, we showed that straying individuals represent a biased sample of the source population, thus, potentially moderating selective effects against those strays. More research is needed to confirm such effects at other traits and in other populations. However, our results suggest that models of adaptive evolution need to consider correlations between phenotypic and genetic characters and the propensity to disperse.