Introduction

Species have limits to their geographic range beyond which reproduction and immigration are insufficient to sustain a population. Although the presence or absence of a species is a simple binary, the ecological genetics and evolutionary processes determining that range limit pose endless questions for researchers. Investigations into the evolution of range limits inform current concerns, particularly in the areas of conservation biology, climate change and bioinvasions.

Most studies of range limits concentrate on species in endemic regions. Recent bioinvasions are less likely to be studied because of the assumption that bottlenecks during the invasion remove genetic variation required for evolution (Garcia-Ramos and Rodriguez, 2002). That assumption is challenged by studies documenting bioinvasions where propagules were either larger than assumed (Lockwood et al., 2005) or comprised multiple introductions from different source populations, resulting in admixtures with high genetic variability (Facon et al., 2008). Repeated introduction of the same species is increasingly considered to be a major factor in the success of bioinvasions (Lockwood et al., 2005). Therefore, the evolutionary potential of recent bioinvasions cannot be assumed to be limited by a lack of genetic variation (van Heerwaarden et al., 2009).

Evolution at the range limits of bioinvasions may therefore depend on parameters similar to those governing evolution of endemics. Initially, short-term opportunities for population growth in newly colonized environments may promote rapid evolution (Reznick and Ghalambor, 2001). In the longer term, local adaptation in peripheral populations may be opposed by genetic drift where migration rates and population sizes are low. Alternatively, gene flow from large central populations, not subject to the same selective regime, may oppose adaptation in peripheral populations. Various models have sought to quantify the joint action of drift and gene flow on peripheral populations (Barton, 2001; Lenormand, 2002). Some suggest an optimum level of migration that is both sufficient to counter the deleterious effects of drift while permitting local adaptation (Alleaume-Benharira et al., 2006), although they do not account for additional complex interactions between migration, habitat quality, mode of selection and genetic architecture (Kawecki, 2008). Nevertheless, an understanding of the population structuring is fundamental in understanding the evolutionary potential of an invasive species.

In this study, we investigated a recent range expansion of the tephritid fruit fly, Bactrocera tryoni, which provides an opportunity to study a bioinvasion characterized by abundant genetic variation. As with other economically significant bioinvasions, there is considerable interest in whether peripheral populations have the potential to adapt to conditions beyond the current range limit. Here, we have investigated the population genetic structuring of B. tryoni from its core regions through to populations at the present species border to determine how gene flow and genetic drift are likely to influence evolutionary potential of the peripheral populations.

Materials and methods

B. tryoni population expansion

Following the European settlement of Australia, the horticultural pest B. tryoni has spread southwards from tropical northeastern Australia, following the planting and trade of exotic horticultural crops (Birch, 1961). Since the mid-twentieth century, B. tryoni has colonized the drier inland temperate region of southeast Australia, our present study region (Figure 1). Here it appears to have reached its present climatic limits, bounded by cold to the south, mountains to the east and desert to the west (May, 1963; Meats, 1981; Sutherst et al., 2000).

Figure 1
figure 1

Map of the study region, showing sampling sites. Sites within the FFEZ quarantine area are indicated by letters. There are no permanent populations of B. tryoni either to the west or south of the region. Grey lines are sealed roads; urban areas appear as irregular black shapes. North is to the top.

Throughout the study region, B. tryoni is confined to rural towns where artificial irrigation allows the fly to survive summer moisture stress (Dominiak et al., 2006). The towns also contain the succession of exotic host fruit trees necessary to support four generations per year (there is no diapause in B. tryoni). As the unaided dispersal ability of B. tryoni is only a few kilometres, dispersal between towns in inland regions is most likely to be human-assisted, mainly in infested fruit (Meats et al., 2003; Dominiak et al., 2006). The study region therefore has an island biogeography, with towns separated by broad-acre grazing and cropping regions, unsuitable for fruit fly.

Genetic structuring studies to the north and east of this study region showed minimal detectable genetic differentiation among core populations of B. tryoni (Yu et al., 2001; Gilchrist et al., 2006). Genetic differentiation between Site 1 (Figure 1) and the central populations to the north and east was at the limit of detectability (FST <0.005). Site 1 appears to be at the edge of an abrupt transition zone where the homogeneous central populations give way to fragmented town-based populations. A preliminary survey of the towns in the present study area showed higher levels of genetic differentiation between towns (0.01<FST<0.12; Gilchrist et al., 2006). In contrast to the central populations, the inland populations appear to have a metapopulation structure approaching the current southern species border.

Population sampling

Simultaneous fly collections of male B. tryoni were carried out across the study area in February–March 2002 at the time of peak summer populations. At some sites sampling was repeated again in 2003 and 2004. All towns in the region with a population over 1000 people were sampled along with several smaller towns. Sites were numbered in approximate order from North to South.

Since 1994, the western section of the study area has been maintained as a B. tryoni quarantine area, the Fruit Fly Exclusion Zone (FFEZ; Jessup et al., 2007; Figure 1), containing a permanent monitoring grid of 3000 B. tryoni male-lure traps. Between 1994 and mid-1996 only two flies were caught in the FFEZ and none were trapped in the second half of 1996 (Meats and Clift, 2003) indicating no permanent populations in the FFEZ at that time. However, since 1997, there have been an increasing number of outbreaks with over 1000 flies trapped each year between 2000 and 2004.

Fly samples from the FFEZ were obtained from the trapping grid. If more than 10 flies were trapped in any one town in February–March 2002–2004, those flies were included in the population structuring analysis. In case where FFEZ samples contained only small numbers of flies (<10), those flies were analysed separately and screened for possible inclusion in the assignment analysis (see below).

Genotyping

Individuals were typed using the standard fluorescent PCR methods detailed in Yu et al. (2001). All samples were typed for at least 26 of a set of 29 microsatellites (Kinnear et al., 1998; Wang et al., 2003). Likely null alleles were detected using tests for homozygote excess (Micro-checker software; van Oosterhout et al., 2004). We tested for deviation from Hardy–Weinberg equilibrium proportions within populations across loci by testing for significant inbreeding (f), using the Genetix program (Belkhir et al., 2002). Overall, 95% confidence intervals for f were estimated from 10 000 bootstrap replicates. Allelic richness and significance values were calculated as detailed in Gilchrist et al. (2006).

A hierarchical analysis of molecular variance was performed using Arlequin version 2.0 (Schneider et al., 2000). The hierarchical levels used were individual, population and region (outside the FFEZ versus inside FFEZ). Significance of variance components was calculated using 50 000 permutations. Pairwise differences between populations, measured as FST, were tested using exact tests in FSTAT version 2.9.3 (Goudet, 2002) with 5000 permutations. Mantel tests were performed using FSTAT with 5000 permutations. In simultaneous statistical tests, critical significance levels were corrected using the sequential Bonferroni test.

Population structuring, assignment testing and migration rate estimation

Genetic structuring of the samples was investigated in parallel using both the Structure software (Pritchard et al., 2000; Falush et al., 2003) and BAPS 5.2 (Corander et al., 2008). For all Structure runs highly repeatable results were obtained with a run length of 50 000 iterations, following a burn-in of 50 000 iterations. Individuals were classified as originating from a particular source if the posterior probability was P>0.9. For all BAPS analyses, 10 replicates were run for all pre-defined k (that is, number of clusters).

The Structure assignment testing function and the BAPS trained clustering function were used to identify source populations of likely migrants into the FFEZ. It was important that the FFEZ candidate flies tested were actual immigrants from outside the FFEZ, rather than the offspring of invading flies, in order to avoid biases due to concurrent founder effects, admixture of genotypes and unequal sampling. Accordingly, we excluded flies trapped at sites where it was likely that a viable population was present. Using 30 years of data on spot infestations of B. tryoni in the FFEZ, Meats and Clift (2005) found that with a trapping rate below 0.025 flies/trap/week, at least 71% of outbreaks became extinct without any control measures (the remaining populations were subjected to control spraying, so their natural fate was unknown). Therefore, we performed assignment tests only on flies from sites where the trapping rate was below 0.025 flies/trap/week. Of 8348 trapped flies, 99 satisfied our criteria and were used in the assignment testing/trained clustering.

We estimated recent migration between populations (i.e. during previous 1–3 generations) using the program BayesAss+ version 1.2 (Wilson and Rannala, 2003). The program estimates the proportion of migrants between breeding aggregations, with the migration rate expressed as a proportion of the recipient population, resulting in migration rates into large populations appearing smaller. The program was run under the default settings for iterations and burn-in. Longer runs or runs with different delta values did not alter those results.

Effective population sizes (Ne) were estimated using the MNe 1.0 software (Wang and Whitlock, 2003), which uses a maximum likelihood method to estimate Ne from temporal variation of allele frequencies, limiting estimates to sites with more than one sample. Samples from the study region collected 12 months apart were separated by four generations (Meats, 1981).

Results

The descriptive statistics of the 927 flies are shown in Table 1. The presence of null alleles across the 29 microsatellite loci was tested in the samples from site 1 to 14, giving 406 tests for null alleles. Assuming a Type I error rate of 0.05, at least 20 significant tests were expected across the 14 populations by chance. Therefore, we inferred null alleles at a locus where 2 or more of the 14 populations showed a significant excess of homozygotes. This process excluded eight microsatellites and all results in this paper are based on the reduced set of 21 microsatellites.

Table 1 Summary statistics for the samples included in the clustering analysis: sample size averaged across loci (n), expected heterozygosity (He), observed heterozygosity (Ho), inbreeding coefficient (f) and the reduction in allelic richness

Inbreeding among the samples was low, indicating that the town-based samples represented single populations (Table 1). Of 34 samples, 10 showed a significant decrease in allelic richness compared with site 1. The reductions in allelic richness were significantly correlated with distance to the west (r=0.7, P<0.0001) but not distance to the south (r=0.07, P=0.7).

Exact tests for pairwise genetic differentiation among the 2002 samples showed that 174 of 210 pairwise comparisons were significant after correction for multiple comparisons, including populations separated by as little as 33 km (sites 4 and 5; see Supplementary Material 1). The analysis of molecular variance showed that 95.7% of variance was explained by within population variation (FST=0.043, P<0.001). Of the remainder, 3.8% of variance was due to variation between populations within regions (FSC=0.038, P<0.001) and only 0.5% was explained by variation between samples inside and outside the FFEZ (FCT=0.05, P=0.04).

Genetic structuring in 2002

Analysis of the twenty-two 2002 samples using either Structure or BAPS identified four distinct town-based clusters at sites 16, 17, H and D (Figure 2). These were the four most westerly sites. With Structure, at least 78% of the individuals from each of the four sites showed a high probability of membership (>0.9) of the cluster based on that site. Thus, each of the four westerly clusters had a specific geographic basis.

Figure 2
figure 2

Population clustering analysis produced using Structure software. Each pie-graph shows the proportion of individuals at that site that had a probability of membership >0.9 of one of the eight clusters (see text). Each pattern of graph fill corresponds to one of the eight clusters. Unlabelled pie graphs indicate 2002 samples. The samples of 2003 are indicated by ‘03’ and 2004 samples by ‘04’. If there were consecutive samples from the same site, they are arranged vertically, with the earliest sample at the top. The thick dashed line indicates the border of the FFEZ and the thick lines are state boundaries.

The remaining 18 samples from 2002 were analysed with Structure after the four distinct clusters were removed from the data. Posterior probabilities began to plateau when the number of pre-defined clusters (K) was 2, at which point 195 out of 466 individuals showed a probability of membership >0.9 of one or other of the two clusters. Geographic mapping of the pattern of cluster membership at each site showed a north-south cline (Figure 2). Analysis of the same 18 samples using the BAPS software with a pre-defined value of K=2 using no spatial information also showed a north-south cline similar to that found using Structure (Figure 3). For later analyses, we referred to the combination of samples 1–8 as the North group and samples 9–15 as the South group.

Figure 3
figure 3

Plot of the microsatellite cline calculated using both Structure and BAPS software. Each point shows the proportion of individuals at that site belonging to the North cluster calculated by Structure or BAPS. Distances were taken from site 1. Lines of best fit are dashed for Structure and solid for BAPS.

A Mantel test showed a significant relation between geographic separation and genetic distance for the populations in the two clusters (r=0.39, P<0.0002). A plot of Slatkin's (1993) gene flow estimator against the pairwise geographical distance between sites 1 and 15 (Figure 4) produced a significant regression of b=–0.30 (95% confidence interval –0.43; –0.19), indicating restriction of gene flow across the study area at a magnitude of approximately 100 km.

Figure 4
figure 4

Plot of Slatkin's gene flow estimator (M-hat) as a function of distance for populations outside the FFEZ. Values less than 1 indicate restricted gene flow.

At three sites within the FFEZ (sites C, W and T), new large outbreaks occurred in 2002. At each site, more than 50 flies were trapped in 2002, whereas no flies were trapped in the previous 6 months. Samples from these sites were not identified by either Structure or BAPS as distinct clusters in the initial analysis. Instead, they were mixtures of the six clusters identified above (Figure 2).

In summary, we identified six main populations among the 2002 samples: North, South, sites 16, 17, D and H). It was unlikely that there were other major populations present in the region in 2002, as our sampling included all but one town with a human population over 1000 (site 18, sampled in 2003, see below). Other small trappings of flies (<10 flies per site) from within the FFEZ were unlikely to be identified as distinct populations, as their contribution to posterior probabilities of different K values would be small.

Genetic structuring in 2003 and 2004

The 10 samples from 2003 consisted of five repeat samplings from 2002 sites, two new sites (site 18, the only large town not sampled in 2002 and site 19) and three samples from new outbreaks within the FFEZ (Sites L, N and B). With both Structure and BAPS software, the six populations from 2002 were used as learner samples for the 2003 samples. Eight of the 2003 samples showed a strong correspondence to the 2002 population from the same area as shown by the matching population assignment pie charts in Figure 2. Notably, in the clinal region, the five 2003 samples all showed proportional membership of the North and South groups consistent with their position in the observed cline (Figure 2).

Both Structure and BAPS identified two new clusters (Sites L and B), each corresponding to a new 2003 outbreak of B. tryoni in the FFEZ. A third FFEZ outbreak from site N contained individuals that clustered only with the South group (Structure: 6 out of 15 with P>0.9).

In 2004, samples from Sites D and N were analysed using the eight clusters identified in 2002 and 2003 as learners. No additional clusters were identified. Instead, flies from both grouped exclusively with the earlier samples from the same towns, indicating that those populations successfully overwinter (Figure 2).

Assignment testing of immigrants

The pattern of current immigration into the FEEZ was investigated using assignment testing in Structure and trained clustering in BAPS. The eight source populations were the population clusters identified in the preceding sections. With Structure, preliminary testing using flies of known origin showed that approximately half could be assigned to the correct population (P>0.9). For BAPS, a change in log(likelihood) of –2 or less was used as the minimum requirement for assignment.

Of the 99 flies selected as likely immigrants, Structure assigned 45 to source populations, with 25 of those assigned to the South group. BAPS assigned 67 individuals to source populations, with 32 assigned to the South group. The reasonable concordance between the two methods was shown by the result that of the 45 flies assigned by Structure, 32 of those were assigned to the same source population using BAPS. Figure 5 compares the performance of Structure and BAPS by comparing the trapping location of assigned individuals from the two main source populations (North and South). Those results also suggest isolation-by-distance for immigration from the North and South source populations.

Figure 5
figure 5

Maps comparing the assignment testing results for Structure and BAPS software. Each map indicates the source population (either North or South) and the software used to calculate assignments. Filled circles indicate trapping sites of flies assigned to the indicated source population. Empty circles indicate sites where flies were tested, but from which no individuals were assigned to the relevant source population. The dashed line is the border of the quarantine area (FFEZ).

The performance of Structure and BAPS differed for assignments involving the younger FFEZ source populations. Structure assigned 10 FFEZ flies as originating from site D, whereas BAPS assigned only one. Conversely, BAPS assigned 16 flies to the site L population (2003), whereas Structure assigned only four of those flies to site L. This difference may have been due to sites D and L having much smaller sample sizes than the combined North and South groups.

Recent migration rate and effective population sizes

Migration matrices were calculated for the 2002 samples. The use of individual sites within the North and South clusters did not produce stable results, as the overall genetic differentiation for these samples (Fst0.03) was below the practical threshold for BayesAss (Fst<0.05) identified by Faubet et al. (2007). Stable results were obtained by grouping populations into the North (sites 1–8) and South (sites 9–15) clusters, along with the four other 2002 source populations and the remaining 2002 FFEZ samples (C, W and T). Table 2 shows that among those samples, the South cluster was a source of migrants to other clusters (with varying levels of confidence), while immigration to the South cluster was not significant. Ne estimates could only be calculated where sampling was repeated in different years (Table 3). Those results show that only Site 11 at the centre of the South cluster had Ne>1000.

Table 2 Estimates of recent migration rates (shown × 1000 for clarity) in 2002, with source populations shown in the left column and recipient populations along the top row
Table 3 The effective population size (Ne) estimates for the six sites with sampling repeated in 2 or more years

Discussion

Our study has uncovered an unusual situation where a population close to the species range limit is a regional source population rather than a sink. The genetic differentiation between the southern population and the core population appears stable, suggesting that migration from the core population may be insufficient to counter local adaptation in the peripheral population. Accordingly, the southern population identified in this study presents an opportunity to study adaptation in a marginal population that contains levels of genetic variation comparable with central populations. In contrast, the populations in the western section of the study region are all characterized by a large genetic differentiation and significant reductions in allelic richness, suggesting that the western populations receive little gene flow and are likely to be dominated by genetic drift. This system therefore provides an opportunity to compare populations close to a species border, where one is large and contains relatively high levels of genetic variation, whereas others are isolated with lower levels of genetic variation.

The unanticipated feature of our results was the cline in microsatellite allele frequencies. Earlier studies of central populations of B. tryoni showed minimal genetic differentiation between populations spread over almost 3000 km of tropical Australia (Yu et al., 2001; Gilchrist et al., 2006). Consequently, the microsatellite cline identified here, which covers the southern 500 km of the species range, was unexpected. There are two non-exclusive explanations for the clinal variation in allele frequency: natural selection or genetic drift coupled with isolation-by-distance (Gould and Johnston, 1972; Endler, 1977).

The recent origin of the cline does not exclude a selection-based explanation. The invasion of the study region by B. tryoni has taken place sometime within the past 60 years, representing at most 240 generations over which natural selection could act to form the cline. This is comparable with the rate at which a wing size cline was established in Drosophila subobscura (Gilchrist et al., 2001). A selection-based explanation is usually founded on the existence of an environmental gradient parallel to the cline. In the study area, the major environmental gradients (mean temperature, annual rainfall, humidity and elevation), all run perpendicular to the cline, that is, east-west. However, monthly average temperatures also vary latitudinally, decreasing to the south as expected for the Southern hemisphere. At the southern end of the cline, the winter breeding gap (during which B. tryoni enters ovarian diapause) is a few weeks longer than in the North. Bateman (1967) found no evidence of temperature adaptation in southern populations of B. tryoni. However, O’Loughlin et al. (1984) studied field cage populations of B. tryoni in areas immediately to the south of our study area (where there were no permanent populations in 2002–2004). They found that B. tryoni failed to produce enough generations in the warmer months to build populations that are sufficiently large to overwinter (Meats, 1981), suggesting that small changes in development time may increase the persistence of southern populations. Thermal adaptation has been intensively studied in Drosophila species in eastern Australia (Hoffmann et al., 2003). Investigations of thermal adaptation in B. tryoni may provide useful indicators of the generality of evolutionary responses found in Drosophila.

The observed cline in microsatellite frequencies may also result from neutral evolutionary processes (Endler, 1977). The simplest model is one where there is secondary contact of populations that have previously differentiated by genetic drift. Those populations may either have been in complete isolation or partially isolated by an isolation-by-distance mechanism. Nevertheless, it has been suggested that selection might be inferred where a subset of neutral loci show a clinal pattern parallel to an environmental gradient (Gockel et al., 2001). However, simulations by Vasemägi (2006) have shown that isolation-by-distance alone can produce similar correlations between a subset of neutral markers and geographical separation. Therefore, a cline in one or a few neutral loci does not necessarily provide evidence of selection where isolation-by-distance has been demonstrated.

Accordingly, our results cannot exclude the possibility that the cline formed due to neutral processes. First, there was isolation-by-distance at the scale of the cline. Second, we have shown that populations at either end of the cline were genetically differentiated source populations, thereby providing the migrants necessary for admixture along the cline. The simulations of Vasemägi (2006) also suggest that clines may form in the absence of selection. Generalizing from simulations requires caution since, as with many tests for neutrality, results are highly sensitive to the demographic assumptions (Neilsen, 2005). Vasemägi's simulations assumed island populations each with a population of 1000 and migration rates ranging from 0.001 to 0.05. Similar parameters were observed in our results: effective population sizes were of the order of 102–103 and migration rates (inferred with caution from FST values; Whitlock and McCauley, 1999) were in the region of 0.005–0.05. A final ecological factor to consider is that larger towns support larger fruit fly populations due to the greater number of exotic host fruit trees (NSW Department of Primary Industries data). The towns at either end of the observed cline are the largest towns in inland New South Wales. They have human populations with an order of magnitude larger than any other towns in the study region. In the South, sites 11, 15 and 19 have a combined population of 80 000 of people with the next largest town having only 6000 people. Similarly in the North, site 2 has a population of 30 600, whereas the remaining towns average approximately 5000 people. The towns at either end of the cline have relatively high patch quality and may therefore produce correspondingly high propagule pressures.

In summary, we have uncovered an unusual situation in which an invading pest species has reached a climatic range limit but has formed a differentiated source population at that limit. Future investigations will seek to identify any phenotypic divergence between that peripheral population and the core populations, as such results may explain the cline and indicate to the evolutionary path of the southern population. Those studies should also provide a useful comparison with the numerous Drosophila-based studies already undertaken in the same region.