Introduction

A major goal of evolutionary biology is to understand selection in nature and how this influences phenotypic and genotypic evolution (Dobzhansky, 1970; Lewontin, 1974; Wilkinson et al., 2015). One of the most significant developments in this regard was the rejection of Fisher’s assertion that selection would be so weak that it could not be measured in the wild (Fisher, 1999). Work by the ecological geneticist Ford did much to dispel this belief and Ford is largely responsible for the first field estimates of selection and for developing techniques to detect differential survival in nature (Dowdeswell et al., 1949).

One of Ford’s most striking findings came from work on the Isles of Scilly, an archipelago off the Cornish coast in the extreme south west of the British Isles. These islands are found in close proximity to each other, being maximally separated by 15 km and minimally separated by 150 m. The islands fall into two distinct size classes: small islands of 20 hectares or less and large islands of 130 hectares or more. Ford and colleagues (Fisher, Dowdeswell and McWhirter to name a few) extensively studied the wing-spot patterns of female meadow brown butterflies, Maniola jurtina on this island chain (Creed et al., 1964).

Ford and colleagues scored hind wing-spot patterns across the islands. Meadow browns have a melanic wing-spot polymorphism on the underside of the hind wing, with females having between 0 and 5 spots with a modal number of 2 spots. In work spanning 14 years (from 1946 to 1959), largely on north-west islands, Ford and co-workers found that female wing-spot distributions on each island changed very little over time, with two exceptions discussed below (Ford, 1964). However, although wing-spot patterns on the larger islands were broadly similar, having more or less identical proportions of 0, 1 and 2 spot females, they differed greatly on the smaller islands where populations were significantly differentiated in their wing-spot pattern distributions. Furthermore, major ecological disturbance was found to be associated with a change of wing-spot frequencies. For example, removal of a cattle herd from the island of Tean resulted in vegetational changes and the emergence of a new wing-spot distribution pattern that subsequently remained constant. Similar changes were seen on White Island after a severe storm (Ford, 1964). More recent work in the 1970s documented wing-spot patterns similar to those initially reported by Ford (Handford, 1973). However, no surveys have been conducted on the islands subsequent to the 1970s and thus longer-term patterns of wing-spot stability remain to be elucidated.

Ford and colleagues described wing-spot patterns as a ‘type of variation which responds quickly to the effects of selection’ (Dowdeswell et al., 1949). It was thought that Maniola wing spots were under multifactorial control and modern molecular genetic approaches have confirmed that butterfly wing patterns can indeed be polygenic (Beldade and Brakefield, 2002). Aside from establishing a link between environmental changes and alterations in wing-spot frequencies, the actual selective forces thought to be affecting spot patterning have never been clearly outlined. However, McWhirter (1969) used crosses to investigate the heritability of spot patterns, and although the sample size used for this analysis was small, he estimated heritability to be 60% for females. Subsequent work based on larger sample sizes confirmed that female spottiness is highly heritable (h2=0.89) (Brakefield, 1984; Brakefield and Vannoordwijk, 1985). Coupled with the long-term stability of spot patterns and rapid phenotypic responses to environmental change, this high heritability has been interpreted as being consistent with the theory of island-specific selection (Brakefield, 1984; Brakefield and Vannoordwijk, 1985).

To assess movement patterns, mark-release-recapture studies of M. jurtina were originally performed on the small (0.16 km2) and uninhabited island of Tean. Three discrete butterfly populations were found, separated by areas of unsuitable habitat, and Ford and colleagues investigated how much migration there was between the habitat patches. Of the 183 butterflies they marked and recaptured, only four were recaptured in new areas. During the same period, M. jurtina were also marked and released on St Martin’s, the closest large island to Tean, and no migration was observed between the two islands. These findings together suggested that dispersal may be restricted both within and between islands (Dowdeswell et al., 1949).

The discovery of restricted dispersal was taken as further evidence for strong selection operating on the different islands because, for the most part, population sizes were considered too large (>15 000 individuals per season on Tean and St Helen’s) for phenotypic differences to be attributed to genetic drift (Dowdeswell et al., 1957; Ford, 1975). However, Dobzhansky and Pavlovsky (1957) argued that ‘genetic divergence was initiated by the island populations being derived from small numbers of immigrants from the mainland or from other islands’, and hence that founder effects and genetic drift could have played an important role when the islands were colonized. Waddington (1957) also suggested that occasional crashes in population size might potentially influence spot pattern through ‘intermittent drift’ and concluded that Ford’s data ‘certainly provide no compelling grounds for rejecting the possibility’ of drift (pages 86–87). Ford continued to dismiss the effects of drift, arguing that a severe drought in 1957 markedly altered both vegetation and wing-spot patterns on St Martin’s, whereas wing-spot patterns on parts of Tresco changed rapidly in response to environmental perturbation but subsequently returned to their historical patterns without any fluctuations in population size (Ford, 1975, pages 59–64).

While Ford believed that the islands were not connected by gene flow, despite the small geographic distances between some of them, to date there has been no genetic work to assess patterns of connectivity among the island populations. Ford’s M. jurtina work has become one of the historical cornerstones of ecological genetics, yet two fundamental tenets—long-term stability of island-specific wing-spot patterns and restricted migration—lack conclusive empirical support. We therefore assessed contemporary wing-spot patterns from five islands to provide a comparison with the historical data, and deployed amplified fragment length polymorphisms (AFLPs) to determine whether the mark-release-recapture estimates of migration rates between the islands are supported by measurements of population genetic structure.

Materials and methods

Specimen collection

A total of 235 M. jurtina butterflies (221 females and 14 males) were collected from three large islands, St Mary’s (n=54), St Martin’s (n=49) and Tresco (n=54), and two small islands, Tean (n=38) and St Helen’s (n=26). Head and abdomen tissues were removed and stored in DESS DNA preservation buffer (20% dimethyl sulphoxide, 0.25 M EDTA, salt). To minimize wing scale loss and damage through handling, the wings remained attached to thoraxes and were stored in individual envelopes for subsequent imaging.

Wing-spot scoring

Ford and colleagues measured wing-spot patterns by eye in the field, writing ‘A spot was regarded as absent if it could not be distinguished from a black scale which might have occurred anywhere on the wing’. (Dowdeswell and Ford, 1952). Later, Brakefield and Dowdeswell (1985) analysed 298 female meadow brown butterflies from England and independently scored wing-spot patterns using traditional field estimates and microscopy (Brakefield and Dowdeswell, 1985). They found that wing spots ~0.43 mm or larger were not significantly different when scored by either method. Using this value as a guide to the presence/absence of a spot, we photographed and digitally measured wing spots with a minimum size of 0.43 mm on at least one of the hindwings. First, wing-spot numbers were visually inspected and counted by two independent observers (SWB and DJH). Digital images of 220 samples were then taken, each including a scale bar. Finally, wing-spot positions were numbered from one (anal) to six (costal) following Brakefield and Vannoordwijk (1985) and ImageJ v1.37 (Schneider et al., 2012) was used to calculate spot diameters on both wings.

Genetic analysis

DNA was isolated from head and antennal tissue of 192 samples using the DNeasy 96 Blood & Tissue Kit (Qiagen, Manchester, UK). These comprised 37 samples from St Mary’s, 40 from St Martin’s, 48 from Tresco, 37 from Tean and 30 from St Helen’s. AFLP templates were prepared using the AFLP Core reagent kit (Invitrogen, Carlsbad, CA, USA) according to the manufacturer’s protocol, except we used half reaction volumes. Twelve selective AFLP primer combinations were analysed including EcoR1-ACA paired with the following MseI combinations: CAGCATCCTCGCCGGCGTCTG and CTT, and Mse1-CCAT paired with the following EcoR1 combinations: ACAACCACG and ACT. PCR amplification incorporated α33P-dCTP into amplicons and the products were resolved by electrophoresis on standard 6% polyacrylamide sequencing gels and detected by autoradiography. Exposed X-ray films were assessed and, if required, a second exposure was made for an adjusted time period. All bands in the approximate size range of 75–300 bp were scored manually by an experienced operator (JIH). Only clear bands with minimal size variation that could be scored in all individuals were included, these being recorded as 1=present and 0=absent. It was assumed that AFLP bands that were the same size across individuals represented homologous markers.

Summary statistics and isolation by distance

Genetic distance between populations was calculated using two programs. The first of these, AFLPSURV (http://www.ulb.ac.be/sciences/lagev/aflp-surv.html), follows an approach for calculating Wright’s Fst outlined by Lynch and Milligan (1994). The second program, Mcheza (Antao and Beaumont, 2011), calculates Fst using only presumed neutral loci, as outlined below. The R package ade4 was used to implement Mantel’s test to compare matrices of genetic (Fst) and geographic distance from all five islands. Distance was calculated using both island mid-points and latitude and longitude of central collection sites (Tresco 49.950, −6.338; St Mary’s 49.906, −6.302; St Martin’s 49.965, −6.298; Tean 49.967, −6.310; St Helen’s 49.973, −6.325). Analysis of molecular variance was implemented within Arelquin version 3.5.2.2 (Excoffier and Lischer, 2010) with 1000 permutations of the data set and only individuals with a maximum of 5% missing data were included.

Before estimating inbreeding coefficients for the individuals, dominant AFLP loci potentially under directional or stabilizing selection (Fst outlier loci) were removed using Mcheza (Antao and Beaumont, 2011). The analysis was performed using 106 permutations of the data set with a 95% confidence interval and 5% false discovery rate. The program I4A (Inbreeding for AFLPs) (Chybicki et al., 2011) was then used to estimate the inbreeding coefficient F for all samples and for samples pooled by island. This analysis was performed using 100 000 steps following 10 000 burn-in steps. As I4A requires the prior values of a β-distribution to be entered, three initial starting values of α=β were applied (0.1, 1.0 and 5.0) following Oleksa et al. (2013).

Bayesian analysis of population structure

To identify genetic populations without prior knowledge of the locations from which individual butterflies were sampled, we analysed the data set using Structure version 2.3.3. (Pritchard et al., 2000). This program uses a maximum likelihood approach to evaluate the most likely number of distinct genetic populations in the sample (K) as well as which individuals are most likely to belong to each of the populations. The membership of each individual to a given population is estimated as q that varies between 0 and 1, the latter indicating full membership. We ran five runs for K=1–10 using 106 Markov chain Monte Carlo iterations after a burn-in of 105, the correlated allele frequencies model and assuming admixture. The most likely number of genetic populations was evaluated using both the maximal average value of Ln P(D), a model-choice criterion that estimates the posterior probability of the data and ΔK, an ad hoc statistic based on the second order rate of change of the likelihood function with respect to K (Evanno et al., 2005). Eleven individuals with >10% missing data were excluded from this analysis.

We also analysed our data using the LOCPRIOR model within Structure that uses knowledge of the sampling locations of the individuals to assist clustering (Hubisz et al., 2009). This model favours solutions that correlate with sampling location while ignoring the geographic information whenever this is uninformative about the ancestry of individuals. It tends to outperform the standard model, generating more accurate estimates of K and improved membership coefficients. We ran this analysis as described above, but classifying individuals according to the island they were sampled from and setting the option LOCPRIOR to 1.

Results

Wing-spot frequencies

Hindwing spot numbers varied between 0 and 4, with a total of 215 spots visually assessed on the left hindwings of 220 individuals (Figure 1). A single individual from St Mary’s was too damaged to phenotype. Photographic analysis indicated that 203 of these spots (94.4%) were above the 0.43 mm threshold established by Brakefield and Dowdeswell (1985) and only these spots were further analysed.

Figure 1
figure 1

Meadow brown specimens were collected from the Isles of Scilly (centre map). The outer graphs show melanic hindwing spot frequencies of female meadow brown butterflies during the 1950s (orange) and 2008 (brown) from five islands. Minimal changes in spot variation were observed within St Mary’s (χ2=4.70, P=0.31), Tresco (χ2=2.39, P=0.664) and Tean (χ2=1.47, P=0.832), whereas that of St Martin’s (χ2=18.32, P=0.001) and St Helen’s (χ2=16.00, P=0.003) were significantly different. Examples of melanic spots (0, 1, 2 or 3) on female meadow brown hindwings are shown below, highlighted with arrows.

The χ2 contingency tables were used to compare wing-spot frequencies among the islands (Table 1). Significant differences in wing-spot patterns were observed between the large islands of St Martin’s and Tresco (P=0.011), Tean (P=0.015) and St Helen’s (P=0.004). These differences appear to be largely attributable to the high frequency of individuals on St Martin’s without any spots (65.3%) or with single spots at low frequency (8%). The wing-spot profile of St Helen’s was characterized by a lower than average frequency of individuals carrying two spots (Table 1). There was also a marginally nonsignificant tendency for the spot pattern to differ between St Helen’s and Tresco (P=0.08). No other comparisons were statistically significant.

Table 1 Pairwise comparisons of wing-spot frequencies among the five islands

Comparison with historical wing-spot patterns

As raw data for historical wing-spot frequencies (1946–1959) have not been collectively published, we estimated spot pattern frequencies from line graphs (Ford, 1960, 1964) to enable comparison between historical and contemporary data sets (Figure 1 and Supplementary Table 1). We found that contemporary wing-spot distributions did not differ significantly from historical records for three of the islands (Tresco, St Mary’s and Tean: all comparisons χ2<7.9; P>0.1). However, significant differences from pre-1960 wing-spot patterns were observed for St Helen’s (χ2=18.5; P=0.004) and St Martin’s (χ2=23.8; P<0.001).

Genetic analysis of population structure

To determine whether the Isles of Scilly comprise a single homogeneous population of meadow brown butterflies or several structured island populations, AFLP analysis was performed on 192 samples, generating 176 polymorphic bands. A permutation test for genetic differentiation among the five islands based on 10 000 randomizations of the data set indicated a strong deviation from the null hypothesis of no genetic structure (Fst=0.026, AFLPSURV, P<0.0001). Pairwise Fst values between the islands ranged between 0.01 and 0.04 and were all individually significant (Table 2). However, no correlation was observed between genetic and geographic distance, indicating the lack of an isolation-by-distance pattern (Mantel’s r=−0.059, P=0.561 using island mid-point; Mantel’s r=−0.051, P=0.514 using latitude and longitude of central collection sites).

Table 2 Pairwise Fst values (below diagonal) and associated P-values (above diagonal) calculated using AFLPSURV

To further test for population structure, we used analysis of molecular variance to determine the proportion of genetic variation attributable to among-population and within-population variance components. A significant proportion of the variance in the AFLP data (6%) was partitioned at the uppermost hierarchical level (Table 3), indicating clear evidence for genetic differences and thus restricted dispersal among the five islands.

Table 3 Analyses of molecular variance (AMOVA) results for AFLP loci with <5% missing data

To test whether genetic differentiation among the islands could be detected in the absence of prior data on the sampling locations of individuals, we also conducted a Bayesian analysis of the AFLP data set using the program Structure (Pritchard et al., 2000). Five replicate runs were conducted for each possible number of populations (K) ranging from 1, implying no population differentiation, through to 10. The highest Ln P(D) and ΔK values were associated with K=5 (Figure 2a), providing support for the presence of five distinct populations. Membership coefficients for these inferred populations are summarized in Figure 3a, in which each vertical bar represents a different individual and the relative proportions of the different colours indicate the probabilities of belonging to each population. This shows that when individuals are classified according to their sampling locations, most of the islands are reasonably well defined, although some admixture is present.

Figure 2
figure 2

Estimating the number of likely populations of M. jurtina collected from five islands using Structure. Ln P(D) are represented as discrete points (with error bars based on five repetitions) and ΔK is shown as a continuous line graph. Results are shown for both the (a) standard and (b) LOCPRIOR models (see Materials and methods for details).

Figure 3
figure 3

Group membership coefficients derived using the program Structure for 181 M. jurtina samples for (a) K=5 using the standard model; and (b) K=6 using the LOCPRIOR model that exploits prior sampling information. Each individual is represented by a vertical line partitioned into coloured segments, the lengths of which indicate the probability of membership in each group.

Reanalysing the data using the LOCPRIOR model within Structure, which makes use of the sampling locations of individuals to inform cluster assignments, Ln P(D) and ΔK peaked at K=6 (Figure 2b). The overall clustering result appears broadly similar to when the standard Structure model was used, although the level of admixture is reduced, making the populations more clearly defined (Figure 3b). The primary difference observed is that, under the standard model, Tean and St Helen’s appear genetically quite similar, whereas with the LOCPRIOR model, Tean appears more similar to St Martin’s.

Finally, we used the program I4A to estimate inbreeding coefficients based on putatively neutral AFLP loci. Eight bands with Fst values falling outside the distribution expected under neutrality were first removed, as they may represent outlier loci under strong directional or stabilizing selection. This generated a global Fst estimate of 0.0298 based on the remaining 168 loci, and this is very close to the value obtained using AFLPSURV. Neutral AFLP loci for all samples were then analysed within I4A using three different β-distribution priors (α=β, 0.1, 1.0, 5.0), all of which produced consistent estimates (Table 4). Inbreeding coefficients for each island were then determined for the same three β-distribution priors. Appreciable variation was observed among the five islands, with lower inbreeding estimates being obtained for the small islands of Tean and St Helen’s relative to the larger islands of St Mary’s and St Martin’s.

Table 4 Inbreeding coefficients estimated with neutral AFLP markers using I4A

Discussion

Documenting selection and local adaptation in the wild remains a major focus of evolutionary biology and the work of Ford did much to define and drive this agenda. Here, we revisited a classical model of ecological genetics used by Ford and his colleagues: spot patterns on the wings of female M. jurtina butterflies on the Isles of Scilly. We found that butterflies from some islands differed in their wing-spot patterns (principally St Martin’s from other islands) and that patterns on some of the islands matched those reported in 1964, whereas others did not. Finally, despite the islands being in close geographical proximity, all five of the putative butterfly populations were significantly differentiated from one another. Our results are consistent with Ford’s early observations and also provide the first empirical support for the claim that gene flow is restricted among the islands.

As with earlier work (reviewed in Ford 1964; Handford, 1973), we found that wing-spot distribution patterns could remain stable within islands, yet vary among them. Island-specific wing-spot patterns have been reported for other Maniola populations (see, for example, (Scali, 1972), suggesting that the Scillies are not unusual in this regard. Our results also indicate that wing-spot patterns have remained unchanged on three of the islands over the 44 years that span the original studies of 1960s to 2008 when we resampled the same locations. The large islands of St Mary’s and Tresco have similar ‘flat top’ profiles to those described by Ford (1964), exhibiting temporally consistent ratios of 0, 1 and 2 spot phenotypes. The small island of Tean also showed a similar bimodal spot distribution to that originally reported by Ford (1964). Temporal stability in Maniola wing-spot patterns has been reported elsewhere, including throughout Western Europe (Dowdeswell and McWhirter, 1967). However, ours is the first study to show that wing-spot patterns can persist virtually unchanged for several decades.

In contrast to the stability of wing-spot patterns on three islands, significant temporal changes in wing-spot frequencies were observed on St Helen’s and St Martin’s. The reasons for this remain unclear, although previously changes in wing-spot frequencies at other sites have been attributed to ecological changes that took place at the same time. For example, Maniola wing-spot distribution patterns altered after grazing was stopped on Tean, but thereafter reached a new stable wing-spot pattern. Similarly, changes in wing-spot frequency have also been observed after particularly heavy storms (Ford, 1964). Historically, the cessation of grazing on Tean led to an increase in the frequency of two-spot females and a decrease in the frequency of females with no spots (Ford, 1964). However, we found that the cessation of grazing on St Helen’s since Ford’s initial studies was instead associated with an increase in the number of females with no spots. As we lack detailed ecological data for St Helen’s, it would clearly be desirable to conduct follow-up studies with larger numbers of individuals over consecutive years.

The relative importance of genetic drift and selection in shaping Maniola wing-spot patterns has been the subject of considerable debate. Dobzhansky and others argued that founding island populations with different gene frequencies could account for wing-spot differences between islands, or that genetic drift could occur when population sizes are low (Dobzhansky and Pavlovsky, 1957; Waddington, 1957). This could potentially account for changes in wing-spot frequencies associated with storms if the associated mortality was high enough to accelerate genetic drift. However, Ford argued that founder effects should not lead to a pattern in which the larger islands have similar wing-spot patterns (Ford, 1964). He also argued that if drift played a major role in shaping wing-spot patterns, these should be less stable over time on smaller islands than larger ones (Ford, 1964). We found no clear relationship between island size and long-term temporal stability in wing-spot patterning. Taken at face value, our results are not consistent with the hypothesis that drift is the main factor driving changes in wing-spot frequencies, as we would have expected under such a scenario to have observed marked changes in wing-spot frequencies on the small but not the large islands. However, our study has the limitations that we could only sample from five islands, and we have little in the way of ecological data to explore the extent to which temporal changes are associated with ecological changes. We also lack data on the effective population sizes of the various populations and to elucidate the potential role of founder effects would also require all of the potential source populations on the mainland to be sampled. This lies beyond the scope of the current study but provides a fertile avenue for future research.

Consistent with Ford’s assumption that migration was restricted among the Scilly Islands, we found clear evidence for population structure (global Fst=0.026 for all loci and 0.029 for putatively neutral loci). This was strong enough to be detected using Bayesian structure analysis without the inclusion of a priori information on the locations from which individuals were sampled, although a somewhat cleaner clustering solution was obtained as expected when the LOCPRIOR model was used. It is unclear why Structure favoured a best clustering solution of K=6 with the LOCPRIOR model given that there are only five island populations. However, this appears to be reflected in some degree of admixture within St Martin’s rather than by a distinct split within one of the islands as would be the case if substructure was present. The admixture observed within St Martin’s would appear to reflect genetic similarity to the island of Tean, as these two islands also had one of the lowest observed pairwise Fst values (Table 2). This is consistent with the close geographical proximity of these islands (<300 m apart at their closest point) and the fact that prevailing winds commonly blow from a west-south-westerly direction during July (www.windfinder.com) when these butterflies are abundant. Minor levels of admixture was observed between St Mary’s and the two other large islands St Martin’s and Tresco (Figure 3b) that also have the potential for wind-assisted migration.

Given the small geographical distances among the islands of the Scilly archipelago, the discovery of population structure in an insect capable of flying short distances is remarkable, especially given that previous studies of other flying insects have found comparatively weaker population structure over much larger geographic scales (Demont et al., 2008). Our results also contrast somewhat with those of a previous study of M. jurtina that found relatively weak population structure across southern England, although in this particular case Fst was estimated from allozymes that tend to have lower levels of variability than AFLPs (Goulson, 1993). Regardless, Ford’s initial mark-release-recapture experiments on Tean showed that vegetation-free zones on this island acted as barriers to butterfly movement, implying that interisland aquatic barriers may be even stronger (Ford, 1964). Thus, our findings are consistent with previous work on the Scillies, and support the conclusion that even relatively narrow stretches of open sea can represent an effective barrier to gene flow in flying insects.

We also estimated inbreeding coefficients (F) to allow further comparisons to be made among the islands. The program we used typically yields slightly higher estimates of F from dominant AFLP markers than from microsatellites (Chybicki et al., 2011) but any such biases should apply equally to all five of the islands. Appreciable variation was found among the islands, with two larger islands tending to have greater average F values than the smaller islands. This contrasts with our initial expectation that levels of inbreeding should be higher on the smaller islands, and suggests unforeseen complexities. One possibility is that although larger islands theoretically support larger butterfly populations, urbanization and farming could have fragmented these populations into more localized patches within which inbreeding may be more prevalent. Alternatively, or in addition to this, we may have unintentionally sampled families or sub-populations on the larger islands. Regardless of the exact explanation, which would require far more exhaustive sampling and population genetic analysis within islands to determine, high levels of inbreeding often arise in isolated populations, lending further support to the notion that dispersal is highly restricted in this species on the Scilly Isles.

The exact nature of any selection that could be (directly or indirectly) targeting spot patterns in this species remains open to debate. Previously, it has been suggested that the wing spots themselves are unlikely to be under direct selection, and that more likely they may be indirectly associated with physiological responses to particular environments (Dowdeswell et al., 1957). However, Brakefield (1984) found that wing spottiness was associated with butterfly movement, suggesting either that predation could vary with the different phenotypes or that when individuals are blown out to sea, certain phenotypes may be better able to fly back to land (Bengtson, 1981). There have also been suggestions that susceptibility to parasitoids or pathogens could correlate with wing spottiness during female development (Dowdeswell, 1961, 1962); reviewed in Brakefield 1984). This is clearly a complex, multidimensional problem that cannot be solved by a single study. However, our findings support many of Ford’s original contentions and we therefore hope to be able to continue our work on these islands to be able to more definitively support or refute the larger claims made by Ford and co-workers in this iconic system.

Data archiving

Data files of wing spot sizes and AFLP genotypes available from the Dryad Digital Repository: http://dx.doi.org/10.5061/dryad.j7v42.