Introduction

Robust estimation of genetic structure and gene flow requires information from many loci, in order that ‘badly behaved loci’ may be identified and excluded from the analysis (Slatkin, 1985; Goudet et al., 1994). It may also be necessary to use different classes of genetic marker. For example, Karl & Avise (1992) and Raybould et al. (1996a, 1997) have shown that single-copy nuclear restriction fragment length polymorphisms (SCN RFLPs) can give estimates of gene flow that are very different from those obtained by isozymes, possibly because of some form of selection at isozyme loci.

Molecular biology techniques provide plant population geneticists with an almost limitless source of highly polymorphic neutral markers with which to estimate gene flow. Since the mid-1980s, SCN RFLPs have proved very valuable to geneticists producing genetic maps of crop species (Helentjaris & Burr, 1989). Nevertheless, despite providing a potentially large number of very polymorphic codominant loci, they have been little used in the study of natural plant populations. Reasons for this may be connected with the availability of probes, problems of isolating good quality DNA from wild plants, the use of radioisotopes, high expense, time constraints and the relatively high degree of skill needed to use RFLPs successfully (e.g. Rafalski & Tingey, 1993). More recently, microsatellites (also called simple sequence repeats, SSRs) have been introduced into many areas of genetics that might benefit from the use of RFLPs (Jarne & Lagoda, 1996). They seem to have advantages over RFLPs in terms of shorter development and processing times, no necessity for radioisotopes or high-quality DNA and generally require much less complex experimental protocols, while retaining the potential of RFLPs to deliver a large number of highly polymorphic codominant loci. However, if primer sequences are not available, start-up costs can be high (Rafalski & Tingey, 1993).

Microsatellites are beginning to fulfil their potential in areas such as genome mapping (e.g. Szewc-McFadden et al., 1996) and the study of relatedness in social animals (e.g. Seppa & Gertsch, 1996). They should also prove very suitable for testing models of population genetic structure. A rigorous test of the suitability of microsatellite variation requires a system in which genetic structure has been described independently of microsatellite variation. The cliff-top populations of Beta vulgaris ssp. maritima (sea beet) in Dorset provide such a system. Gene flow (Nm), derived from FST estimates, among these populations has been analysed using SCN RFLPs and isozymes (Raybould et al., 1996a, 1997). Regressions of log Nm and log distance between all pairs of populations (Slatkin, 1993) indicate that isolation by distance operates in these populations. In this paper, we examine whether microsatellite variation can also be used to detect isolation by distance in these populations.

There is much debate over how FST will perform as a measure of genetic distance at microsatellite loci, which tend to have high mutation rates (e.g. 10−3 per generation in humans; Weber & Wong, 1993) and usually mutate by gain or loss of one or two repeats per mutation event. Slatkin (1995) has shown that these properties fail to satisfy the assumptions that allow demographic parameters (such as Nm) to be inferred from FST. These assumptions are that mutation rates are low and that mutation obeys an infinite alleles model (or k-alleles model with k tending to infinity). If the stepwise mutation model with high mutation rates describes microsatellite variation better than the infinite alleles model with low mutation rates, parameters of genetic structure need to allow for the difference in allele size (number of repeats). Slatkin (1995) described such a parameter, RST, which is analogous to FST. RST is defined by variances of allele sizes such that where Sw is twice the average of the estimated variances in allele size (number of repeats) between alleles within each subpopulation and ¯S is twice the estimated variance in allele size in the population as a whole (Slatkin, 1995).

The properties of parameters describing genetic variation based on the different mutation models have been examined by Slatkin (1995), Michalakis & Excoffier (1996) and Rousset (1996), and theoretical situations in which the estimators of the different parameters would be expected to characterize actual genetic distance best have been described. However, there is no consensus at present as to whether RST or similar estimators are more appropriate than FST for the analysis of microsatellite data from real populations (Jarne & Lagoda, 1996). Here, we use both Weir & Cockerham's (1984) FST estimator and Michalakis & Excoffier's (1996) estimator of RST in regressions of genetic and geographical distance to examine whether microsatellites can detect evidence of isolation by distance in populations in which this process is known to occur, and whether the different estimators vary in their ability to detect isolation by distance.

Materials and methods

Microsatellite analysis

Microsatellite analysis was carried out on plants from five cliff-top populations surveyed previously for isozyme and RFLP variation (50 plants per population; Raybould et al., 1996a, 1997). DNA was extracted from all remaining 230 tissue samples using the method of Edwards et al. (1991). Microsatellite sequences were amplified from 2.5 μL of (unquantified) DNA using the primers described by Mörchen et al. (1996). Amplifications were performed in 25 μL samples with one unit of Taq polymerase, 5 pmol of each primer, 5 μmol of each nucleotide and 1.2 per cent formamide. Denaturation was for 1.5 min at 94°C followed by 30 cycles of 2 min at the annealing temperature, 1.5 min at 72°C and 20 s at 94°C. The amplification was completed with 2 min at the annealing temperature and 5 min at 72°C. The annealing temperature was calculated as the melting temperature of the primer (Tm) minus 5°C, where Tm=2 (A+T bp)+4 (C+G bp). Amplification products were separated by electrophoresis on 10 per cent acrylamide gels followed by ethidium bromide staining (Mörchen et al., 1996).

Tests for isolation by distance

For the analysis of genetic structure, populations were subdivided into patches approximating to randomly mating units (Goudet et al., 1994; Raybould et al., 1996b). Because of lack of tissue and some failures in amplification (see below), complete microsatellite data were available for 204 plants only (AMOVA, the program for calculating RST estimates, requires plants to have data at all loci). For consistency, plants without complete microsatellite data were removed from the RFLP and isozyme data sets. Patches with fewer than five plants remaining were pooled with the closest neighbouring patch. This gave 21 patches. Genetic distances were then estimated between all pairs of patches. Genetic distances were calculated as both FST (for RFLPs, isozymes and microsatellites), using Weir & Cockerham's (1984) estimator as computed by the program FSTAT (Goudet, 1995) and Michalakis & Excoffier's (1996) estimator of RST (for microsatellites only) as computed by the program AMOVA (Excoffier et al., 1992). Pooling of patches may introduce biases caused by substructure within the new groupings, leading to underestimation of the absolute values of FST and RST, although this should have minimal effects on the detection of processes such as isolation by distance (Raybould et al., 1997).

The pairwise genetic distances, FST and RST, were classified into those between patches in the same population (coded 1 in a new variable, ‘population membership’, distinguishing intra- from interpopulation comparisons of patches) and those between patches in different populations (coded 0). The separate effects of population membership and distance were examined by carrying out partial regressions and partial matrix correspondence tests (Manly, 1991; Thorpe & Baez, 1993; Thorpe & Malhotra, 1996) of genetic distance on geographical distance, population membership and the interaction between distance and population membership (=distance multiplied by either 0 or 1), respectively. The interaction term tests for uniformity of the effect of distance within and between populations. The same procedures were adopted with log-transformed distances. For each group of regression coefficients (transformed/untransformed distances), significance levels were set by sequential Bonferroni procedures (Rice, 1989).

Results

In the Dorset material, three of the four microsatellite primer pairs described by Mörchen et al. (1996) were found to give polymorphism interpretable as variation at single loci. Bvm1, which was monomorphic in French material, was polymorphic in the Dorset populations, but the variation could not be interpreted in any simple fashion. Table 1 shows the allele frequencies at the interpretable loci. The number of alleles per locus is lower in the Dorset sample, even though it has more plants, probably because it was collected over a smaller area. It can be seen from Table 1 that fewer than 230 phenotypes were obtained for each microsatellite locus. We believe that failure of samples to amplify was caused by factors other than null alleles (Pemberton et al., 1995) because, in most cases, a sample either amplified with all or none of the primer pairs.

Table 1 Microsatellite allele frequencies in Dorset populations of sea beet

Table 2 summarizes the results of the partial regressions. The results show clearly that FST and RST estimators behave differently for microsatellites in the sea beet populations. There is a very strong population membership effect when genetic distance between patches is estimated as FST, whereas population membership is not a significant factor in the RST estimates between patches. The partial regressions of RST behave similarly to FST for isozymes and RFLPs. With log-transformed distances, the estimates of FST for microsatellites give the most negative partial regression with population membership, but no coefficient is significant.

Table 2 Partial regression coefficients and partial matrix correspondence test probabilities for partial regressions of genetic distance estimators and geographical distance, population membership and distance×membership interaction for approximately randomly mating patches of sea beet in five Dorset populations. Figures in bold are regression coefficients with untransformed distances; figures in parentheses are regression coefficients with log-transformed distances. Significance levels set by sequential Bonferroni methods (Rice, 1989) for transformed and untransformed estimates. NS, significance level >0.1

The tests for isolation by distance are more clear-cut with the log-transformed data. FST at isozyme and RFLP loci show isolation by distance, confirming the results of Raybould et al. (1997) using Slatkin's (1993) method of detecting isolation by distance. The partial regression coefficient of RST estimates and log distance is significant at the 5 per cent level for a single test and marginally significant (<10 per cent) after table-wide testing. FST at microsatellite loci, however, fails to detect isolation by distance in these populations.

For all estimators, with both transformed and untransformed distances, no interaction term is significant. This suggests that the rate of change of genetic distance with geographical distance is the same within and between these populations.

Discussion

Our results show that isolation by distance can be detected in the sea beet populations by microsatellite loci, but that the choice of genetic distance estimator is important. When genetic distances at microsatellite loci are estimated by FST, patches of plants within the same population are significantly more similar to each other than those in different populations when the effect of distance is removed. However, there is no significant effect of population membership (with distance effects removed) for isozymes and RFLPs. This suggests that mutation rates at these microsatellites are higher than rates of gene flow, whereas gene flow rates are higher than mutation rates at isozyme and RFLP loci. This is consistent with observations of high mutation rates at microsatellite loci in humans (Weber & Wong, 1993; Dib et al., 1996) and mice (Dallas, 1992). The number of alleles per microsatellite locus (four to five) is higher than that found at RFLP and isozyme loci in the same populations (c. three; Raybould et al., 1996a), although the number of alleles is much smaller than that found by Mörchen et al. (1996).

If RST is used as the genetic distance estimator, a pattern of genetic structure emerges that is very similar to isozymes and RFLPs. With RST, the significant population effect is removed. Using transformed distance data, isolation by distance can be detected with RST at a significance level of 7 per cent after table-wide testing. Thus, providing a suitable estimator is chosen and suitable transformations of distance are carried out, a significant relationship between genetic distance and geographical distance can be found at three different sets of loci. This is a valuable result as it gives us confidence in both the robustness of the marker systems and the analytical procedures.

It should not be inferred from these data that a genetic distance estimator based on a stepwise mutation model will always be more appropriate than an estimator such as FST. For example, Perez Lezaun et al. (1997) have recently shown that neighbour-joining trees based on human microsatellite variation are different depending on the choice of genetic distance estimator. However, in this case, the trees based on estimates of FST are in closer agreement with archaeological and other genetic data than are trees based on estimates of RST. These results, and our results from sea beet, imply that, if possible, microsatellite data should be analysed using estimators of parameters based on both stepwise mutation and infinite alleles models. If significant differences arise as a result of these different forms of analysis, only the presence of independent information may allow a decision to be made as to which method is more appropriate. Rousset (1996) has pointed out that, for the mutation process to have a significant effect on F-statistics, the mutation rate must be high enough for two or more mutation events to occur in different populations since the time of common ancestry. When population studies are made using hypervariable loci, which may be assumed to have high mutation rates, theory predicts that populations separated by relatively large geographical distances (and having relatively low migration rates between them) will show significant effects of mutation, whereas populations with higher migration (geographically closer) will not. Such data sets may need to be analysed using FST for closer populations and using RST for those further apart in order to dissect the effects of isolation by distance from those of mutation.