Short Review | Published:

Indirect measures of gene flow and migration: FST≠1/(4Nm+1)

Heredity volume 82, pages 117125 (1999) | Download Citation



The difficulty of directly measuring gene flow has lead to the common use of indirect measures extrapolated from genetic frequency data. These measures are variants of FST, a standardized measure of the genetic variance among populations, and are used to solve for Nm, the number of migrants successfully entering a population per generation. Unfortunately, the mathematical model underlying this translation makes many biologically unrealistic assumptions; real populations are very likely to violate these assumptions, such that there is often limited quantitative information to be gained about dispersal from using gene frequency data. While studies of genetic structure per se are often worthwhile, and FST is an excellent measure of the extent of this population structure, it is rare that FST can be translated into an accurate estimate of Nm.


Everything should be made as simple as possible, but not simpler. – Albert Einstein.


The movement of individuals and genes in space affects many important ecological and evolutionary properties of populations (Hanski & Gilpin, 1997). For example, it is well known that the extent of gene flow affects species integrity, because gene flow counters divergence which can lead to the evolution of reproductive isolation. The rate of movement of genes from one population to another helps to determine the possibility of local adaptation and of adaptive evolution on complex landscapes. Furthermore, dispersal affects the persistence of local populations, species extinction rates, the evolution of species ranges, synchrony of population size changes, and many other important ecological properties. These genetic and ecological issues have taken new urgency in the wake of the rapid loss of biodiversity, since developing effective species conservation strategies depends on knowing the genetic and ecological relationships among populations. Population biologists would very much like to be able to measure the rate at which migration among populations occurs and have collectively devoted a great deal of effort towards measuring gene flow, migration, and their consequences in a large number of species.

Unfortunately, direct measures of migration are fraught with difficulty. Marking and following individual organisms is at the least very time-consuming and expensive, and often technically very difficult. Mark and recapture techniques are prone to biases: long-distance dispersal may be very hard to observe but very important biologically. Estimates of migration are limited in time and do not accurately reflect rare but important events, such as the dramatic gene flow which may accompany storms or climatological shifts. Finally, direct measures of dispersal do not necessarily reflect the movement of genes, because the migrant must reproduce effectively in the new location for gene flow to have occurred.

As a result of these problems, methods have been developed that attempt to use gene frequency data to infer the extent of gene flow in natural populations indirectly (Slatkin, 1985, 1987). Most famously, Sewall Wright's island model of population structure predicts that, if a long list of assumptions is true, the variance in gene frequencies among different populations should be related to the number of migrants which come into each population each generation. With the advent of molecular biology, it has become easy to measure the distribution of alleles within and among populations and therefore tempting to use these data to study gene flow. A number of recent papers have addressed the estimation of gene flow (Milligan et al., 1994; Neigel, 1997; Bossart & Prowell, 1998a), but there is controversy about the usefulness of these estimates (see Bohonak et al., 1998, Bossart & Prowell, 1998b).

These indirect estimates of gene flow have the advantage that the data necessary to make such estimates are relatively easy to gather. Further, such estimates reflect migration rates averaged among numerous populations through time. However, indirect estimates of gene flow are not without their own problems. In particular, since those estimates rely on a mathematical relationship between genetic structure and the rate of gene flow, such estimates implicitly assume that the ecological properties of the populations from which the genetic data are taken match the often unrealistic assumptions of the theoretical model upon which that mathematical relationship is based. Even when such an estimate is warranted, the estimate is subject to sampling error, which can be very large. The central theses of this paper are that these real deviations from the artificial assumptions of the models undermine the reliability of indirect measures of gene flow and that these measures have a high degree of statistical uncertainty. We suggest that, for many applications, measures of genetic structure are valuable in their own right, but that transformations of these measures to quantitative estimates of gene flow or dispersal are at best not needed and, at worst, misleading.

Underlying theory

Wright's F-statistics are a set of hierarchical measures of the correlations of alleles within individuals and within populations. The F-statistic most relevant to the study of gene flow is FST, which has various interpretations; most famously it is the variance in allele frequencies among populations, σ2p, standardized by the mean allele frequency (p) at that locus:

See Slatkin (1985) for details concerning its derivation and Weir (1996) concerning its estimation. Wright (1931) introduced a simple model of population structure, called the island model, which predicts a simple relationship between the number of migrants a population receives per generation and FST (Fig. 1). Under the assumptions of the island model,

Figure 1
Figure 1

The island model. Each population receives and gives migrants to each of the other populations at the same rate m. Each population is also composed of the same number of individuals, N.

where N is the effective population size of each population and m is the migration rate between populations. Since FST can be estimated readily from data gathered with molecular techniques, we would seem to have a way to quickly measure the number of migrants coming into a population per generation, Nm. The promise of such easy information has led to a minor cottage industry of estimating Nm from FST. For example there were 13 papers in this journal which have done this in 1997 alone. (Note that there are several methods for deriving a measure of differentiation from genetic data, such as GST, ΦST, AMOVA, private alleles, etc., but the estimates of gene flow derived from each of these make fundamentally the same assumptions as FST, and we will be referring to these measures collectively in the following section.)

The island model, however, makes a large number of simplifying assumptions. It assumes an infinite number of populations, each always with N diploid individuals, and that each of these populations gives and receives a fraction m of its individuals into and from a migrant pool each generation. The individuals which do migrate are randomized and dispersed back to the populations without respect to any geographical structure, such that all populations are equally likely to give and receive migrants from all other populations. Furthermore the island model assumes that there is no selection or mutation and that each population persists indefinitely and has reached an equilibrium between migration and drift. Each of these assumptions is unlikely to be true in any particular case; sometimes this will not matter very much at all with regard to estimating Nm, but in some cases it will matter tremendously. One intention of this review is to investigate the common ways in which natural systems violate the assumptions of the island model and to explore the effects these deviations from the simple model will have on the quantitative and qualitative conclusions from indirect studies of gene flow.

The Fantasy Island model: violating the unrealistic assumptions of the island model

Violation of each of the assumptions of the island model can significantly affect the interpretation of the results. In this section we will discuss the likelihood of various deviations from the island model and their implications. We have organized our discussion of these assumptions into five categories.

1. No selection

One significant assumption made by all models of population structure used to infer gene flow from genetic patterns is that the different alleles at the loci being measured are selectively neutral and that none are linked to selected loci. In fact there is much evidence that many loci are under selection, including many of the loci studied as markers of gene flow themselves. The topic of the neutrality of allozymes and various other markers is far too large to review here; we merely wish to add a reminder that this can be an important source of error in the interpretation of population structure statistics. Selection can either increase or decrease FST relative to the neutral case.

Several studies have demonstrated significant differences in the F-statistics estimated from genetic markers derived from coding vs. noncoding DNA (e.g. Karl & Avise, 1992; Pogson et al., 1995; Bossart & Prowell, 1998). The important implication of these studies is that there is strong selection in some species affecting the pattern of genetic differentiation, especially at allozyme markers.

Theoretically, overdominance, underdominance, and local adaptation can change the expected value of FST, even with the same value of Nm (Charlesworth et al., 1997; Slatkin & Barton, 1989). Underdominance and local adaptation serve to inflate genetic differentiation; overdominance and spatially uniform selection tend to reduce the genetic variance among populations. Frequency-dependent selection should decrease FST if there is a single internal equilibrium, but increase it if there are multiple equilibria. Particularly when the migration rate is small, selection can easily be strong enough to dominate the pattern of genetic differentiation.

Furthermore, selection acting at other loci can affect the distribution of marker alleles. Charlesworth et al. (1997) have demonstrated that linkage to locally selected alleles will substantially increase FST. They have also shown that background selection (the constant selection against deleterious mutations at many loci in the genome) can result in a substantial increase in FST. This background selection is likely to be extremely common.

Finally, selection caused by inbreeding depression can act to inflate the effects of migration (Ingvarsson & Whitlock unpublished; Berry et al., 1991). If local populations are inbred, then migrant individuals produce outbred offspring which can have substantially higher fitness than those individuals with two local parents, because the offspring of migrants are outbred and therefore may have a higher fitness. As a result, selection can substantially enhance the effective rate of gene flow. On the other hand, in cases where migrants come from a very long distance or a very distinct breeding pool, the offspring of migrants may suffer from outbreeding depression and therefore the effective migration rate would be diminished (Barton & Bengtsson, 1986).

2. No mutation

New mutation can also affect the pattern of genetic differentiation among populations, but unless the mutation rate is large relative to the migration rates, this will present little problem for interpreting genetic differentiation. With DNA-level genetic markers, such as microsatellites and mitochondrial DNA, the rates of mutation can be quite high relative to the migration rate, and merit special attention (see, e.g. Goldstein et al., 1995; Slatkin, 1995).

3. All populations are created equal, with a constant number of individuals and equal contributions to the migrant pool

This assumption is particularly unrealistic, since almost any species will have a great deal of variation in local population size and in immigration and emigration rates. In this section we will discuss the consequences of violating these assumptions.

If the migration rate is not very high (say, less than 10%), then the expected FST in an island model population depends approximately not on N and m separately, but only on their product Nm. As a result, even if N and m vary, there will be no effect on FST as long as Nm is constant. Often, however, the effect of variation in N is not counterbalanced by variation in m, and Nm is variable among demes.

It is easy to see why we must be concerned with spatial variation in migration rates and population sizes, especially if we consider a common variant of population structure, source-sink metapopulations (Harrison et al., 1988; Pulliam, 1988; Dias, 1996; Gaggiotti & Smouse, 1996; Whitlock & Ingvarsson, in prep.). Imagine the case where some populations (sinks) are not capable of sustaining themselves without immigration from other populations of higher reproductive capacity (sources). If these sinks are sufficiently poor that they produce almost no emigrants, they will contribute almost nothing to the evolutionary future of the species. Yet the differentiation among sink populations can be much greater than that among source populations, if there is a higher rate of population turnover among the sinks. In this case, the FST measured from the metapopulation without knowledge of whether a population is a source or sink would bias any estimate of the effective Nm of the metapopulation as a whole. Situations of asymmetric migration are also common, for example, in areas of varying habitat quality or with directional dispersal vectors, such as in an ocean or river current.

More generally, N and m vary across populations. Island biogeography theory predicts variance in migrant number for a variety of reasons (MacArthur & Wilson, 1967), as has often been observed in natural populations (Ebenhard, 1991). Furthermore, dispersal is often distance dependent, such that populations near many other populations receive a greater number of migrants, whereas more isolated populations receive fewer (Brown & Kodric-Brown, 1977). If N and m both vary, they could vary independently or they could covary. McCauley (1991) and Ingvarsson (personal communication) have found correlations between migration rate and population size in two species of insect. In most cases the net effect is that Nm (the number of individuals entering populations) varies among populations. In this case, the FST that we measure does not correspond to the average value of Nm in the metapopulation, but rather is extremely biased (see Whitlock, 1992b). When we measure FST by traditional techniques, we are measuring an average correlation of alleles within demes. The average of this correlation across demes is a nonlinear function of the Nm of a deme, so the ‘average’ Nm estimated from FST data becomes biased downwards. Figure 2 demonstrates this effect.

Figure 2
Figure 2

The effects of variable Nm on FST and estimating the average Nm. The thick line in this figure shows the equilibrium FST for different values of Nm. Because FST is a nonlinear function of Nm, the Nm value which would be estimated from an average FST is not the same as the average Nm value. In this extreme example, the metapopulation is composed of two subtypes in equal proportions, one with Nm=10, and one with Nm very small. The black dot shows the average Nm (≈5) and the overall FST. The Nm which would be calculated from this FST is, however, much smaller. Spatial heterogeneity in Nm translates into large underestimates of the average Nm.

An extreme, yet common, form of variation in population size is recurrent local extinction and colonization of populations. In many species, the turnover of populations is high enough to substantially affect (usually increase) FST (Slatkin, 1977; Wade & McCauley, 1988; McCauley, 1989; Whitlock & McCauley, 1990; Whitlock, 1992a; McCauley et al., 1995; Giles & Goudet, 1997). This is because founding events usually involve far fewer individuals than a habitat patch can eventually sustain and because the finite life of individual populations limits the time over which subsequent gene flow can ameliorate the effects of the initial founding events. In many cases (Whitlock, 1992a; Ingvarsson et al., 1997), the FST of the population is substantially different from that predicted by using the island model with the direct measures of N and m alone, although the direct measures of demographic parameters predict FST very well when they include the effects of extinction and recolonization (Whitlock & McCauley, 1990). The effects of extinction and recolonization can be particularly pronounced for extranuclear genomes that are inherited uniparentally, especially in many angiosperms in which maternally inherited chloroplast and mitochondrial DNA can disperse only in seeds (McCauley, 1995).

4. There is NO spatial structure: migration is completely random

One of the most obvious deviations in natural populations from the assumptions of the island model is that migration rates are correlated with the distance between populations (Wright, 1943, 1946; Neigel, 1997). Populations which are farther apart tend to exchange fewer migrants. A recent review by Neigel (1997) summarizes recent advances in measuring various parameters important in determining the amount of genetic differentiation in a metapopulation with isolation by distance; here we will address the biases which arise from interpreting a system with distance-biased dispersal as if it were an island model with free migration.

Kimura & Weiss (1964), in their classic paper introducing the stepping stone model (where discrete demes are most likely to exchange migrants with adjacent demes), showed that the correlation among demes in allele frequencies drops with increasing distance, and that this would happen more rapidly in a one-dimensional system than in two dimensions. While this correlation is a function of the migration rate, FST in this kind of system does not behave as in the island model. The genetic differentiation of stepping stone systems is substantially greater for the same number of migrants coming into a deme per generation. The same value of FST is consistent with much or little migration, depending on the geometry of migration.

Obviously, measures of dispersal rates can be made only at the spatial scale at which the samples were taken. A population may have much dispersal locally, but none at a larger scale, or even vice versa. Thus, even if island model assumptions hold at the scale at which the sample is taken, they may not at a larger or smaller spatial scale, and therefore results from that scale will not extrapolate. This problem could be particularly important when Nm values are compared between species without acknowledging that each species may have been sampled at a different spatial scale.

One method has been proposed to measure the pattern of migration among specific pairs of populations (Slatkin, 1993). This formulation might sometimes be informative, but it is perhaps worthwhile to point out a potential misunderstanding in its use. Slatkin shows that the isolation by distance between populations can be estimated by calculations of FST for each pair of populations. He defines circ;M as the value of Nm that would give the pair-wise FST. It is important to realize, however, that this circ;M does not reflect the actual dispersal between two populations, but instead is another measure of differentiation. A pair of populations in the same metapopulation which receive migrants from the same sources will have a low FSTeven if they exchange no migrants at all (even indirectly).

Similarly, many geographical features can restrict gene flow between sets of populations, such as rivers, highways, mountain ranges, etc. In many circumstances, the equal migration assumption of the island model is clearly not true. Any particular geographical feature which could potentially be a barrier to gene flow can be examined by using hierarchical F-statistics (see Weir, 1996). Ignoring this hierarchical structure can result in substantial biases in estimates of gene flow (Husband & Barrett, 1994).

5. Everything is at equilibrium, nothing is changing

Another major assumption of methods of estimating migration rates from gene frequency data is that the whole population has reached an equilibrium between the forces of migration and genetic drift. Yet in many cases this is clearly not the case. Many species are naturally in new ecological contexts, such as high-latitude tree populations which have been in situ for mere tens of generations in many cases. A recent range expansion can cause migration and drift to have insufficient time to reach equilibrium and therefore give migration estimates biased towards the previous conditions. As an extreme case, populations that have recently been completely isolated will not yet necessarily reflect the equilibrium predicted by current levels of gene flow. Low FST values do not imply current gene flow.

This is a particular problem for attempts at estimating gene flow among ‘species’ - the large population size and low migration rate expected among species means that the equilibrium FST will take an extremely long time to be reached, perhaps longer than the history of the speciation event. FST cannot be used to infer the rate of gene flow among species.

We live in a time of rapid anthropogenic change. Many species are fragmented into smaller, more distant subpopulations than the species previously experienced; similarly, species which deal well with human disturbance have increased in number and are perhaps more connected by migration than they were historically. As a result, many species, especially those for which conservation biologists have particular concern, are not expected to be at an equilibrium between migration and drift, and therefore indirect dispersal estimators based on FST are likely to be in significant error.

The time (in generations) required for a metapopulation to reach equilibrium is increased by low migration rates and large population size (Crow & Aoki, 1984; Whitlock, 1992b). The time it takes for FST to reach halfway from an old value to a new equilibrium is ln(1/2)/ln[(1−m)2(1−1/2N)] (Whitlock, 1992b), which can be extremely long if population sizes are large and migration rates are low (Fig. 3).

Figure 3
Figure 3

The time for FST to approach equilibrium. From a panmictic population, the time taken for FST to reach 95% of equilibrium value is given in this graph. The time to equilibrium is larger when the deme population size is large and when the migration rates are small. In some circumstances, the amount of time required is much longer than the history of the species in its current state. Note the log scales.

The differences between dispersal and gene flow

Throughout we have attempted to maintain a subtle distinction between dispersal and gene flow. The differences can be very important for interpreting the results of an indirect measure of gene flow. Of course, in order for a dispersal event to effect gene flow, the migrant individual must successfully mate and breed, and at least some of its offspring must grow to adulthood. Nm in the island model refers to an ‘effective’ number of individuals in a population (Ne) and the effective proportion of breeding individuals that are migrants (m). In many ecological contexts, this effective Nm is not what is being sought; often we would prefer to know the number of individuals moving to a patch, consuming its resources and interacting with its residents in a variety of ways. Migrants that are reproductively unsuccessful may nevertheless significantly affect the ecology of their new deme. A good example of this would be the dynamics of a host/parasite or disease system, which is significantly affected by spatial structure and dispersal patterns, but where migrants can introduce disease to a new patch even without reproducing.

There are many reasons why migrants may have different fitness from resident individuals. Dispersers are often of different age classes, physical condition, social status, or genotype from nondispersers (Chepko-Sade & Halpin, 1987; Roff & Simons, 1997). The offspring of migrants are more likely to be outbred, and migrant genes may spread faster because of heterosis or more slowly because of outbreeding depression. Furthermore, migration may occur nonrandomly with respect to the life-cycle. If young individuals move, then most of their reproductive effort will be in the new patch and the usual assumptions of the island model are met. However, if individuals move after some of their reproductive life, then their contribution to the new deme is less than a full individual (Endler, 1979; McCauley, 1983).

One particular problem of estimating dispersal rates from genetic data is that the ‘Nm’ value from FST is actually Nem. The effective population size of a deme will be much less than the actual population size. The best estimates of Ne/N average about 10%, but are extremely variable among species (see Frankham, 1995 for a nice review). As a result, the actual number of migrants into a deme is likely to be 10-fold or more higher than that estimated by FST for this reason alone! Because of the variation in Ne/N ratios, however, this means that it is almost impossible to translate an Nem estimate into an estimate of the actual number of migrant individuals. Estimating Ne is extremely difficult; many applications of the FST approach are attempting to measure m rather than Nem, and without an accurate measure of Ne this becomes impossible.

With seed plants (and some other sedentary organisms as well), the differences between dispersal of individuals and gene flow can be even greater, because gene flow by gametes (i.e. pollen) may greatly outstrip movement by seeds (Ennos, 1994). For most ecological purposes, the movement of individual diploid organisms is much more relevant than gene flow alone.

Statistical issues

Even when the estimation of Nm from FST can be interpreted biologically as an unbiased estimate of the number of individuals moving between populations, that estimate of Nm comes with a great deal of uncertainty. This is because, for logistical reasons, estimates of FST are usually based on a small number of loci scored for a limited number of individuals taken from a few populations. Because FST is essentially the ratio of two variances, it is difficult to measure accurately without a large data set. Worse, because FST is a nonlinear function of Nm, estimates of Nm from FST will be especially inaccurate. Small differences in FST can result in large differences in estimates of Nm (Fig. 2), therefore the error in estimating FST is amplified when estimating Nm. As a result, the confidence interval for estimates of Nm can be enormous, particularly for FST values less than 0.1, as is often the case in nature.

To illustrate this we conducted a simple computer simulation in which FST estimates were created by randomly sampling five loci from each of 50 individuals from each of 10 demes (approximately the size of a typical data set used to estimate FST in nature) from an underlying set of island model demes with an FST of 0.005. This procedure was repeated 1000 times and the results are shown in Fig. 4. Estimates of FST ranged from slightly negative to greater than 0.02. The 95% confidence limits for values of Nm calculated from these FST estimates was 11–283 (Fig. 4), compared to the true value of 50. With a true value of FST=0.02, the confidence intervals are not as broad. These errors result not only from the sampling error caused by the finite samples of individuals and alleles from a finite number of subpopulations, but also from the random evolutionary history of any particular locus.

Figure 4
Figure 4

Error variance for FST and the effects on the estimate of Nm. This is the distribution of FST estimates obtained from looking at 5 loci, 10 demes, and 50 diploid individuals per deme. The true FST of these demes is 0.005, but the range of estimates is large. Nm was truly 50 in these simulations, but the estimates of Nm ranged from −11790 to 19149 out of 1000 replicates. 95% of the estimates of Nm fell between 11 and 283.

FST, as estimated from a typical size sample, is capable of estimating Nm only roughly, sometimes only within a couple of orders of magnitude. In many cases, this is a useful scale of resolution; more often, it is not (see below). In any event, estimates of Nm should always be accompanied by confidence limits. Recent advances in the sampling theory of F-statistics make this easier (Balding & Nichols, 1995).

Furthermore, as has been suggested by Jim Mallet (personal communication), errors in scoring of individual genotypes can potentially contribute strongly to the apparent FST measured in a population. These errors are often controlled for (by, say, running individuals from different populations at random on a particular gel), but when they are not, temporal variation in scoring of genotypes could easily contribute a strong systematic bias to the estimation of FST.

Non-diploid genetics

The expected FST for non-diploid genetic loci is of course much different from the FST for diploids. Haplodiploid or X-linked genes will depend critically on the sex-ratio of dispersers and in general give higher FST values (Kimura, 1963; Whitlock, 1995). Uniparentally inherited loci such as mitochondrial DNA or chloroplast DNA will have differentiation patterns which depend only on the population size and migration patterns of the sex of individuals which transmit these genes; therefore the level of differentiation tends to be much higher (McCauley, 1995). For haploid genomes, of course there are effectively half as many allele copies as there are for a diploid genome, so that the effects of drift are higher and the differentiation among populations is increased, such that the expected FST under the island model is ≈1/(2Ne,O mO+1), where Ne,O is the effective size and mO the migration rate of the sex transmitting the genome in question.


It is clear that the promise of easy estimates of dispersal by inference from genetic data must be viewed with caution. These methods make numerous assumptions about populations which are unlikely to be true, and there are other difficulties with the interpretation of these data. Although our central thesis is that great care should be taken in the interpretation of genetic data, we conclude with some more optimistic observations.

First, there is a great difference between studying genetic data to estimate dispersal and doing so to estimate genetic differentiation. For many purposes, FST is an excellent measure of the genetic differentiation among populations, and indeed studying the genetic structure of a population is essential to understanding its evolutionary properties. Most of the concerns we have brought up do not affect this conclusion. Often FST is truly intended to measure the genetic differences between populations, and in these cases we simply suggest not translating FST into a measure of Nm.

Second, the technology of direct estimation of dispersal has improved substantially. Smaller radio transmitters (Koenig et al., 1996), marker-assisted migration estimates (Devlin & Ellstrand, 1990; Broyles et al., 1994; Nason & Hamrick, 1997), and improved estimation techniques have allowed observation of dispersal events which were previously invisible to direct methods. These methods allow other important natural history information to be recorded, do not suffer most of the problems outlined above, and are not necessarily more expensive than indirect studies. A renewed emphasis on direct measures would spur even more development along these lines. Direct observations of dispersal through mark and recapture methods, or direct observations of gene flow through genetic methods, are labour intensive and are most useful when applied to a few focal populations. However, in addition to providing an estimate of gene flow and/or migration, they might provide some insight into the suitability of FST estimates for predicting Nm. If one can not reconcile the direct and indirect estimates, it would seem profitable to explore which additional ecological features cause a departure from the assumptions of the island model (population turnover, recent range expansion, temporal variation in migration rates, etc.).

Third, recognition of the limitations of the island model may spur theoreticians to account for these issues in more realistic models which could allow for using genetic data in a more realistic way. At this point, the issue most obviously being addressed in this way is the problem of isolation by distance, for which there is a growing literature dating back to Wright's isolation by distance papers (Wright, 1943, 1946; Slatkin, 1993; Neigel, 1997). Genetic data have also been used to infer the importance of extinction/colonization (Whitlock, 1992a; McCauley et al., 1995; Giles & Goudet, 1997; Ingvarsson et al., 1997), source-sink dynamics (Dias et al., 1996), and others, but more formal models are required. Furthermore, we need to use genetic data to investigate a broader spectrum of demographic processes. A transition to hypothesis testing, comparing the genetic structure of separate elements of a metapopulation (i.e. young populations vs. old, near vs. distant, mainland vs. island, etc.), can give us more insight into the importance of various factors in creating genetic patterns.

Fourth, FST measures may give reasonable estimates of Nem in cases where the spatial scale is small (so that migration may follow the island model and selection is less likely to cause strong patterns of genetic differences), migration rate is relatively high (so that equilibrium conditions are quickly reached), sample sizes and number of loci are large (to account for statistical issues), and when the biological questions are truly asking for an estimate of the effective rate of gene flow expressed as Nem. These conditions do not always hold, and it could be argued that if we knew these conditions to be the case we would already know more about the populations than we will learn from this indirect measure.

Finally, a note of cautious (and perhaps foolish) optimism. For the reasons we have discussed throughout this review, estimates of gene flow based on FST are unlikely to be very reliable. However, these estimates are likely to be correct within a few orders of magnitude. Comparisons of large groups of species are likely to be more informative, as many of the differences may average out. Estimates of dispersal from FST should be undertaken with great caution, and only if the biological question behind the attempt at estimating dispersal depends on knowing migration rates within very large bounds.


  1. and (1997). Metapopulation Biology: Ecology, Genetics and Evolution. Academic Press, New York.

  2. (1985). Gene flow in natural populations. Ann Rev Ecol Syst, 16: 393–430.

  3. (1987). Gene flow and the geographic structure of natural populations. Science, 236: 787–792.

  4. (1997). A comparison of alternative strategies for estimating gene flow from genetic markers. Ann Rev Ecol Syst, 28: 105–128.

  5. and (1998a). Genetic estimates of population structure and gene flow: limitations, lessons and new directions. Trends Ecol Evol, 13: 202–206.

  6. , and (1994). Conservation genetics: beyond the maintenance of marker diversity. Mol Ecol, 3: 423–435.

  7. , , and (1998). Is population genetics mired in the past? Trends Ecol Evol, 13: 360

  8. and (1998b). Reply from Bossart and D. Pashley Prowell. Trends Ecol Evol, 13: 360

  9. (1996). Genetic Data Analysis II. Sinauer Associates, Sunderland, MA.

  10. (1931). Evolution in Mendelian populations. Genetics, 16: 97–159.

  11. and (1992). Balancing selection at allozyme loci in oysters: implications from RFLPs. Science, 256: 100–102.

  12. , and (1995). Genetic population structure and gene flow in the Atlantic cod Gadus morhua: A comparison allozyme nuclear RFLP loci. Genetics, 139: 375–385.

  13. , and (1997). The effects of local selection, balanced polymorphism and background selection on equilibrium patterns of genetic diversity in subdivided populations. Genet Res, 70: 155–174.

  14. , , , and (1991). Hybridization and gene flow in house mice introduced into an existing population on an island. J Zool, 225: 615–632.

  15. and (1986). The barrier to genetic exchange between hybridizing populations. Heredity, 57: 357–376.

  16. (1995). A measure of population subdivision based on microsatellite allele frequencies. Genetics, 139: 457–462.

  17. , , and (1995). An evaluation of genetic distances for use with microsatellite loci. Genetics, 139: 463–471.

  18. (1988). Sources, sinks, and population regulation. Am Nat, 132: 652–661.

  19. , and (1988). Distribution of the bay checkerspot butterfly Euphydryas editha bayensis: evidence for a metapopulation model. Am Nat, 132: 360–382.

  20. (1996). Sources and sinks in population biology. Trends Ecol Evol, 11: 326–330.

  21. and (1996). Stochastic migration and maintenance of genetic variation in sink populations. Am Nat, 147: 919–945.

  22. and (1967). The Theory of Island Biogeography. Princeton University Press, New Jersey.

  23. (1991). Colonization in metapopulations - a review of theory and observations. Biol J Linn Soc, 42: 105–121.

  24. and (1977). Turnover rates in insular biogeography: effect of immigration and extinction. Ecology, 58: 445–449.

  25. (1991). The effect of host plant patch size variation on the population structure of a specialist herbivore insect, Tetraopes tetraophthalmus. Evolution, 45: 1675–1684.

  26. (1992b). Temporal fluctuations in demographic parameters and the genetic variance among populations. Evolution, 46: 608–615.

  27. (1977). Gene flow and genetic drift in a species subject to frequent local extinctions. Theor Pop Biol, 12: 253–262.

  28. and (1988). Extinction and recolonization: their effects on the genetic differentiation of local populations. Evolution, 42: 995–1005.

  29. and (1990). Some population genetic consequences of colony formation and extinction: Genetic correlations within founding groups. Evolution, 44: 1717–1724.

  30. (1989). Extinction, colonization, and population structure: a study of a milkweed beetle, Tetraopes tetraophthalmus. Am Nat, 134: 365–376.

  31. (1992a). Nonequilibrium population structure in forked fungus beetles: Extinction, colonization, and the genetic variance among populations. Am Nat, 139: 952–970.

  32. , and (1995). Local founding events as determinants of genetic structure in a plant metapopulation. Heredity, 75: 630–636.

  33. and (1997). A case study of genetic structure in a metapopulation. In: Hanski, I. A. and Gilpin, M. E. (eds) Metapopulation Biology: Ecology, Genetics and Evolution, pp. 429–454. Academic Press, New York.

  34. , and (1997). Extinction-recolonization dynamics in the mycophagous beetle Phalacrus substriatus. Evolution, 51: 187–195.

  35. (1995). The use of chloroplast DNA polymorphism in studies of gene flow in plants. Trends Ecol Evol, 10: 198–202.

  36. (1943). Isolation by distance. Genetics, 28: 114–138.

  37. (1946). Isolation by distance under diverse systems of mating. Genetics, 31: 39–59.

  38. and (1964). The stepping stone model of population structure and the decrease of genetic correlation with distance. Genetics, 49: 561–576.

  39. (1993). Isolation by distance in equilibrium and non-equilibrium populations. Evolution, 47: 264–279.

  40. and (1994). Estimates of gene flow in Eichhornia paniculata (Pontederiaceae): effects of range substructure. Heredity, 75: 549–560.

  41. and (1984). Group selection for a polygenic behavioural trait: Estimating the degree of population subdivisions. Proc Natl Acad Sci USA, 81: 6073–6077.

  42. , (eds) (1987). Mammalian Dispersal Patterns: the Effects of Social Structure on Population Genetics. University of Chicago Press.

  43. and (1997). The quantitative genetics of wing dimorphism under laboratory and ‘field’ conditions in the cricket Gryllus pennsylvanicus. Heredity, 78: 235–240.

  44. (1979). Gene flow and life history patterns. Genetics, 93: 263–284.

  45. (1983). Gene flow distances in natural populations of Tetraopes tetraophthalmus. Evolution, 37: 1239–1246.

  46. (1995). Effective population size/adult population size ratios in wildlife: a review. Genet Res, 66: 95–107.

  47. (1994). Estimating the relative rates of pollen and seed migration among plant populations. Heredity, 72: 250–259.

  48. and (1995). A method for quantifying differentiation between populations at multi-allelic loci and its implications for investigating identity and paternity. Genetica, 96: 3–12.

  49. (1963). A probability method for treating inbreeding systems especially with linked genes. Biometrics, 19: 1–17.

  50. (1995). Two-locus drift with sex chromosomes: the partitioning and conversion of variance in subdivided populations. Theor Pop Biol, 48: 44–64.

  51. , and (1996). Detectability, philopatry, and the distribution of dispersal distances in vertebrates. Trends Ecol Evol, 11: 514–517.

  52. and (1990). The development and application of a refined method for estimating gene flow from angiosperm paternity analysis. Evolution, 44: 248–259.

  53. , and (1994). Evidence for long-distance pollen dispersal in milkweeds (Asclepias exaltata). Evolution, 48: 1032–1040.

  54. and (1997). Reproductive and genetic consequences of forest fragmentation: two case studies of neotropical canopy trees. J Heredity, 88: 264–276.

  55. , and (1996). Source-sink populations in Mediterranean Blue tits: evidence using single-locus microsatellite probes. J Evol Biol, 9: 965–978.

  56. and (1989). A comparison of three indirect methods for estimating average levels of gene flow. Evolution, 43: 1349–1368.

Download references


Thanks to many people who have read previous versions of this manuscript and made many helpful suggestions: Sally Otto, Rick Taylor, Pelle Ingvarsson, Mike Stamford, Steve Latham, and Jim Leebens-Mack. In particular we would like to thank the reviewers of this manuscript for thoughtful and generous contributions which have greatly improved this paper: Nick Barton, Jim Mallet, and Joe Neigel. This work was supported by a Natural Sciences and Engineering Research Council (Canada) grant to M. C. W. and by National Science Foundation grant DEB-9610496 to D. E. M.

Author information


  1. Department of Zoology, University of British Columbia, Vancouver, BC V6T 1Z4 Canada

    • Michael C Whitlock
  2. Department of Biology, Vanderbilt University, Nashville, Tennessee 37235, USA

    • David E McCauley


  1. Search for Michael C Whitlock in:

  2. Search for David E McCauley in:

Corresponding author

Correspondence to Michael C Whitlock.

About this article

Publication history





Rights and permissions

To obtain permission to re-use content from this article visit RightsLink.