Introduction

Two measures that help us predict the influence of genetic drift on a population are the (effective) neighborhood number (Nn) and effective population size (Ne). However, the distinction between them is often misunderstood and here my aim is to clarify their relationship, emphasizing that these two measures are conceptually very different. Ne is the effective number of individuals within the whole population, defined as the size of an ideal population affected by random genetic sampling at the same rate as the population being studied (Wright, 1931, 1938), whereas Nn is the effective number of individuals within an area of a population (the neighborhood) defined such that the parents of a focal individual can be considered to be genetically representative of the neighborhood’s occupants (Wright, 1946). To avoid confusion, the notation Nn is used for the neighborhood number instead of Nb (Slatkin and Barton, 1989) as Nb has been used for the effective number of breeders in a given year in species with overlapping generations (Waples, 2005), and, more generally, for the effective number of parents producing the sample used in the estimation of Ne when there is temporal or spatial population structure (Neel et al., 2013).

Both Ne and Nn increase with the density of individuals within a population, and some of the species-specific factors that influence Ne by altering the variance in reproductive success (Wright, 1938) such as longevity and mating system (Nunney, 1993) can also influence Nn; however, the two measures differ in scale and, as a result, they differ in the nature of their influence on the genetic composition of a population. A major factor determining Ne is the size of the population (N), and that is largely dependent upon the area of available habitat. In contrast, Nn is independent of N as it is defined at a local scale by the dispersal biology of the species. Any population exhibiting spatial structure must encompass an area greater than a single neighborhood, as when this is not the case the population is panmictic.

Ne is expected to vary substantially across populations because of its strong dependence upon the area of suitable habitat availability, an area that can vary enormously from population to population, whereas Nn is expected to vary little among populations of the same species given the relative constancy of species-specific dispersal patterns. Moreover, as Ne is affected by the spatial structure of a population (Whitlock and Barton, 1997; Nunney, 1999), Nn can directly affect Ne, whereas the reverse is not true.

The rate at which neutral genetic variation is lost from a population because of genetic drift is determined by the effective population size (Ne). Although this measure is primarily dependent upon N, it is modified by a variety of factors (see Hare et al., 2011). One such factor is nonrandom mating. Wright (1943) showed how two different sources of nonrandom mating affect Ne under otherwise ideal conditions. First, given local inbreeding (such as selfing or brother/sister mating) but no spatial structure, then Ne=N/(1+FIS); and second, given a spatial structure of semi-isolated island sub-populations (and random mating within each), Ne=N/(1−FST). In these formulae, FIS and FST are the hierarchical inbreeding coefficients (Wright, 1951). These two results can be combined into a single relationship (Nunney, 1999):

that illustrates how FIS, derived from local inbreeding, and FST, derived from the isolation of the island sub-populations, influence Ne in opposite ways.

This observation that FIS generated by a regular system of inbreeding acts to reduce Ne extends to nonideal populations (see Caballero and Hill, 1992; Wang, 1996; Yonezawa, 1997; Nunney, 1999); however, the effect of the larger scale inbreeding due to the island subdivisions (FST) depends on the model used. The influence of FST in the island model in increasing Ne (Equation (1)) is dependent upon the assumption that island productivity is locally regulated, such that all islands contribute equally to the dispersal pool (Whitlock and Barton, 1997). If, on the other hand, a demic model is used, where population regulation allows productivity differences among the demes (or islands) to be reflected in the pool of migrants, then these productivity differences generate interdemic genetic drift. The result is that the influence of FIS and FST become identical (Nunney, 1999):

This difference between the effective size of a metapopulation with equal dispersal per island (island model, Equation (1)) versus one with varying dispersal from each island (demic model, Equation (2)) is important in defining conservation strategies (see, for example, Gilpin, 1991; Hedrick and Gilpin, 1996; Whitlock and Barton, 1997; Nunney, 2000).

The Equations (1) and (2) provide the basis for understanding how Ne is affected when a deviation from Hardy–Weinberg ratios occurs either as a result of a regular system of close inbreeding combined with uniform dispersal (=FIS), and/or as a result of the population being subdivided into semi-isolated islands (= FST). The situation that these equations do not directly address, however, concerns how Ne is affected when genetic structure builds up as a result of ‘isolation-by-distance’. Wright (1943, 1946) suggested how this could be done.

Wright (1946) introduced the concept of a neighborhood as the unit of structure that arises from isolation by distance within a continuously distributed population. He showed that when both sexes exhibit normally distributed dispersal (of either individuals or gametes), then Nn is the number of individuals contained in a circle of radius 2σ, that is,

where σ2 is the variance of the distance between a central offspring and its parents, measured along a diameter (noting that the symmetry in the position of parents around the focal offspring defines a mean dispersal distance of zero), and where d is the density of individuals. Note that Equation (3) indicates that Nn will be relatively constant assuming that σ2 is a property of the species, but it will vary somewhat if the density of individuals varies among different populations. Even the effect of varying density may be minimized if there is some negative correlation between dispersal and density (for example, if at very low density, individuals disperse further).

The neighborhood number directly determines how random genetic drift affects genetic differentiation within a population, with a smaller neighborhood (because of a shorter dispersal distance) resulting in greater differentiation. Thus, within a large continuously distributed population, Nn provides information on the spatial distribution of genetic variation contained within a population. This genetic structure develops regardless of the total population size (assuming Ne » Nn). The value of Nn affects the level of genetic variation within the total population indirectly via the influence of this genetic structure that affects Ne by creating a pattern of nonrandom mating. As genetic structure created by limited dispersal is often a feature of natural populations, it is important to determine accurately what influence Nn has on the effective size of the total population.

Wright (1943) suggested that a random breeding unit in a continuous population (later called a neighborhood in Wright, 1946) was analogous to an island of the island model if neighborhoods were sampled at random. To formalize the analogy he showed that, if these units regulate their numbers locally, then:

where FST is the genetic differentiation among randomly sampled neighborhoods. Equation (4) predicts that as the genetic structure becomes more extreme (FST increases), then Ne increases.

Maruyama (1972) derived a more exact expression for the decline in heterozygosity given the isolation-by-distance model and using numerical examples of his formula concluded that:

From Equation (3), this result indicates that if Nn>4π (=12.6), then neighborhood size has no noticeable effect on the global effective population size, but if the neighborhood size is small then:

that is, Ne increases as Nn decreases, with Ne=4πN when Nn=1. This result is qualitatively (but not quantitatively) the same as Equation (4).

To further examine the link between Nn and Ne, Kawata (1995) simulated a continuously distributed population where limited dispersal created genetic structure. During the initial nonequilibrium phase (he used the first 15 generations), while the genetic structure was being established, estimates of Ne (using the decline in heterozygosity) were closely related to Nn, but later, when the genetic structure of the population had equilibrated, the estimates of Ne were more influenced by N. However, Kawata (1995) found that at these late stages the estimated Ne was always less than or equal to the population size, a result conflicting with Equation (4).

Neel et al. (2013) also simulated a neighborhood-structured population, estimating Ne with a single-sample estimator (based on linkage disequilibrium). Like Kawata (1995), they found that under some conditions (in their case, when the genetic sampling was at the scale of a single neighborhood) the estimated Ne was close to Nn. Larger scale sampling encompassing several neighborhoods produced higher estimates of Ne, but they never approached the expected theoretical value defined by Equation (4). For example, given the largest neighborhood size tested (=84) and a sample area of 5% of a population of 90 000, the estimate of Ne was under 800.

The present paper was motivated by questions arising from these two previous simulation studies. First, was Wright (1943) correct in his contention that the equilibrium interrelationship of Ne, Nn and N is closely approximated by the results derived for the island model, that is, that the effect of neighborhood structure on Ne is analogous to the effect of island sub-populations on the effective size of the total population, as defined by Equation (4). The results of Kawata (1995) were inconsistent with this contention. Second, the two previous studies showed that, under some conditions in populations with significant spatial genetic structure, traditional estimators of Ne can lead to substantial underestimates closer to Nn than the expected Ne. Furthermore, Neel et al. (2013) concluded that single-sample estimates of Ne will generally result in an underestimation of the true Ne, and raised the question of whether this bias also applies to the more traditional two-sample temporal method (see Waples, 1989). If the temporal method underestimates the true Ne to a degree similar to that revealed by the simulation estimates of Neel et al. (2013) using a single-sample linkage disequilibrium method, then this might account for many of the unusually low estimates of Ne/N reported in the literature.

These two questions were investigated using simulations of a spatially structured plant population. First, to examine the accuracy of Equation (4), the ‘true Ne’, that is, the value of Ne realized in a simulation over a period of 32 generations, was calculated from the gene frequency change occurring across 1000 single-nucleotide polymorphisms (SNPs) estimated from the total population. The reliability of this estimate was independently verified using FST values resulting from simulations of an island-structured metapopulation (of 1000 islands with 1 polymorphic locus), where each island showed internal genetic structure because of isolation by distance. Second, the bias in estimating Ne in a spatially structured population was evaluated using the temporal method under varying conditions of sample size, sampling method and time interval. Until recently, the temporal method was the main approach for estimating Ne using genetic data.

In summary, this paper addresses the theoretical problem of whether Equation (4) defines Ne, and the practical problem of whether the temporal method can accurately estimate Ne. Understanding the extent to which spatial genetic structure is expected to increase (or decrease) Ne is important in predicting the effect of factors such as habitat loss on the long-term genetic composition of populations using the theoretical links between ecological/demographic factors and Ne (Nunney and Elam, 1994). It is also important in evaluating genetic estimates of Ne, given the concern over substantial bias noted by Neel et al. (2013). Documenting this bias across the available genetic estimators is the first step in moving toward a resolution of the problem.

Materials and methods

Simulation model

The simulation model assumed a plant population of monoecious annuals. The habitat consisted of a rectangular array of regularly spaced sites, each of which always supported a single plant so that the population size N was constant. Pollen dispersal was normally distributed around each paternal plant and seed dispersal was zero; thus success through female function was fixed at one, whereas male success was approximately Poisson (although it was expected to be somewhat influenced by the neighborhood size). To avoid edge effects, it was assumed that locally dispersing pollen was reflected back from the boundaries of the population. The model was coded in PureBasic (Fantaisie Software, Fegersheim, France).

The simulated system tracked 1000 independent biallelic loci (or SNPs) initiated with equal allele frequencies. As noted above, the true value of Ne determining drift over a period of 32 generations was calculated using the temporal method by genetically sampling every individual before and after the 32-generation interval. This value was compared with shorter sampling intervals of 1, 2, 4, 8 and 16 generations. To examine the effect of less than complete sampling, three different sampling strategies were employed using intervals of 1, 2, 4, 8, 16 and 32 generations. These strategies were: (1) random sampling over the whole population (without replacement), (2) sampling and resampling a single site and (3) sampling one site but resampling a different site. All sampling was nondestructive, and each sampling or resampling of a site included all individuals at that location. Sampling was initiated after an N-generation burn-in. The population sizes simulated were N=256, 1024 or 4096 and the fraction of the population sampled was 25, 10 or 2%. If either one of an initial/final pair of samples was not polymorphic at a locus, then that locus was omitted as the time period over which drift was acting was unknown.

To estimate Ne, the temporal method was applied using the statistic Fc and the appropriate sample size correction (Equations (8) and (12) from Waples, 1989). Each 1000-locus scenario of each of the three population sizes and each of the three sample sizes (that is, 9 cases) was replicated 5 times, with the three sampling methods being simultaneously implemented within each simulation.

The expected value of Ne given uniform pollen dispersal (that is, Nn=∞) was used as a reference value (Ne,ref) for testing the fit of the data to theory. Ne,ref was predicted to be 4N/3 based on Wright's (1938) classic result of Ne=4N/(2+V), noting that the reproductive variance V is made up of male plus female variance. Given the conditions of the simulation, Vf=0 whereas Vm≈1.

To provide an independent estimate of Ne given the conditions of the simulation, Ne was also estimated from the equilibrium FST using simulations of 1000 neighborhood-structured island populations (each of N=256, 1024 or 4096) linked by a low level of migration (Nm=1) and segregating a single biallelic locus, so that Ne=(1−FST)/(4mFST). Simulations were run for 4N generations, and estimates of FST were based on the final N generations.

Results

The effect of Nn on Ne

When the neighborhood size was infinite (that is, uniform pollen dispersal), the effective size determined from gene frequency change occurring over 32 generations (Ne,32) was in close agreement with the expected value of 4N/3 (=Ne,ref): Ne,32=360±15 (mean±1 s.d.) with Ne,ref=341.3 (N=256); Ne,32=1396±78 with Ne,ref=1365 (N=1024); and Ne,32=5526±230 with Ne,ref=5461 (N=4096), where each value was based on 15 replicates. Under these conditions, genetic structure was minimal (FIS=0.00), and, as expected, when neighborhood size was reduced (using a series of fourfold reductions down to Nn=1), genetic structure increased, as measured by FIS (Table 1). Determining Ne from the gene frequency change occurring over 32 generations (Ne,32), it was found that reducing Nn increased Ne, although the effect was relatively minor until the neighborhood size was very small (roughly less than 16; Figure 1). This increase was compared with the expected effective size (Ne,exp) calculated as:

Table 1 Ne estimated for populations of varying size (N) and neighborhood size (Nn)
Figure 1
figure 1

Ne of populations of varying size (N) and neighborhood size (Nn) compared with Wright’s expectation. The theoretical expectation of Ne (Ne,exp; see Equation (7)) is shown by the solid lines for N of 256 (squares), 1024 (triangles) and 4096 (diamonds). The dashed lines link the values of Ne realized in the simulations (Ne,32), estimated from the multilocus gene frequency changes (using 1000 SNPs) over 32 generations. Each point (±1 s.d.) was based on 15 replicate simulations where the whole population was sampled.

Using Equation (7) provides a practical test of the suggestion of Wright (1943) that the isolation-by-distance model should exhibit the same genetic structure as the island model among neighborhoods. As neighborhood dimensions cannot be easily identified in the field, an alternative to comparing neighborhoods is to quantify genetic structure using the population-wide FIS. The results showed that Equation (7) generally provides a good estimate of Ne (Figure 1). It did overestimate the effective size somewhat when Nn was very small (4). Thus for Nn=4, the bias was limited to 7%, but increased to 20% when Nn=1 (18% when N=256 and increasing to 23% when N=4096).

The Ne based on the population-wide gene frequency changes observed across 32 generations (Ne,32) were compared with independent estimates derived from a separate set of simulations in which 1000 replicate populations were linked by random dispersal (Nm=1). Ne was calculated from FST among the island populations. The results are shown in Table 1 and are very similar to the temporal method values, except when Nn is very small. For Nn4, the FST-based estimates were significantly lower, an effect apparently linked to reduced genetic structure within the populations (that is, reduced FIS; see Table 1) because of immigration.

Accuracy of Ne estimates

The temporal method (see Waples, 1989) uses drift-induced gene frequency change to estimate Ne. In this present simulation study, the two required samples were separated by T=1 to 32 generations in populations of N=256, 1024 and 4096 with 1000 biallelic SNPs, and the samples were based on 2, 10, 25 or 100% of the population.

The first question was to determine the accuracy of effective size estimates (Ne,est) when the whole population was sampled, but the time interval between samples was small. It was found that even a one-generation sampling interval (T=1) gave very good estimation relative to the T=32 value, only very slightly underestimating Ne when N was small (Table 1).

When 25% of the population was sampled (Figure 2), the accuracy of the estimates of Ne depended on the sampling strategy. The first strategy was to resample the same site. Using this approach, if the sampling interval was only one or two generations, then at least one of the five replicate estimates was infinite, regardless of Nn, and hence the average was infinite. For the smallest neighborhood size (Nn=1), this was also the case when the interval was four generations. In all other situations, the estimates were constrained to a more realistic range declining from infinite overestimates to underestimates as the sampling interval increased (Figure 2a). Given a low level of genetic structure (Nn=256; dotted lines, Figure 2a), in the smallest population simulated (N=256) Ne,est was fairly close to the true value (that is, less than 25% below) and was a good estimate when T was 32 generations, but for larger values of N, the underestimate increased to substantial levels (40–60%). As structure in the population was increased, estimates dropped from being unrealistically high (that is, infinity) down to underestimates of 50–75% (that is, Ne,est/Ne of 0.50 to 0.25) as the sampling interval was increased (Figure 2a).

Figure 2
figure 2

The accuracy of temporal-method estimates of Ne based on sampling 25% of the population after various intervals (T). The estimate of Ne is shown relative to Ne,32, the value derived from sampling the whole population over a period of 32 generations (see Figure 1). Three sampling methods are illustrated: (a) resample all individuals in the same site; (b) sample all individuals in one site and then resample all individuals from a non-overlapping site; and (c) randomly sample individuals for both samples. The results are shown for different populations sizes (N=256, 1024 and 4096) and neighborhood sizes (Nn=1, 16 and 256). Five sets of simulations with 1000 loci were run for N generations. Points that included infinite estimates of Ne were omitted.

The second sampling technique was to take the first and second samples from different sites within the population (specifically, opposite corners). Using this method, no estimates were infinite; in fact, in stark contrast to same-site sampling, when T was one or two generations, Ne,est was dramatically underestimated (Figure 2b). Even when genetic structure was minimal (Nn=256; dotted lines, Figure 2b), the underestimate was such that Ne,est/Ne was 0.12 (N=256) to 0.008 (N=4096) when T=1. The underestimate became more extreme as the neighborhood size decreased. For Nn=1, the ratio of Ne,est/Ne was 0.001 (N=256) to 0.0001 (N=4096) given T=1, a situation that improved as T increased, but the underestimate remained extreme even for T=32 (when the equivalent ratios were 0.04 to 0.002).

The third sampling technique, random sampling across the whole population, was generally the most accurate of the three methods. This accuracy was notable when Nn=256 (dotted lines, Figure 2c); however, especially when the sampling interval was small (T=1 or 2), there was an increasing downward bias as genetic structure increased. Thus when Nn=1 (solid lines, Figure 2c), Ne,est was 5% of Ne when T=1 and 10% when T=2.

Reducing the sample size to 10 or 2% of the total population had qualitatively the same effect on the estimate of Ne as a 25% sample; however, quantitatively the estimates of Ne became smaller as the sample size decreased. Figure 3 shows the pattern for N=1024. The only exceptions to this pattern were estimates from smaller unstructured populations, some of which became infinite. For example, given same-site sampling, the estimates that were infinite given 25% sampling, remained so given 10 and 2% sampling, but others were added under 2% sampling. For example, when Nn=256, average estimates were infinite over a broader range of T: for T16 when N=256; for T8 when N=1024; and for T4 when N=4096. Under the same conditions (2% sampling and Nn=256), infinite estimates were also seen with random sampling (for T16 when N=256; for T8 when N=1024; and for T2 when N=4096), whereas different-site resampling only gave infinite estimates when N=256 (for T16).

Figure 3
figure 3

The effect of sampling fraction on the accuracy of temporal-method estimates of Ne given the three different sampling strategies of (a) resampling the same site, (b) resampling a different site or (c) random sampling. The fraction of the population (N=1024) sampled was 25, 10 or 2%. Details are as described in Figure 2.

The degree of underestimate given 10 and 2% sampling is further quantified in Table 2 for Nn=16, a case of moderate spatial structure (0.1<FIS<0.2; see Table 1), that one would hope would not be too much of a challenge for estimating Ne. However, this was not the case. Although the estimates were consistent (that is, a low coefficient of variation), the accuracy was generally poor, and often very poor. Given different-site resampling (Figure 3b), the estimates were often more than two orders of magnitude in error, an effect that was most pronounced when the population was large (underlined values in Table 2b). The accuracy improved when the time period between generations was long (for example, T=32 generations), but the estimates were still very biased. Given same-site resampling, the underestimation was reduced relative to the case of different-site resampling (Table 2a); however, the underestimate was still 3–30-fold (dashed and solid lines, Figure 3a). As noted earlier, the pattern of underestimation was reversed if the sample interval was only one or two generations when Ne was overestimated, generally as infinite (Figure 3a).

Table 2 The bias in effective size estimates (Ne,est) given a neighborhood size (Nn) of 16

In contrast, random sampling across the whole population gave markedly better estimates (Figure 3c). The accuracy was generally good across all sampling proportions when T=32 generations, except when the genetic structure was extreme (Nn=1; see triangles, Figure 3c). Accuracy deteriorated as the time interval between samples was shortened, but the underestimation was modest compared with the other methods (Table 2c), provided the neighborhood size was not too small (Figure 3c); however, as T was reduced, the estimator could flip from significant underestimation to infinite overestimation (Table 2c and Figure 2c).

Discussion

The work presented was designed to emphasize the important distinction between the effective size of a population (Ne) and its internal neighborhood size (Nn). Random genetic change at the population level is determined by Ne that is largely dependent upon N, the number of adults in the population, that in turn is largely determined by the area of the suitable habitat. It is Ne that determines the long-term effects of genetic drift on the genetic composition of a population. In contrast, Nn is primarily dependent upon the dispersal patterns of the species (see Equation (3)), and provides no direct information on the fate of genetic variation in the population as a whole.

In a population structured by the influence of isolation by distance, neighborhoods are, to some degree, genetically differentiated from each other; however, the genetic composition of each neighborhood also varies over time. In essence, given spatial structure, gene frequency contours drift over time, even in a large population where the overall gene frequency remains largely unchanged. This spatial drift has important consequences for the estimation of Ne (see below).

It is important that the distinction between these two measures is kept clear, but the terminology can sometimes be confusing. For example, Nn is sometimes called the neighborhood or local effective population size (see, for example, Eguiarte et al., 1993; Neel et al., 2013), and the neighborhood size is sometimes symbolized by Ne (Kawata, 1995). This can lead others to the incorrect assumption that Nn exhibits the properties of Ne that are important in maintaining genetic variation in a population. For example, Lode and Peltier (2005) in their study of mink concluded that their estimate of Nn (of 16–23) was far below values considered critical for long-term viability. This was not an appropriate conclusion; they were comparing estimates of Nn with a suggested theoretical minimum applying to Ne. Under most circumstances, and especially in the context of populations at risk for extinction, Ne is the critical parameter determining the long-term maintenance of genetic variation (see Nunney, 2000). The magnitude of Nn is only relevant in such studies to the extent that Nn influences Ne.

However, Nn does have an indirect effect on the level of genetic variation by its influence on Ne. Wright (1943) predicted that a continuous population structured by limited dispersal would behave much like a population consisting of a set of island sub-populations. The link between such ‘isolation-by-distance’ populations and a system with separate island sub-populations was supported by Slatkin and Barton (1989), who noted that the stepping stone and neighborhood models can be equated using Nn=2πnm, where n is the size of a sub-population and m is the migration rate among adjacent sub-populations in a stepping-stone model.

The simulations of annual plant populations presented here supported this view: neighborhood size (Nn) affected the effective size of the population (Ne) in a manner consistent with Equation (7), a simple reformulation of the island model Equation (4). The only notable deviation occurred when Nn was very small (=1; see Figure 1). In any event, as Nn decreased, causing genetic structure to become more pronounced, Ne increased. This is to be expected, because as neighborhoods become increasingly different from each other, local structure protects genetic variation from being lost from the population by drift, provided population regulation is local (see below). The effect of Nn on Ne is only apparent when the neighborhood size is <16 when N=4096 (when FIS=0.16; see Table 1), and when Nn is a little smaller for smaller N (Figure 1). This result is in general agreement with the conclusion of Maruyama (1972), who suggested that Nn had little effect on Ne if σ2d>1 that, as noted earlier, translates to Nn>12.6; however, the results did not fit well with the prediction of Maruyama (1972) (Equation (6)) for Ne when Nn is small. For example, when Nn=1, the prediction is that Ne is increased 16.7-fold by genetic structure, whereas the simulations gave much lower values, between 2.9 (N=256) and 3.7 (N=4096) (Table 1).

In contrast to these patterns, Kawata (1995) found, in a series of simulations, that Ne decreased when Nn was reduced. Although the reasons for this apparently contradictory result cannot be identified with certainty, the simulations are consistent with a decrease in Ne because of two factors: the mating system and global population regulation.

In his simulations, individuals were non-selfing hermaphrodites. Female parents were chosen randomly (with replacement) from across the population and for each a male parent was chosen within a circle of radius M. When M is large, this mating system approximates a random union of gametes model; however, when M is small the system becomes more complex. For example, some individuals have zero fitness because they do not have any potential mates within the specified area, thus increasing the overall variance in reproductive success and decreasing Ne (Kawata, 1995).

The effect of M on the mating system accounted for some of the reduction in Ne below N (see figure 6b of Kawata, 1995), but the effect was not strong enough to explain why reducing Nn showed no indication of driving the increase in Ne predicted by Equation (4). The primary reason why Nn reduced rather than increased Ne appears to be because of the nature of the population regulation.

Kawata (1995) implicitly assumed that population regulation acted globally, that is, limiting the whole population to size N, whereas the simulations presented here imposed strong local regulation (one individual at each of the N sites). Given local regulation at or below the level of the neighborhood makes the isolation-by-distance model analogous to the island model (as pointed out by Wright, 1943); however, global regulation changes the model to one that is more analogous to a demic model (Nunney, 1999). In the island model, shifting from local (island level) to global (population level) regulation of dispersal reverses the effect of FST on Ne (compare Equations (1) and (2)). Similarly, given isolation by distance, global population regulation results in random neighborhood productivity differences driving increased genetic drift and hence lowering Ne.

Whether population regulation is local or global will depend on the specific factors acting; however, many density-dependent factors (notably intraspecific and interspecific competition) act locally, suggesting that regulation may generally act at the neighborhood level.

Estimates of Ne based on both the temporal method and using FST from independent simulations of a metapopulation of replicate populations (islands) linked by dispersal were highly concordant, except when Nn was very small (4). This divergence was most probably because of the effect of immigration in reducing the internal genetic structure (that is, reducing FIS, see Table 1). It is expected that reduced FIS would result in reduced Ne (see Equation (7)), and this is what is observed.

The presence of genetic spatial structure has been shown to strongly bias estimates based on the single-sample linkage disequilibrium method for the estimation of Ne. Neel et al. (2013) found that estimates of Ne were close to Nn when the sample area was small relative to the neighborhood, and increased only slowly as the sample area increased, never approaching the true value of Ne. In the present study, this pattern was only weakly supported given the comparable approach of same-site sampling, and required T to be large enough to avoid infinite estimates (see Figures 2a and 3a). For example, when T=32 and N=1024, the 2% sample size was well within a single neighborhood given Nn=64 and 256, but Ne,est was 201 and 578 respectively, values substantially larger than Nn, although substantially smaller than Ne (estimated at 1415 and 1365, respectively, based on sampling the entire population).

The general conclusion regarding the two-sample temporal method for estimating Ne must be that the results cannot be trusted. The only conditions yielding relatively accurate results were random sampling of a large fraction (25%) of a population with moderate to low structure (FIS<0.2) across an interval of 8 generations (Figure 2c). If the proportion of the population sampled drops and/or there is significant spatial structure then accuracy quickly declines, an effect exacerbated by a reduced interval between samples (Figure 3c and Table 2c). In general, the values obtained with random sampling were underestimates, but when Nn was large and the sample size was small, they were overestimates. For example, it can be seen in Figure 3c (where N=1024 throughout) that when the sample size was 2%, the estimates for Nn=256 were slight overestimates for T=16 or 32 generations, but for shorter intervals (T8) the estimates were infinite.

Same-site sampling yields dramatic (generally infinite) overestimates of Ne when T is small, and underestimates when T is larger. These underestimates can become substantial (that is, an order of magnitude or more) if the neighborhood size is small and the sample is taken from an area that represents a small fraction of the population (Figure 3a and Table 2a). It is important to understand the cause of this switch from extreme overestimation to underestimation as T is increased. When T is small, it is clear that limited (local) dispersal has the effect of buffering genetic change so that sampling the same site will result in less genetic change than is occurring in the population as a whole. This results in infinite estimates of Ne. However as the time interval increases, the contours of gene frequency within the population shift in space, so that the change in gene frequency at a particular site is a combination of both population-wide drift and these local changes in gene frequency. The result is that Ne is underestimated.

The effect of internal genetic structure is even more apparent when different-site resampling is employed. It is clear that unless the time interval between samples is substantial, the gene frequency difference between the initial and final samples includes both the effect of population-wide drift and the effect of gene frequency differences between the two locations. This inevitably results in a very low estimate of Ne (see Figures 2b and 3b). It was shown that these low estimates were often two or more orders of magnitude below the correct value, especially when T was small and N large (Table 2b).

The results presented were from simulations with varying levels of pollen dispersal but zero seed dispersal. This scenario may create particular difficulties for the estimation methods; however, the general pattern of very limited gene flow in one sex is neither uncommon nor restricted to plants. In animal species, it is frequently the case that one sex is philopatric, whereas the other disperses. Furthermore, the enormous bias observed in the estimates of Ne were also apparent in the results of Neel et al. (2013) given two-sex dispersal. It is clear that a new approach to the estimation of Ne in spatially structured populations is needed.

An important question raised by Neel et al. (2013) was whether isolation by distance could contribute to the unexpectedly low Ne/N ratios sometimes observed. At first sight this may appear paradoxical given that, under conditions of local population regulation, the simulations showed how a small Nn increases the true Ne in accord with Equation (4) (see Figure 1); however, in contrast to the effect of Nn on the true Ne, it is apparent that a small Nn can lead to estimates of Ne that dramatically underestimate the true Ne (Table 2).

Theory suggests that in general Ne/N0.1 (Nunney and Campbell, 1993; Frankham, 1995; Vucetich et al., 1997), and although empirical evidence is broadly supportive of this conclusion (Palstra and Ruzzante, 2008), some exceptionally low ratios in the range 10−3–10−5 have been published, many of which are derived from marine populations (reviewed in Hare et al., 2011). These studies typically used samples stored by previous researchers and, as such, may well result in resampling in a location that was different from the original. In addition, the populations are generally very large. It is perhaps notable that in the simulations these two factors resulted in the highest degree of underestimation (see Table 2).

The effective population size Ne is a very important population parameter for understanding long-term genetic change, whereas the neighborhood size Nn provides very different information concerning the degree to which the standing genetic variation becomes spatially structured. Ne is especially important in the context of predicting genetic loss from small populations of threatened species (Nunney, 2000), but it is also important for understanding genetic change in larger populations of commercially important species, notably fish (Hare et al., 2011). For these reasons accurate estimation of Ne is important.

The present work using the temporal method builds on the results of Neel et al. (2013) using the linkage disequilibrium method to demonstrate that current procedures for estimating Ne are woefully inaccurate in populations exhibiting spatial genetic structure. Strong biases were apparent even though the genetic data (based on 1000 SNPs) were extensive. Further theoretical work is urgently needed to resolve the confounding effect of spatial variation in the estimation of Ne.

Data archiving

Summary data from all simulation runs and the code used in this study are available from the Dryad Digital Repository: DOI:10.5061/dryad.qc1nc.