Introduction

Resampling techniques are becoming widely used to assess confidence in phylogenetic reconstructions as well as in population genetics (Crowley, 1992). Indeed, direct analytical derivations of the appropriate variances can be extremely complex, and resampling techniques then provide rapid assessments of the precision of the studied statistics.

In particular, a variety of resampling methods have been used to detect genetic differentiation among populations (Crowley, 1992). To test whether there is a significant genetic structure, permutation procedures can be used where individuals are shuffled at random among populations, while keeping the sample sizes the same as in the original analysis (Palumbi & Wilson, 1990; Excoffier et al., 1992; Hudson et al., 1992). These methods do not replace the need to evaluate the precision of the measures of differentiation, such as FST (Wright, 1951; Weir & Cockerham, 1984) or GST (Nei, 1973; Pons & Petit, 1995). For multilocus isozyme data, a confidence interval can be obtained by jackknifing or bootstrapping over loci, as suggested by Weir (1990). But Van Dongen (1995) recently concluded that in general, resampling over individuals should be preferred to resampling over loci, because the allele frequencies for the different loci are usually estimated from the same individuals and are therefore not independent. Resampling over individuals also has the advantage that the precision of the differentiation at each locus can be estimated, which allows for the possibility of detecting aberrant loci when several are available (McDonald, 1994) or of studying single-locus data, such as data based on mitochondrial or chloroplast DNA.

However, for estimators of diversity whose precision is affected by both sources of variation (sampling of individuals and of populations), it is unclear what sampling to use for the bootstrap. Several types of resampling have been used so far in the literature: for instance, Petit et al. (1993) used the two-stage bootstrap which mimics the original sampling, whereas Prout & Barker (1993) used a bootstrap with only the populations as units of resampling.

To decide between these alternative types of sampling, a satisfactory solution would be to compare the simulated bootstrap estimates with the analytically derived direct estimates. This seems important because, as outlined by Crowley (1992, p. 431), ‘bootstrapping may have been swept into the mainstream of ecological and particularly evolutionary research somewhat ahead of a full, balanced evaluation of its capabilities, and shortcomings’. This analytical treatment, which considers both the sampling of individuals and the sampling of populations, is now available for a haploid and a diploid locus as well as for ordered alleles (Pons & Chaouche, 1995; Pons & Petit, 1995, 1996). The usual but sometimes implicit assumption is made that the observed populations are independent, which is both a genetic and a sampling assumption (for further discussion on this topic, see Nei, 1986; Pons & Petit, 1996). The same assumption is required in the uniform resampling methods, such as the bootstrap discussed here. We also assume that the number of sampled populations is large but nevertheless much smaller than the total number of existing populations.

We will use these results to illustrate that, in complex situations, it is necessary to ascertain whether bootstrap simulations yield the required estimates or not. Although two-stage bootstrap methods have been studied in the case of a finite number of finite populations drawn with replacement (Rao & Wu, 1988; Sitter, 1992), no results are known for a large number of populations and data sampled without replacement (in the original data sampling scheme). Here, three situations will be considered: sampling of individuals only, of populations only, and of both. We will present the exact or approximate bootstrap estimators and will compare them to the direct estimators obtained analytically in Pons & Petit (1995) and which will be referred to simply as the ‘direct estimators’.

These results will be illustrated using data on isozyme polymorphism in sessile oak (Zanetto & Kremer, 1995). The comparison of the bootstrap estimators with the corresponding bootstrap simulated values will provide an evaluation of the approximations we used in the analytical derivations of the variances.

Bootstrap estimation

We consider a total population subdivided into a large number of independent populations and in which I alleles are segregating at a haploid locus. For this situation, diversity and differentiation parameters were defined in Pons & Petit (1995). In particular, the diversity hk of the kth population is given in their eqn 1, the average within-population diversity hS in eqn 2, the total diversity hT in eqn 3 and the differentiation parameter GST in eqn 4. A two-stage random sampling is used to estimate these parameters: n independent populations are drawn with the same probability from the general population, then nk individuals are drawn independently and uniformly from the kth population. Within the kth population, the proportion xki of individuals having the ith allele is observed, corresponding to an unknown frequency pki. Estimators of the parameters are defined in Pons & Petit (1995) as ĥk, ĥS, ĥT and ĜST by their eqns 5, 6, 9 and 10.

Here, we study three different bootstrap sampling procedures for estimating the variance of ĥS, ĥT and ĜST. We also study the within- and between-population components of these variances, which will be useful for understanding what is estimated under each resampling scheme. The first bootstrap method is a resampling of the individuals in the observed populations: in the kth population, nk individuals are drawn uniformly and with replacement from the initial sample of the kth population. In the second resampling procedure, only the populations are drawn: a bootstrap sample is obtained by drawing uniformly n populations with replacement from the observed set of populations, then by taking the initial observed values for each sampled population. The third bootstrap procedure corresponds to a twostage bootstrap resampling: n populations are sampled uniformly and with replacement from the observed set of populations and, whenever the kth population is selected, nk individuals are drawn uniformly and with replacement from its initial sample.

For the vth bootstrap resampling procedure (v=1, 2 or 3) and for a parameter θ, which will here be hS, hT or GST, the bootstrap estimator *(v) of θ is the mean, under the bootstrap resampling distribution and conditionally on the observed variables, of the variable θ *(v) defined in the same way as the direct estimator but for the vth bootstrap variable (Efron & Tibshirani, 1993). The bootstrap estimator *(v)() of the variance of is the variance of θ *(v), under the vth bootstrap resampling distribution and conditionally on the observed variables. Because the resampling distributions are multinomials with parameters depending on the observed variables, the bootstrap estimators *(v) and *(v)() are functions of the proportions of individuals having each allele in the different populations, when bootstrapping within the populations, and of the sample sizes nk and n. Here, we give the expressions of these bootstrap estimators without proofs, which may be obtained from the second author (Pons, 1997). We will need the following biased estimators of hS and hT, where xi=n−1 Σkxki:

According to each multinomial bootstrap distribution, we get for hS the three bootstrap estimators:

For hT, we have

The three bootstrap estimators of GST are therefore biased. If the number n of sampled populations is large, as is recommended to reduce the total variance of the estimates (Pons & Petit, 1995), ĥS¸hS and ĥT¸hT, and the three procedures give similar results.

Before considering the bootstrap estimation of the variances, we define Ek and Vark as the expectation and the variance under the multinomial distribution of parameters nk and pki in the kth population. For hS, the three bootstrap variances are

In eqn (4), k(ĥk) is the estimated variance of ĥk, within the kth population. It has the same form as Vark (ĥk) given by Pons & Petit (1995) but with xki instead of pki. This is then a biased estimator of the within-population variance of ĥS. Comparing eqn (5) to eqn 12 in Pons & Petit (1995), it appears that the second bootstrap variance is an estimator of the total variance of ĥS instead of an estimator of the between-population variance as could have been expected when populations alone are drawn. In eqn (6), Êk (ĥ2k) is an estimator of Ek (ĥ2k)=Vark(ĥk)+h2k obtained by replacing pki with xki, i.e. Êk (ĥ2k)=k(ĥk)+¸h2k. This bootstrap variance is therefore a biased estimator of the sum of the within-population and total variances of ĥS.

Closed forms of the bootstrap variance of ĥT are more complicated and we use the same approximations as in Pons & Petit (1995) for large n. The three bootstrap variances of ĥT are then approximated, up to the order n−2, as

and

For the first bootstrap variance of ĥT, the right-hand side of eqn (7) is an estimator of Varintra(ĥT), the within-population variance of ĥT given by eqn 14 in Pons & Petit (1995), but nk−1 in Varintra(ĥT) is now replaced by nk. Thus, it is biased and it may differ substantially from the direct estimator for small population sample sizes. If the sample sizes of the bootstrap populations are modified and set to nk−1 for the kth population, the bootstrap estimator of hT is still ĥT and its bootstrap variance estimates the within-population variance of ĥT. By a comparison of our direct estimators of Var(ĥT), we can see that the right-hand side of eqn (8) is an estimator of the total variance of ĥT. This is therefore also the case for the second bootstrap variance estimator. Finally, by the third procedure the estimated bootstrap variance is approximately the sum of the total and within-population bootstrap variances of ĥT.

We get similar results for the bootstrap estimators of the covariance between ĥS and ĥT, *(v)(ĥS, ĥT). Thus, the bootstrap procedures provide estimators of the variance matrix of (ĥS, ĥT) when sampling the populations alone, and of its within-population variance when sampling only the individuals. Such results also hold for ĜST, and in particular the bootstrap estimator of its variance is approximately

If n is large, this expression is close to the direct estimator of Var(ĜST) defined by Pons & Petit (1995). However, for small n, the bias of this bootstrap estimator will become apparent and the direct estimation is preferable.

Numerical example

The data set originates from a large study of gene diversity of sessile oak (Quercus petraea (Matt.) Liebl.) in Europe using several isozyme markers (Zanetto & Kremer, 1995). A total of 81 populations were sampled over most of the European range of this species. We selected a single locus (acid phosphatase, EC 3.1.3.2). Sessile oak is a diploid species and a total of five alleles and 12 genotypes were detected at this locus in the survey. For the purposes of illustrating the approach described in this paper, all the analyses are made at the genotypic level, where each genotype is equivalent to an allele in a haploid locus. Alternatively, Hardy–Weinberg equilibrium could have been assumed, to consider the data as haploid. This has no consequence in regard to the question studied here.

The mean number of individuals per population is 114.6, for a total of 9281 individuals analysed. Alleles 2 and 4 are largely predominant, and genotypes 22, 24 and 44 make up 98 per cent of all genotypes found. In Table 1, simulated bootstrap estimates of the three parameters are compared with the direct estimates (Pons & Petit, 1995). In accordance with the theoretical results presented previously, bootstrapping over individuals only provides an unbiased estimate of hS but a biased one for hT, and bootstrapping over populations only provides a biased estimate of hS and an unbiased one for hT, whereas the two-step bootstrap provides biased estimates for both hS and hT. All bootstrap estimates for GST are therefore biased.

Table 1 Direct estimates and bootstrap simulated estimates of the parameters (using 1000 bootstrap samples). The direct estimates are ĥS, ĥT and ĜST (Pons & Petit, 1995) and the biased estimates are ĥS, ĥT given by eqn (1) and ĢST=1−ĥS/ĥT

In Table 2, we computed total, inter- and intrapopulation variances for the estimates of hS, hT and GST, following the method of Pons & Petit (1995). The estimators of hS and hT have similar variances but the estimate of GST (which directly derives from the other two parameters) is less precise. These estimates are then compared to those obtained by a bootstrap procedure, either empirically (1000 bootstrap simulations) or using (eqns 4 5, 6, 7, 8 and 9). The three types of bootstrap were considered, as discussed above. Overall, the bootstrap estimates are in excellent agreement with the results obtained in the simulations (Table 2). Hence, the approximations (7–9) are acceptable in this example. The same approximations led to the expression of Varintra(ĥT), Varinter(ĥT), Covintra(ĥS, ĥT) and Covinter(ĥS, ĥT) in Pons & Petit (1995). By analogy, these expressions must also be sufficiently precise. The comparison of the variances obtained using either bootstrap (through simulations or estimations) or direct estimates clearly shows that a two-step bootstrap yields the sum of the total and intrapopulation variances instead of the total variance, in agreement with the theory developed in the previous section. Moreover, the bootstrap over populations gives estimates of the total variances (and not of the interpopulation variances). Finally, the bootstrap over individuals does estimate the intrapopulation variance, though it appears to give an inflated estimate in the case of GST.

Table 2 Direct and bootstrap variance estimates×104 of gene diversity estimates for the complete data set (using 1000 bootstrap samples)

Because the population sample sizes are large, the bootstrap over individuals is not greatly modified by the sampling of nk−1 individuals instead of nk, as proposed above (results unchanged for Var(ĥT) for the bootstrap over individuals: 0.058×10−4 by both methods). Another example was studied using a subset of the complete data set where 20 individuals were selected at random and without replacement in each of the 81 populations. Direct and bootstrap variance estimates were then computed as before. The results indicate that, with this sample size, the procedure of bootstrapping over individuals no longer provides a good estimate of the intrapopulation variance of GST (Table 3). Hence, we recommend the use of the direct analytical estimates in these situations.

Table 3 Direct and bootstrap variance estimates×104 of gene diversity estimates for a subset of 20 randomly sampled individuals in each of the 81 populations (using 1000 bootstrap samples)

Conclusion

Resampling procedures, as emphasized by Crowley (1992), are often used without being validated, especially in the field of population genetics. Although seemingly appealing, intuitive resampling procedures that mimic the sampling of individuals and populations may turn out to be misleading. Here we have shown that the bootstrap variance estimators of ĥS, ĥT and ĜST are obtained by resampling over populations only, instead of over both populations and individuals. Nevertheless, we recommend rather the use of the direct estimators (Pons & Petit, 1995) that we have indirectly validated here. These estimators do not require the computing time necessary for resampling procedures and they are unbiased.