Many species occur in several or at least partly isolated subpopulations that are adapted to local environmental conditions. These populations are influenced by a number of forces that can alter their genetic structure. In general, genetic drift and diversifying selection will differentiate the gene frequencies, whereas gene flow, mutation and unifying selection will counteract these forces. Estimation of genetic differentiation between populations using molecular markers is an important topic in many areas of evolutionary science (Avise, 1993; Goldstein and Schlötterer, 1999). Most studies have used statistical measures derived from Wright's F-statistics (Wright (1951, 1965); see also Excoffier, 2001; Weir and Hill, 2002).

Given an island model, it has been shown that genetic differentiation between populations, expressed as Wright's FST, should be the same regardless of whether it is estimated from neutral single-locus marker or from a neutral quantitative trait with an additive genetic basis (Lande, 1992; Lynch, 1994). Whitlock (1999) showed that this procedure can be applied also to other types of population structures. Hence, the amount of differentiation in neutral molecular markers (FST) can be used as an expectation for the level of differentiation of the neutral additive genetic variance of a quantitative trait among populations (termed QST by Spitze, 1993). If the value of QST is larger than the corresponding value of FST, one can conclude that there is some evidence against the neutrality hypothesis in favour of diversifying selection among the populations. Conversely, if the value of QST is lower than FST, one can say that unifying selection has been the prevalent force. This reasoning, of course, assumes that the mutation rates are the same at the loci involved. An increasing number of studies have compared the differentiation in molecular markers and quantitative traits, and it seems that quantitative traits generally are more differentiated than molecular markers (reviewed in Merilä and Crnokrak, 2001; Reed and Frankham, 2001; McKay and Latta, 2002). However, except for Palo et al (2003), QST and FST estimates have so far not been compared in a statistically rigorous way.

Bayesian inferential methods have recently started to emerge for different areas in evolutionary biology (Beaumont and Rannala, 2004). They provide a number of advantages compared to classical frequentist approaches. One such advantage is a probabilistic measure of uncertainty in the form of credible intervals (CI), which can be obtained on parameter estimates and even on their functions. By contrast, maximum likelihood methods only provide standard errors around the parameters in the model (from the Fisher information matrix; Lynch and Walsh, 1998) and confidence intervals for functions of these parameters must be obtained with approximate methods (eg Podolsky and Holtsford, 1995; Waldmann and Andersson, 1998). Another important advantage with Bayesian methods is that many complex hierarchical model structures that earlier were intractable can now be easily investigated (Robert, 2001; Gelman et al, 2004). A Bayesian method for estimation of population genetic structure and FST on the basis of multilocus molecular markers was recently presented (Corander et al, 2003, 2004). This method considers the number of populations as an unknown quantity and determines the posterior probabilities of the structure configurations. Moreover, rather than conditioning the FST estimate on a single structure it provides model-averaged (robust) estimate for FST, where individual FST estimates from different population structures are weighted according to corresponding posterior probabilities (of the structure). For other approaches, into Bayesian FST estimation, see Holsinger (1999) and Balding (2003).

In this study, we develop a Bayesian method for estimation of the differentiation of the additive genetic variance between populations. The method is applicable to single quantitative traits that can be assumed to have an additive genetic basis. The method uses a Gibbs sampling approach (Gelfand et al, 1995) for estimation of posterior distributions, and CIs are therefore easily obtained for any function of the variance parameters. We first estimate the model averaged neutral FST estimate (Corander et al, 2003, 2004) in Pinus sylvestris, and then calculate the weighted QST estimate (according to the posterior probabilities of the hidden structures) and compare these two estimates. Based on the results, we also discuss situations where comparison of the FST and QST parameters might be problematic.

Materials and methods

Estimation and comparison of FST and QST

The recently developed Bayesian method for estimation of molecular marker population genetic structure (Corander et al, 2003, 2004) treats both the number of populations and the allele frequencies of the molecular markers of each population as random variables. The posterior distribution of the population structure is estimated from the expression where all allele frequencies have been analytically integrated out. If needed, the posterior sample for the allele frequencies can then be generated afterwards. The accompanying program (BAPS) performs an exact Bayesian analysis by enumerative calculation when the number of original populations is small (less than nine). A Markov Chain Monte Carlo (MCMC) algorithm is used when there are nine or more original populations. Based on the posterior distribution of the structure parameters, a measure of uncertainty regarding the specified populations is obtained for all pairwise comparisons. BAPS can also generate the MCMC samples for the allele frequencies and the FST statistic, estimated under all possible population structures that are considered to be likely in the light of the data (model averaged estimate). The empirical samples in this study come from five different locations (original populations) and the enumerative calculations were therefore used for the structure parameter. The length of the MCMC chain for FST statistic estimation in BAPS program was based on 50 000 iterations (after a burn-in of 10 000 had been discarded). For details of this method and the program BAPS, see Corander et al (2003, 2004).

Given a polygenic trait with an additive genetic basis and a neutral island model where populations derive from a common ancestor population, it has been shown that the components of between-population variance (σb2) and the within-population variance (σw2) can be used to formulate a quantitative trait analog to the molecular FST statistic as QST (Prout and Barker, 1989; Spitze, 1993). The molecular FST estimate can be used as a neutral drift expectation to which it is possible to compare the QST estimate of a quantitative trait.

Estimation of the FST and QST statistics have so far been conducted within the frequentist framework, mainly by using standard methods as ANOVA and REML (an exception is presented in Palo et al, 2003). Rigorous tests for the hypothesis that FST and QST are equal are difficult to formulate because these statistics are often estimated from separate analyses conditionally on the original sampling design.

The only relevant way up to now has been to compare an overlap of confidence intervals around the estimates. Confidence intervals can readily be obtained from the ANOVA for balanced designs (Lynch and Walsh, 1998). However, designs are seldom balanced and only approximate confidence intervals can therefore be constructed (eg with the Delta method; Podolsky and Holtsford, 1995; Waldmann and Andersson, 1998). Bootstrap methods can also be used, but the bootstrap can be difficult to implement in multilevel hierarchical designs.

A Bayesian Gibbs sampling approach can be formulated using the hierarchical centering parameterization of Gelfand et al (1995). Consider the following nested random effects linear model

where yijk is the observed quantitative trait measurement of individual k belonging to family j at population i, μ the overall mean, pi the population effect at population i, fij the family effect of family j in population i, and eijk the residual. These parameters are distributed as

The hierarchical centering parameterization is obtained by replacing pi with γi and fij with δij, so that γi=μ+pi and δij=μ+pi+fij. Hence, γi is centered around μ and δij is centered around γi. This centering has been found to provide a good mixing and convergence properties of the MCMC algorithm (Gelfand et al, 1995). In order to estimate QST, σb2 is obtained directly from the population variance σp2, that is (σb2=σp2), whereas the family variance σf2 has to be converted into σw2 by multiplication of a coefficient (c) that depends on the relationship of individuals within families (σw2=f2). For half-sibs, full-sibs and cloned individuals c is 4, 2 and 1 (under the assumption of no dominance and epistasis), respectively. The quantitative trait analog to the molecular FST statistic is then estimated as (Prout and Barker, 1989; Spitze, 1993):

We implemented this model using WinBUGS14 (Spiegelhalter et al, 2003) and the code is available from the authors. Prior for μ was taken to be a very flat normal distribution with zero mean and variance 1/10−6. Recently, it has been argued that the commonly used inverse Gamma prior is not always uninformative for variance parameters (Gelman et al, 2004). Hence, we performed two separate analyses with Gamma (0.001, 0.001) and uniform (10+6, 10−6) distributions as priors for the (inverse) of the variances (1/σp2, 1/σf2 and 1/σe2).

Quantitative and molecular data from P. sylvestris

Seedlings of P. sylvestris were grown in a common garden experiment as described by García-Gil et al (2003). The experiment consisted of five populations along a latitudinal cline (67–40°N). The timing of terminal budset was scored twice per week. A terminal bud was defined as the stage when the stipules of the foliage leaves cover the shoot apex and the youngest foliage leaf are offset from the central axis of the shoot apex. The date of budset was defined as the number of days from sowing to the formation of the bud. The number of families per population was: 22 (Valsaín), 20 (Kolari, Lapinjärvi and Lithuania) and 10 (Puebla de Lillo). In all, 20 individuals per family were scored for the quantitative traits. We assume that the individuals within each family are related as half-sibs (Muona and Harju, 1989; Yang et al, 1996). For the microsatellite work, two individuals per family per population from the common garden experiment were analysed. Ten nuclear microsatellite primers, developed for Pinus taeda, were used to genotype the individuals: PtTX3025, PtTX3013, PtTX2146, PtTX2123 primers (Elsik et al, 2000) and 8846, 4516, 4527, 4528 primers supplied by Dr Auli Karhu. PtTX2146 primer amplified three different polymorphic microsatellite loci named as PtX2146A, PtX2146B and PtX2146C. Out of the ten primers, six microsatellite loci were polymorphic and were used to genotype the 180 individuals. DNA was extracted from needles using Quiagen DNAeasy plant kit. The PCR volume was 25 μl and consisted on 50 ng of genomic DNA template, 0.2 μM of each primer, 0.2 mM of each dNTP, 2.5 μl of 10 × Taq buffer (500 mM KCl, 100 mM Tris-HCl, 1% Triton X-100, Promega), 2 mM of MgCl2 (Promega) and 2 units of Taq polymerase (Promega). Amplifications were performed using Robocycler gradient 96 (Stratagene). The amplification protocol was: 5 min at 94°C; followed by 35 cycles of 1 min at 94°C, 30 s at 50°C, 1 min at 72°C; and finally one cycle 10 min at 72°C. PCR amplifications were resolved using an ABI 377 DNA sequencer and allele scoring was performed by using the GeneScan 3.1 and Genotyper 2.5 softwares.


P. sylvestris data

The allele frequencies for the original populations are summarized in Table 1. Analysis with BAPS resulted in two clusters, {Kolari, Lapinjärvi and Lithuania} and {Valsaín and Puebla de Lillo}, with a posterior probability of 0.994. The pairwise probabilities, that two populations are equal, are presented in Table 2. The mean of the posterior of the FST estimates between the two clusters varied between 0.00650 and 0.0727 for the loci, the overall mean being 0.0316 (95% CI: 0.0193–0.0438). We also estimated the inbreeding coefficient (FIS) within each cluster with the Hickory 1.0 program (Holsinger and Lewis, 2003). The 95% CI of the posteriors of FIS overlapped considerably (northern cluster: 0.0405–0.151 and southern cluster: 0.0346–0.200). Consequently, the estimated level of inbreeding was low (corresponding well with the assumed half-sib relations within families) and did not vary between the two clusters.

Table 1 Microsatellite allele frequencies for the five original Pinus sylvestris populations
Table 2 Pairwise posterior probabilities for allele frequencies of two populations being the same

Given that the two-cluster configuration in the BAPS analysis had a very high posterior probability, the estimation of the QST differentiation of bud set date was carried out conditionally on this structure. Two parallel MCMC chains were run for 550 000 iterations for each type of priors (Gamma, uniform) for variance parameters. The first 50 000 iterations were discarded from each chain as burn-in and the chains were thinned by storing every tenth iteration based on autocorrelation plots (plots not shown). The Gelman-Rubin convergence statistic in WinBUGS1.4 strongly supported the conclusion that the chains had converged (R close to 1) for all variance parameters with both the Gamma and the uniform priors. The two priors seemed to produce identical results, and we will therefore only present the runs based on the uniform priors. The MCMC chains resulted in a posterior QST density that was bimodal with peaks close to 0.2 and 1 (Figure 1). In order to investigate this further, we estimated the family variance σf2 for each cluster (extracted from the full model). The northern cluster had a posterior mean of σf2 of 125.9 (95% CI: 110.7–145.2), and the southern cluster a posterior mean of σf2 of 62.93 (95%CI: 38.14–97.88). Hence, the within-cluster variation was very different for those two groups. The Pr (σfNorth2>σfSouth2) was estimated by dividing the number of iterations where (σfNorth2>σfSouth2) by the total number of iterations and found to be 0.999. We also estimated QST based on the original population configuration for comparative purposes. The mean, mode and median were high (0.765, 0.817 and 0.779, respectively), but the CI of the posterior was wide (95% CI: 0.490–0.964).

Figure 1
figure 1

The estimated posterior QST density of date of budset in Pinus sylvestris. The density is based on two parallel Gibbs chains with a burn-in of 50 000 and sampling of every 10th iteration yielding a total sample size of 100 000.


It is shown here how the QST statistic, which is commonly used for estimation of differentiation in quantitative traits, can be formulated in a Bayesian framework. A Gibbs sampling approach with hierarchical centering was used in the estimation, because it has been shown to work well (Gelfand et al, 1995). When the molecular markers from P. sylvestris were subjected to the BAPS program, it was found that the estimated neutral population genetic structure consists of two clusters that correspond very well to the north/south geographic distribution. However, when trying to estimate QST of the bud set data from P. sylvestris, the MCMC-chains produced a bimodal posterior of QST. The evident reason for the bimodality is that the family variances differed considerably between the two clusters because of the strong cline within the northern cluster (García-Gil et al, 2003). An earlier study has found that microsatellites and other molecular markers are hardly differentiated at all, whereas timing of bud set is very different between the original populations of the northern cluster (Karhu et al, 1996). Consequently, the assumption of variance homogeneity at the family level is violated and no comparison between the QST and the FST estimates can therefore be made.

Statistical and evolutionary assumptions behind QST

One of the critical assumptions when estimating QST is that the populations are evolving at the same rate, that is, that genetic drift is converting the family variance within populations to the between-population component at the same rate. It is therefore of fundamental importance to investigate a priori whether the additive genetic variance (or the heritability) differs considerably between populations. Several of the studies included in recent review articles (Merilä and Crnokrak, 2001; McKay and Latta, 2002) have reported average heritabilities. Since relatively few authors have reported both QST and population-specific heritability estimates in their studies, it is difficult to evaluate if heterogeneity in within-population variance is a common problem.

Another assumption that is more challenging to verify is that the molecular markers and the genes of the quantitative traits should have the same mutation rates. Although it is inherently difficult to estimate proper mutation rates of both molecular markers and quantitative traits, it has been suggested that different marker types can have considerably dissimilar mutation rates (Balloux et al, 2000). Moreover, theoretical studies have shown that the FST statistics is very sensitive to variations in the mutation rate (Fu et al, 2003) and to unequal migration rates between populations (Wilkinson-Herbots and Ettridge, 2004).

The level of inbreeding was low and did not differ between the northern and southern clusters. Thus, the bias should be small for the assumption of no inbreeding when estimating QST. Theoretically, it is possible to derive a Bayesian QST statistic that takes inbreeding into account. Prior information from the level of inbreeding could easily be attained from molecular markers. Unfortunately, it is practically much more difficult to specify a model for the dominance variance components that are introduced by inbreeding (De Boer and Hoeschele, 1993). Hence, we did not try to implement an inbreeding QST for this data set.

In Waldmann and Andersson (1998), heritability estimates varied between populations considerably for some traits (flowering date in Scabiosa canescens varied between 0.098 and 1.49), but not that much for others (flowering date in S. columbaria varied between 0.198 and 0.450). However, the confidence intervals of the heritabilities were wide in that study and overlaps between populations were common. Large variation in population-specific heritability levels were also found for some traits in two rare plants, whereas some traits displayed very similar heritability levels between populations (Petit et al, 2001). A similar result was found by Widen et al (2002) for Brassica cretica. The within-population genetic variance varied between 0.111 and 11.0 for internode length, whereas node number only varied between 12.9 and 64.4 between populations of this species.

Moreover, it should also be noticed that the (frequentist) ANOVA and REML methods produce point estimates for the variance components even in the presence of considerable variance heterogeneity. For example, an REML analysis of the bud set data in this study produced a QST estimate of 0.274 (when estimated conditional on the northern and southern cluster structure). An obvious indication for variance heterogeneity can be obtained by checking that the residuals do not follow a straight line in the Normal Quantile–Quantile plot (Figure 2; see also Pinheiro and Bates, 2000).

Figure 2
figure 2

Normal Quantile–Quantile plot of standardized residuals from an REML analysis of the budset data using the northern and southern clusters as population levels. Normally distributed residuals should display a straight line. This plot clearly shows two different linear trends.

Recently, a theoretical study by Lopez-Fanjul et al (2003) showed that QST can be severely biased if there are nonadditive gene actions (dominant and/or epistatic loci contribute to the phenotype). Hence, comparison of QST and FST for inference of the relative importance of drift and selection in population differentiation is limited to purely additive traits. In this study, we have assumed that individuals within families are related as half-sibs, which seems to be reasonable when considering the mating system of P. sylvestris (Muona and Harju, 1989; Yang et al, 1996) and the FIS result attained with Hickory. However, it is possible that a small fraction of full-sibs are present and introduce a small amount of error due to dominance (Lynch and Walsh, 1998).

So far, no theoretical investigation has been undertaken on how different selection regimes within populations influence QST. Although it has been shown that migration (or pollen dispersal) between local populations in which selection favours different trait values can maintain substantial amounts of genetic variation at the between-population level (Barton and Keightley, 2002). Regarding our data, one could suspect that divergent selection is more prevalent within the northern cluster, and that uniform selection (or drift) is the dominating force within the southern cluster where original populations occur in a rather similar environment (latitude). In fact, analyses of bud set date at the within-cluster level (using initial populations) with WinBUGS14 revealed that mean QST was 0.728 for the northern cluster and 0.0979 for the southern cluster (these estimates should only be taken as very approximate because CIs were very wide).

Differentiation in pine and other forest trees

Two recent studies have compared differentiation in molecular markers and quantitative traits in different Pinus species. Yang et al (1996) compared allozyme differentiation and quantitative genetic differentiation in P. contorta ssp. latifolia and found that specific gravity, stem diameter, stem height and branch length had significantly higher QST (between 0.133 and 0.195) than FST (0.019) values. In P. pinaster (Gonzalez-Martínez et al, 2002), the QST values were very high for stem form, total height growth and survival at 30 years age (0.973, 0.791 and 0.732, respectively), and significantly higher than the allozyme FST (0.048). However, Gonzalez-Martinez et al (2002) also report considerably lower QST estimates (0.12 and 0.20) for height growth of two other pine species from the Mediterranean.

Strong diversifying selection is apparent in several widespread tree species as QST values often are much higher than neutral FST values, especially in timing of bud burst (Le Corre and Kremer, 2003). However, the QST values are not always extremely high. Instead, it is the FST values that are very low. This is not surprising because many tree species are often wind pollinated and distributed over large areas. For example, a recent study in Picea glauca revealed that QST estimates of 10 traits ranged between 0.035 and 0.246 (Jaramillo-Correa et al, 2001). Of those were only 8 year height (QST=0.082), 13 year height (QST=0.069), total wood density (QST=0.102) and date of budset (QST=0.246) higher than the neutral differentiation of allozymes (GST=0.014) and ESTPs (GST=0.014). Unfortunately, no heritability estimates were reported for the separate populations.

In two other theoretical studies (Latta, 1998; Le Corre and Kremer, 2003), the differentiation at neutral molecular markers, the QTL behind the trait and the adaptive trait itself was compared under different selection regimes and with different levels of gene flow. Their general conclusion was that it is more common that population differentiation shows pattern of genetic variability that differs between markers, QTL and the adaptive trait, than that the differentiation is the same. Le Corre and Kremer (2003) found that the highest disparity between the three levels occurred under highly diversifying selection and high gene flow, a situation that corresponds very well with the pine biology.

In conclusion, we have presented a Bayesian method for estimation of QST that can utilize the information regarding the actual population structure estimated using neutral molecular markers. The method should work well when the quantitative traits of the populations differentiate at the same rate (ie have similar variances). However, when the heritability differs substantially between populations, the MCMC estimation may result in bimodal posteriors. This problem was illustrated by applying the methods to a data set of date to bud set in P. sylvestris. We also recommend that future studies in addition to presenting QST estimates, also present between-family variance or heritability estimates for each trait and population. Moreover, when using ANOVA and REML methods, it is of particular importance to check that the residuals are normally distributed and do not follow any deviating trends. Finally, many evolutionary forces can potentially bias FST and QST comparisons and they should therefore be interpreted with care.