Contents

1 Introduction

2 Population structure and genetic differentiation

3 Effective size

 3.1 Equal contribution from subpopulations

3.1.1 Island model

3.1.2 Stepping stone model

3.1.3 The neighbourhood model

3.1.4 The general model with an equal contribution from subpopulations

 3.2 Variable contributions from subpopulations

3.2.1 Effective size for monoecious species

3.2.2 Constant and equal subpopulation size

3.2.3 Extinction and recolonization

 3.3 The pattern of inheritance

3.3.1 Dioecious species

3.3.2 Sex-linked loci

3.3.3 Haploid species

3.3.4 Non-nuclear gene inheritance

4 The effects and implications of population subdivision

 4.1 The effects

 4.2 Implications

5 Some other considerations of subdivision

 5.1 Complete subdivision

 5.2 More than two levels of subdivision

 5.3 Realization of the asymptotic effective size

References

1 Introduction

Effective population size is a key parameter in population and quantitative genetics. It has important applications in evolution theory, domestic animal and plant breeding and conservation biology. As a measure of the strength of the stochastic process in a finite population, effective size determines the rate of decay of neutral genetic variation. It also affects the behaviour of genes under selection and other systematic forces, and thus influences the variance of selection response (Hill, 1985), the selection limits (Robertson, 1960) and the short- and long-term survival of populations under conservation (Lande & Barrowclough, 1987; Lynch et al., 1995).

The concept of effective size was introduced by Wright (1931) and developed subsequently by Crow and colleagues (see Crow & Kimura, 1970, pp. 345–364). They distinguished between a variance effective size, which predicts the variance of change in gene frequency resulting from one generation of genetic sampling, and an inbreeding effective size, which predicts the rate of decrease in heterozygosity. In simpler cases, the two effective sizes are the same but, for more complex situations, it is necessary to make a distinction (Kimura & Crow, 1963a). Since this pioneer work, there has been much progress on developing equations to predict this important parameter. A long list of factors, such as sex ratio, mating system, variance in reproductive success, fluctuation of population size over generations, selection, overlapping generations and pattern of inheritance (e.g. haploid, polyploid, sex-linked) have been identified, and their effects on effective size quantified. These developments were reviewed by Caballero (1994). However, an important factor, the subdivision of populations, was only included briefly in his review.

The importance of population subdivision is reflected in both its substantial impact on effective size (as shown below) and its ubiquity among organisms. Many species naturally form subpopulations in the form of herds, flocks, schools, colonies or other types of aggregations because of intrinsic factors, such as behavioural segregation. In addition, natural habitats are typically patchy, with favourable areas interspersed among unfavourable areas. Even in the ideal case of a uniformly favourable habitat, the population is still subdivided to some extent if the habitat is large, because the distance of individual migration is usually much smaller than the entire range of the habitat; individuals are more likely to reproduce locally, leading to the so-called ‘isolation by distance’ (Wright, 1943). Now, more and more habitats are fragmented by human activities, and many organisms live in more or less isolated islands.

Recognizing the prevalence of population subdivision and its effects on genetic variation, several authors have developed prediction equations for effective size of subdivided populations (also called metapopulations) in recent years. In this paper, we will review these equations and their inter-relations and implications for conservation biology, evolutionary theory and breeding applications. Derivations and extensions to previous developments are made for some cases. Particular attention is paid to keeping the symbols, parameters and equations for subdivided populations consistent with those reviewed previously (Caballero, 1994) for a single unsubdivided population

2 Population structure and genetic differentiation

The effective size of a metapopulation is the size of a Wright–Fisher idealized population that would give rise to the variance of change in gene frequency or the rate of inbreeding observed in the actually subdivided population under consideration. An idealized population is a monoecious population with constant size over discrete generations, random mating including selfing in random amounts, an equal probability of contributing gametes to the next generation from different parents and with respect to autosomal loci without mutation and selection (Fisher, 1930; Wright, 1931; see Caballero, 1994). The main violation of the above ideal situation in a subdivided population is nonrandom mating; mating is more likely to occur within subpopulations rather than among subpopulations. This kind of nonrandom mating will inevitably lead to genetic differentiation (the difference in gene frequency) among subpopulations, which is reflected as the relative difference between the average observed heterozygosity over subpopulations and the theoretical heterozygosity assuming no subdivision (completely random mating). Genetic differentiation may result from natural selection favouring different genotypes in different subpopulations and may also result from random genetic drift acting independently within local demes.

Migration, the gene flow among subpopulations, tends to impede genetic differentiation. As a homogenizing factor, it offsets the differentiating forces so that the population may reach an equilibrium. With one generation of complete migration (random mating in the entire population), the differentiation among subpopulations disappears completely. With no migration, the subpopulations differentiate continuously until alleles become either lost or fixed in different subpopulations. We are generally interested in incomplete subdivision in which genetic differentiation among subpopulations is counterbalanced by a certain degree of gene flow.

As another homogenizing force, mutation has little effect on genetic differentiation compared with migration, because it occurs at a rate of the order of 10−5 or less. Throughout the paper, we concentrate on selectively neutral variation and consider genetic drift as the sole differentiating force and migration among subpopulations as the sole homogenizing force.

The interaction between genetic drift and migration, which determines the equilibrium differentiation, the steady-state rate of decay of genetic variation and effective size, is largely influenced by the hierarchy of population structure. A hierarchical structure means that individuals can be grouped into progressively inclusive (nested) levels such as sublines, lines and the whole population. The actual geographical structure of a natural population may be quite complicated. For mathematical tractability, some simplified (idealized) population structure models, such as the island model, stepping stone model and neighbourhood model, have been proposed. Results from these models are, however, still pertinent to the real world and have implications for evolutionary theory, plant and animal breeding and conservation biology.

Given the geographical structure and demographic properties of a population, its genetic differentiation will eventually reach an equilibrium after a sufficiently long time. Wright's F-statistics (Wright, 1951) are very useful in describing the genetic structure (or the partition of genetic variation) of a metapopulation irrespective of its geographical structure, the degree of isolation among subdivisions and without getting into details such as gene and genotypic frequencies. As will be shown below, the F-statistics can also be used to derive the effective size of metapopulations, making the expressions more concise and meaningful. F-statistics can be predicted using information on the geographical structure and demographic properties of the population and can also be estimated using gene and genotypic frequency data on allozyme and DNA polymorphisms (see Hartl & Clark, 1997).

3 Effective size

The variance and inbreeding effective sizes are not always the same in a metapopulation. However, if the total metapopulation is not completely subdivided, and its size and structure are constant over generations, then, after a sufficient number of generations, the rate of inbreeding and the variance of change in gene frequency will gradually reach their respective asymptotic values, which correspond to a unique effective size called eigenvalue effective size (Ewens, 1982) or asymptotic effective size (Chesser et al., 1993; Wang, 1997a,b). The long-term effect of drift and inbreeding in a metapopulation on neutral or weakly selected alleles can be predicted by the asymptotic effective size. In this part, we are concerned only with the asymptotic effective size, which will be called effective size (Ne) for simplicity hereafter.

3.1 Equal contribution from subpopulations

Most prediction equations for the effective size of a subdivided population assume, explicitly or implicitly, that all subpopulations contribute equally to the next generation and have an equal and constant size over generations. These assumptions simplify the prediction considerably, although they may not be true in natural populations.

3.1.1 Island model

Wright (1943) considered a population subdivided into n subpopulations, each being an idealized population of size N except for receiving a proportion m of immigrants taken randomly from the entire population per generation. This is the so-called island model. He derived, using the approach of the variance in gene frequency, an expression for effective size

which was also derived by Nei & Takahata (1993) using Slatkin's (1991) formula for the coalescence time of two randomly chosen homologous genes. Equation (1) indicates that subdivision always results in an increase in Ne with this model. When Nm is large, NenN, as expected for a panmictic population; but Ne can be much greater than the metapopulation size (nN) when Nm is small. If n1, we have

approximately. When migration is low (4Nm1), each subpopulation acts as a single individual, and Nen/(4m) is mainly determined by the number of subpopulations and migration rate and can be much larger than the census size nN. However, a strong assumption made in deriving eqns (1) and (2) is that different subpopulations contribute equally to the next generation. As will be shown below, violation of this assumption may lead to the opposite.

Natural populations are seldom so ideal; there may be nonrandom mating, non-Poisson variance in reproductive success and other complications within subpopulations as reviewed by Caballero (1994). Wang (1997a) considered a monoecious metapopulation with partial selfing at rate β, an arbitrary distribution of family size (with variance S2k) and both pollen and seed migration with rates dp and ds, respectively. The resulting equation for Ne is complicated, but a good approximation can be obtained from eqn (2) by replacing N with the effective size of a subpopulation (NeS), i.e.

where the total migration rate is mdp+ds. The effective size of a subpopulation is

where

(see Caballero, 1994). Equation (3) shows that any factor that acts to increase (or decrease) the effective size of subpopulations will also increase (or decrease) the effective size of the metapopulation. However, this is important only when migration is high (4NeSm1).

For a subdivided dioecious population with different numbers of males (Nm) and females (Nf) within subpopulations and different migration rates of males (dm) and females (df), predictions of Ne have been made by Chesser and colleagues for equal family size and Poisson distribution of family size (Chesser et al., 1993) and by Wang (1997b) for an arbitrary distribution of family size. To a good approximation, Wang's eqn (28) also reduces to eqn (3) above with m=½(dm+df) and NeS=16/ (Pm,mm+2Pm,mf+Pm,ff+Pf,mm+2Pf,mf+Pf,ff) (Nagylaki, 1995), where Pu,vw is the probability that two individuals of sexes v and w taken at random from the same subpopulation before migration come from the same parent of sex u (u, v, w=m for males, f for females). Pu,vw is given by

and

where σ2uv denotes the variance of the number of offspring of sex v from a parent of sex u, and σu,mf represents the covariance of the numbers of male and female offspring from a parent of sex u.

Equation (3) is only an approximate prediction. From exact solutions of recurrence equations for the probability of identity by descent (PIBD) or more accurate equations for Ne, it has been shown that male migration is more important than female migration when Nm<Nf (Wang, 1997b). This is because a male immigrant is expected to contribute a higher proportion of genes to the next generation than a female migrant when Nm<Nf. However, the difference between the effects of male and female migration is small.

3.1.2 Stepping stone model

In natural populations, individuals are often distributed more or less discontinuously to form colonies (subpopulations), and individuals or gametes are mainly exchanged between adjacent or nearby colonies. Kimura (1953) first proposed a model, called the ‘stepping stone model’, to analyse such a structure. Based on the spatial arrangements of the subpopulations, the model is classified into one, two, three or higher dimensions. Maruyama (1970) considered a monoecious population subdivided into an even number of n colonies, each of size N and arranged in a circular stepping stone model. Migration between adjacent colonies occurs at a rate of ½m in each generation (m≠0). By two different methods, Maruyama (1970) derived that

and

These equations show that, when the absolute number of migrant gametes from (or to) each subpopulation (2Nm) is small, Ne is approximately proportional to n2, i.e. the square of the number of subpopulations, as pointed out by Kimura & Crow (1963b). On the contrary, when 2Nm is large, there is little differentiation among subpopulations, and the metapopulation behaves as a large panmictic population. The equality 2Nm=n2 represents a turning point at which one of the equations becomes valid and the other one breaks down. At the turning point, however, both approximate equations give the same and the least level of accuracy, underpredicting Ne by a factor of 2 (Maruyama, 1970).

Comparing eqns (6) and (7) with eqn (2), we see that the stepping stone model is, in fact, similar to the island model, in that a turning point determined by the effective number of migrant gametes (2NeSm) and the number or squared number of subpopulations divides the population into two classes, effectively panmictic and subdivided populations. Extrapolating the expression of effective size for the stepping stone model from that for the island model, we obtain

Although eqn (8) is not derived analytically, it provides better predictions than eqns (6) and (7) across the whole range of all parameters (N, n and m), the greatest improvement occurring at the turning point, as expected. This can be verified by the numerical examples of Maruyama (1970; tables 1 and 3).

Prediction equations for Ne for other stepping stone models are not available, but the relation between Ne and n, N and m should, in principle, be similar to that shown above. When subpopulations are not ideal, NeS should be used instead of N in eqn (8).

3.1.3 The neighbourhood model

The island and stepping stone models assume discrete subpopulations, i.e. there is an area between colonies that is not inhabited. A natural population may have a continuous distribution of its individuals across the habitat, but it is not necessarily a random mating unit, because the distance of individual migration is usually much smaller than the entire distribution range of the population. Wright (1943, 1946) considered a model in which a population is distributed uniformly over a large territory, but parents of any given individual are drawn at random from a small surrounding region. The individuals in this region in a continuum constitute a random mating unit, called a ‘neighbourhood’ (Wright, 1946). The neighbourhood size is determined by the migration distance (or rather the variance in migration distance) and population density (number of individuals per unit area). For a diploid, monoecious population distributed uniformly along a circular (one-dimensional) habitat, Maruyama (1971) derived that

and

where σ2 is the variance of dispersion distance, D is the population density, L is the length of habitat and NT=LD is the metapopulation size. When the population is distributed on a linear habitat with two ends, eqns (9) and (10) and the conditional inequalities hold if L is replaced by 2L (Maruyama, 1971). Therefore, when Dσ2 is small, the effective size of a linear population is twice that of a circular population. When Dσ2 is large, however, both populations behave as panmictic units and have the same effective size, NT. The turning point that determines roughly whether the population behaves like a panmictic one or not depends on both Dσ2 (a local property) and L (a property of the metapopulation or habitat). This property is similar to that of the circular stepping stone model (see eqns 6 and 7). With the same arguments given for the stepping stone model, we immediately see that the equations before and after the turning point could be combined to give more general predictions,

for a circular structured population, or the same expression replacing L with 2L for a linear structured population.

For a population distributed uniformly and continuously on a habitat of a torus-like surface (two-dimensional) with size L×L and density D, numerical examples indicated that

and

(Maruyama, 1972). Comparing eqn (9) with eqn (12), we note that, for a given size NT and a given small Dσ2, one-dimensional structures result in a much larger Ne than two-dimensional structures. When the habitat occupied by a population is rectangular, eqns (12) and (13) and the conditional inequalities are valid if Dσ2 is replaced by ½Dσ2 (Maruyama, 1972). Therefore, for small values of Dσ2, a rectangular structure results in a larger Ne than the torus-like structure. By analogy with eqns (8) or (11), a combination of eqns (12) and (13) gives more general and accurate predictions.

In contrast to one-dimensional populations (eqns 9 and 10), the turning point for two-dimensional populations depends only on Dσ2, independently of the habitat size. In this respect, two-dimensional, continuously distributed populations are similar to populations in the island model (see eqn 2).

3.1.4 The general model with an equal contribution from subpopulations

In the above, different geographical structures of a metapopulation lead to different prediction equations for effective size. The parameters in these equations, such as migration rate and variance in dispersion distance, are difficult to estimate in practice (Slatkin, 1985). Moreover, if the number of dimensions in the stepping stone model and neighbourhood model exceeds two, it is also difficult to derive an analytical expression for effective size. Furthermore, the structure of natural populations may be much more complicated than that described by the three models. Therefore, a general expression for effective size that is independent of the geographical structure of the population will be very useful.

F-statistics describe the genetic architecture of a subdivided population in any geographical structure, and they can be estimated using protein or DNA polymorphism data. If effective size can be expressed in terms of F-statistics, there will be no need to consider the geographical structure of the population and to estimate m and σ2. Here, we derive such an equation for effective size, according to the approach of the variance in gene frequency (although a similar derivation can be made by the inbreeding approach).

Assume that a population is subdivided into n subpopulations with the same census size N and effective size NeS. Let qi and qi be the frequencies of a given allele at an arbitrary neutral locus in the ith subpopulation in generations t and t+1, respectively. The change in gene frequency caused by genetic drift in the ith subpopulation is Δqi=qiqi. The variance of change in gene frequency can be derived using a procedure shown in Caballero (1994) as VΔqi=qi(1−qi)/(2NeS). The mean change in gene frequency for the metapopulation is

Ignoring the second-order terms introduced by correlations among the Δqi because of the finite number of subpopulations, we obtain

In the subdivided population, the average heterozygosity is

(Hartl & Clark, 1997), where

Inserting the relation into eqn (14) and equating VΔq to ¯q(1−)/(2Ne) by definition, we obtain

Equation (15) is essentially the same as that derived by Wright (1943) except for NeS instead of N. Therefore, the effective size of a subdivided population turns out to be very simple in terms of F-statistics and effective subpopulation size. All the complications within a subpopulation (such as nonrandom mating, variance in family size, etc.) can be taken into account by using the effective subpopulation size, which has been reviewed by Caballero (1994). For a monoecious population with partial selfing rate β, for example, NeS is given by eqns (4) and (5), and the effective size of the metapopulation is

3.2 Variable contributions from subpopulations

Most previous analyses on metapopulations have assumed that all subpopulations contribute an equal number of offspring to the next generation and have the same constant size over generations. Natural populations are seldom so ideal, and all of their demographic parameters may change in space and time. In the extreme case, a particular subpopulation may become extinct and be replaced by migrants from other subpopulations. A general model has been proposed and analysed by Whitlock & Barton (1997) recently. They found that, when subpopulations are allowed to contribute differentially, subdivision may lead to a great decrease in Ne. This was also found by Gilpin (1991) and Hedrick & Gilpin (1997) using simulation studies incorporating both realistic features of population ecology (e.g. extinction and recolonization) and population genetics.

3.2.1 Effective size of monoecious species

Whitlock & Barton (1997) considered a monoecious population subdivided into n subpopulations, whose sizes may differ and change over generations. Each subpopulation was assumed to be an ideal Wright–Fisher population but, as shown below, their model can easily be extended to non-ideal situations.

Consider a population subdivided into n subpopulations. In each discrete generation, reproduction is followed by migration. At generation t, the ith subpopulation has Ni individuals and contributes a total number of Ni progeny to the next generation. Therefore, the fitness of the ith subpopulation at generation t is wi=Ni/Ni, with

(the metapopulation size is constant). Let Fij be the probability of identity by descent (PIBD) of two alleles chosen at random from subpopulations i and j, respectively, at generation t. After reproduction but before migration, the average PIBD for alleles taken from the ith subpopulation is increased to

where NeSi is the effective size of subpopulation i. The PIBD for alleles from different subpopulations remains the same,

The overall average PIBD for alleles taken at random from the entire metapopulation is

for generation t and

for generation t+1, where ¯N is the mean number of individuals per subpopulation. Inserting eqns (17) and (18) into eqn (20) yields

Now, migration changes the PIBD for alleles within and among subpopulations, but the overall mean, ¯F′, remains the same. Therefore, it is needless to consider migration as far as ¯F′ is concerned. Noting that the overall mean PIBD for alleles in generation t can be expressed as

we obtain the rate of change in the overall mean PIBD from eqns (19) and (21),

where FST,ij=(Fij¯F)/(1−¯F) is a generalization of Wright's F-statistic, and FST,ii is the FST for the ith subpopulation. From the definition of ¯F, it is easy to see that

Equating eqn (22) to 1/2Ne yields the asymptotic effective size of a metapopulation

When each subpopulation is an ideal Wright– Fisher population, NeSi=Ni, and eqn (23) reduces to Whitlock & Barton's eqn (9). In principle, all the parameters in eqn (23) are measurable or predictable in a single generation. The FST,ij could be estimated from the same genetic data as FST, using an analysis of covariance. The reproductive values (wi) are for a single generation's transition (for more details, see Whitlock & Barton, 1997).

In the derivation of eqn (23), no assumption was made about the geographical structure or migration pattern of the population. The variation in fitness among subpopulations may come from individual variation of contributions and temporary or permanent quality differences among the patches of a habitat occupied by different subpopulations. Therefore, eqn (23) can be used to estimate Ne for a natural population if enough demographic data and genetic data are available for a single generation. Two special and important cases of eqn (23) are discussed next.

3.2.2 Constant and equal subpopulation size

Consider the case that each subpopulation has the same constant size (Ni=¯N=N) but contributes unequally to the next generation via migration. If wiwj is not correlated with FST,ij, we have FST,ij=−FST/(n−1) for ij (Whitlock & Barton, 1997) where

The effective subpopulation size is

(Wang, 1996), where S2ki is the variance in the number of gametes contributed per individual in subpopulation i. Inserting the above relations into eqn (23) yields

where V is the variance in fitness among subpopulations (variance of wi),

and

With random mating and Poisson distribution of family size within subpopulations, we have FIS=0 and

eqn (24) reduces to Whitlock & Barton's equation

From eqn (25), it can be seen that, compared with an unsubdivided population of the same size nN, subdivision may increase or decrease Ne, depending simply on the variance in fitness among subpopulations. If subpopulations contribute equally to the next generation (V=0), eqn (25) reduces to eqn (15) of the previous section with N replacing NeS, and Ne always being increased by subdivision. On the contrary, if V≥1/(2N−1) approximately, subdivision will result in a decrease in Ne. A variance of V≈1/(2N−1) refers to the situation in which there are no restrictions to the reproductive output of each subpopulation and no quality differences (temporary or permanent) among the patches of a habitat occupied by the metapopulation, and the distribution of reproductive success of individuals is Poisson.

Whereas the direction of the effect of subdivision on Ne depends on V, the magnitude of the effect relies on the genetic differentiation, FST. If FST→0, NenN/(1+V)≈nN because V is of the order 1/(2N); if FST→1, eqn (25) reduces to Ne=(n−1)/(2V), which is equivalent to the Ne of an unsubdivided haploid population (Caballero, 1994, eqn 27) with the number of subpopulations taking the place of the number of individuals and the variance in the number of gametes being 2V instead of V. Because with FST=1, each subpopulation is completely homozygous for a particular allele at a locus, it behaves as a haploid individual, in that no genetic drift results from Mendelian segregation, and as a diploid individual, in that the average contribution of gametes is two for stable population size. If each subpopulation contributes equally to the next generation, there is no genetic drift and the effective size is infinite.

For equal contributions from different individuals within a subpopulation (S2k=0) and from different subpopulations (V=0), eqn (24) reduces to

indicating that inbreeding at any level (FST or FIS) has an equal effect on increasing Ne. If, on the other hand, S2k=2+2β (Poisson distribution of the reproductive success per individual when there is a proportion β of selfing) and V=1/N (i.e. twice the accumulated Poisson distribution variance), using eqn (5), eqn (24) reduces to

approximately, indicating that inbreeding at any level has an equal effect on decreasing Ne. If V=0 and S2k=2+2β, eqn (24) reduces to

approximately. Equations (27) and (28) were also derived by Nunney (1998) for a dioecious population using another inbreeding approach.

3.2.3 Extinction and recolonization

The effective size of a subdivided population subject to extinction and recolonization was also derived by Whitlock & Barton (1997). The model assumes that a population is subdivided into n demes, each being an ideal population with N diploid individuals. The migration rate among demes is m, and each deme becomes extinct with probability e per generation. Each extinct deme is recolonized immediately by 2k gametes in which each pair has a probability φ of having come from the same deme. The newly colonized deme grows to N individuals before reproduction in the next generation (Slatkin, 1977; Whitlock & McCauley, 1990). With φ=0, the contribution of a deme to the next generation (N′) is either 0 with probability e or N/(1−e) with probability 1−e. Therefore, the variance in fitness among demes is V=e/(1−e), and from eqns (23) or (25) we obtain that

Whitlock & Barton (1997) obtained from eqn (29) that

approximately, which is also true relaxing the assumption about φ. For φ=1 and k=N, eqn (30) reduces to the equation given by Maruyama & Kimura (1980) without mutation. With no extinction (e=0), eqn (30) reduces to eqn (15) with N taking the place of NeS, as expected.

Although extinction and recolonization may increase or decrease FST (Whitlock & McCauley, 1990), they always decrease Ne. If kN and extinction is not rare relative to migration, Ne can be reduced substantially (Whitlock & Barton, 1997). For example, if e=m=0.1, N=1000 and k=1, then the effective size is 0.0062 of what it would have been without local extinction.

3.3 The pattern of inheritance

3.3.1 Dioecious species

The effective size of a subdivided population with separate sexes was derived by Nunney (1998) using an inbreeding approach. He assumed an equal size of subpopulations but with unequal contributions to the next generation. The subpopulation was allowed to have different numbers of males and females with a constant sex ratio over generations, partial sib-mating and arbitrary distributions of family size. A more general equation can be obtained using a derivation similar to the above monoecious case. The expression derived is essentially similar to eqn (23). When each subpopulation consists of N individuals (half in each sex) and the fitness of subpopulations is not correlated with F-statistics, the expression reduces to

When the subpopulations have an equal size and are ideal populations except for unequal numbers of males (Nm) and females (Nf) and variable contributions to the next generation, the general expression is then simplified to

where N=Nm+Nf and NeS=4NmNf/N. Compared with an unsubdivided population of the same size, subdivision will decrease the effective size when V≥1/(2NeS−1).

3.3.2 Sex-linked loci

Similar to the case of a single unsubdivided population (Caballero, 1994), the effective size of a metapopulation for sex-linked loci or haplo-diploid species relates to the inbreeding rate of the homogametic sex or to the variance of gene frequency at a neutral sex-linked locus.

Assuming all subpopulations have an equal and constant size in the island model, Wang (1998) derived exact recurrence equations for the PIBD of two homologous genes from a female (the homogametic sex) and from separate females within and among subpopulations. Different numbers and separate migration rates of males (dm) and females (df) were incorporated, with random mating within subpopulations. The derived expression for effective size is complicated, but to a good approximation it reduces to eqn (3) using an appropriate effective subpopulation size NeS (see eqn 30 in Caballero, 1994) and the total migration rate m= 1 3 (dm+2df). Female migration rate has a larger effect on Ne than male migration rate, because a female carries two genes and a male carries only one gene at a sexlinked locus.

When there are fitness differences among subpopulations, it can be shown that eqn (23) also applies for sex-linked loci, with N and FST referring to the homogametic sex only. Therefore, results and discussions above about autosomal loci are applicable to the sex-linked case.

3.3.3 Haploid species

For haploid species, genetic drift in gene frequency or an increase in PIBD comes only from differential reproductive successes of individuals and different fitnesses among subpopulations. In the extreme case of equal contributions from different individuals and different subpopulations, the gene frequency remains the same, and the effective size is infinite.

Using a procedure similar to the monoecious case, an equation identical to eqn (23), except for the factor 2 of the second term in the denominator being dropped, can be derived. If subpopulations have an equal size but contribute unequally to the next generation via migration, the equation reduces to

where S2k is the within-subpopulation variance in the number of gametes contributed per individual averaged over subpopulations. The effective size for a haploid species depends on the variance in genetic contributions at both the individual and the subpopulation levels. The effect of the subpopulation level variation is amplified by the differentiation among subpopulations. If the differentiation is complete (FST=1), each subpopulation behaves as a single individual, and the effective size is determined by the number of subpopulations and their differential contributions [Ne=(n−1)/V]. If the subpopulation level variance comes only from the individual level (V=S2k/N), then subdivision has little effect on the effective size of a metapopulation.

3.3.4 Non-nuclear gene inheritance

Birky et al. (1989) investigated organelle gene diversity in subdivided populations, and Chesser & Baker (1996) considered the asymptotic effective size of a subdivided population for non-nuclear gene inheritance. If organelle genes (e.g. mtDNA) are inherited uniparentally and are always homologous within individuals, then they are similar to haploid inheritance, and the effective size for organelle genes can be predicted from the corresponding haploid equations (e.g. eqn 33), considering only the relevant sex.

Although homoplasmy for organelles is considered to be near ubiquitous in higher vertebrates (Avise, 1991), an increasing number of species are found to be heteroplasmic (Hartl & Clark, 1997). In this case, the genetic drift theory for organelles becomes more complex than haploid or diploid nuclear inheritance. An individual cell may have different organelles that are partitioned among daughter cells. Thus, the equations for either diploid or haploid populations are not applicable to the organelle genes.

4 The effects and implications of population subdivision

4.1 The effects

The effect of subdivision is similar to nonrandom mating in a single population, with a subpopulation corresponding to a family and FST to FIS. The effective size for a monoecious population with partial selfing and a dioecious population with partial full-sib mating can be summarized as

[Caballero, 1994, eqns (11) and (20), with S2k/4 replaced by V], where NT is the total population size, N is the number of individuals per family (N=1 and 2 for selfing and full-sib mating, respectively) and V is the variance in fitness among families. For a subdivided population with random mating within subpopulations (FIS=0), eqns (24) or (31) reduce to

approximately. Note that, to be comparable with eqn (34), in which individuals within a family contribute equally to the next generation, the individual variance S2k should be set to zero. Thus, eqn (35) reduces to eqn (34) with FST taking the place of FIS.

Knowing the similarity between subdivision and partial inbreeding, we can understand the effect of subdivision on effective size from the point of view of genetic drift. The original effect of subdivision is nonrandom mating; reproduction occurs mainly within subpopulations, resulting in a decrease in heterozygosity. Genetic drift in gene frequency can come from two sources in a finite population: one is the differential reproduction among subpopulations and among individuals within subpopulations; and the other is the Mendelian segregation of heterozygotes. A decrease in heterozygosity from subdivision will definitely lead to a decline in genetic drift resulting from segregation. With a given gene frequency, however, a decrease in heterozygosity also implies an increase in the variance of gene frequencies among subpopulations and among individuals within subpopulations, and therefore an increase in genetic drift because of differential reproduction. The net effect of a decrease in heterozygosity on genetic drift depends on the relative magnitudes of the counteracting effects from the two sources. When the variance in reproduction success and, thus, its effect on drift are small, segregation is the leading source of genetic drift and, therefore, subdivision results in an increase in effective size. In the extreme case of equal reproductive success (V=S2k=0), genetic drift comes solely from segregation and is therefore minimized. Thus, subdivision gives the greatest increase in effective size, Ne=2NT/(1−FST) from eqn (35), and Ne→∞ when FST→1. On the contrary, when the variance in reproductive success is large [V>1/(4N−2)], genetic drift mainly comes from differential reproduction among subpopulations and, therefore, subdivision always leads to a decrease in Ne. There exists, in fact, a balance (equilibrium) between the counteracting effects from the two sources, at which a change in heterozygosity resulting from subdivision has no influence on effective size. The condition for the balance can be found from eqn (35) as V=1/(4N−2). With this value of V, the effects of segregation and differential reproduction cancel out and Ne=n(2N−1) from eqn (35), irrespective of the nonrandom mating (FST).

When ecological factors are considered, subdivision may well enlarge the variance in reproductive output among demes (V) in natural populations. Without local density regulation, V comes from three sources. First, individual variance in offspring number contributes an amount of S2k/(4N) to V. This is the variance in reproductive output of arbitrarily defined groups of individuals. Secondly, temporary environmental differences among demes contribute to V. These two components of V do not cause correlation in the change of gene frequency over generations. Thirdly, there may be permanent quality differences among demes. If a subpopulation fortunately (unfortunately) inhabits an area of high (low) quality, it will contribute more (fewer) offspring consistently over generations. Unless migration is complete (m=1), the permanent difference in habitat quality that influences reproductive success will tend to increase the long-term genetic drift and inbreeding.

The permanent difference in fitness among subpopulations is similar to the inherited variation under selection (Santiago & Caballero, 1995). They both result in correlated changes in gene frequency or gene identity over generations and, therefore, Ne is greatly reduced. The patterns of correlated changes in fitness associated with an allele across generations are different, however, for inherited fitness and for permanent habitat difference. In the former case, the correlated change between two successive generations is gradually dissipated by recombination, whereas in the latter case, the correlated change is kept completely as long as the allele remains in the same subpopulation and is totally lost once it moves to a new subpopulation. Recombination plays a role similar to migration in reducing the correlated changes, the magnitude of effect lying somewhere between the island model and the stepping stone model (Whitlock & Barton, 1997). Subdivision with permanent difference in fitness among subpopulations is also analogous to hitchhiking or background selection, in which a neutral allele finds itself associated with one of the genetic backgrounds under directional selection.

Although subdivision usually tends to increase variance in genetic contributions among demes in natural populations, it might also be used to decrease the variance in long-term contribution from different individuals in controlled populations. This is best illuminated by the following example. To minimize the inbreeding and genetic drift in a small population with Nm males and Nf females (sex ratio r=Nf/Nm≥2), Gowe et al. (1959) proposed a selection scheme in which each male has one son and r daughters, and each female has one daughter and a probability of 1/r of contributing one son. The selection scheme combined with random mating (each male mates at random with r females) has been known as minimal inbreeding (Falconer & Mackay, 1996, p. 69) and considered ideal for both control populations and conserved populations. Under minimal inbreeding, the variance in paternal contribution to the next generation is always zero. However, the paternal contribution to the gene pool after two or more generations is still variable. For example, males may have different numbers of grandsons resulting from the differential contributions of their daughters. If the population is subdivided into Nm herds, each consisting of Nf females and one male, and males migrate randomly among herds and females do not migrate, then the variance in paternal long-term contribution is also zero under Gowe et al.'s (1959) selection scheme (Wang, 1997c). Therefore, the subdivision scheme can increase the effective size by as much as 12.5% for autosomal loci and 50% for sex-linked loci, compared with random mating (Wang, 1997c). The subdivision scheme is very efficient in minimizing inbreeding, not only in the long term but also in the short term (especially for large r), in contrast to the usual subdivision with equal sizes and contributions of all subpopulations (section 3.1) or circular pair mating (Kimura & Crow, 1963b), which results in an increased Ne with a sacrifice of a higher rate of inbreeding at initial generations.

4.2 Implications

The above results imply that subdivision in natural populations is much more likely to result in a decrease rather than an increase in Ne. Classical models assume a constant size and an equal contribution of different subpopulations, resulting in an increase in Ne because of subdivision, particularly at low levels of gene flow (section 3.1). The assumption is, however, unlikely to be realistic in natural populations in which subpopulations may contribute unequally to the next generation. There are generally quality differences across the habitat that increase the variation in subpopulation productivity, and permanent quality difference or extinction and recolonization can substantially amplify the reproductive variance among subpopulations and thus decrease Ne enormously. The implication is that Ne/N of natural populations should be smaller than one, and the lower the level of gene flow in a species, the smaller the estimate of its Ne/N. A review of 192 estimates of Ne/N from 102 species of plants and animals (Frankham, 1995) shows that the estimates vary dramatically, ranging from 10−6 in the pacific oyster (but see Nunney, 1996) to 0.99 in humans, with an overall average of only 0.1. These estimates, however, do not account for subdivision and gene flow, and some of them may therefore be biased.

In conservation, if the habitat of a species is permanently fragmented and migration among subpopulations is not possible except by carefully planned management, Ne of the species could be increased compared with a panmictic population of the same size. In this case, effort should be made to ensure that each subpopulation does not extinguish, and a certain level of artificially aided migration (Mills & Allendorf, 1996) is carried out to avoid inbreeding depression. The results reviewed in this paper suggest that subdivision and thus differentiation at different levels have a similar effect on Ne, the direction of the effect depending on the variance in reproductive success at the corresponding levels. To increase Ne, gene flow at a particular level of subdivision should be low, if reproductive variation at this level can be minimized by management, but gene flow should be high if reproductive variation cannot be minimized. However, even if effective size can be increased by restricted migration, it should be noted that this is realized only after many generations (see below), during which the inbreeding rate might be increased compared with panmixia.

At the planning stage of a conservation programme, whether a population should be subdivided or not depends on many factors. If the reproduction and migration of the population and the ecological factors could be managed intensively, then subdivision may be beneficial for conserving the genetic variation for a given population size. This is the case for domesticated species, in which a large number of local breeds are endangered by the worldwide spread of a few highly productive commercial breeds and their crosses. For wild species in a more or less natural habitat, it is difficult to practise intensive management. Therefore, it is generally safer to conserve the species in a single large population rather than in a number of subpopulations. Because of its small size, a subpopulation without careful management is more likely to become extinct because of demographic and environmental stochasticity (Lande, 1988) and inbreeding depression (Saccheri et al., 1998), resulting in a drastic decrease in Ne of the metapopulation.

5 Some other considerations of subdivision

5.1 Complete subdivision

In the above, each subpopulation is not completely isolated from the rest; it receives immigrants from and contributes emigrants to the entire population. If a subpopulation only receives immigrants from but does not contribute emigrants to the metapopulation, the rate of inbreeding or genetic drift in this subpopulation will be smaller than that of the rest of the metapopulation. If, on the contrary, a subpopulation only contributes emigrants to but does not receive immigrants from the rest of the metapopulation, its effective size is determined by itself as a single population.

If all the subpopulations are isolated from each other, no asymptotic effective size for the entire metapopulation exists. In this case, the rate of inbreeding or genetic drift in each subpopulation will reach an asymptotic value, corresponding to the eigenvalue effective size of the subpopulation. The inbreeding effective size of the metapopulation (NeI), signifying the rate of decrease in the average heterozygosity of individuals in the whole population, is a weighted average of the effective subpopulation sizes. The equilibrium variance effective size of the metapopulation (NeV) is different from NeI and is not reached until the differentiation is complete (FST=1). With FST=1, different alleles are fixed in different subpopulations, and the genetic drift is determined only by the differential contributions from subpopulations. For the case of an equal size and unequal contributions of subpopulations, we have NeV=(n−1)/(2V) for diploid or NeV= (n−1)/V for haploid species, and NeV is infinite if subpopulations contribute equally to the next generation (V=0).

5.2 More than two levels of subdivision

A population may be subdivided at more than two levels in a hierarchy. For simplicity, let us consider a monoecious population of constant size subdivided into s lines, each line subdivided into n sublines, and each subline containing N individuals. Different lines (sublines) have an equal size but may contribute unequally to the next generation, migration among sublines within lines being more frequent than that among lines. No line or subline is completely isolated from the rest of the metapopulation. It can easily be shown that the F-statistics satisfy the relation (1−FIT)=(1−FLT) (1−FSL)(1−FIS), where FIS, FSL, FLT and FIT are similarly defined as Wright's F-statistics with subscripts I, S, L and T referring to individuals, sublines, lines and the total metapopulation, respectively. The effective size of the ith line can be obtained as

using a procedure similar to the derivation of eqn (24), where μi and VSi are the average and variance of fitness of sublines within line i, FIS,i and FSL,i are the average FIS and FSL over sublines in line i, S2ki is the variance in the number of gametes contributed per individual averaged over sublines within the ith line. The effective size of the metapopulation can be derived as

where VL is the variance in fitness among lines, FLT, FSL, FIS, S2k and VS are averages of FLT,i, FSL,i, FIS,i, S2ki and VSi over lines. If there is no variation in reproductive success at any level of subdivision (S2k=VS=VL=0), eqn (37) reduces to

If individual reproductive success follows a Poisson distribution [S2k=2+2β, where β is the selfing proportion], VS and VL are twice the accumulated Poisson distribution variance (VS=1/N, VL=1/nN), and eqn (37) reduces to

approximately. In both cases, the effective size of the metapopulation depends on the total size (snN) and the total differentiation (FIT); the number of levels of subdivision in the hierarchy is irrelevant. More generally, however, the importance of each level of subdivision in determining Ne depends on the differentiation and variance in fitness at the level, as can be seen from eqn (37).

5.3 Realization of the asymptotic effective size

In a Wright–Fisher ideal population with completely random union of gametes, the equilibrium rate of inbreeding and genetic drift or the asymptotic effective size is attained immediately (in one generation). Any form of nonrandom mating will retard the realization of the asymptotic effective size, inbreeding lagging some generations behind genetic drift when close inbreeding is avoided or vice versa (Caballero, 1994). In both cases, inbreeding and variance effective sizes are different and variable over the first few generations before they converge gradually to the same asymptotic value. The larger the departure from a random union of gametes, the greater the number of generations necessary to realize it. In the initial generations, neither inbreeding nor genetic drift can be predicted accurately by the asymptotic effective size. Other forms of nonrandom mating, such as age structure (Choy & Weir, 1978) and population subdivision (Chesser et al., 1993; Wang, 1997a,b), and factors other than nonrandom mating, such as artificial selection, hitchhiking and permanent difference in fitness among subpopulations, also retard the realization of asymptotic effective size. Here, we consider population subdivision as a form of nonrandom mating and its effect on the time required to realize the equilibrium F-statistics and effective size.

For simplicity, let us consider Wright's ideal infinite island model. The genetic differentiation among subpopulations will increase over generations until it reaches the equilibrium value (FST) asymptotically. The number of generations for the instantaneous differentiation (FST,t) to reach FST,tFST (where 0<λ<1) is

when N1 and m1 (Whitlock, 1992). The length of the retardation is proportional to the subpopulation size and the equilibrium amount of differentiation. Equation (40) is also valid for a metapopulation with non-ideal subpopulations when the census size of a subpopulation is replaced by its effective size (NeS). The term 2NeSFST (that we may call the ‘effective genetic differentiation’) determines the length of time to reach the equilibrium rate of inbreeding and genetic drift. With 2NeSFST=50 for example, 35 and 150 generations are required for the subdivided population to realize 50% and 95% of its equilibrium FST, respectively.

The effective size depends on demographic parameters and F-statistics, and its asymptotic value is realized once all the F-statistics reach their equilibrium values. Compared with FST, FIS reaches its equilibrium value much faster and is little affected by factors other than nonrandom union of gametes within subpopulations (Wang, 1997a). Therefore, the number of generations required to reach the asymptotic effective size is essentially determined by FST. If there are more levels of subdivision in a hierarchy, it is the effective differentiation at the highest level of subdivision (2NeLFST for the case described in section 5.2) that determines the time. At initial generations, the variance effective size is generally larger than the inbreeding effective size. The former decreases steadily, and the latter generally increases with possible fluctuations over generations, eventually converging to the same asymptotic value (Wang, 1997a,b).

With subdivision, mating is more likely between individuals within subpopulations than among subpopulations. This is somewhat like mating between individuals of higher than average co-ancestry in a single population. Compared with completely random union of gametes, subdivision results in a higher initial rate of inbreeding, which decreases over generations. Therefore, the fixation probability of favourable genes and the rate of shifts between adaptive peaks (Barton & Rouhani, 1991) depend on population structure differently from neutral variability. For species with less frequent migration among subpopulations, a large number of generations is necessary to reach its asymptotic effective size. A highly differentiated population may never attain its asymptotic effective size, because it is highly unlikely that both its demographic and geographical properties remain constant for many generations.