Introduction

The effectiveness of seed and pollen flow as agents of the interpopulation gene flow depends upon the mode of gene inheritance (Petit et al, 1993; Ennos, 1994). For nuclear and paternally inherited organelle genes, gene flow occurs in both pollen and seed; while for maternally inherited organelle genes, gene flow is mediated by seed only. Consequently, different levels of population differentiation at drift-migration equilibrium are expected for genes differing in mode of inheritance under different models of population genetic structure (Petit et al, 1993; Ennos, 1994; Hu and Ennos, 1997, 1999; Hu, 2000a). The relative rates of pollen and seed flow can be estimated using Wright's F-statistics and other genetic statistics (Ennos, 1994; Hu and Ennos, 1997; Hu, 2000b). However, those theoretical results are only appropriate for selectively neutral genes.

One important type of genetic structure observed in natural populations is clinal variation, where gene frequency exhibits a gradient change (increase or decrease) with geographical distance. Natural selection is often involved in cline formation (see Endler, 1977). Cline discordance (or concordance) between cytoplasmic and nuclear genes, which can be generally defined as the frequency clines with different (or the same) slopes (the rate for the change of frequency with distance) or positions (the geographical location for the same allele frequency) in space, was observed in practice (eg, Young, 1996). In theory, the impacts of seed and pollen flow, with incorporation of the effect of natural selection, were considered recently in a clinal situation (Nagylaki, 1997). However, haploid genes with paternal and maternal inheritance are not included. The relative roles of seed and pollen flow in bringing about cline discordance/ concordance between cytoplasmic and nuclear genes remain to be addressed. Furthermore, the disequilibria between cytoplasmic and nuclear genes, generated by seed and pollen flow, are likely maintained, as was shown in theory in a population structure model of continent-island (Asmussen and Schnabel, 1991; Schnabel and Asmussen, 1992). Li and Nei (1973) also showed that stable linkage disequilibrium between nuclear loci without epistasis can be maintained in subdivided populations under certain conditions. Slatkin (1975) further demonstrated, when the recombination fraction between nuclear loci is of the same order as selection coefficient or smaller, a substantial amount of linkage disequilibrium can be present in a cline.

Unlike the linkage disequilibria between nuclear loci, organelle genomes are physically unlinked to the nuclear genome and the recombination fraction is ½, greater than the weak selection coefficient that is assumed to involve in maintaining cline formation. The pre-existing theoretical results are not appropriate for elucidating the impacts of seed and pollen flow on cline discordance/ concordance between cytoplasmic and nuclear genes. The purpose of this article is to develop further theory required for understanding and interpreting clinal variation of nuclear, paternally and maternally inherited organelle genes, and to formulate the relationships between seed and pollen flow and cline discord- ance/concordance between cytoplasmic and nuclear genes, with incorporation of the disequilibria between cytoplasmic and nuclear genes attained by seed and pollen flow. For the sake of simple expression, the term ‘cytonuclear’ is used when cytoplasm and nuclear genes are interacted and considered jointly. It is necessary to note in advance that the cline discordance/concordance analyzed in this article refers to the specific case, where the same spatial selection pattern takes place between cytoplasmic and nuclear genes and the selection boundary points are assumed at the same position. Three different combinations between cytoplasmic and nuclear genes will be considered separately: nuclear vs paternally inherited organelle genes; nuclear vs maternally inherited organelle genes; and nuclear vs paternally and maternally inherited organelle genes. By analyzing these combinations, we look at the implications for the behaviour of nuclear, paternally, and maternally inherited organelle genes, and explore the inferences that can be drawn about the roles of seed and pollen flow in cline discordance/concordance from the data on these differently inherited genes.

General assumptions

A single locus with two alleles is considered for each of the three modes of gene inheritance. Two alleles are denoted by A1 and A2 for biparentally inherited diploid nuclear genes, B1 and B2 for paternally inherited haploid organelle genes, and M1 and M2 for maternally inherited haploid organelle genes. A hermaphrodite plant species is distributed in an infinite chain of equally spaced populations. Population size is assumed so large that the influence of genetic drift can be ignored. Symmetrical migration rate is assumed between any two populations for either seed or pollen flow. The life cycle within a short time interval, Δt, follows a sequence of events; pollen flow, random mating, seed flow, and selection. The population distribution is assumed uniform after selection, and hence migration rate between any two populations represents the probability of migration from one population to another within the time, Δt. Density- independent selection in the offspring takes place independently in each population after seed flow.

Nuclear vs paternally inherited organelle genes

General case

Let p(i,t) be the frequency of allele A1 in population i at time t, and q(i,t) for allele A2 (p(i,t) + q(i,t) = 1), u(i,t) and v(i,t) (u(i,t) + v(i,t) = 1) be the frequencies of alleles B1 and B2, respectively. Define D1(i,t) as freq.(A1B1) − p(i,t) q(i,t), the gametic (allelic) disequilibrium between A1 and B1 in population i. According to Asmussen et al (1987), the gametic disequilibria between any other paternally inherited organelle and nuclear alleles are summarized in Table 1. Use Wrightian fitnesses, 1 + s1Δtg1(i), 1 + hs1Δtg1(i), and 1 − s1Δtg1(i) for genotypes, A1A1, A1A2, and A2A2, respectively, where h is the degree of dominance and g1(i) indicates the pattern for the change of selection coefficient with geographical distance (Nagylaki, 1975). Similarly, define 1 + s2Δtg2(i)/2 and 1 − s2Δtg2(i)/2 as the fitnesses for alleles B1 and B2, respectively. The multiplication viability model that omits the product form for selection coefficient (selection epistasis) is used so that the fitness for any combination of paternal and nuclear genotype can be simply calculated (Clark, 1984).

Table 1 Gametic disequilibria between nuclear (A1, A2) and paternally inherited organelle (B1, B2) alleles in population i

Scale the population location by x = iε, where ε is the spacing between colonies. Use the notations of f = ∂f/∂t, f′ = ∂f/∂x, and f″ = ∂2f/∂x2 for function f. Following the sequence of events mentioned in the general assumptions, we can derive by diffusion approximation:

where p, q, u, v and D1 refer to p(x,t), q(x,t), u(x,t), v(x,t) and D1(x,t), respectively, and σ21 = σ2S + σ2P/2, σ22 = σ2S + σ2P. σ2S and σ2P are defined respectively as the dispersal variances of seed and pollen.

The change for the gametic disequilibrium is given by:

where D1 refers to D1(x,t). Equation (3) is comparable with (7) of Kruuk et al (1999). It can be viewed from (3) that there are three components underlying the change of D1. The first component is induced by the system of random mating that reduces D1 in half per time interval, Δt. The second component is the change due to migration, and the third is associated with selection.

Letting h = 0 and considering that the second term on right side of (3) is negligible (see also Kruuk et al, 1999), we can simply obtain

If 2σ21p′u′ ≤ s1 or 2σ21p′u′ ≤ s2, the case may likely occur when clines are very smooth. D1 is then of the order similar to selection coefficient and D1 terms in (1) and (2) can be ignored. If 2σ21p′u′ > s1 and 2σ21p′u′ > s2, which may take place when clines are very steep, then D1 is of the order 2σ21p′u′ and its effect is likely significant. In this case, it is not reasonable to remove D1 terms from (1) and (2).

D1 with the order similar to selection coefficient

In this case, the clines for nuclear and paternally inherited organelle genes can be approached independently, and analytic solutions for p and u at steady state can be solved separately using the method introduced by Haldane (1948). According to Slatkin's (1973) result, the characteristic length is defined as for nuclear genes and for paternally inherited organelle genes. Thus, the relationship between their characteristic lengths is obtained by:

where r = σ2P/σ2S, the ratio of pollen to seed flow. When, s1 = s2, a large value of the ratio r can enlarge cline discordance.

D1 with significant effect

In this case, the combination between nuclear and paternally inherited organelle genes can be considered jointly as a whole system. Assume that the same spatial selection pattern is possessed between nuclear and paternally inherited organelle genes, ie g1(x) = g2(x) = g(x), where g(x) = 1 when x < 0, and g(x) = −β2 when x > 0 (Haldane, 1948). Combining equations (1) and (2) at steady state and using (4), we can obtain a two dimensional nonlinear differential equation,

When a complete concordance takes place, let p′ = u′, p = u and χ = (p′)2, and then p″ = ½ ∂χ/∂p. We can derive:

when x > 0, where c1 = −4β2(2s1 + s2)σ21/(σ21 + σ22) and c2 = −4β2(s1 + s2)/(σ21 + σ22);

when x < 0, where c1 = 4(2s1 + s2)σ21/(σ21 + σ22) and c2 = 4(s1 + s2)/(σ21 + σ22). The characteristic length for the cytonuclear genes as a whole system under concordance, denoted by w12D, is thus equal to . It can be viewed that w12D < w1 and w12D < w2. With the same order of selection coefficient between cytoplasmic and nuclear genes, the cline for the whole system under concordance is distributed within the range of the clines for its two components that evolve independently (Figure 1).

Figure 1
figure 1

The cline for the combination between nuclear and paternal organelle genes under concordance is compared with those of its two components that evolve independently in a given frequency interval [0.1, 0.9]. Parameter settings are h = 0.0, β2 = 1.0, σ2P = 0.07, σ2S = 0.05, s1 = 0.03, s2 = 0.06.

Assume that g(0) exists. According to Slatkin (1973), the characteristic length under the impacts of cytonuclear disequilibria can also be calculated by w1D = (p′|p=0.5)−1 and w2D (u′|u=0.5)−1. Thus, the value of D1 at cline centre (maximum point) is:

If h = 0, (8) reduces to the result obtained by Barton and Gale (1993).

Barton (personal communication) indicated that a small linear perturbation analysis can be used as an approach to cytonuclear combination when D1 is significant. Let λ = (p + u)/2, s = (s1 + s2)/2, and σ2 = (σ21 + σ22)/2. We can approximate an average equation even if there are some differences between cytonuclear genes, . Firstly, consider the case without the D effect, ie . At steady state, it becomes

λ′ε can be solved analytically. Next consider the impacts of small linear perturbations, γ, and let λ = λε + γλ′ε. Following the same method used by Barton (1979), we can obtain

where s′ = 0 for either x > 0 or x < 0. Assume that D effect is equivalent to the influence by linear perturbation that eventually induces a non-zero steady flux into the system described by (9). Let 3sD/2 = λ − λε = γλ′ε. Since |D| ≤ 1/4 according to Asmussen and Basten (1996), then |3sD/2| = |γλ′ε| ≤ 3s/8, which limits arbitrary choice of the value γ. An integral that is invariant during cline shift due to perturbation can be configured, I = ∫+∞−∞ gλ′εγdx. We obtain, . I is invariant for all possible γ when g = λ′ε. Thus, the eventual movement of cline due to linkage disequilibria can be approached by −(I = ∫+∞−∞ gλ′εγdx)/(I = ∫+∞−∞ gλ′εdx) (Barton, 1979).

Nuclear vs maternally inherited organelle genes

Let y(i,t) be the frequency of allele M1, in population i at time t, and z(i,t) (y(i,t) + z(i,t) = 1) of allele M2. Define D2(i,t) as freq.(A1M1) − p(i,t) y(i,t), the gametic disequilibrium between A1 and M1 in population i. Similarly, denote the fitness by 1 + s3Δtg3(i)/2 for M1 and 1 − s3Δtg3(i)/2 for M2. The partial differential equations can be readily gained by replacing D1, s2, g2(x), σ22, u and v in (1) and (2) with D2, s3, g3(x), σ2s, y and z, respectively. The relationship between their relative characteristic lengths is given by

The change for D2 is given by

The magnitude of D2 at centre is

where w3D is the characteristic length for maternally inherited organelle genes. Under concordance, the solution for the first order p′ can be obtained immediately by substituting c1 = 4g(x)(2s1 + s3)σ2s/(σ21 + σ2s) and c2 = 4g(x)(s1 + s3)/(σ21 + σ2s) into (7a, b). It can be viewed that w13D < w1 and w13D < w3, where . With the same order of selection coefficient, the cline is wider for the whole cytonuclear system under concordance, than for maternally inherited organelle genes but narrower than for nuclear genes (Figure 2).

Figure 2
figure 2

The cline for the combination between nuclear and maternal organelle genes under concordance is compared with those of its two components that evolve independently in a given frequency interval [0.1, 0.9]. Parameter settings are h = 0.0, β2 = 1.0, σ2P = 0.07, σ2S = 0.05, s1 = 0.03, s2 = 0.06.

Paternally vs maternally inherited organelle genes

According to the general assumptions the two haploid organelle alleles B and M are shown to be independent. The relationship between their characteristic lengths is given by

Nuclear vs paternally and maternally inherited organelle genes

Similarly, define D(i,t) as freq.(A1B1M1) − p(i,t) u(i,t) y(i,t), the gametic disequilibrium among alleles A1, B1, and M1 in population i. Let D(i,t) = D1(i,t)y(i,t) + D2(i,t)u(i,t). The gametic disequilibria for any other combinations among three genomes are summarized in Table 2. The previous definitions for D1 and D2 still hold, and so do the partial differential equations for either paternally or maternally inherited organelle genes.

Table 2 Gametic disequilibria among nuclear (A1, A2) and paternally (B1, B2) and maternally inherited organelle (M1, M2) alleles in populataion i

The partial differential equation for the frequency p is given by

It has been shown that the partial differential equations for D1 and D2 remain unaltered; while the result for the change of D ( = D A 1 B 1 M 1 ) becomes more complicated,

where D represents D(x,t). It is further derived that the following relationship holds,

Thus, according to equations (15) and (16a,b) a partial differential equation for any other linkage disequilibrium among the three genomes can be readily worked out.

Under cline concordance among the three genomes, the solution for the first order of cline gradient can be obtained by substituting:

in (7a) when x > 0 and (7b) when x < 0. Define w123D as the characteristic length for the cline of three genomes under concordance. It can be viewed that the relationship of w2123D < w212D + w213D holds. The cline for the three genomes under concordance is distributed within the range of the clines for the two genomes under concordance, if the same order of selection coefficient is possessed for each genome (Figure 3).

Figure 3
figure 3

Under concordance the cline for the combination of three genomes is compared with those of two other combinations (nuclear vs paternal, nuclear vs maternal) in a given frequency interval [0.1, 0.9]. Parameter settings are h = 0.0, β2 = 1.0, σ2P = 0.07, σ2S = 0.05, s1 = 0.03, s2 = 0.06.

Discussion

This paper has explicitly formulated the relationships between the seed and pollen flow and cline discordance/ concordance between cytoplasmic and nuclear genes. It can be concluded that the integrated effects of gene flow (seed and pollen flow) with selection shape cline discordance/ concordance. The effect of seed flow is not the same as that of pollen flow because of the asymmetric migration rates that are utilized by the three plant genomes with contrasting modes of inheritance. Previous studies show that for the selectively neutral markers, a large migration rate of seeds can reduce the level of population differentiation among these three genomes, whereas a large migration of pollen grains can increase the divergence in population differentiation between maternally inherited organelle genes and the other two counterparts (Ennos, 1994; Hu and Ennos, 1997, 1999). The results similar to the neutral case are further proven in the clinal situation if weak selection coefficients are of the same order among the genomes. Large seed flow can reduce the clinal differential among these three genomes, and hence enhance the level of cline concordance. A large migration rate of pollen grains can increase cline concordance between nuclear and paternally inherited organelle genes, but reduce cline concordance between the maternally inherited organelle genes and the other two counterparts.

The cline concordance/discordance between cytoplasmic and nuclear genes should be different from that between linked nuclear genes. Since the cytoplasmic genomes are biologically not linked to the nuclear genome, the differential in gene introgression will be greater between cytoplasimic and nuclear genes than between many linked nuclear genes under comparable selection intensity. Moreover, compared with the selection acting on those linked nuclear genes, the selection acting on cytoplasmic and nuclear genes with disequilibria is usually weak (Barton and Hewitt, 1985, 1989). Thus, it is expected that cline discordance between cytoplasmic and nuclear genes occur more often than that between linked nuclear genes. Complete cline concordance between cytoplasimic and nuclear genes cannot occur frequently and hence the concordance case generally reflects a limit situation.

The mechanism responsible for the discrepancy in the speed of gene introgression between cytoplasimic and nuclear genes is difficult to address inclusively. Of many factors, such as asymmetric mating and founder effects, the differentiation in dispersal of male and female parents is probably primary. Owing to the different vectors by which migration of cytoplasmic and nuclear genes is realized, the impact of the differentiation in gene dispersal should be substantial and genes with localized introgression will lag behind those with dispersed introgression. Several plant species display a large ratio of pollen to seed flow (Ennos, 1994; Ennos et al, 1999), implying that the speed of gene introgression is slower for maternally inherited organelle genes than for nuclear and paternally inherited organelle genes. Observations on the differential patterns in introgression between cytoplasmic and nuclear genes have been recorded in the literature (eg Harrison, 1989).

In addition to the asymmetric migration among nuclear and uniparentally inherited organelle genes, selection as another force to operate in cline formation should also be attentive. It can act against nuclear genes but not against cytoplasmic genes, or against cytoplasmic genes but not against nuclear genes. Several reports on roughly equivalent frequencies for nuclear and cytoplasmic gene flow indirectly indicate that diverse natural selection intensities can act on them (Rieseberg and Wendel, 1993). Those nuclear genes that affect reproductive isolation between hybridizing populations or taxa will much likely lag behind those organelle genes that do not seriously concern the survival of plants. Thus, the differential in gene introgression between cytoplasmic and nuclear genes is also attributable to diverse selection intensities.

It can be predicted in theory that the validity of equations (5) and (11) indirectly indicates no significant cytonuclear disequilibria involving the formation of cline discordance/concordance. Once all genetic marker data in clinal situations are available, the disequilibria between cytoplasmic and nuclear genes, in fact, can be tested using the methods introduced by Basten and Asmussen (1997). Characteristic length can also be estimated, as demonstrated in one typical dispersal-selection cline recently reported in Castanea sativa Mill (Villani et al, 1999). If the ratio of pollen to seed flow can be approximately estimated by using neutral markers (Ennos, 1994), equations (5) or (11) provides a convenient way to estimate the relative selection coefficients between cytoplasmic and nuclear genes. If the disequilibria between cytoplasmic and nuclear genes are significant, use of the integrated cytonuclear data is recommended. Under this case, the characteristic length for combined data can be used to roughly estimate the relative contributions to selection, ie s1/(s1 + s2) = w212D/σ21 − 1, or s1/(s1 + s3) = w23D/σ2s − 1. Application of these techniques to real populations awaits accumulation of appropriate data collections.