Introduction

The influence of restricted gene flow and natural selection may vary at different spatial scales in patchily distributed species because both the rate of gene flow and differences in natural selection may change nonlinearly with geographical distance. The scale at which genetic spatial patterns develop depends on a complex interaction between gene flow, genetic drift and selection. Studies of population genetic structure at more than one spatial scale provide insight into the balance between these evolutionary forces (Gehring and Delph, 1999).

Microgeographic isolation-by-distance structure may be generated in a population after a few generations as a consequence of fine-scale genetic processes. Two different processes may be in the background of local spatial structure in selectionally neutral characters (Ennos, 2001; Kalisz et al, 2001). First, if gene dispersal by pollen and seeds is restricted within a continuous plant population, inbreeding and genetic substructuring become established by genetic drift in the absence of selection. As a result of this isolation-by-distance process, the relatedness of individuals increases with their spatial proximity. If loss of genetic variation from the finite population is balanced by mutation or long-distance gene flow, a quasi-equilibrium between these processes becomes established within 20–30 generations. At this quasi-equilibrium, the population is genetically structured and comprises patches or neighbourhoods of related genotypes. Second, spatial genetic structure may be generated in plant populations also as a consequence of sampling events that occur when the population is founded or regenerated. Spatial genetic structure generated in this fashion will be greatest where regeneration sites are colonized from a limited number of maternal/paternal seed parents. A key difference between this form of genetic structure and that generated purely by restricted gene flow is that it represents a nonequilibrium situation (Ennos, 2001; Kalisz et al, 2001). Thus, if founder events are behind the spatial genetic structure, it should diminish or disappear with increasing stand age. In some plant species, it has been shown that younger populations may show spatial genetic structuring, which is not evident in older ones (eg Miyamoto et al, 2002).

Silene tatarica is an endangered perennial plant growing typically in long, narrow patches at riverbanks and shores of two rivers in Finland. In our earlier study, we investigated the distribution of genetic variation within and between seven subpopulations of S. tatarica by using amplified fragment length polymorphism (AFLP) markers and the overall pattern suggested a ‘classical’ metapopulation structure of the species with rare migration events between the subpopulations (Tero et al, 2003). However, it is possible that different migration patterns are operational in local and regional scales in S. tatarica, as in related Silene species (McCauley, 1997; Giles et al, 1998; Gehring and Delph, 1999; Richards et al, 1999).

Several statistical methods have been used describe local spatial structure (see Escudero et al (2003) and Manel et al (2003) for recent reviews). Most often, spatial genetic structure has been described by means of spatial autocorrelation. Spatial autocorrelation analysis tests whether the observed value of a variable at one locality is independent of the values of the variable at neighbouring localities. Results are usually shown as correlograms, graphic displays in which values of autocorrelation coefficients are plotted against distance classes (Escudero et al, 2003). Population geneticists have often implemented this technique using Moran's I statistics or relatedness coefficients (Rousset, 2000; Hardy, 2003).

‘Patch width’ is an often used measure to describe spatial structure within a population in the context of spatial autocorrelation. It has usually been estimated from the first x-intercept of the correlogram of a spatial autocorrelation. Within this patch, the individuals are supposed to be more related than the average in the population and this estimate can be used as guideline to define the meaningful conservation units (eg Diniz-Filho and Telles, 2002). However, as shown recently by Fenster et al (2003) and Vekemans and Hardy (2004), patch width is not necessarily characteristic of the populations studied, as it seems to depend strongly on sampling scheme. Patch width could be used to define a conservation unit only in cases in which each individual in a population is genotyped.

Spatial genetic structure has also often been quantified as ‘neighbourhood size’ (Nb). Neighbourhood size is a concept originally defined by Wright (1969) and usually defined as Nb=4πσ2D, in which σ2 is the axial dispersal variance and D is the density of the population. The neighbourhood size was originally intended to approximate the effective size of the local random-mating units within a continuously distributed population. However, it has been shown that this definition does not have a rigorous basis. The analytical model of Rousset (1997, 2000) has shown that σ2D alone does not define the genetic differentiation. Even though the neighbourhood sizes may not have a rigorous theoretical basis, it is often a convenient synthetic way to express the balance between local genetic drift and gene dispersal within continuous populations (Vekemans and Hardy, 2004).

Neighbourhood size is usually estimated using direct demographic estimates of dispersal variances and density (eg Crawford, 1984). However, it may also be estimated indirectly from the slope of regression between genetic relatedness and geographic distance (Rousset, 1997; Hardy, 2003). Sometimes Nb has also been estimated using Moran's I values in spatial autocorrelation analysis relative to theoretical values in some simulation studies (Epperson and Li, 1997; Epperson et al, 1999). As indirect estimation of Nb seems to depend on certain conditions, Vekemans and Hardy (2004) have very recently suggested a new ‘Sp’ statistics to quantify spatial genetic structure. This statistics depends primarily on the rate of decrease of pairwise kinship coefficients between individuals with the logarithmic of distance in two dimensions and it seems to be often more useful in population or species comparison than neighbourhood size.

In this study, we extended our former macrospatial structure analyses (Aspi et al, 2003; Tero et al, 2003) to describe the small-scale genetic structure in subpopulations of S. tatarica and to investigate the processes behind the structure. Our hypothesis is that if spatial genetic structure is a consequence of founding events it should diminish or disappear with increasing stand age. On the other hand, if the local structure is a result of isolation-by-distance process, the possible differences between subpopulations should not be related to the stand age. As S. tatarica is an endangered species, we also wanted to find out whether possible spatial structure may have some conservation implications to the survival of the species.

Materials and methods

Study species and plant material

The main distribution of S. tatarica is on the Russian steppes with disjunctive occurrences in Hungary, Germany, Lithuania, and NW-Russia (Ulvinen, 1997). The northwestern range of the distribution is in northern Finland, where it has invaded the riverside habitats of the Oulankajoki and Kitinen Rivers after the last ice age. In Finland, S. tatarica is an endangered species with spatially structured subpopulations widely scattered in a more or less continuous areas of suitable habitats along the riverside. The plants grow from open sand and gravel shores and erosion banks to more densely vegetated shores and riverbanks (Aspi et al, 2003). The seeds are presumably dispersed by gravity at the slopes, by wind, water, and probably by animals and man as well (Tero et al, 2003; Jäkäläniemi et al, 2004). Both the establishment of new patches and the expansion of old patches occur exclusively by seed dispersal. Colonization and extinction rates of subpopulations in the Oulankajoki River seem to vary between years. In a study period between 1999 and 2003, the colonization rate varied between 0.002 and 0.117, and the extinction rate between 0.007 and 0.05. The projected finite rate of increase in the number of patches over the 5-years study period (the arithmetic mean) was 1.03 (Jäkäläniemi et al, 2005).

The study area is situated in the Oulankajoki River valley (66°N and 29°E) in northern Finland. The microspatial structure of S. tatarica was investigated from three subpopulations (see Table 1, Figure 1) used in the earlier studies (Aspi et al, 2003; Tero et al, 2003). Jäkälämutka (henceforth JAK) is a young subpopulation and with respect to habitat openness it represents open/intermediate habitat type (see Aspi et al, 2003). Nurmisaari (NUR) and Purkuputaanniemi (PUR) are old subpopulations and represent open-habitat type and degenerating subpopulation in a closed habitat, respectively. We sampled and mapped a representative number (ie 20–30; see eg Diniz-Filho and Telles, 2002) of individuals along a random transect from these subpopulations (Table 1, Figure 1) so that the mean distance between individuals was rather similar in each subpopulation. The distance of the sample of individuals to the nearest fertile individual has been estimated during the former studies (Aspi et al, 2003), and the density in each subpopulation was estimated as

where n is the sample size and ri is the distance from random individual i to nearest neighbour (Byth and Ripley, 1980).

Table 1 Description and local structure of the populations used in the study
Figure 1
figure 1

Schematic map of the sample sites. The size of the circle symbolizes the subpopulation size, and the arrow indicates the direction of flow of the river. The subpopulation name abbreviations refer to Table 1.

For this study, we conducted a more detailed microspatial analysis of a new subpopulation, Valtikkasaari (VAL). This subpopulation is a recently colonized, isolated island subpopulation from which we collected leaf samples and mapped all individuals.

The average subpopulation ages have been estimated mainly on the basis of the relative age of the largest individuals in the population. The exact age of an individual plant in S. tatarica could be estimated on the basis of annual rings formed in the main root of the plant (Jäkäläniemi et al, 2004). However, because this is very tedious at subpopulation level, and because the number of buds and, respectively, also the number of stems increases with the aging of plant (Jäkäläniemi et al, 2004), we used the number of stems to estimate the relative age of a subpopulation. The mean number of stems of the 10 largest individuals was clearly larger in the NUR (45.9; SD=7.9; the largest individual with 103 stems) and PUR subpopulations (38.9; SD=21.9; largest with 93 stems) compared to JAK (21.7; SD=12.3; largest with 59 stems) and VAL (25.5; SD=8.3; largest with 49 stems) subpopulations, confirming that the former ones represent older and the latter ones younger subpopulations.

For parentage analysis, we collected leaf material and capsules from all fertile individuals of VAL subpopulation. The seeds from each capsule were sown in a greenhouse in a turf–sand mixture (pH 6; conductivity 1.5 mS/cm; nutrient content: N 120, P 40, K 200 mg/l). Light–dark cycle during the first 2 months was 10/14 and after that 16/8. We counted the number of offspring produced per each plant after 3 months.

DNA extraction and AFLP analysis

Genomic DNA was isolated from leaves of individual plants using the slightly modified CTAB method (Rogers and Bendich, 1985). The starting material was 0.1 g frozen leaves and the DNA was isolated from all samples of the VAL subpopulation. The AFLP marker system, reactions, and AFLP electrophenograms analysing are described in Tero et al (2003). The primer pairs were the same as earlier study for JAK, NUR and PUR. In VAL subpopulation analysis we did not use E-ACA/M-CTT primer pair because it was not informative in the preliminary analysis. AFLP genotypes were also scored for 214 offspring from 19 mothers collected from VAL subpopulation.

The use of dominant markers like AFLP in population genetic analysis may sometimes be more limiting than single-locus codominant markers (eg Hollingsworth and Ennos, 2004). However, we used AFLP in the analysis of spatial genetic structure and parentage analysis. For these purposes, AFLP markers seem to be as good or even more efficient than codominant markers given that sufficiently large amount of loci are scored (eg Krauss and Peakall, 1998; Gerber et al, 2000; Hardy, 2003; Nybom, 2004). According to a recent review of Nybom (2004), AFLP markers are as effective as codominant markers in estimating intraspecific genetic diversity in plants.

Statistical methods

Expected genetic diversity in different populations was estimated using POPGEN software and assuming Hardy–Weinberg equilibrium in subpopulations (see Tero et al, 2003). Linkage equilibrium analysis was conducted using LIAN software ver 3.1 (Haubold and Hudson, 2000). LIAN tests were carried out for independent assortment by first computing the number of loci at which each pair of taxa differs. From the distribution of mismatch values, a variance (VD) is calculated. This is compared to the variance expected (VE) for linkage equilibrium. The significance of the ratio was tested by Monte Carlo simulation test with 10 000 random resamplings (see Haubold and Hudson, 2000; Haubold, 2001). LIAN software was also used to estimate the standardized index of association for each population as

where l is the number of loci analysed. This index is comparable between studies as long as it can be assumed that the neutral mutation parameter is constant.

The spatial autocorrelation analysis was conducted in each subpopulation using Hardy's kinship coefficient between individuals versus distance in logarithmic scale (Hardy, 2003) using SPAGeDi ver. 1.1b (Hardy and Vekemans, 2002, 2003). When estimating the kinship coefficients between individuals, we assumed that the inbreeding coefficient was 0. This is not necessarily true in S. tatarica. Even though there is no any direct evidence, there is still some indirect evidence that the level of inbreeding in this species is probably low (see Results and Tero et al, 2003). However, the estimate of kinship is fairly robust to errors made on the assumed inbreeding level (Hardy, 2003). As there is no consensus regarding the way to generate distance classes, we used the equal frequency method (Escudero et al, 2003), that is, uneven lags that comprise a constant number of individuals. As the number of individuals analysed was larger in the VAL subpopulation, we used 10 distance classes in the VAL subpopulation and six in the others to ensure that each distance class include approximately similar amount of pairs (c. 50–70) in each subpopulation. A Jackknife procedure (over loci) was used to estimate standard errors for each distance class and 10 000 randomizations of individual spatial locations were performed to test for the overall spatial structure (Hardy and Vekemans, 2002, 2003).

We used the first x-intercept of the correlogram of a spatial autocorrelation to estimate the patch width for VAL subpopulation. We did not estimate this quantity to other subpopulations, because we did not sample every individual in those subpopulations.

To characterize the spatial genetic pattern of subpopulations, we estimated the indirect estimate of neighbourhood size (Nb) and Sp statistics for each subpopulation on the basis of spatial autocorrelation. The neighbourhood size was estimated as −(1−F1)/b and Sp statistics as b/(1−F1) where b is the regression slope, and F1 is the average Fij (kinship) estimate for adjacent i,j individuals (Hardy, 2003; Vekemans and Hardy, 2004).

We also estimated the neighbourhood size using Wright's equation (Nb=4πσ2D) and direct demographic estimates of seed and pollen dispersal distances from VAL subpopulation. We used the AFLP genotypes of the offspring, their known mothers, and potential fathers to estimate direct unidirectional demographic estimates of pollen dispersal. We estimated the demographic parameters only in VAL subpopulations because it is an isolated island population in which pollen flow outside of the population – which may confuse the paternity analysis – is improbable. In addition, we had some estimates of pollinator flight dispersal distances for this subpopulation.

The results of parentage analysis were also used to estimate male reproductive success and the amount of selfing in VAL subpopulation. For these purposes, we used Famoz parentage analysis software (Gerber et al, 2003) to estimate the exclusion probabilities for each offspring–parent pair. The formulas of the probabilities for dominant markers are given in Gerber et al (2000). Categorial allocation approach (Jones and Arden, 2003) was then used to select the most likely parent from the pool of nonexcluded parents. The most probable single father for each produced offspring was identified on the basis of likelihood ratio scores (LOD). These ratios compare the likelihood of an individual being the father of a given offspring divided by the likelihood of these individuals being unrelated (Gerber et al, 2000; Jones and Arden, 2003). The threshold for the significance of a LOD score in Famoz program was estimated using simulations. We ran the program with different departures from HW and error rates (0.1, 0.5, and 0.01), to see whether changes in these values would affect LOD scores and the most probable father. However, varying the error rates did not change the order of most probable fathers in any case even though there were differences in LOD scores. The threshold for LOD scores was based on the distribution of simulated LOD scores inside the stand (Gerber et al, 2000). In the analysis, we assumed that the distance between known mother and most probable father represented pollen dispersal (eg Ennos, 2001).

We used parentage analysis to detect possible mother–daughter pairs in the population to allow estimation of direct seed dispersal distances. For each mother plant, it was estimated whether a probable parent pair for it could be found among the other members of the subpopulation. We used Famoz software for parentage analysis as described for the paternity analysis. We assumed that the nearer individual of a most possible parental pair was the mother, and the distance between mother and a given individual represented seed dispersal (c.f. Ennos, 2001).

We used the estimated direct pollen and seed dispersal distances to estimate the axial variances of these dispersal measures following Crawford (1984). Both seed and pollen dispersal axial variance were estimated as one-half of the variance of absolute dispersal and the total parent–offspring variance was estimated as

in which δ2p and δ2s are estimates of pollen and seed dispersal variances (see above). When estimating effective density for the direct estimate of neighbourhood size, the surveyed population size was not used because family sizes in VAL subpopulation show greater variation than the expected for the idealized population (see Results). Therefore, we used the effective size when estimating the density. For this purpose, we first estimated the variance in the number of offspring produced by each individual in the subpopulation. The male reproductive success for each plant was estimated as the number of offspring sired (see below). The number of seeded capsules per plant counted in the wild was assumed to represent female reproductive success (it is highly correlated with seed production; Aspi et al, 2003), and this number was scaled relative to the total number of offspring produced. The total reproductive success for each individual was estimated as the mean of male and female reproductive success. The effective population size for VAL subpopulation using the formula (eg Frankham et al, 2002, p 245):

where N is the number of individuals in the previous generation, and k and Vk are the mean number and variance of offspring produced, respectively. Subpopulation density was deduced by dividing the number of effective individuals by the studied area. The ratio of effective size and surveyed population was estimated as Ne/N=k/[(k−1+(Vk/k)] (eg Frankham et al, 2002).

Results

Polymorphism and gene diversity

In the VAL subpopulation, we scored the dominant AFLP genotypes of all 38 individuals and 214 offspring with respect to 242 loci. The number of seedlings produced was very uneven between mother plants (from zero to over hundred). The AFLP genotypes were scored at a maximum for 20 offspring/mother. The level of genetic variation appeared to be in the lower part of the diversity range formerly observed among S. tatarica populations. The percentage of polymorphic loci was 22.3% (54) and Nei's gene diversity 0.076 (SD=0.156). The mean percentage of polymorphic loci among 361 scored loci formerly found (Tero et al, 2003) in the other subpopulations was 40.0% (range 25.9–54.9%) and Nei's gene diversity 0.127 (range 0.075–0.176). However, among the greenhouse-grown offspring, the genetic polymorphism appeared to be higher than among parental population. The proportion of polymorphic loci among offspring was 74.4% (180) and Nei's gene diversity 0.217 (SD=0.173).

Linkage disequilibrium

Linkage disequilibrium has formerly been detected in all other subpopulations except for VAL using other methods (Tero et al, 2003). The results of LIAN confirmed the former results and show that there was linkage disequilibrium also in VAL. The linkage disequilibrium was significant in each subpopulation (P<0.001). The standardized index of association was rather similar in three subpopulations: 0.022 in VAL, 0.044 in NUR, and 0.058 in PUR, respectively. However, in JAK, the index of association appeared to be clearly higher (0.192) than in the other subpopulations. The high value in JAK subpopulation may be due to the fact that formerly there were two spatially separated patches in the subpopulation, even though these patches are now interconnected. When we analysed the former patches separately, we found that the index of association was at the same level as in other subpopulations (0.030) in the larger group. Among the smaller group, the index of association was 0.310, causing the high index of association at the total subpopulation level.

Spatial genetic structure

The spatial autocorrelation analysis suggested spatial genetic structure in each subpopulation. The slope of the regression between kinship coefficient and distance between individuals was significant at least at level P<0.05 and at similar level in each subpopulation (Table 1). Values of the regression slopes (b) were negative, indicating that on average individuals spatially close to each other were more likely to be genetically related than individuals which were separated by larger distances. There was significant deviation from the population mean kinship estimate at least in one distance class in each subpopulation (Figure 2). Positive values of Fij were found at short distances, meaning that neighbour individuals had a higher genetic relatedness than random pairs of individuals, whereas negative values of Fij occurred at larger distances, indicating isolation-by-distance within a population. In the NUR subpopulation the Fij values decreased steadily in the first four, and in the PUR subpopulation only in first two distance classes, showing no further trend.

Figure 2
figure 2

Kinship coefficient versus logarithmic distance between individuals in different subpopulations. The subpopulation name abbreviations refer to Table 1.

The estimate of the patch width in VAL subpopulation using the first x-intercept of the correlogram was c. 7.8 m (Figure 2), suggesting that within this patch the individuals are more related than the average in the subpopulation.

Mean Sp statistics among the subpopulations was 0.0364 and there was only slight variation between the subpopulations, from 0.0234 to 0.0632 (Table 1). However, the stabilizing profile of correlograms in NUR and PUR subpopulations suggested that the Sp values for these subpopulations may have been estimated for a nonrelevant range of distances, and we estimated the Sp statistics values also for only the monotonically decreasing distance classes (ie for the first four distance classes in NUR and first two classes in the PUR subpopulation). This analysis gave Sp statistics estimates of 0.0911 and 0.0644 in PUR and NUR subpopulations, respectively. Correspondingly, the corrected mean Sp among the subpopulations was 0.0525.

The Sp statistics values were not lower in the old subpopulations as expected if the spatial genetic structure is caused by founder effects. On the contrary, the corrected Sp values tended to be higher in the older (0.0910 and 0.0644) than in the young subpopulations (0.0311 and 0.0234). There appeared to be a negative association (r=0.954; P=0.046) between density and Sp estimates when the latter one was estimated for a restricted range of distances.

Indirect estimates of Nb values

Indirect estimation of Nb assumes that spatial genetic structure results solely from isotropic limited gene dispersal, structure has reached a stationary phase, that the sampling scale is adequate with respect to σ2 and that the geometry of the population is also adequate (Vekemans and Hardy, 2004). In S. tatarica these conditions seem to be fulfilled, and the slope of the regressions was used to estimate gene dispersal distances in terms of a product between population density (D) and mean squared distance of axial gene movements (σ2). The estimated neighbourhood sizes among the studied subpopulations were relatively similar in each subpopulation (Table 1) and very much smaller than the surveyed subpopulation sizes except in the VAL subpopulation in which the Nb was about 85% of the survey size. When the Nb values for PUR and NUR subpopulations were estimated for only the monotonically decreasing distance classes, the estimate of Nb for NUR subpopulation was 11.0 and in PUR subpopulation 15.5.

Parentage analysis and direct estimates of Nb values

Paternity analysis in the VAL subpopulation revealed that the subpopulation was predominantly outcrossing. There was only one case out of 214 (0.47%) in which the mother of an individual offspring was also its most probable father. Assuming random mating, we would expect about 10-fold larger proportion of selfed (5.26%) individuals. This difference was statistically significant (χ2=9.87; df=1; P<0.01), suggesting that in S. tatarica there is active avoidance of selfing.

The distribution of unidirectional pollen dispersal distances are presented in Figure 3 together with the distribution expected under random mating of individuals. The mean pollen dispersal distance (24.10 m; SD=10.46) was significantly longer (Mann–Whitney U=51832.0; P<0.001) than the random expectation. Variance in male reproductive success was relatively high (mean=5.36; Var=193.0).

Figure 3
figure 3

Comparison of seed- and pollen-flow distributions with those expected under random mating of individuals in VAL subpopulation.

In the parentage analysis, we found 23 probable mother–offspring pairs from the population, but no probable mother were found for 15 individuals. Most likely reasons for the lack of probable mother are that (i) the mother has already died, (ii) the mother of a given individual originated from another subpopulation, or (iii) the resolution of our parentage analysis in these cases was not adequate. The mean distance between these pairs, that is, assumed seed dispersal (9.07 m; SD=9.23) was significantly (Mann–Whitney U=3971.5; P<0.001) shorter than the mean distance between individuals in the subpopulation (19.20 m; SD=13.80) suggesting restricted seed dispersal (Figure 3). Variance in female reproductive success was lower (mean k=5.36; Var=36.1) than in male reproductive success. However, the difference in reproductive variance between sexes was not statistically significant (Levene statistics: 1.86, df=1, df=2; P=0.177). Total variance in reproductive success was 77.9. This estimate of variation in offspring number would give as an effective population size of 10.9 in the VAL subpopulation. The ratio of surveyed population size (38) and effective size was 0.284. The Ne/N ratio in S. tatarica seems to be quite normal in plant populations. The Ne/N ratios given in Silvertown and Charlesworth (2001) were between 0.15 and 0.68 for various annuals, and in two perennials the ratios were 0.07 (Papaver dubium) and 0.11 (Eichhornia paniculata).

The axial variances of seed and pollen dispersal were 41.0 and 172.2, respectively, and the total parent–offspring variance was 127.1. In our VAL subpopulation, there were 38 individuals in the area of 1612 m2 giving a density estimate of 0.0236 ind/m2, and on the basis of population density and pollen and seed dispersal estimates this would give an estimate of neighbourhood size of Nb=37.6 for VAL subpopulation. This is very similar to the indirect genetic estimate of Nb=32.1 (Table 1). Given that the effective population size due to variation in offspring number was only 10.9 individuals, this would correspond to effective density of 0.0067 ind/m2 and the direct demographic estimate is then only Nb=10.8.

Discussion

Spatial genetic structure

In this study, we found microspatial genetic structure in each of the four subpopulations studied in S. tatarica. Spatial autocorrelation analysis revealed significant fine-scale structure, which – if anything – decreased in magnitude with age of the subpopulation. The high level of linkage disequilibrium within subpopulations in S. tatarica also indicated that the subpopulations consist of patches of related individuals. Our field observations suggest that new subpopulations in S. tatarica are founded by a few individuals (Jäkäläniemi et al, 2005) and this founder effect is the probable reason for observed linkage disequilibrium within young subpopulations. A relatively high amount of linkage disequilibrium at subpopulation level may indicate recent population bottlenecks or admixture (McVean, 2002). Linkage disequilibrium values tended to be higher in older than in younger subpopulations probably because the continuous isolation-by-distance process could have enhanced linkage disequilibrium in the older ones. This is supported by the fact that the Sp statistics values did not decrease with the age as expected if the spatial genetic structure is caused by founder effects.

Vekemans and Hardy (2004) has recently reanalysed data from mostly published studies and assessed the Sp statistic for 47 plant species. The Sp statistics varied between 0.006 and 0.263, and it was found to be significantly related to the mating system (higher in selfing species) and to the life form (higher in herbs than trees). The average of Sp for herbaceous species 0.0459 (Vekemans and Hardy, 2004) was between the averages of original (0.0364) and the estimate calculated for restricted range of dispersal (0.0525) in S. tatarica. Both averages were slightly larger than the value for mixed mating species (0.0372) in the review.

We found negative association between the corrected Sp statistics and density within S. tatarica subpopulations. The Sp statistics appeared to be higher (and correspondingly the neighbourhood size smaller) in older and denser populations.

Indirect and direct estimates of neighbourhood sizes

In this study, indirect genetic estimates of Nb varied from 15.8 to 42.7 (Table 1). The Nb values in the subpopulations were about one-tenth to one-fifth of the entire subpopulation sizes, except in VAL subpopulation in which the Nb was about 85% of the total size.

Our direct demographic estimate of Nb for the VAL subpopulation was 37.6. This is very similar to the indirect genetic estimate of Nb=32.1 (Table 1). When we used the estimate of effective population size (10.9) instead of surveyed size (38), the Nb estimate was only 10.8. However, given that there appears to be active inbreeding avoidance in S. tatarica, the estimate of Ne based on variance in offspring production may underestimate the real decrease in polymorphism due to drift. If unrelated individuals produce highly polymorphic offspring, the level of genetic variation does not decrease as dramatically as could expect from Ne value alone.

Indirect and direct estimates of neighbourhood sizes for the same species have been estimated only for a few cases. These studies have been reviewed by Rousset (2001). In two of these 10 studies, major discrepancies between the two approaches' findings have been reported, while in the others there was relative agreement. However, in the investigations with similar results the direct and indirect estimates of Nb typically differed by factor of 2 (Rousset, 2001; Sumner et al, 2001). On the other hand, in a recent study of Fenster et al (2003), direct and indirect estimates of Nb matched very closely in an annual legume, Chamaecrista fasciculata.

The range of neighbourhood sizes observed in S. tatarica seems to be typical for perennial herbs: Chung et al (1999) estimate of Nb for an endangered herb, Lycoris sanguinea var koreana, was about 25, and in a hybrid between Aconitum japonicum and A. jaloense the value of Nb was also about 25 (Chung and Park, 2000). Chung and Epperson (1999) have reported an Nb value of about 50 in Adenophora grandiflora, while a considerably lower Nb range (1.66–5.53) has been reported in Mimulus ringens (Karron et al, 1995). In Vekemans and Hardy's (2004) review, the mean neighbourhood size for herbaceous species was 21.8.

Dispersal distances

The estimate of mean pollen dispersal distance was rather high (24.10 m) in S. tatarica and significantly greater than the random expectation, indicating that pollen disperse throughout the population. In an investigation of direct pollen dispersal in S. tatarica (Siikamäki et al, unpublished), it was found that on average UV-coated dusts were moved only 6.9 m (SD=9.9; N=540). However, these direct estimates of pollen dispersal do not represent successful fertilization events as the father–offspring distances do.

The estimated pollen dispersal distances seem to be typical for perennial herbs. Pollen dispersal curves for both insect- and wind-pollinated species tend to be highly skewed, with most movement being relatively local with occasional very-long-distance movement. Most pollinations of the orchid Aerangis ellisii by hawkmoths occurred between plants no more than 5 m apart (Silvertown and Charlesworth, 2001). Godt and Hamrick (1993) have reported a range of 11.6–14.8 m of mean pollen flow in several populations of Lathyrys latifolius. In Chamaelerium luteum population, the mean pollen dispersal distance was 10.4 m (Meagher and Thompson, 1987), and in Heleonopisis orientalis about 5.0 m (Miyazaki and Isagi, 2000). In a related species, Silene alba, over 85% of pollinations occurred within the first 40 m from the pollen source (Richards et al, 1999), which appears similar to S. tatarica (Figure 3).

Our estimate of the mean seed dispersal distance in VAL subpopulation (9.07 m) indicated restricted seed dispersal. The estimate appeared to be rather high for a species with seeds without any special dispersal agent, even though seed dispersal distances using parentage analysis has been estimated only in a few cases. In Chamaelerium luteum the reported mean seed dispersal distance (10.4 m; Meagher and Thompson, 1987) seems to be rather similar than in S. tatarica.

Inbreeding

Our parental-analysis results indicate that S. tatarica is predominantly outcrossing with minimal amount (0.47%) of selfing. This result does not exclude the possibility that there could be some biparental inbreeding in the subpopulations. However, assuming random mating in the VAL subpopulation, we would expect a greater than 10-fold increase in the proportion of selfed offspring. The most likely reason for the deficiency is that the S. tatarica flowers seem to be protandrous, that is, stamens are developed before the pistillates. The discrete male and female phases of flowers may thus prevent selfing. The loss of inbred offspring may also indicate that, in S. tatarica, there is selection against pollen involved in short-distance dispersal events because individuals near each other are more likely related. In fact, the observed distribution of pollen dispersal distances included only a few short-distance dispersals (Figure 3). Avoidance of mating between closely related individuals was also supported by the fact the direct observations of dispersal of UV-coated dusts were clearly shorter than the estimated father–offspring distances. Still another fact supporting inbreeding avoidance was that Nei's genetic diversity was much higher among the offspring (D=0.217) compared with that of the parental generation (D=0.076).

Mechanism of spatial genetic structure in S. tatarica

Genetic substructuring in plant populations may evolve as a consequence of sampling events that occur when the population is founded or regenerated, or also if gene dispersal by pollen and seeds is restricted within a population. Our results suggested that the local spatial genetic structure in S. tatarica was attributed merely to the isolation-by-distance process rather than founder effects, even though we could not exclude the possibility that the latter mechanism may also be important in recently established subpopulations.

As we found spatial genetic structure in old subpopulations, there appears to be a mechanism(s) that maintains observed spatial genetic structure in these subpopulations. Our parentage-analysis results suggested that, in S. tatarica, spatial genetic structure developed as a result of restricted seed dispersal because pollen appeared to disperse throughout the VAL subpopulation. However, because Sp appeared to increase and Nb to decrease in older subpopulations of S. tatarica, pollen and seed dispersal and also recruitment of seedlings in older and denser populations may be more restricted than in younger ones. High population density tends to reduce neighbourhood area (and increase Sp statistics) in insect-pollinated species, because flight distances and thus pollen dispersal distances are shorter in denser populations (Silvertown and Charlesworth, 2001). At Oulankajoki River, flood water is the main dispersal agent of seeds within and between subpopulations of S. tatarica (Tero et al, 2003; Jäkäläniemi et al, 2004), and the effectiveness of this agent may vary between old and young subpopulations. Flood water often covers the subpopulations in spring and would then be able to spread seeds longer distances in the relatively open habitat of young subpopulations compared to the closed vegetation of older ones. In addition to seed dispersal, the seedling establishment is highly dependent on the amount of suitable habitats and safe microsites for seedlings. High S. tatarica density in the closed habitat of the older subpopulations seems to efficiently prevent establishment of new recruits (Siikamäki et al, unpublished). Open young subpopulations are also more probably receiving seeds from other subpopulations more than the older ones (Jäkäläniemi et al, 2005).

In many of cases where the mechanism behind the spatial genetic structure has been revealed in perennial herb species (eg Caujape-Castells and Pedrola-Monfort, 1997; Kang and Chung, 1997; Chung et al, 1998, 1999; Chung and Epperson, 1999; Chung and Park, 2000; Hardy et al, 2000; Miyazaki and Isagi, 2000; Stehlik and Holderegger, 2000; Williams et al, 2001), it appears to be an isolation-by-distance process, as in S. tatarica. Pollen and seed dispersal patterns similar to S. tatarica have been obtained in other herbaceous species. In an evergreen perennial herb, Heloniopis orientalis (Miyazaki and Isagi, 2000) spatial population structure is maintained by very limited seed dispersal (>60 cm) even though pollen disperses over a wide range. Mean dispersal distances for all seeds (dispersed and undispersed) in Trillium grandiflorum were 0.446 and 0.169 m and it was highly localized relative to pollen dispersal (Kalisz et al, 2001).

Conservation implications

Subpopulation size, neighbourhood, and migration between subpopulations are all important factors to long-term survival among endangered species (Kawata, 1997, 2001). The mean number of individuals in S. tatarica subpopulations appeared to be reasonably large (Table 1). However, the ratio of effective size and observed surveyed size (Ne/N=0.284) in VAL subpopulation was rather low, indicating that the effective size at the total population may be much lower than the survey size. On the other hand, the observed inbreeding avoidance in S. tatarica may decrease the loss of genetic variation and thus the effect is not as strong as the estimated Ne/N ratio suggests.

In addition to population size, the neighbourhood size may also be important to species survival in endangered species. Recently, Kawata (2001) has shown that neighbourhood size and total population size both independently affect the average number of homozygous deleterious loci per individual. If we apply the regression equation of Kawata (2001; Equation (2)) for the number of homozygous deleterious loci per individual: Y=59.36X1+21.68X2−0.01 (where Y is the number of deleterious alleles, X1=1/N, and X2=1/Nb), the number of deleterious alleles per individual in the larger populations is reasonably low (c. 0.75 in JAK and 0.90 in PUR subpopulation). However, in the smaller subpopulations, the number of deleterious alleles was slightly larger (2.24 in VAL and 1.76 in NUR subpopulation). Given the reasonable large total population size and observed migration (Nm=0.556) between subpopulations (Tero et al, 2003), and also the avoidance of inbreeding observed in this study, the accumulation of deleterious alleles and loss of evolutionary potential in S. tatarica seems to be minor at total population level.

In the Red Book of Finland (Rassi et al, 2000), S. tatarica is classified an endangered (vulnerable) species. However, the demographic analysis of Jäkäläniemi et al (2005) suggested that the species is able to maintain viable populations, and our present analysis confirms that there are no special genetic threats to the survival of this species at Oulankajoki River. On the other hand, the situation may be totally different at the other main distribution area in Finland. At the Kitinen River, the main threat factor for the species is the recent construction of hydroelectric power plants and water-level regulation, which prevent natural river dynamics and creation of new vegetation-free habitat for poorly competitive S. tatarica (Ulvinen, 1997; Rassi et al, 2000). At some point survival of the species at Kitinen area may require active translocation of individuals from Oulanka area to supplement or reinforce the existing population. Natural migration between these metapopulations is not probable because these rivers are over 100 km apart and separated by a watershed.

Our former study of the regional structure and the recent results provide a guideline for selection of individuals for transplantation purposes and define the meaningful within population conservation unit. In this study, we found local spatial structure within subpopulations, and our estimate of patch width in VAL subpopulation suggest that we should avoid collecting individuals less than 7.8 m apart for transplantation purposes. As our patch width estimate suggests, individuals within this area are more related than the average in the subpopulation, and collecting individuals within this area may result in a low level of the genetic diversity among the sampled individuals.