Introduction

Endangered species, whose populations are in many cases small and isolated, are susceptible to loss of within-population genetic variation by genetic drift, which results in random fluctuations of allele frequencies. Loss of genetic variation reduces the evolutionary potential of populations to adapt to changing environments and increases the likelihood of biparental inbreeding, which decreases the fitness of progenies (Ellstrand and Elam, 1993; Frankham, 2005; Leim et al., 2006; Ouborg et al., 2006). Consequently, affected populations tend to decline and can fall into an ‘extinction vortex’ (Gilpine and Soule, 1986). Thus, detailed information on within- and among-population genetic variation over species' ranges and the factors influencing their levels of within-population genetic variation are needed to conserve such populations of endangered species.

Two factors that are believed to strongly influence the level of genetic variation within populations are the effective population size and degree of population isolation (Cruzan, 2001; Silvertown and Charlesworth, 2001; Ouborg et al., 2006). Thus, it is important to consider the effects of these variables in analyses of genetic variation, but choosing the best measures for them is not straightforward. Many authors have used the actual number of individuals within investigated populations as a measure of population size, assuming that it is correlated with their effective population size, and found stronger relationships between this variable and genetic variation (for example, Young et al., 1999; Galeuchet et al., 2005; Gao, 2005; Leim et al., 2006). The degree of population isolation can be quantified by measuring positional relationships between a focal population and its surrounding populations. The simplest way of doing this is to measure the distance to the nearest population. This approach has been used by various authors (for example, Llorens et al., 2004; Galeuchet et al., 2005; Lu et al., 2005), but few have found significant relationships between this parameter and genetic variation (Lu et al., 2005), probably because it provides a too simplistic indication of isolation (Moilanen and Nieminen, 2002). The cited authors studied possible connectivity measures in spatial ecology and recommended use of a ‘buffer measure’ based on the number of populations within a given radius around the focal population rather than simply the distance to the nearest population.

The species studied here, Magnolia stellata (Sieb. et Zucc.) Maxim., is a threatened tree species endemic to the Tokai region of central Japan (Environment Agency of Japan, 2000). It occurs in swampy places such as small rivers and marshes (Ueda, 1988) and at 40–700 m (mainly 200–500 m) elevations (Japan Association for Shidekobushi Conservation, 1996), and forms hierarchical metapopulation structures with sets of adjacent local populations (Setsuko et al., 2007). About 75% of these local populations are comprised of less than 100 individuals (Japan Association for Shidekobushi Conservation, 1996). Most populations are intermittently distributed from southern Gifu to central Aichi Prefectures, but a small number of populations are isolated on the Atsumi Peninsula of southern Aichi Prefecture and in northern Mie Prefecture (Figure 1; Japan Association for Shidekobushi Conservation 1996). Since habitat destruction and vegetation succession have decreased both the numbers of populations and numbers of individuals within each population, the species is now listed as ‘vulnerable’ in the Japanese Red Data Book (Environment Agency of Japan, 2000). However, only a few populations are currently protected, including several designated as natural monuments by the national government, or authorities in Prefectures or cities.

Figure 1
figure 1

Locations of the 20 sampled Magnolia stellata populations (black dots) and the species' range (grey areas). Populations are represented by the codes listed in Table 1.

The genetic variation and differentiation in M. stellata populations have been investigated in three previous studies. First, Kawahara and Yoshimaru (1995) investigated nine populations using 15 allozyme markers, and found that the level of within-population genetic variation was lower in the isolated populations, while there was a high level of genetic differentiation among all populations (GST=0.254). In the cited study, however, few populations were sampled from the central part of the species' range. Second, Nakashima and Sakai (2003) investigated 14 populations sampled solely from the Tono region (southeastern Gifu Prefecture) using 10 allozyme markers, but did not identify the factors influencing the level of within-population genetic variation. Third, Ueno et al. (2005) investigated 11 populations, including the nine examined by Kawahara and Yoshimaru (1995), using four microsatellite markers developed in M. obovata, and also failed to identify factors influencing the level of within-population genetic variation, although they did detect some regional genetic structure. Contributory reasons for the failure of these three studies, and various studies of other species, to detect the factors influencing the level of within-population genetic variation were probably the regionally biased sampling of small numbers of populations and low resolution of the genetic markers used (allozymes and small numbers of microsatellites). Microsatellite makers, especially those developed in target species, generally have far larger numbers of alleles, including many rare alleles, than allozyme markers. Since the number of alleles is susceptible to random genetic drift, microsatellite markers are also suitable for detecting reductions in within-population genetic variation associated with random genetic drift. Thirty microsatellite markers were recently developed in M. stellata (Setsuko et al., 2005). We hypothesized that use of these markers to investigate a large number of populations extensively sampled from the species' range would allow us to identify factors influencing their levels of within-population genetic variation, which could not be detected in the previous studies.

The objectives of this study were (i) to estimate within- and among-population genetic variation in 20 M. stellata populations distributed across most of the species' range using 10 microsatellite markers, nine of which were developed in M. stellata (Setsuko et al., 2005) and one developed in the related species M. obovata (Isagi et al., 1999); (ii) to identify the factors influencing the level of within-population genetic variation, focusing on the effects of variations in the number of reproductive individuals within each population and the degree of population isolation represented by the number of surrounding populations and (iii) to consider possible conservation strategies for this species based on the findings.

Materials and methods

Study species and population sampling

M. stellata is a deciduous broad-leaved tree that grows up to 10 m in height and 20 cm in diameter at breast height. It reproduces both by seeds and by clonal growth (sprouts and layering) (Gotoh et al., 1998; Setsuko et al., 2004). Flowers are bisexual, protogynous and pollinated by insects, and mixed mating occurs (Hirayama et al., 2005). Fruits are aggregated, with at most two red seeds per follicle that may be dispersed by birds (Callaway, 1994).

The natural range of M. stellata was roughly divided into three regions: southern Gifu to central Aichi Prefectures (region A); the Atsumi Peninsula in southern Aichi Prefecture (region B) and northern Mie Prefecture (region C) (Figure 1). Twenty populations were randomly selected across its natural range as study populations (Figure1 and Table 1). A population was defined as an aggregation of individuals along streams or in marshes. We counted the number of reproductive individuals in flowering seasons in each population and defined this parameter as the population size. For DNA analysis, leaf samples were collected from 27 to 77 reproductive individuals in each population during 2004–2006 and stored at −30 °C until DNA extraction.

Table 1 Location, sample size and population size of the 20 investigated Magnolia stellata populations

DNA extraction, PCR amplification and microsatellite analysis

Genomic DNA was extracted from the sampled leaves using the hexadecyltrimethylammonium bromide method (Murray and Thompson, 1980) with minor modifications. Genotypes were determined using 10 microsatellite markers: nine (stm0148, stm0184, stm0191, stm0222, stm0223, stm0251, stm0334, stm0353 and stm0423) developed for M. stellata (Setsuko et al., 2005), and one (M6D8) developed for M. obovata (Isagi et al., 1999). PCR fragments were separated using a 3100 Genetic Analyzer and GeneScan software (Applied Biosystems) (see Setsuko et al., 2005, for details).

Data analysis

The following parameters were calculated at each locus and over all loci in each population using FSTAT ver. 2.9.3.2 (Goudet, 2001): allelic richness (AR), calculated with a fixed sample size of 27 (54 gene copies) (El Mousadik and Petit, 1996), gene diversity (HE; Nei, 1987) and inbreeding coefficient (FIS=1−HO/HE). Departures from Hardy–Weinberg equilibrium (HWE) at each locus and linkage disequilibrium between loci were tested in each population by an exact test using the Markov chain method implemented in GENEPOP ver. 3.4 (Raymond and Rousset, 1995), with sequential Bonferroni corrections (Rice, 1989). Assuming that the microsatellite loci used in this study were randomly sampled from an infinite number of loci, bootstrap resamplings among loci (20 000 samples) were used to create 95% bootstrap confidence intervals for AR, HE and FIS for each population. Deviations of FIS from zero over all loci in each population were tested using bootstrap confidence intervals of various percentages generated by the same bootstrap resampling method, with sequential Bonferroni corrections (Rice, 1989).

The total number of alleles detected (A), allelic richness (AR), observed heterozygosity (HO), average gene diversity within populations (HS), gene diversity in the total populations (HT), inbreeding coefficient (FIS) and the coefficient of genetic differentiation among populations defined under the infinite allele model (FST; Weir and Cockerham, 1984) and the stepwise mutation model (RST; Slatkin, 1995; Rousset, 1996) were calculated across all populations at each locus and over all loci using FSTAT. Overall tests for departures from HWE across all over populations at each locus and over all loci were performed by Fisher's procedure (Fisher, 1970). The significance of population differentiation at each locus was tested by the log-likelihood (G)-based exact test (Goudet et al., 1996), with sequential Bonferroni corrections (Rice, 1989), using the Markov chain method implemented in GENEPOP, and its significance over all loci was tested by Fisher's procedure (Fisher, 1970). To test the presence of isolation-by-distance patterns in population differentiation, the Mantel test between population-pairwise geographic distances and genetic distances (DA distances; Nei et al., 1983) was applied. A dendrogram was constructed to interpret the relationships among populations by the neighbor-joining method (Saitou and Nei, 1987) based on DA distances between populations using DISPAN (Ota, 1993). The node significances of the dendrogram were evaluated using bootstrap probabilities based on 10 000 replicates.

Linear mixed-effect models were used to investigate the relationships between the level of genetic variation within each population and both its population size and degree of population isolation. We defined the size of each population as the number of reproductive individuals (s) within it, and its degree of population isolation as the number of populations within radii of 0.5–10.0 km with 0.5 km intervals (20 distance classes; n0.5n10.0). We set the distance interval to 0.5 km since this was the minimum distance that significantly exceeded the measurement errors regarding the positions of populations. The number of surrounding populations was obtained from detailed distributional information of M. stellata populations published by the Japan Association for Shidekobushi Conservation (1996) together with our field observations. The population size and degree of population isolation were treated as fixed effects and the differences between loci as random effects. Two different response variables were used in the models: allelic richness (AR) and gene diversity (HE). Data for HE were arcsine-transformed to obtain closer approximations to normality. Forty-two candidate models with 1–3 fixed explanatory variables were constructed for each response variable. All models had fixed and random intercepts (the latter arose from the differences between loci). We estimated parameters in each model by the maximum likelihood method using R 2.3.1 (R Development Core Team, 2006).

To evaluate the candidate models, we calculated Akaike's information criterion (AIC; Akaike, 1973) values for each of them using the equation, −2 × (log-likelihood)+2 × (the number of estimated parameters). Models with low AIC values have higher likelihoods and contain smaller number of variables than models with higher values. A substantial advantage of using AIC, for the purposes of this study, is that it is valid for comparing non-nested models (Burnham and Anderson, 2002; Jhonson and Omland, 2004). The differences between the AIC values obtained for each of the models and the minimum AIC value (ΔAIC) were calculated, and each model with a ΔAIC value 2 was considered to be one of the most parsimonious, and thus selected as one of the best models (Burnham and Anderson, 2002). We also calculated Akaike weights, from exp[−0.5 × (ΔAIC of the focal model)]/Σexp[−0.5 × (ΔAIC)], to assess the relative support for each model (Burnham and Anderson, 2002). Akaike weights range from zero to one and are normalized so that they sum to one over all candidate models. Akaike weights are convenient for comparing the relative likelihoods of models in a set of candidate models.

Results

Within-population genetic variation

Two hundred tests for deviations from HWE at each locus in each population were applied. Significant deviations were detected by 16 of these tests. There were no significant deviations at any locus across more than 3 populations, and 6 of the 16 significant deviations detected were at 6 loci in the Asahidani population. In addition, 900 tests for linkage disequilibrium between pairs of loci in each population were applied, 20 of which detected significant deviation from equilibrium, but the deviation was not significant for any pairs of loci across more than 3 populations. The loci used in this study exhibited high polymorphism (Table 2). The total number of alleles detected (A) and average gene diversity within populations (HS) at each locus across all populations were, on average, 22.1 and 0.719, respectively. FIS values across all populations were positive at nine loci and over all loci, and departures from HWE were significant at the same nine loci and over all loci. The allelic richness (AR) and gene diversity (HE) of the populations ranged from 2.99 to 11.64 and from 0.517 to 0.863, with mean values of 7.45 and 0.719, respectively (Figures 2a and b). Levels of within-population genetic variation were generally higher in populations in region A (although they were substantially lower in three populations in this region: Iji, Obora and Kotohira) than in populations in regions B and C, except for the Tabika population. Unexpectedly, the Tabika population showed the highest genetic variation (HE=0.863 and AR=11.64) of all the populations, even though it was highly isolated from other populations (Figure 1) and its population size was modest (Table 1). FIS values of the populations ranged from −0.126 to 0.297, and averaged 0.073 over all the populations (Figure 2c). Significant positive deviations of FIS values from zero were detected in three populations (Chikusui-ike, Kotohira and Asahidani).

Table 2 Within- and among-population genetic variation in the 20 investigated Magnolia stellata populations
Figure 2
figure 2

Allelic richness (a), gene diversity (b) and inbreeding coefficient (c) in the 20 investigated Magnolia stellata populations. Dots and bars show values calculated using the original data set and their 95% bootstrap confidence intervals were estimated by 20 000 bootstrap resamplings among loci, respectively. Deviations of FIS from zero at each population were tested using bootstrap confidence intervals of various percentages generated by the same bootstrap resampling method, with sequential Bonferroni corrections (Rice, 1989). **P<0.01; ***P<0.001. Populations are represented by the codes listed in Table 1.

Among-population genetic variation

All the loci used in this study showed significant population differentiation (P<0.001) and the values of FST and RST over all loci were 0.185 and 0.274, respectively (Table 2). There was also a significant correlation between pairwise-population geographic distances and DA distances (R2=0.579, P<0.001, Mantel test; Figure 3). The neighbor-joining tree based on DA distances between populations seems to reflect the populations' geographical distribution (Figures 1 and 4). Four nodes of this tree were supported with high bootstrap probabilities (more than 88%). These nodes correspond to geographical locations at the edge of the eastern Gifu Prefecture (98%), the root of the Atsumi Peninsula (90%) and two within the peninsula (88 and 100%).

Figure 3
figure 3

DA distances between the 20 investigated populations of Magnolia stellata versus geographical distances between them.

Figure 4
figure 4

Neighbor-joining tree constructed using DA distances between the 20 investigated Magnolia stellata populations. The numbers shown are bootstrap probabilities exceeding 50% based on 10 000 replicates. The length of the bar is equal to the genetic distance of 0.1. Populations are represented by the codes listed in Table 1.

Factors influencing the level of within-population genetic variation

All regression coefficients of fixed explanatory variables (c, s, n0.5–n10.0) against the two response variables (AR and HE) estimated by the maximum likelihood method were positive. The models, providing the best explanations for AR and HE, had the same parameters (s and n0.5) with ΔAIC2 (Table 3). Akaike weights of these models were 0.611 and 0.237 for AR and HE, respectively. The models had two fixed explanatory variables: the population size (regression coefficients±95% confidence intervals: 0.0106±0.0047 for AR; 0.0004±0.0003 for HE) and the number of surrounding populations within a radius of 0.5 km around the focal population (0.3657±0.2080 for AR; 0.0230±0.0143 for HE) (Table 4).

Table 3 The best models (in bold), and four next best models, explaining the levels of within-population genetic variation of the 20 investigated Magnolia stellata populations, selected according to Akaike's information criterion (AIC)
Table 4 Fixed-effect explanatory variables of the linear mixed-effect models (random effect: locus) that best explained the level of within-population genetic variation of the 20 investigated Magnolia stellata populations, selected according to Akaike's information criterion (AIC)

Discussion

Within- and among-population genetic variation

The level of within-population genetic variation tended to be higher for populations in region A than for populations in regions B and C, in accordance with the results of previous studies (Kawahara and Yoshimaru, 1995; Ueno et al., 2005). This trend may be partly due to region A being central to the species' range, while regions B and C are marginal, and partly due to the populations in regions B and C being more isolated from one another than those in region A. A finding that was not observed in the previous studies was that some populations in region A (Iji, Obora and Kotohira) showed substantially lower variation than the others in this region, even lower in some respects than those in regions B and C (Figure 2). One of these populations is located at an unusually high altitude (the Iji population, located 700 m above sea level, while most populations are located 200–500 m above sea level) and the other two (Obora and Kotohira) at the edges of the species' range. Thus, these populations are in isolated environments, like those in regions B and C, and this may account in large part for their low levels of variation. We discuss the influence of isolation on the level of within-population genetic variation in the following section.

Interestingly, although the Tabika population was highly isolated and did not have a large population size, it showed the highest within-population genetic variation. High genetic variation in this population has also been revealed by nuclear microsatellite analyses in a previous study (in which it was called the Komono population; Ueno et al., 2005). Ueno et al. (2005) also examined cpDNA variation in 11 populations, and found that the level of variation was unusually high in the Tabika population. The unusually high genetic variation in the population could not be explained by contemporary gene flow due to its isolated distribution and its modest current size. Therefore, the reasons for it may be related to historical dynamics of the population and surrounding populations in region C.

Significant positive values of FIS estimated across all populations at nine loci (as indicated by their significant departures from HWE), and those over all loci in three populations, indicate an excess of homozygotes compared to expectations under HWE, main causes of which include inbreeding, population substructure and the presence of null alleles. M. stellata produces protogynous, hermaphrodite flowers that prevent autogamous selfing, but its asynchronous flowering within individuals leads to geitonogamous selfing (Hirayama et al., 2005). Selfing rates of 26.1–38.8% at the mature seed stage (Hirayama et al., 2007) and 5.7% at the germinated seedling stage (Setsuko et al., 2007) have been reported. In addition, the Asahidani population showed significant deviations from HWE at six of the loci used in this study and the highest value of FIS among the studied populations. These features were probably not due to locus characteristics such as the presence of null alleles, but to population characteristics such as inbreeding and population substructure. However, genotype frequencies at the locus stm0353 significantly deviated from HWE in three adjacent populations (Sue, Shizuno and Chikusui-ike), suggesting the presence of null alleles at the locus that are regionally distributed over the three populations.

Population differentiation among the 20 populations was significant (FST=0.185 and RST=0.274), and a significant isolation-by-distance pattern was also detected. The value of FST obtained in this study is very similar to that obtained by Ueno et al. (2005) using nuclear microsatellites (ΦSC, an analog to FST=0.18), but lower than that found by Kawahara and Yoshimaru (1995) using allozymes (GST, an analog to FST=0.254). Values of FST and GST cannot be simply compared between loci if the loci to be compared exhibit different levels of within-population genetic variation (Hedrick, 2005), as in the material examined here, since there were considerably higher values of HS at microsatellite loci (0.719, this study; 0.58, Ueno et al., 2005) than at allozyme loci (0.092, Kawahara and Yoshimaru, 1995). Hedrick (2005) addressed this problem, and proposed a standardized measure (GST), which allows more appropriate comparisons between loci that have different levels of genetic variation. Using the microsatellite data in this study and the previous allozyme data (Kawahara and Yoshimaru, 1995), GST values were estimated to be 0.683 and 0.283, respectively. Genetic differentiation for microsatellite loci was found to be more than twice that for allozyme loci.

The separation of populations in region B (the Atsumi Peninsula) from those in the other regions was supported by a bootstrap probability of 90%, indicating that the differentiation of populations in region B from the populations in the other regions is due to their geographical isolation. Furthermore, there was significant support, with a bootstrap probability of 100%, for the separation of populations (Kurogawa and Toshichibara) at the root of the Astumi Peninsula from even the populations (Ikawazu and Nagusa) at the tip of the same peninsula, indicating that there was little gene flow between populations at the tip and root of the peninsula, although they were located close to each other (about 5 km apart).

Factors influencing the level of within-population genetic variation

This study demonstrated significant relationships of genetic variation with population size and the degree of population isolation in M. stellata populations using linear mixed models. The model with population size and degree of population isolation (the number of surrounding populations in a radius of 0.5 km around the focal population) as fixed explanatory variables was selected as the best for explaining the level of within-population genetic variation in both response variables, AR and HE. The regression coefficients of the best models are all positive, implying that populations with large sizes and high numbers of surrounding populations within a radius of 0.5 km should have high levels of genetic variation. The best models predict that a decrease in population size, s (which ranged from 29 to 339 individuals among the studied populations), of 100 individuals corresponds to reductions in AR and HE of 1.060 and 0.039, respectively, and a decrease in the number of populations within a radius of 0.5 km, n0.5 (which ranges from 0 to 7 among the populations), of five populations corresponds to reductions in AR and HE of 1.829 and 0.113, respectively.

Many studies have reported significant relationships between population size and genetic variation (for example, Young et al., 1999; Galeuchet et al., 2005; Gao, 2005; Leim et al., 2006). However, a previous study on M. stellata published by Nakashima and Sakai (2003) did not detect such a relationship, possibly because the measure of population size they used was not the number of reproductive individuals but the number of individuals within populations. In contrast to the multitude of reported relationships between population size and genetic variation, few studies have detected significant relationships between the degree of population isolation and genetic variation (Cruzan, 2001; Lu et al., 2005). Since it is more difficult to measure the degree of population isolation than population size, good measures for quantifying the degree of population isolation have not yet been established. However, we detected significant relationships between the degree of population isolation and genetic variation, using the number of surrounding populations as the measure of isolation, whereas the distance from the nearest population was used in many previous studies (for example, Llorens et al., 2004; Galeuchet et al., 2005; Lu et al., 2005). Thus, the detection of a significant relationship in this study, and the failure of many of the cited studies to detect such relationship, supports the assertion by Moilanen and Nieminen (2002) that the number of surrounding populations is a much better indicator of isolation than the distance from the nearest population. In addition, Cruzan (2001) reported significant relationships between the degree of population isolation and genetic variation, using the number of surrounding populations. Important components affecting the degree of population isolation are likely to be the number and distribution of population in proximity to the focal population. However, the most important of these components appears to be the number of populations in the proximity of the focal population, especially for species that form hierarchical metapopulation structure, such as M. stellata.

The area encompassed by a radius of 0.5 km around each population, selected in the best model in this study, may cover the range of metapopulations in M. stellata because the number of populations within the area had significant effects on within-population genetic variation. This is consistent with observations published by Setsuko et al. (2007), who found the maximum distance that pollen was directly transferred across local populations within a metapopulation of the species to be ca. 420 m, although long-distance gene flow may occur over 500 m, at least through seed dispersal by birds. However, gene flow between populations within radii of 0.5 km in metapopulations may principally influence the level of genetic variation within populations in the species.

Conservation implications

The populations distributed in region A have been shown to contain higher genetic variation than the populations distributed in regions B and C, but even in region A populations located at an unusually high altitude (Iji) or at the western and southern edges of the range (Obora and Kotohira) showed similar levels of genetic variation to those in the populations distributed in regions B and C. These populations with low genetic variation (especially the Obora population) have small population sizes and high degrees of population isolation, and may be highly prone to further reductions in genetic variation through random genetic drift (Frankham et al., 2002; Primack, 2002; Hartl and Clark, 2006). Therefore, further reductions in the numbers of individuals in these populations should be avoided to prevent their extinction.

The neighbor-joining tree indicates significant differentiation between populations at the edge of the eastern Gifu Prefecture and the other regions, the Atsumi Peninsula and the other regions, and between the tip and root in the peninsula. If trees are to be transplanted from other populations to these populations to counter population decline in the future, use of trees from populations outside the regions should be avoided. In addition, even within the Atsumi peninsula, use of trees from neighboring populations would be most appropriate for transplanting.

In attempts to conserve rare species, only specific, large populations are often designated as conservation targets, although the surrounding populations may be important as sources of pollens and seeds with different genetic variations (Setsuko et al., 2007). We have found that not only the number of reproductive individuals within the focal population but also the number of surrounding populations within a radius of 0.5 km around it influences its level of within-population genetic variation. These findings support the notion that the metapopulation structure of M. stellata should be retained in attempts to conserve the species, and suggest that an appropriate conservation strategy could include the designation of areas encompassed by circles with a radius of 0.5 km as conservation units. Certain M. stellata populations have been designated as natural monuments for conservation purposes, which we support, but such measures only protect specific local populations that may not be sustainable in isolation. Thus, to ensure the long-term conservation of populations of M. stellata, the importance of surrounding populations around them must be considered.