Introduction

In forest trees, 90% or more of the total genetic diversity of a species is often found within individual populations1. Understanding the spatial genetic structure (SGS) in a population is important because spatial structure influences genetic parameters including mating system, selection, and genetic drift2. Investigating and comparing the SGS through generations offers important clues about historical and demographic forces that have affected the population3,4. Spatial autocorrelation analysis offers information about the SGS by comparing the structure of a population to a random arrangement and has been applied to various studies5. Bayesian clustering models describe multi-locus genotypes as clines or clusters to infer the SGS or recently created genetic barrier6,7. Comparing multiple Bayesian clustering models, including both spatial and non-spatial, is informative with additional independent inference methods, such as principal component analysis (PCA) or principal coordinate analysis (PCoA)8. For the investigation of genetic variation, microsatellite markers are commonly used on trees. Microsatellites, also known as simple sequence repeats (SSRs), are highly reproducible, polymorphic, codominant markers9. Microsatellite markers are useful to estimate the relationship or relatedness among individuals of unknown ancestry10 and have an advantage of cost-effectiveness11.

Integration of genetics into conservation is getting more important as genetic diversity critically contributes to the survival of a species12,13. Ex-situ conservation has a critical role in providing materials for restoration and preventing extinction, so it gets highly important when additional threats can lead the species to an extinction vortex14. The materials for ex-situ conservation need to be collected from distanced trees to avoid inbreeding and encompass the genetic variation that target populations have15. Typically, the goal of gene conservation is to capture at least one copy of 95% of the common alleles that occur in target populations at frequencies over 0.0515, and it is impractical to set the goal to include whole alleles because it requires a much larger sampling size1. The sampling size to reach this goal differs by species because of the distinct characteristics of each species, including the mating and pollination systems1. Therefore, to establish an appropriate sampling strategy for the target species, we need to understand the pattern and extent of the genetic variation among and within its populations.

For the restoration and conservation of tree species, seeds are the primary source of propagation because most tree species rarely propagate asexually in nature16,17. Seed characteristics need to be understood to efficiently collect the germplasm for conservation. Seed quality is a major factor that affects the seed sampling strategy17. The important indicators of seed quality include purity, germination percentage, and viability. While germination percentage is more easily tested, some researchers18 argue that viability is more important. They noted that germination percentage can provide imprecise information because we lack knowledge on dormancy and germination. Genetic diversity or genetic structure of the population is another important factor for the establishment of a good seed sampling strategy. An increase in homozygosity leads to embryo death, more empty seeds, and lower viability in various species, including Abies19. Therefore, genetic diversity of the population is at the core of sustainable reproduction and regeneration.

Abies nephrolepis (Trautv. ex Maxim.) Maxim. is a wind-pollinating conifer belonging to the section Balsamea of Abies20. It is naturally distributed in the subalpine or boreal forests of eastern Russia, northeast China, and the Korean peninsula21,21. It was assessed as Least Concern (LC) in 2010 (published in 2013) according to IUCN Redlist22. The main threat to the species is known to be logging21, but recently, the populations in South Korea have been affected by climate change. The Korea Forest Service noted that A. nephrolepis showed a decline rate of 31% in 2019–2020, which was estimated to be caused by higher temperature and droughts23 It is expected that the potential habitats of A. nephrolepis on the Korean peninsula will decrease to 36.4% by 2070–2099, and most regions in South Korea were expected to be inhabitable24. Populations of A. nephrolepis in South Korea had a low level of genetic differentiation and only about 3–8% of total genetic variation resulted from the variation among populations25,26,27. These studies suggest that we could make the conservation more efficient by focusing on several populations.

Our goal was to produce insights that lead to the establishment of an effective conservation plan in a southernmost population of A. nephrolepis. For this, this study had two main objectives: (1) to understand the within-population genetic variation in A. nephrolepis by investigating the genetic diversity and SGS, and (2) to identify the characteristics of seeds in a population. The study was conducted at Mt. Hambaeksan, South Korea, which was the southernmost population having dominant mitochondrial DNA haplotypes of A. nephrolepis28. We will discuss the relationships among genetic variation, seed quality, and spatial process.

Results

Genetic diversity and sampling simulation

Genetic diversity was quite high in the population of A. nephrolepis at Mt. Hambaeksan (Table 1). Among the nine microsatellite markers, two markers, AK171 and AK252, were estimated to have the null alleles (estimated frequency over 0.05). However, null alleles were uncommon with frequency under 0.2, so we considered it acceptable for further analysis29. All amplified loci were polymorphic having over three effective alleles. All individuals were distinguished by their genotypes using any combination of three markers. A Mantel test showed that there was no significant isolation by distance (IBD) (R2 = 0.0002NS, NS: not significant). Genetic diversity indices were also calculated on two different diameter classes (Table 2). The mean number of effective alleles and the mean expected heterozygosity of each class were similar. The group of bigger trees (BT) showed a relatively lower mean number of alleles, observed heterozygosity, and fixation index, but within the error range.

Table 1 Genetic diversity indices estimated in the population of A. nephrolepis at Mt. Hambaeksan, South Korea over nine microsatellite loci.
Table 2 Genetic diversity indices of different diameter class of A. nephrolepis at Mt. Hambaeksan, South Korea.

According to the sampling simulation study, sampling 20 individuals ensured capturing over 95% of the total common alleles on average (Supplementary Fig. S1). When we checked the success rate of capturing sufficient common alleles (over 95%) at every locus, sampling individuals only showed 31.1% of the success rate (Supplementary Fig. S1). At least 25 individuals were needed to make over 50% of the samplings to be successful, and 40 individuals were required to exceed 90% (or 95%).

Spatial genetic structure

We found a weak SGS within the study population of A. nephrolepis at Mt. Hambaeksan. Significantly positive SGS was found up to 30 m (Fig. 1a). Individuals were in genetic randomness from the distance interval of 30–40 m, and until 400 m which was approximately the farthest distance between the trees (Supplementary Fig. S2). When we investigated SGS by diameter class, the group of bigger trees (BT) showed a similar result to that of all individuals (Fig. 1b). They had significantly related individuals until 30 m and distributed in genetic randomness from the distance interval of 30–40 m. In contrast, the group of smaller trees (ST) did not show significant SGS in all distance classes (Fig. 1c).

Figure 1
figure 1

Correlogram of the population of A. nephrolepis at Mt. Hambaeksan in 10 m distance intervals. The blue line represents values of correlation coefficient (r) against the geographical distance between individuals with black error bars of 95% confidence interval. The red dotted line indicates the upper and lower limits of the null hypothesis with a 95% confidence interval. (a) Correlogram of all individuals, (b) correlogram of the bigger trees (BT), and (c) correlogram of the smaller tree (ST).

The Sp value obtained from all individuals was 0.0036 (F1 = 0.0103***, bF = − 0.0036***, ***p < 0.001). When the values were calculated by the diameter group, BT had Sp = 0.0081 (F1 = 0.0270***, bF = − 0.0079***), which was higher than the value obtained from all individuals. However, ST had a lower value of Sp = 0.0021 (F1 = 0.0030NS, bF = − 0.0021NS), and it was not statistically significant. It indicated that the bigger trees had a greater level of SGS. Overall, SGS found in all individuals could have resulted from the SGS in the bigger trees.

The optimal number of clusters (K) inferred by two Bayesian clustering models, GENELAND and STRUCTURE, was inconsistent. In 20 independent runs in GENELAND, all runs produced consistent results, K = 2 (Fig. 2a, b). However, the two clusters had an FST of 0.009 (p = 0.001), which was a very low level of differentiation. Principal coordinate analysis (PCoA) failed to demonstrate a clear clustering that coincides with the GENELAND result (Supplementary Fig. S3). The best K inferred by STRUCTURE was different depending on the methods. Pritchard’s method presented K = 1 as the best K, but Evanno’s ΔK method presented K = 2 (Fig. 2c). PCoA did not demonstrate a clear clustering that coincides with the result of STRUCTURE of K = 2 (Supplementary Fig. S3). In this study, K = 1 by Pritchard’s method was chosen as the optimal number of clusters estimated by STRUCTURE.

Figure 2
figure 2

Results of Bayesian clustering models of the population of A. nephrolepis at Mt. Hambaeksan, South Korea. (a) The optimal number of clusters estimated from GENELAND, (b) map of posterior probability estimated from GENELAND, and (c) the optimal number of clusters estimated from STRUCTURE. X and y coordinates are the geographical coordinates of the individuals in Universal Transverse Mercator (UTM52N).

Cone and seed characteristics

The 31 mother trees included all common alleles and 76.61% of total alleles. On average, the mother trees had a diameter at breast height (DBH) of 14.55 cm (± 3.86 cm) and the smallest tree had a DBH of 5.2 cm. The mother trees produced 55.77 (± 67.36) cones on average, and there was a huge variance among trees. The least productive tree only produced five cones, while the most productive one produced about 270 cones. Overall, the top 20% of the most productive mother trees produced 61% of the total cones, and 81% of the cones were produced by the top 42% of the mother trees.

The cone and seed characteristics of the sampled mother trees (Table 3) were similar to a previous study that analyzed 5 different populations in South Korea30. However, the purity was approximately 10% lower than the forest seed standard quality investigated by the Forestry Research Institute of South Korea31, which is now known as the National Institute of Forest Science.

Table 3 Cone and seed characteristics resulted from the cone analysis in the population of A. nephrolepis at Mt. Hambaeksan, South Korea.

Benefiting from the long pre-chilling period, seeds germinated early (Supplementary Fig. S4). A seed was first germinated only a single day after the germination test began. The mean germination time was 7.4 days and the germination energy on day 6 was 15.4%. In the end, the germination percentage was 32.2% and the viability was 34.2%. The survivorship of 480 seedlings after 2 months was 77.5% and the survivorship after 6 months dropped to 64.2%.

Correlation among seed characteristics

There were significant relationships between cone characteristics (Fig. 3). The cone length, cone width, dry weight of a cone, 100-seed weight, and seed potential had positive relationships with each other. All of them also had a positive relationship with purity. However, when it comes to the germination percentage, only 100-seed weight had a significant relationship.

Figure 3
figure 3

Results of correlation analysis on cone characteristics in the population of A. nephrolepis at Mt. Hambaeksan, South Korea. CL, CW, DW, W100, SP, GP_sqrt, P_cube each represent cone length, cone width, dry weight 100 seed weight, seed potential, germination percentage after square root transformation, and purity after cube transformation. ***p < 0.001.

There were also several significant relationships between the mother tree characteristics. The DBH and cone abundance had a positive relationship (r = 0.6053***), and the geographic position was also related to seed quality and quantity (Supplementary Fig. S5). Mother trees at higher altitudes produced more cones (r = 0.4037*, *p < 0.05), but they had lower purity (r = − 0.3905*). We observed a weak tendency between genetic distance and seed quality (Supplementary Fig. S6), however, correlation tests failed to discover a significant relationship. Neither genetic distance (r = 0.3145NS, p = 0.08) nor the number of trees within the distance of spatial autocorrelation (30 m) (r = − 0.1900NS, p = 0.31) was not significantly related to the seed quality.

Discussion

Fine-scale spatial genetic structure

In the result of STRUCTURE, the optimal number of clusters estimated by Pirtchard’s method and Evanno’s ΔK method were different from each other (K = 1 and K = 2, respectively). Although Evanno’s ΔK method was developed to improve the detection of the optimal number, it cannot calculate the value at K = 1, therefore Evanno’s ΔK method selects K = 2 more frequently32. Some researchers also evaluated that Evanno’s ΔK method only brings little improvement compared to Pritchard’s method33. The two Bayesian clustering models, GENELAND and STRUCTURE, also failed to achieve consensus in estimating the optimal number of K within the population. There were several studies reviewing multiple Bayesian clustering models together34,35, and they reported that a low level of genetic differentiation can increase the error rate in the Bayesian clustering model. The two clusters found with GENELAND in A. nephrolepis at Hambaeksan had a very low level of differentiation, and this could have increased the error rate. Also, although the correlated frequency model is better for the populations with low differentiation, researchers noted the problem of poorer accuracy35,36, and previous studies reported overestimation of K under the correlated frequency model in GENELAND37,38. Also, the results of PCoA supported the absence of clustering within the population, and there was no apparent abiotic barrier that can explain the result of GENELAND. For these reasons, we concluded that the correlated frequency model of GENELAND overestimated K, and the genetic clusters had not yet developed within the population.

SGS in the population of A. nephrolepis at Mt. Hambaeksan was not significant and the estimated Sp value (0.0036) was relatively weak compared to the average of outcrossing species (0.0126) or wind-dispersed species (0.0064)39. However, it was akin to Sp values of other species in Abies40,41,42. While there was no significant SGS in ST, BT had a stronger SGS. Previous studies found that younger generations of outcrossing species can have a weaker SGS43,44,45. According to these studies, an increase in genetic structure towards older life-history stages can result from local selection, historical events, and demographic factors. In this study, BT showed a higher level of inbreeding coefficient (F = 0.036) compared to ST (F = − 0.011). Considering the existence of a transmitting station right near the study area, populations of A. nephrolepis in the study area could have experienced a bottleneck event caused by the anthropogenic disturbance, then slowly recovered the genetic diversity by active gene flow in younger generations. However, we could not observe a clear decrease in the number of alleles in BT (Table 2) which is the major evidence of bottleneck events45,46. Therefore, it was hard to conclude what caused the difference in SGS between the diameter class of A. nephrolepis at Mt. Hambaeksan. Moreover, although DBH is commonly used as an indirect indicator of age, it is hard to compare the diameter class directly to the age structure. Among the sampled individuals in this study, we extracted cores from the two trees that belonged to different diameter classes, but they were estimated to have similar ages (Supplementary Table S3). To study the change in SGS through generations, we should sample individuals in various life-history stages, including seedlings, saplings, and juveniles, or investigate the age structure more directly using cores.

Seed quality

The seeds and cones of A. nephrolepis at Mt. Hambaeksan mostly had similar characteristics to other populations, but the purity was lower than the standard quality. Considering that the standard quality was published in 1994, changes in climate conditions could have resulted in poorer purity. According to the data from the nearest weather station47, the sampling year had a higher temperature and less precipitation compared to the normal year (past 30 years): the average annual temperature was about 0.5℃ higher, and annual precipitation was about 100 mm less. There were several studies on the relationship between climate change and seed quality. They found that the increased temperature and moisture stress can reduce the seed yield and quality, causing shortened duration of stigma receptivity48. Temperature irregularity affects the flowering timing, which causes asynchronous flowering within a population49. Therefore, the increased temperature might have affected pollen production or pollination of A. nephrolepis. Especially, the genus Abies lacks pollination drop and uses rainwater as an alternative19. Therefore, decreased precipitation could have decreased pollination success.

The correlation analysis between seed characteristics found that more cones existed in trees with bigger DBH. This result corresponds to the previous research that noted, for some species in Pinaceae, older trees produced more cones while younger trees produced more pollen50. The amount of pollen of A. nephrolepis was not measured in this study, therefore, further studies might offer interesting information about the relationship between age and strobili production. There was a positive correlation between seed weight and seed quality. With this result, prioritizing heavier seeds in propagation seems efficient. For sampling, direct evaluation of seed weight is impossible because it needs post-collection treatment, such as drying or cleaning. Cone weight can be used as a good alternative because it had the strongest correlation with the seed weight and also had a positive relationship with the purity.

Among the geographic factors, altitude showed a relationship with purity and cone abundance. The decrease in purity along the altitude could have been caused by the decrease in pollen pool. As the study plot was located right below a transmitting station on the summit, the individuals at the high altitude were on margin, therefore had a decreased pollen pool, which leads to poor reproductive success. This leaves important implications in seed production and regeneration of A. nephrolepis under climate change. Several studies expected that the lower limit of subalpine or alpine conifers will retreat upward due to climate change and have a decrease in its central area. The lowest limit of regeneration of Abies georgei Orr had retreated upslope 31 m per decade51, and A. nephrolepis was expected to lose its lowlands habitats52. Considering a decrease in purity at higher altitudes of Mt.Hambaeksan, loss and upward-retreat of habitable areas will result in poor regeneration of A. nephrolepis in South Korea. Therefore, the need for ex-situ conservation gets more important to adapt to climate change.

The correlation analysis failed to find a significant relationship between genetic relatedness and seed quality. Because the genus Abies has developed a post-zygotic barrier, it was expected that mother trees having more genetically close pollen trees would have poorer seed quality than the opposite. The insignificance could be because of the sufficient gene exchange among the mother trees. A. nephrolepis was a dominant species in the study plot and occupied the upper canopy. As A. nephrolepis is a predominantly outcrossing wind-dispersed conifer, occupying the upper canopy allows active pollen dispersal.

Strategy for ex-situ conservation

With this study, we could establish several sampling and management strategies for A. nephrolepis at Mt. Hambaeksan. For the ex-situ conservation of A. nephrolepis at Mt. Hambaeksan, materials need to be collected from distant individuals that are at least 30 m away to avoid genetic relatedness. To encompass sufficient genetic diversity, at least 20 individuals have to be sampled within the study population. However, to secure the common alleles at every locus, much more individuals need to be sampled. Keeping sampling individuals distant, over 30 m away, will help reduce the optimal number of sampling. In this study, the 31 mother trees, which were 35 m apart from each other on average, included all common alleles.

We could also establish some strategies for seed collection. Although trees with bigger DBH had more cones in a tree, it did not ensure seed quality or genetic diversity. Therefore, cones must be collected from various distant trees, regardless of size. However, the geographical position of the mother trees should be considered. It is better to avoid mother trees on the margin because they could have less gene exchange and lower purity. Within a tree, collecting heavier cones would be effective, as heavier cones have a higher probability of good quality. After the seed cleaning, heavier seeds should be chosen, as they are more likely to germinate.

Ex-situ conservation is more emphasized and important for the species under threat, however, for the effective conservation of A. nephroelpis, in-situ conservation should be carried out together. A previous study on the conservation of Taxus cuspidata Sieb. et Zuuc., a species with a similar genetic and geographical distribution pattern to A. nephrolepis, pointed that ex-situ gene conservation might reduce overall genetic diversity and population size53.

Conclusions

This study aimed to understand the genetic variation and seed characteristics in a southernmost population of A. nephrolepis. The flowering individuals had a positive genetic relationship until 30 m, so we concluded 30 m is the optimal sampling distance of A. nephrolepis at Mt. Hambaeksan. The overall extent of the SGS was weak, but it was stronger among individuals in a bigger diameter class. There was no cluster found within the study population, therefore we considered sampling 20 individuals over the site would be enough to secure the genetic diversity of the population. The cones and seeds had similar characteristics to other populations in South Korea. Correlation analysis revealed that the DBH was positively related to the cone abundance and the seed weight was the most effective indicator of the seed quality. The purity was relatively low, and it got lower as the altitude got higher. Mother trees that were genetically closer to other individuals seemed to have poorer seed quality, but it was insignificant.

This study investigated SGS and seed quality together in a population of A. nephrolepis. This study contributes to the establishment of an effective ex-situ conservation of A. nephrolepis in a southernmost population in South Korea. Further studies including demographic and historical factors, mating systems and pollen dispersal need to be continued for a better understanding of the species. The impact of climate change on the regeneration of A. nephrolepis needs to be studied, as we inferred that higher temperature and upward-shifting reduction of habitats could disturb the healthy seed production.

Materials and methods

Sampling

Sampling was carried out in a natural population of A. nephrolepis at Mt. Hambaeksan, South Korea. The study area was about 5 ha, located on the northern aspect near the mountaintop (latitude: 37.161° N–37.164° N, longitude: 128.918° E–128.924 ˚E, altitude: 1482–1545 m). Within the plot, all mature flowering individuals were sampled. A. nephrolepis forms strobili and pollinates in April to May, and its seeds mature in September to October54. We defined a mature flowering tree as a tree with cones (strobili) at the time of sampling. We collected fresh needles from 165 individuals (Fig. 4) in late September 2021, and measured their height and diameter at breast height (DBH). Cones were sampled from 31 mother trees within the same plot in late September 2021 (Fig. 4). The average neighboring distance of the mother trees was 35 m. We collected at least three cones from each mother tree and a total of 141 cones were collected. Geographic positions (latitude, longitude, and altitude) were recorded in coordinates within a 3–4 m error range using Garmin GPSMAP 64 s (Garmin, Olathe, KS, US) and drawn on the map using QGIS 3.2855. All methods were carried out in accordance with the IUCN policy statement on research involving species at risk of extinction. Sampling was carried out with the permission of the Korea National Park Service. Our research team and Dr. Suh G.U. from the Korea National Arboretum undertook the formal identification of the plant material used in our study. The voucher specimen of the studied material has been deposited in the herbarium of the Korea National Arboretum (the deposition number: KNKA201106145001, KNKA201106145006, KNKA201106145007).

Figure 4
figure 4

Geographic positions of the sampled A. nephrolepis at Mt. Hambaeksan. Dots indicate the location of sampled trees. Mother trees were marked by red dots with bold letters. Gray ray italic letters indicate altitude (m). Map of contour line was created by National Geographic Information Institute56.

DNA extraction and PCR amplification

DNA was extracted from the fresh needles using QIAGEN DNeasy Plant mini kit (QIAGEN, Hilden, Germany) and GeneAll Exgene Plant SV mini kit (GeneAll, Seoul, Korea). We followed the protocol of each kit with slight modifications. The modifications included increasing the time of 65 ℃ incubation and ice incubation to 30 min. The concentration and quality of the genomic DNA were measured with NanoVue Plus (Biochrom, Holliston, MA, USA).

For genotyping, nine microsatellite markers, AK87, AK171, AK173, AK176, AK246, AK247, AK25257, As13, and As2058 (Supplementary Table S4), were selected based on their polymorphism and amplification pattern. PCR amplifications were performed with a total volume of 12 μl containing 20 ng DNA, 1 X A-Star Taq Reaction Buffer, 2.5 mM MgCl2, 0.2 mM dNTP Mix, 0.5 U A-Star Taq DNA polymerase (BioFact, Daejeon, Korea), 0.04 μM HEX/FAM M13(-19) primer, and 0.2 μM of each reverse and forward primer. PCR programs were performed as described in each reference.

Fragment analysis was performed with PCR products by the National Instrumentation Center for Environmental Management (NICEM) in Seoul National University, with GeneScan 500 ROX size standard and ABI 3730 XL DNA Analyzer (Applied Biosystems, Waltham, MA, USA). Genotypes were scored using Microsatellite 1.4.7 plugin in Geneious Prime 2022.1.159. Large allele dropout, stutter allele, and null allele were checked and adjusted using MICRO-CHECKER60 with the Oosterhout algorithm, 1000 permutations.

Genetic diversity and sampling simulation

Genetic diversity indices were calculated using GenAlEx 6.50361. The number of alleles (A), number of effective alleles (AE), observed heterozygosity (HO), expected heterozygosity (HE), and fixation index (F) were calculated. To identify the degree of isolation by distance (IBD), we performed a Mantel test with 999 permutations between genetic distance and geographic distance of individuals.

A sampling simulation study was performed to find the minimal sampling size that can reach the genetic diversity. Here, the simulation goal was to secure over 95% of common alleles, which appear at over 0.05 frequency in the populations. If the simulated sampling reached the goal, we defined it as “success”. The sampling simulation study was carried out in Python 3.10.662. Individuals were randomly sampled with a sampling size of 5–50 and checked whether they confirmed the goal (“success”). The random sampling was permuted 1000 times at each sampling size, and we counted the number of successes during 1000 permutations.

Spatial genetic structure

SGS was investigated by three methods: spatial autocorrelation analysis, Sp statistic and Bayesian clustering model. This study only sampled adult trees, but we checked whether SGS differs by diameter class within the same life-history stage. To group the individuals into two diameter classes, the median (15.3 cm) of the DBH was used as a criterion. Individuals bigger than the median were grouped into the bigger trees (BT; 84 individuals with an average of 19.2 cm), and the others were grouped into the smaller trees (ST; 81 individuals with an average of 12.4 cm).

For spatial autocorrelation analysis, the correlation coefficient (r) of Smouse and Peakall63 was used. The correlation coefficient was calculated using GenAlEx 6.50361. The analysis was performed on even distance classes (10 m, 20 m, 40 m). Correlograms were plotted at the end point of the distance interval. Statistical significance of r values and correlograms were determined using 999 permutations and 1000 bootstraps, respectively.

Sp statistic, developed by Vekemans and Hardy39, enables the comparison of the extent of SGS among different populations or species. Sp statistic estimates SGS with regression slope of kinship coefficient (Fij) against spatial distance: \({S}_{p} = -{b}_{F} / (1-{F}_{1})\), where bF is the regression slope of kinship coefficient on the log of distance and F1 is the mean of Fij between individuals belonging to a first distance interval. SPAGeDi 1.564 was used to calculate the Sp statistic. Fij of Loiselle et al.65 was used, with an even distance class (20 m). Statistical significance of bF and F1 was determined with 10,000 permutations.

To identify the genetic clusters within the study population, two different Bayesian clustering methods were used: GENELAND66 and STRUCTURE67. GENELAND 4.9.2 was performed in R 4.2.0 environment68. Twenty independent Markov chain Monte Carlo (MCMC) runs were performed with 500,000 iterations and 500 thinning intervals. Spatial model and null allele model were assumed under the possibility of having 1–10 populations (K). Correlated frequency model was selected, as it is more appropriate to detect subtle structures with a low differentiation35. There was a burn-in of 200 saved iterations. Among the 20 independent runs, a run with the highest log posterior probability was selected for further analysis. STRUCTURE 2.3.4. was run with 20 independent simulations for each value of K (1–10). The length of the burn-in was set to 100,000, and 100,000 MCMC iterations were run after the burn-in period. Admixture model and correlated frequency model were assumed. With the results, major modes on each K were obtained using CLUMPAK69. To determine the optimal number of K, Pritchard’s method using LnP(K)67 and Evanno’s ΔK method70 were both considered. STRUCTURE HARVESTER71 was used to obtain the results of the two methods. If clusters were detected, FST between the clusters was calculated using the analysis of molecular variance (AMOVA) in GenAlEx 6.50361, and statistical significance was determined using 999 permutations. Principal coordinate analysis (PCoA) was performed as an additional tool to validate the K value. PCoA was also conducted in GenAlEx 6.50361.

Cone analysis and germination test

We selected three undamaged mature cones of each mother and measured their width and length. The cones were dried until the scales were naturally separated from the cone axis. The dried cones were weighed using a precision balance (QUINTIX 213-1SKR, Sartorius, Gottingen, Germany), then seeds were extracted and cleaned from the dried cones. After the cleaning, seed potential, percent developed seeds, percent damaged seeds, purity, and 100-seed weight were investigated72,73,74 (Supplementary Table S4).

All pure seeds were stored in a 3 ℃ refrigerator for 90 days during the winter for pre-germination chilling (pre-chilling). The seeds were placed on 90 mm Petri dishes with two layers of No.2 filter paper (Hyundai Micro, Anseong, Gyeonggi, Korea). After the pre-chilling, a germination test was conducted in March. Seeds were placed on two layers of No.2 filter paper in a Petri dish (Hyundai Micro, Anseong, Gyeonggi, Korea). The germination test was carried out for 28 days in a growth chamber (BF-600THG, BioFree, Seoul, Korea) in March 2022. The growth chamber was maintained at 20 °C temperature and 50% of humidity. Eleven hours of dark and 13 h of light were provided every day. Seeds were counted as germinated when a radicle or cotyledon was developed over 1 mm. After the test, germination percentage, mean germination time, and germination energy was calculated (Supplementary Table S2). The germination energy was based on day 6, which was the day when the most seeds germinated. Seeds that failed to germinate were cut to investigate the existence of the embryo viability18,75,76. Sixteen seedlings of each mother tree were transplanted to the soil and tested for survivorship in April. Thirty mother trees were used except mother tree M12 (Fig. 4), which lacked germinated seedlings. A total of 480 seedlings were transplanted into pots of 3 cm width and 5 cm depth. Soil was sterilized with an autoclave before use. The seedlings were stored together in the laboratory and greenhouse at 20–30 °C. The survivorship was counted at 2 and 6 months after the transplantation. Seedlings were considered dead when the whole leaves turned brown.

Statistical analysis

Statistical analysis was implemented to find the relationship among genetic, spatial, and seed characteristics in R 4.2.068. Skewed data sets were normalized by log, cube, square, or square root transformation. The normality and equal-variance of the data sets were checked before the correlation test. Pearson’s correlation test was conducted to identify the relationship.

The analysis was conducted on two levels: (1) on cone characteristics and (2) on mother tree characteristics. On both levels, we aimed to find which observable characteristics are related to the seed quality and quantity. We used purity and viability to quantify seed quality: \(seed \;quality = normalized \;purity \times viability\), and considered this value as a comparative indicator. On mother trees, we aimed to find whether genetic or spatial factors also affect the seed quality. To quantify genetic relatedness, two methods were considered in this study: (1) genetic distance and (2) the number of neighboring trees. The genetic distance62 between individuals was calculated using GenAlEx 6.50360. With the result, the sum of genetic distance to all potential pollen donors was computed on each mother tree. If the value was higher than the others, we considered that tree was more related to overall individuals. The number of neighboring trees was counted based on the result of spatial autocorrelation analysis. Trees within the distance of genetic relatedness were counted on each mother tree.