Conservation recommendations for Oryza rufipogon Griff. in China based on genetic diversity analysis

Over the past 30 years, human disturbance and habitat fragmentation have severely endangered the survival of common wild rice (Oryza rufipogon Griff.) in China. A better understanding of the genetic structure of O. rufipogon populations will therefore be useful for the development of conservation strategies. We examined the diversity and genetic structure of natural O. rufipogon populations at the national, provincial, and local levels using simple sequence repeat (SSR) markers. Twenty representative populations from sites across China showed high levels of genetic variability, and approximately 44% of the total genetic variation was among populations. At the local level, we studied fourteen populations in Guangxi Province and four populations in Jiangxi Province. Populations from similar ecosystems showed less genetic differentiation, and local environmental conditions rather than geographic distance appeared to have influenced gene flow during population genetic evolution. We identified a triangular area, including northern Hainan, southern Guangdong, and southwestern Guangxi, as the genetic diversity center of O. rufipogon in China, and we proposed that this area should be given priority during the development of ex situ and in situ conservation strategies. Populations from less common ecosystem types should also be given priority for in situ conservation.

Scientific RepoRtS | (2020) 10:14375 | https://doi.org/10.1038/s41598-020-70989-w www.nature.com/scientificreports/ scientific expertise or local government recommendations, although it was still necessary to justify their value and rationale. Detailed information on the population genetic structure of O. rufipogon is therefore useful to guide the selection of future sites. Information generated using molecular methods has direct and indirect consequences for the practical management and conservation of germplasm. Genetic diversity data can be useful for understanding the taxonomy and evolution of crop species, and this basic knowledge supports their conservation 8 . More directly, genetic diversity studies can help us to adjust our strategies for collection, evaluation, and breeding. Although some studies have recently documented the population genetic structure of O. rufipogon in China 9-13 , few have focused on the development of conservation strategies based on this population genetic structure. Our study used SSR markers to examine the genetic diversity and population genetic structure of natural populations at three different levels: national (China), provincial (Guangxi Province), and local (Dongxiang population in Jiangxi Province).

Results
Genetic diversity of O. rufipogon populations. Twenty-four SSR primer pairs from previous studies 14 with polymorphisms and a uniform distribution among chromosomes were selected for use in the analysis of population genetic diversity and genetic structure (Supplementary Table S1). All loci were found to be in Hardy-Weinberg equilibrium. High levels of genetic variability at 24 loci were detected in 628 individuals from 20 populations sampled across China (Table 1; Fig. 1a). A total of 340 alleles were detected across the loci, ranging from 23 alleles at RM253 to seven alleles at RM244 and RM345 (Supplementary Table S2). The average number of alleles was 14.17. The overall means of A E (the effective number of alleles), H O (the observed heterozygosity), H E (the expected heterozygosity) and I (the Shannon-Weaver information index) across all loci were 6.97, 0.58, 0.83, and 2.08, respectively. The values varied widely among loci: A E ranged from 2.80 (RM244) to 15.41 (RM336); H O ranged from 0.15 (RM244) to 0.88 (RM336); H E ranged from 0.64 (RM244) to 0.93 (RM253); and I ranged from 1.13 (RM244) to 2.85 (RM336).
The genetic structure of populations across China was analyzed. The mean value of F ST was 0.44, and it varied from 0.25 to 0.59, indicating that there was substantial genetic variation among populations (Supplementary  Table S2). Genetic population differentiation was also measured by AMOVA analysis (Table 2) and pairwise population differentiation (Supplementary Table S3). AMOVA analysis showed that 41.2% of the variation occurred among populations. The significant differentiation (the P value of F ST < 0.001) among populations also reflected larger differences between all populations from the whole country (Table 2). F IS ranged from − 0.49 to 0.44 with   Genetic structure of O. rufipogon populations in China. The genetic relationships among populations in China were analyzed with a population structure analysis in STRU CTU RE, principal component analysis (PCA) and construction of a UPGMA tree (Fig. 2). In the structure analysis, the log-likelihood ln(P(D)) was largest when the number of populations, K, was equal to four (Fig. 2b), and four groups were therefore identified. Most populations from Hainan were placed in Group 1, together with two southern boundary populations N_GX2 and N_GD1. The N_HN3 population was placed by itself in Group 2. Two northern boundary populations from Guangdong and Guangxi were placed in Group 3 with N_HuN1 and N_JX1. Most populations from Guangdong and Guangxi were placed in Group 4, together with the northernmost population, N_HN5. Results of the PCA were consistent with those of STRU CTU RE (Fig. 2c). The UPGMA dendrogram showed that the 20 populations formed three main clusters with a genetic similarity of approximately 0.27 (Fig. 2d). Three populations (N_HN1, N_HN3, and N_HN4) from Hainan were grouped into Cluster 1. N_HN2 from Hainan and N_GD1 and N_GD5 from Guangdong and N_GX2 from Guangxi were grouped into Cluster 2. Group 3 from the structure analysis corresponded to Cluster 4a, Group 4 from the structure analysis corresponded to Cluster 3 and Cluster 4b. Surprisingly, the two Hainan populations (N_HN2 and N_HN5) were placed in different clusters rather than forming a separate cluster. N_HN2 was placed in Cluster 2 with populations from Guangdong and Guangxi, and N_HN5 was separated away from the others. Monmonier's maximum difference algorithm was used to perform a genetic barrier prediction analysis with all populations included (Fig. 3). The first predicted barrier separated N_HN1 and N_HN2 from all other populations. The second barrier separated population N_FJ1, which was located in the easternmost sampled area. The third predicted barrier separated N_HN4, and the fourth barrier separated N_HN3 and N_HN5. Cluster 3b (Group 3) was separated by the fifth barrier, and N_GD1 by the sixth barrier.
The populations with higher genetic diversity formed a triangular area (Fig. 1a) that reached from 19° N to 23° N and included northern Hainan, southern Guangdong, and southwestern Guangxi. It included the N_GX1, www.nature.com/scientificreports/ N_GX3, and N_GX2 populations from Guangxi, the N_GD1, N_GD2, N_GD5, and N_GD3 populations from Guangdong, and the N_HN5 population from Hainan. As shown in Table 1, the averages of the diversity parameters in this triangular area were significantly higher than those of other populations and higher than the overall population averages (Fig. 4). The higher diversity and greater number of private alleles indicate that this triangular area may be the genetic diversity center of O. rufipogon in China.

Population structure and differentiation of O. rufipogon in local regions.
To understand the history of wild rice populations, we conducted an in-depth study of the largest (Guangxi) ( Table 4) Table S4).
We also used Monmonier's maximum difference algorithm to perform a genetic barrier prediction analysis for 14 populations from Guangxi, revealing five predicted barriers to gene flow ( Supplementary Fig. S1). The You River-Yu River-Xun River formed a primary barrier that isolated the populations into two parts. Five populations (R_GX1, R_GX2, R_GX3, R_GX4, and R_GX5) were from the Nanliu River (Fig. 1b) but were a long distance apart. They were isolated from the other populations by the Darong and Liuwan Mountains. R_GX6, located in Fangchenggang, was isolated by the Gutong Mountain. R_GX9 belonged to Baise City and was distributed in small areas along the You River. R_GX13 was located in the northernmost area of rice distribution and was isolated by Shanzhao Ling and the Liu River.
Dongxiang County is recognized as the northernmost habitat of O. rufipogon, and Anjiashan and Shuitaoshu are the only two sites in Dongxiang where O. rufipogon is found. Four populations were surveyed to gain a more complete understanding of the genetic structure of O. rufipogon populations in Dongxiang. At the Anjiashan site, the primary in situ conservation site for O. rufipogon in China, a relatively large population is divided by a concrete wall that was constructed in the 1980s. DXP1, DXP2 and DXP3 are separate populations that are located close together and have been isolated by the concrete wall from the outside. DXP2 is located in the southeast, DXP3 in the northwest, and DXP1 in the middle. DXP4 is from Shuitaoshu and is further isolated by a hill (Fig. 1c).
We first investigated the genetic structures of the three natural populations from Anjiashan (DXP1, DXP2, and DXP3). The mean H E estimates for DXP1, DXP2, and DXP3 were 0.47, 0.39 and 0.41, respectively, indicating that DXP1 had the highest genetic variation. However, the values of H T , D ST , and G ST were 0.47, 0.050, and 0.098, suggesting that there was little differentiation among the three populations after more than 20 years of isolation by the concrete wall.
Next, DXP4 was included in the genetic structure analysis, together with the other three populations. The mean D ST was 0.06, indicating that there was little genetic variation among the populations. The genetic differentiation over loci assessed by G ST (0.12) was slightly higher than D ST , but it nonetheless indicated that there was minimal differentiation among the Dongxiang populations. The genetic similarity of all individuals from the four populations was 0.74, highlighting their close genetic relationship. Based on these analyses, the four Dongxiang populations can be considered a single population when collecting samples for ex situ conservation and the establishment of in situ conservation sites.

Discussion
We examined O. rufipogon population differentiation at the national, provincial, and local levels using SSR markers. F ST calculations showed that almost half of the total variation occurred among populations. Previously, Zhou 12 used SSR markers to investigate twelve Chinese wild rice populations from four provinces and found high genetic differentiation among them (R ST = 0.52). Zheng 16 analyzed the sequences of seven chloroplastic and nuclear loci and found that pairwise F ST values between O. rufipogon populations at the nuclear loci ranged from 0.3175 to 0.5748. The AMOVA and pairwise F ST results in the present study provide further evidence for relatively high genetic differentiation and corroborate previous results. Zhou 12 concluded that population isolation caused by habitat fragmentation increased genetic differentiation by increasing the frequency of inbreeding and clonal growth. Previous studies have reported indica-like and japonica-like differentiation in the O. rufipogon population 12,16,17 . Wang 18 also suggested that spatial or physical isolation and local adaptation may contribute to population differentiation within this species. The F IS of O. granulata was 0.402, suggesting that most populations deviated from Hardy-Weinberg expectation within populations and were deficient in heterozygotes. The F ST of O. granulata was 0.859, indicating that 85.9% of the total genetic variation existed among populations 19  rufipogon. In outbreeding species, a decrease in recombination rates is observed in certain regions of the genome, especially around centromeres 24 . On the contrary, in species with a high level of inbreeding, the rarity of double heterozygotes results in lowered effective recombination rates in the whole genome. Therefore, it is expected that both hitch-hiking and background selection will strongly affect genetic variability in inbreeding species 24 .
Here, we identified a triangular area, including northern Hainan, southern Guangdong, and southwestern Guangxi, as the genetic diversity center of O. rufipogon in China. Previously, a genetic diversity center for O. rufipogon in south China, including Guangdong and Guangxi provinces, was proposed based on random amplified polymorphic DNA analysis 13 , allozyme analysis 9 , and SSR data 12 . However, our samples were collected according to a more systematic sampling strategy 25 in which sampled individuals were at least 12 m apart, and approximately 30 individuals were sampled from each population to encompass at least 95% of its genetic diversity. Our genetic diversity center included one population from Hainan that was not included in the previous diversity center. Gao 11 found that, like Guangdong and Guangxi, Hainan also maintained higher levels of microsatellite diversity. Wang 18 also found that Hainan ranked first in China with respect to its gene diversity index and gene richness. It is reasonable that the genetic diversity center includes Hainan because it Scientific RepoRtS | (2020) 10:14375 | https://doi.org/10.1038/s41598-020-70989-w www.nature.com/scientificreports/ has appropriate annual temperatures (16-23 °C) and precipitation (approximately 1,400 mm), as well as higher levels of outbreeding and diversity of ecological habitats 26 . In addition, populations in our proposed diversity center had more private alleles than did populations from other areas. Although the diversity of phenotypes in populations within and outside the proposed diversity center should be documented 27,28 and compared with molecular data, our SSR data strongly suggest that the proposed triangular area is the diversity center of O. rufipogon in China. Southeastern China has generally enjoyed relative tectonic stability since the late Tertiary 29 , with perhaps the single exception of its two large islands. The high percentage of endemics in this region 30 suggests that central and south China has played a significant role both as a center of survival but also as a center of plant differentiation and evolution during the Quaternary 31 . This may explain why the center of O. rufipogon genetic diversity is located in Southeastern China. O. rufipogon is the most important genetic resource for rice breeding and the most endangered wild rice species in China; its collection and conservation are therefore increasingly important. At the national level, a region in southern China whose populations have higher genetic variation and more private alleles is likely to be the genetic diversity center of O. rufipogon. More valuable genes may exist in populations from this area, and its gene pool may be more useful for future variety improvement and biotechnology applications. Therefore, attention should be focused on O. rufipogon from this area for both the construction of in situ conservation sites and the collection of ex situ samples. However, populations outside the genetic center are also important for conservation: almost half of the genetic diversity and 32 out of 76 private alleles existed in these populations. Indeed, populations with relatively low genetic diversity may contain unique alleles that are absent from the diversity center (Table 5). More than 40% of the variation between populations also supported the conclusion that populations from outside the diversity center should receive attention. Based on our analysis of populations across the country, populations in the genetic diversity center should be given first priority when developing national strategies for O. rufipogon conservation. Nonetheless, populations in regions with special ecological conditions, such as unique soils, climates, or valley locations, should also be considered.
Based on the genetic structure analysis of wild rice in Guangxi and Jiangxi, local environmental conditions appear to have influenced gene flow to a greater extent than geographic distance during population genetic evolution. The geography of R_GX9 belonged region is unique, consisting of valleys surrounded by mountains, and the spread of wild rice has therefore been curtailed. The annual minimum temperature of the northernmost area of rice distribution where R_GX13 was located is often below 0 ℃ in this region. The unique local climate and geography have shaped the distinct characteristics of wild rice from the northern mountains. Moreover, The finding that populations from the lower and middle regions of a river contained more genetic variation than those from the upstream regions suggests that conservation efforts should be focused on the downstream populations. Similar genetic diversity results have been reported for plants growing in several important watersheds in China, including Myricaria laxiflora from the Changjiang River in the Three Gorges Region 32 and Sophora moorcroftiana along the Yarlung Zhangbo River 33 .
Smaller population sizes that result from habitat fragmentation may lead to a loss of genetic variation through genetic drift, thereby increasing population differentiation 34 . However, we found that population divergence was more significantly correlated with environmental conditions than with geographic location and isolation by distance. Populations from similar ecosystems showed less genetic differentiation, and local environmental conditions rather than geographic distance appeared to have influenced gene flow during population genetic evolution. These results are consistent with the recently developed maximum genetic diversity (MGD) theory of molecular evolution 35,36 , which predicts that similar environments will select for similar genetic variants, regardless of geographic distance 37 . Environmental factors such as historical habitat fragmentation and local adaptation can cause divergence 38 , and adaptation to local conditions rather than simple geographic isolation appears to have driven O. rufipogon population differentiation. Our results suggest that ex situ sampling of multiple populations from similar ecosystems should not be a priority because such populations tend to be genetically similar even when they are separated by large distances. We should therefore reduce the number of samples for ex situ conservation collection to avoid duplication, no matter how far apart they are.  (Table 1). At each latitude, one to four representative populations were selected for analysis. Samples Table 5. Summary of the private SSR alleles detected inside and outside the genetic diversity center. ***P < 0.001.

Group
No. of genotypes  (Table 4). These samples included individuals from the most northern (R_GX14), southern (R_GX3, R_GX4, R_GX5), and western populations (R_GX7) in the province (Fig. 1b). Within each population, individuals were randomly collected at a distance of at least 5 m from one another to avoid collecting samples from a single genet.
DNA isolation and polymerase chain reaction. Genomic DNA was extracted using the CTAB method according to the protocol of Edwards 39 . The quality and quantity of DNA were assessed on 0.8% agarose gels. DNA concentrations were determined using an ultraviolet spectrophotometer, and the solutions were then diluted to 20 ng/μL with a Tris-EDTA buffer. PCR amplifications were performed with a 5700 thermocycler (PE Applied Biosystems, USA). The PCR reaction in a total volume of 20 μl consisted of 100 mmol/L Tris-HCl, 1 U Taq polymerase, 2.5 mmol/L MgCl 2 , 2.5 mmol/L dNTPs, 4 μmol/L forward and reverse SSR primers, and 100 ng DNA. The PCR program was 5 min at 94 °C, followed by 35 48,49 . Deviation from Hardy-Weinberg equilibrium and population differentiation were assessed at each locus across all populations using F statistics, including the fixation index within populations (F IS ), the fixation index across all populations (F IT ) and the gene differentiation index (F ST ) 50 . STRU CTU RE was used to infer genetic clusters (K) with the model-based clustering method 51 . We assessed K values from 2 to 9 by performing ten independent runs for each K value, and the model was run with a 10,000 burnin period and 100,000 Monte Carlo Markov chain repetitions. CLUMPP version 1.1 52 was used to obtain the optimal clusters for each K. The relationships between populations were assessed by Nei's 53 standard genetic distance using the Unweighted Pair Group Method with Arithmetic Mean (UPGMA) in PowerMarker 54 . Principal component analysis (PCA) was performed using Tassel 3.0 (https ://www.maize genet ics.net/tasse l) to summarize the major patterns of variation. Analysis of molecular variance (AMOVA) was performed using Arlequin 3.11 55 , and Mantel 2.0 56 was used to assess whether the data fit the hypothesis of isolation by distance, which predicts a significant relationship between geographic distance and genetic distance. A genetic barrier analysis was performed to suggest historical barriers to gene flow among or between collection sites using BARRIER 57 (version 2.2, Syracuse University, USA) with Monmonier's maximum difference algorithm, which takes the geographic coordinates and genetic distance (GD) of each population as inputs.
The presence of private alleles in each population and in groups within and outside the genetic diversity center were assessed, and the richness of private SSR alleles was defined as the average number of private alleles per genotype for each population.