Genome scanning for detecting adaptive genes along environmental gradients in the Japanese conifer, Cryptomeria japonica

Tsumura, Y; Uchiyama, K; Moriguchi, Y; Ueno, S; Ihara-Ujino, T

doi:10.1038/hdy.2012.50

Download PDF

Original Article
Published: 29 August 2012

Genome scanning for detecting adaptive genes along environmental gradients in the Japanese conifer, Cryptomeria japonica

Y Tsumura¹,
K Uchiyama¹,
Y Moriguchi¹,
S Ueno¹ &
…
T Ihara-Ujino¹

Heredity volume 109, pages 349–360 (2012)Cite this article

2813 Accesses
52 Citations
3 Altmetric
Metrics details

Subjects

Abstract

Local adaptation is important in evolutionary processes and speciation. We used multiple tests to identify several candidate genes that may be involved in local adaptation from 1026 loci in 14 natural populations of Cryptomeria japonica, the most economically important forestry tree in Japan. We also studied the relationships between genotypes and environmental variables to obtain information on the selective pressures acting on individual populations. Outlier loci were mapped onto a linkage map, and the positions of loci associated with specific environmental variables are considered. The outlier loci were not randomly distributed on the linkage map; linkage group 11 was identified as a genomic island of divergence. Three loci in this region were also associated with environmental variables such as mean annual temperature, daily maximum temperature, maximum snow depth, and so on. Outlier loci identified with high significance levels will be essential for conservation purposes and for future work on molecular breeding.

Physical geography, isolation by distance and environmental variables shape genomic variation of wild barley (Hordeum vulgare L. ssp. spontaneum) in the Southern Levant

Article Open access 11 January 2022

Population structure and adaptive variation of Helichrysum italicum (Roth) G. Don along eastern Adriatic temperature and precipitation gradient

Article Open access 21 December 2021

Large haploblocks underlie rapid adaptation in the invasive weed Ambrosia artemisiifolia

Article Open access 27 March 2023

Introduction

Local adaptation is important in evolutionary processes and speciation (Hoffmann and Willi, 2008; Stapley et al., 2010). It occurs over relatively long periods of time in plant populations with long generation times, such as forest tree species. The process of local adaptation involves the genetic divergence of specific loci in populations inhabiting different environments. Once identified, these loci can be used for conservational purposes, for future forestation, and to study the mechanisms by which trees adapt to specific environments. Forest tree species have not been as extensively domesticated as crop species, and natural populations are still abundant. By identifying and studying adaptive genes in these natural populations, it is possible to gain new insights into the adaptive mechanisms that allow them to flourish in the wild (Neale and Savolainen, 2004; González-Martínez et al., 2006; Neale, 2007; Savolainen and Pyhäjärvi, 2007; Savolainen et al., 2007; Neale and Ingvarsson, 2008; Grattapaglia et al., 2009). This is facilitated by certain properties of forest tree species: they typically have relatively low genetic differentiation between populations on average, are widely distributed in different environments and have relatively large population sizes. Consequently, the selective pressures acting on a population in a given environment tend to result in adaptation by selecting for a relatively small number of genotype changes at a few specific loci (Howe et al., 2003; Holliday et al., 2010; Pelgas et al., 2011).

Genome scanning is an effective and useful method for detecting genetic differentiation and candidate genes associated with specific phenotypes or environmental variables in forest tree species (Eveno et al., 2008; Namroud et al., 2008; Eckert et al., 2010; Holliday et al., 2010; Neale and Kremer, 2011; Prunier et al., 2011). Recently developed multiple genotyping methods have greatly reduced the cost of single-nucleotide polymorphism (SNP) genotyping. Thus, multiple-marker-based neutrality tests are frequently used for detecting functional DNA variations. The most widely used ‘outlier tests’ are multiple-population tests such as the F_ST-based test (Beaumont and Nichols, 1996), the coalescent-based simulation approach (Vitalis et al., 2001) and the lnRv test based on the Gaussian null distribution (Kauer et al., 2003). The Bayesian method implemented in BayeScan has the advantage that it estimates population-specific F_ST coefficients, thereby accounting for different demographic histories and different amounts of genetic drift between populations (Foll and Gaggiotti, 2008). When identifying outlier loci, it is generally necessary to use multiple tests in order to minimize the false-positive rate; however, false positives remain common even when using multiple methods and multitest correction (Pérez-Figueroa et al., 2010). It is therefore important to validate the detected outlier loci in multiple ways to determine whether or not they genuinely are adaptive. Luikart et al. (2003) suggested several ways to confirm outlier behavior and selection, including considering their genomic location, quantitative trait locus mapping and testing for genotyping errors.

Cryptomeria japonica is an allogamous coniferous species that relies on wind-mediated pollen and seed dispersal. Modern natural forests of the species are distributed across various different environments in the Japanese Archipelago, from Aomori Prefecture (40° 42′ N) to Yakushima Island (30° 15′ N) (Hayashi, 1960). However, its distribution is discontinuous and scattered across small, restricted areas as a result of its extensive exploitation over the last thousand years (Ohba, 1993). The geographical variation between natural forests of C. japonica has been investigated, focusing on both morphological traits (needle length, needle curvature and other features; Murai, 1947) and diterpene components (Yasue et al., 1987). The results of these studies suggest that there are two main lines: ura-sugi (C. japonica var. radicans, found near the Sea of Japan) and omote-sugi (C. japonica, located near the Pacific Ocean). The ura-sugi variety has slender branchlets with soft leaves, while the omote-sugi variety has rough branchlets with hard leaves (Yamazaki, 1995). A previous study focusing on 148 cleaved amplified polymorphic sequence loci revealed genetic differentiation between the two varieties, and four outlier loci were identified as potential local adaptation genes (Tsumura et al., 2007) that may be associated with the genetic differentiation between the varieties.

In the study reported herein, we identified potential local adaptation genes at 1026 loci using two tests and also studied the relationships between specific SNPs and environmental variables such as climate data to investigate the selective pressures acting on the studied populations. The linkage map positions of outlier loci that were associated with specific environmental variables are also discussed.

Materials and methods

Investigated populations

We examined 14 populations in this study with 186 individuals in total (the average number of individuals per population was 13.29): seven from the Japan Sea side of Japan and seven from the Pacific Ocean side (Table 1). All trees sampled were growing in national forests that were candidates for in situ gene-conservation programs. The locations of the sampled populations covered most of the natural distribution of C. japonica (Figure 1, Tsumura et al., 2007).

Table 1 Locations and environment variables for 14 investigated natural populations in C. japonica

Full size table

SNP genotyping

We selected one SNP each from 1536 unique expressed sequence tag contigs in C. japonica. This assumes implicitly that SNPs within different expressed sequence tag contigs are independent (that is, unlinked). The identification of SNPs was carried out previously and was accomplished through the resequencing of 5170 unique expressed sequence tag contigs in a discovery panel of four C. japonica individuals collected from a range-wide sample of trees (Uchiyama et al., 2012). Multiplexed genotyping of SNP markers for natural populations was carried out using Illumina’s 1536-plex GoldenGate array (Illumina Inc., San Diego, CA, USA) according to the protocol recommended by the manufacturer. A total of 0.5–1.0 μg of genomic DNA per sample (at a rate of 100–200 ng μl⁻¹) was used in each GoldenGate assay. Only SNPs with Illumina design scores above 0.6 were used. When multiple SNPs were available within the same sequence, a single highly polymorphic SNP was selected as the target for that sequence. The GoldenGate assay uses highly multiplexed allele-specific extension methods and universal PCR amplification reactions. The PCR products, which were fluorescently labeled by using the 5′-labeled primers P1 (Cy3) and P2 (Cy5), were hybridized to capture probes on the beads in the array. The ratio of the fluorescent signals from two allele-specific ligation products was used to determine the genotype. Signal intensities were quantified and matched to specific alleles, using the Genome Studio v2010.2 software package in Illumina BeadArray Reader (Illumina Inc.). The quality of the GoldenGate genotype scores for individual SNPs was assessed based on their GenTrain cluster and GenCall genotype scores in GenomeStudio (Illumina Inc.). A minimum GenCall50 (GC₅₀) score of 0.25 was chosen as the threshold for inclusion of SNP loci in the final data set, and genotypic clusters were edited manually when necessary. In the present study, this threshold corresponded to SNPs with accurate scoring for at least 95% of the individuals, with most successful SNPs scored for over 99% of the individuals analyzed.

Genetic structure

To evaluate the within-population variation, we used the proportion of polymorphic loci (Pl) at the 95% probability level, the unbiased heterozygosity (H_e), (Nei, 1977) and allelic richness (R_s) calculated from the allele frequencies of all loci analyzed. Allelic richness was measured by the method of El Mousadik and Petit (1996) with a reference sample size of 6. The fixation indices, F_IS=1−H_e/H_o, for polymorphic loci and their averages over all loci were determined to compare the observed genotype frequencies with expectations based on the Hardy–Weinberg equilibrium (Wright, 1922; Nei, 1977; Nei and Chesser, 1983). Deviations from such expectations were analyzed using Fisher’s exact test. Coefficients of gene differentiation, F_ST, among populations, were calculated to determine how gene diversity was partitioned at each level (Wright, 1978). These analyses were done using the FSTAT software package (Goudet, 2000) and GenAlEx (Peakall and Smouse, 2006). To detect population structure and infer the most appropriate number of subpopulations (K) for interpreting the data without prior information on the number of locations at which the populations were sampled, we used the F-model of Bayesian clustering approach proposed by Pritchard et al. (2000). For this analysis, we excluded 184 loci with high linkage disequilibrium (LD) values (χ² test, significant at 99% level after Bonferroni correction), 141 outlier loci that were detected by Fdist2 analysis and a locus that was distorted from the Hardy–Weinberg equilibrium. Ten independent runs with K values ranging from 1 to 10 were performed using 2 × 10⁶ MCMC (Marcov chain Monte Carlo) sampling after a burn-in period of 50 000 iterations. The posterior probability was then calculated for each value of K using the estimated log likelihood of K to identify the optimal K value. The optimal number of K was defined as the one at which the log likelihood of the data, ln P(X|K) (Pritchard et al., 2000) or ΔK, the rate of change of ln P(X|K) between successive K values (Evanno et al., 2005), was maximal. To examine the genetic differentiation between two groups representing the two varieties, C. japonica and C. japonica var. radicance, we performed a hierarchical analysis of molecular variance (Excoffier et al., 1992) in which the significance levels for variance components were tested using permutations.

Methods for detecting candidate loci for divergence

The following methods were used to detect signs of natural selection and to tentatively identify the environmental parameters associated with the corresponding selective pressures acting on C. japonica populations. We also compared the distributions of the F_ST values over all loci to their expected distributions under the assumption of neutrality. Beaumont and Nichols (1996) have shown that the distribution of F_ST as a function of heterozygosity in the context of an island model is quite robust, that is, insensitive to variations in factors such as population structure, demographic structure and mutation level. We therefore used this method to identify markers deviating from the null hypothesis of neutral evolution. All calculations for identifying potential non-neutral genes in the studied populations were performed using the Fdist2 (Beaumont and Nichols, 1996) and Arlequin programs (Excoffier and Lischer, 2010). Arlequin program allows us the hierarchical analysis using two genetic lines such as ura-sugi and omote-sugi varieties. In the analysis using Fdist2, we identified the outlier loci using not only all 14 populations but also each variety populations such as seven ura-sugi populations and seven omote-sugi populations.

An alternative method for identifying candidate loci is BayeScan, which uses a Bayesian method to directly estimate a posterior probability for each locus and a reversible-jump MCMC approach to selection (Foll and Gaggiotti, 2008). The main advantage of BayeScan is that it estimates population-specific F_ST coefficients and therefore accommodates differences in demographic history and the extent of genetic drift between populations. This method is an extension of that proposed by Beaumont and Balding (2004), and it is based on a logistic regression model that decomposes genetic variation into population- and locus-specific effects. Preliminary tests were conducted using a burn-in of 10 000 iterations, a thinning interval of 50 and a sample size of 10 000 (Foll and Gaggiotti, 2008). Four independent runs were performed for each of the two data sets to account for the consistency of the detected outliers. The loci were ranked according to their estimated posterior probability and all loci with a value over 0.990 were retained as outliers. This corresponds to a log₁₀ Bayes factor of more than 2, making it possible to be confident in the validity of the model.

The gathering of environmental data for each population and the identification of correlations with the genetic data

Environmental information on the area of origin of each population, including longitude, latitude, altitude, mean annual temperature, mean annual precipitation, annual sunshine hours, amount of global solar radiation and maximum snow depth over 30 years (1971–2000) was obtained from the Japan Meteorological Agency (2002), in the form of detailed 1-km mesh data. We used information from the 1-km mesh data to estimate the mean annual temperature, precipitation and snowfall, and so on, for each site using the program in Mesh climatic data of Japan of the Japan Meteorological Agency (2002).

To calculate correlations between SNP allele frequencies and climate variables, we used the Bayesian linear model method of Coop et al. (2010), which controls for population history by incorporating a covariance matrix of populations and that accounts for differences in sample size among populations. Using the full set of SNPs, we estimated a covariance matrix of allele frequencies across populations. This covariance matrix was used as the basis of the null model for the transformed allele frequencies at each SNP to be tested. For each tested SNP, the method generates a Bayes factor as a measure of the support for the alternative model relative to the null model, in which the transformed population allele frequency distribution is dependent on population structure alone. The significance threshold for ranked Bayes factors was set to 2.0, and the environmental variables considered were latitude, longitude and ten climate variables. All climate variables were standardized before analysis. In this analysis, we included the geographical information such as latitude and longitude to find out the false positives because they are sometimes correlated with some environmental variables.

Putative functions for the detected outlier loci were identified by performing BLASTx searches of the NCBI database using Blast2GO (Conesa et al., 2005, Supplementary Table 1).

Mapping the outlier loci on the linkage map, and LD

The YI pedigree used to determine the positions of the outlier loci of C. japonica had previously been used to construct a linkage map, in which a total of 438 markers were assigned to 11 large linkage groups (LGs) and some small or nonintegrated LGs, giving a total map length of 1372.2 cM (Tani et al., 2003). The mapped population was genotyped using Illumina’s 1536-plex GoldenGate array, as discussed above. All linkage analyses and map estimations were performed using the JoinMap v3.0 software package (Van Ooijen and Voorrips, 2001). During map construction, markers were assigned to tentative LGs by comparing the linkages formed at logarithm of odds thresholds ranging from 3.0 to 9.0, increasing in increments of 1.0. Finally, the markers were ordered at a logarithm of odds threshold of 8.0. Map distances were calculated using the Kosambi mapping function (Kosambi, 1944).

Coefficients of LD were calculated as described by Weir (1996), using squared allele–frequency correlations (r²) for pairs of loci on the basis of genotype data from the investigated populations. The differences from equilibrium were verified by the χ² tests (Weir, 1996). These analyses were performed using the GDA software package (Lewis and Zaykin, 2002).

Assuming a random distribution of markers, if the genome were divided into N intervals, the number of markers per interval would follow a Poisson distribution. To determine whether outlier loci were randomly distributed, every LG was divided into 15, 20, 25 and 50-cM intervals. We used the χ² test to compare the actual distribution of outlier loci to that expected for a Poisson distribution, as described by Kang et al. (2010).

Results

SNP genotyping

Of the 1536 SNPs, 1031 (67.1%) yielded data that met our quality thresholds according to the GoldenGate genotyping system (Uchiyama et al., 2012). The median GC₅₀ score across all usable SNPs was 0.81, with an average call rate of 98.6%. A minimum GenCall50 (GC50) score of 0.30 was chosen for inclusion of SNP loci in the final data set and genotypic clusters were edited manually as needed. Of those 1031 SNPs, 5 were monomorphic and were therefore discarded. The remaining 1026 SNPs were used in subsequent analyses.

Genetic structure

The Pl values for the SNPs ranged from 0.786 to 0.951 with an average of 0.901. The R_s values also varied, from 1.569 to 1.794 with an average of 1.721, and the H_e values ranged from 0.292 to 0.320 with an average of 0.311 (Table 2). In contrast to findings from previous studies, these parameters do not reveal any clear geographical trends (Takahashi et al., 2005; Tsumura et al., 2007). The average F_IS value for all loci was not significantly different from expectations under the Hardy–Weinberg equilibrium except for one locus, SNPg04215.

Table 2 Genetic diveristy of 14 investigated populations in C. japonica based on 1023 SNPs

Full size table

The overall genetic differentiation among populations at the 1026 loci was low (F_ST=0.0391). We also conducted an analysis of molecular variance to determine the variation within and among groups (the two varieties) and populations, and to test the significance of the among-population variation. The variation among populations was 7.72% and was highly significant (P<0.001), while the proportions of variance among varieties and among populations within varieties were 2.74% and 4.98%, respectively.

Bayesian clustering of the information from the 743 loci after excluding the outlier loci demonstrated that the models with K=2 and K=7 both provided satisfactory explanations of the observed data, as judged by the delta K and the highest log-likelihood values, respectively (Supplementary Figure 1). Both of the bar plots for K=2 and K=7 revealed clear differences between populations of the two varieties with the exception of the Oninome population (Figure 2). In addition, the bar plot for K=7 separated the populations for each variety. The ura-sugi variety populations were divided into two groups, with the first consisting of two northern populations and the second containing all the others. Conversely, the omote-sugi populations were divided into three major groups. The first comprised four northern populations, the southernmost population from Yakushima and two others. The two other omote-sugi population groups carried a higher proportion of ura-sugi-type alleles.

Outlier loci

Using the F_ST-based method (Fdist2), we identified 63 loci lying outside the 99% confidence interval (upper), and an additional 40 loci that were also outside (below) the CI (Figure 3). The method of considering the hierarchical structure, such as ura-sugi and omote-sugi varieties, using Arlequin was identified for yet another 34 loci (Table 3). BLASTx searches identified putative functions for 41 of these loci (65%; Table 3).

Table 3 Outlier loci detected by Fdist2 (P<0.99), Arlequin (P<0.99) and BayeScan (Bayes factor >2) analyses, and their linkage group and putative function by BLASTx search (E-value cutoff ⩽1 × 10⁻³)

Full size table

Using the Bayesian method implemented in BayeScan, we identified 52 loci as outliers with a log₁₀ Bayes factor above 1, all of which were also detected by the F_ST-based method except for SNPg05191. Of these outlier loci, 20 had log₁₀ Bayes factor values above 2 (Figure 4); 19 of these 20 were also identified as outliers by the F_ST-based method (Table 3). Five of the 20 loci had log₁₀ Bayes factors above 3, corresponding to a posterior probability of locus effects above 0.999; four had a log₁₀ Bayes factor of 5, corresponding to a posterior probability of 1.000. BLASTx searches identified putative functions for 11 of the 20 loci with log₁₀ Bayes factors above 2 (55%; Table 3).

Using within-variety F_ST analysis, 54 outlier loci were identified. All but three of the identified outlier loci were detected only in either omote-sugi or ura-sugi variety populations; the three exceptions were SNPg03850, SNPg01839 and SNPg00140 (Table 3). The 25 outlier loci were also detected when all populations were included in the analysis. Using the Bayesian method, only four loci were identified in either the omote-sugi or ura-sugi populations, two of which were present in all populations.

Relationships between specific loci and environmental variables

Eighteen loci were associated with environmental variables (Table 4). The environmental variable with the greatest number of associations was the daily maximum temperature, which was related to six loci; the maximum snow depth and latitude were associated with five and three, respectively. All environmental variables were associated with at least one locus. The 12 out of 18 loci found to be related to an environmental variable were also identified as outliers in either the Fdist2, Arlequin or BayeScan analyses. BLASTx searches identified putative functions for 10 of the 18 environment-associated loci.

Table 4 Significant associations between loci and environmnental variables (P<0.01), and their linkage group and the putative function by BLAST search

Full size table

Distribution of outlying loci on the linkage map, and LD

In total, 100 loci were identified as outlier loci or loci associated with environmental variables for all populations. Sixty-four of these loci were successfully mapped on the C. japonica linkage map, and their distribution was studied (Figure 5, Moriguchi et al., 2012). The numbers of mapped outlier loci in each LG varied substantially. For example, LG 4 contained only three loci, while LG2 contained 11. The observed distribution of loci deviated significantly from the Poisson distribution for all marker intervals considered (15, 20, 25 and 50 cM) and was therefore not random (P<0.001). Eight of the 14 loci that were detected by all of the Fdist2, Arlequin and BayeScan methods were mapped on the linkage map; 4 on LG11, and one on each of LG2, LG6, LG7 and LG10. The outlier loci thus formed clumps on specific LGs, particularly LG2 and LG11 (Figure 5). The outlier distribution was related to the F_ST distribution; regions with high F_ST values had high numbers of outlier loci.

LD was detected for only five of the 2145 possible combinations of the 66 loci. Three pairs of loci were closely linked to one another on LG2, LG5 and LG11, respectively, two of which corresponded to the clumped regions of outlier loci discussed above. Locus SNPg04215 was mapped to LG7 but exhibited high LD (more than r²=0.270) with SNPg00066 and SNPg02495 on LG2.

Discussion

Genetic structure

The genetic differentiation between the studied C. japonica populations was low (F_ST=0.0391). Hierarchical analysis indicated that genetic variation among varieties and populations accounted for 2.74% and 4.98% of the total variation, respectively. These results are similar to those from our previous study (Tsumura et al., 2007). While there was little genetic differentiation between variety groups, what differentiation there was pertained to genes related for local adaptation. A K value of 7 yielded the highest log-likelihood value in the structure analysis and provided a much clearer picture of the species’ genetic structure than had been obtained in our previous study (Tsumura et al., 2007). Although the extent of the genetic differentiation between varieties was relatively modest, they were clearly distinguished from one another in the structure analysis excluding ONN population in omote-sugi populations. This population might be established immigration from Chugoku populations such as Azouji and others after extinction of natural forest in Kyushu Island (Takahara, 1998). We therefore examined outlier loci both for all populations and within populations of individual varieties after excluding ONN population from omote-sugi variety populations. However, the obtained result is not largely different in case of inclusion of ONN population.

Detection of outlier loci

The number of outlier loci identified with Fdist2 and Arlequin was roughly three times greater than the number identified with BayeScan for all populations, even when using the same significance threshold (0.99) in both cases and considering the hierarchical structure. Fdist2 may be more prone to detecting false positives than other methods (Pérez-Figueroa et al., 2010), but all of the loci identified by BayeScan were also detected by Fdist2 with the exception of six loci (Table 3). This illustrates the necessity of performing multiple tests to detect outlier loci. It may be that BayeScan is more efficient than Fdist2 at identifying true positives. Pérez-Figueroa et al. (2010) conducted a simulation to compare the efficiency of three different programs (DFDIST, DETSELD and BayeScan) for detecting loci under directional selection using dominant markers. They concluded that BayeScan is more efficient than the alternatives because it automatically optimizes the values of its parameters (Pérez-Figueroa et al., 2010). However, some false positives are inevitable even when multitest corrections and multiple methods are used.

Tests for detecting outlier loci that deviate from neutral expectations are unable to identify false positives (type I errors). We therefore conducted all Fdist2, Alrequin and BayeScan tests, imposing a 99% P lower limit for the identification of genes under selective pressure; the expected number of false positives was thus 1026 × 0.01=10.26. As 14 outlier loci were identified by both tests, it appears that at least some are unlikely to be false positives (owing to type I errors). The 14 loci identified by all Fdist2, Alrequin and BayeScan are thus strong candidates as genes involved in adaptation to local environments.

1.4% of the 1026 loci investigated were identified as candidate loci. This proportion of candidate loci is similar to that found in a previous study (2.7%) in which 148 cleaved amplified polymorphic sequence loci were examined (Tsumura et al., 2007). Eckert et al. (2010) identified 22 candidate loci (BF log₁₀>2.0) associated with climate variables from 1730 loci in 682 loblolly pine tree samples covering the full range of the species. Prunier et al. (2011) also identified 10 candidate loci above 99% confidence interval in black spruce. The proportion of detected loci (1.3% and 1.7%, respectively) associated to the climate variables was very similar with our results. The average F_ST value of the 14 candidate loci was 0.1943 (the values for individual candidate loci ranged from 0.1210 to 0.5006), which is substantially higher than the average F_ST value for all loci (0.0391). This means that different populations have substantially different allelic compositions at these 14 loci, presumably because of the different selective pressures acting upon them.

When we examined the status of outlier loci within groups representing the two varieties, two-thirds of the loci detected in all populations were not identified as outliers in either variety group by Fdist2. With BayeScan, only three of the loci detected in all populations were identified as outliers in one of the variety groups or the other. These genes might be important in the two varieties’ adaptation to their different environments. In Japan, the coast of the Japan Sea experiences heavy snowfall but the Pacific Ocean side is quite dry in the winter. The ura-sugi variety is found on the Japan Sea side and has short needles with narrow angles, which are thought to confer selective advantages in areas with heavy snowfall in the winter. The outlier genes were detected in all populations and also in each variety group, and are therefore likely to be important in local adaptation.

Relationships with environmental variables

Correlation analyses were performed for several environmental variables, bearing in mind the possibility that multiple loci might be associated with any one of these variables. Some of the environmental variables were correlated with one another—specifically, the latitude and the longitude (r=0.81), and the latitude and the annual precipitation (r=−0.86). This reflects the fact that the Japanese archipelago extends in a rough line from the southwest to the northeast and the environmental gradient examined in this study mirrors that of the archipelago; the natural C. japonica forest has a wide-ranging distribution, from 30°N, 130°E to 40°N, 140°E (Table 1, Figure 1). There is heavy snowfall on the Japan Sea side in the winter season but the Pacific Ocean side is quite dry in the winter, and the annual precipitation is much higher in the southern part of Japan than elsewhere in the country. There are thus significant differences in the environmental factors affecting the different regions examined in this work, especially in terms of the annual precipitation and the maximum depth of fallen snow (Table 1). There is a cline of environmental variables such as precipitation, temperature and snowfall along the Japanese archipelago from the northeast to southwest. The formation of the Japanese archipelago began around 17 million years ago, with the earliest incarnation of the Japan Sea forming around 7 million years ago (Taira, 2001). Ever since then, there have been significant climatic differences between the Japan Sea side and Pacific Ocean sides of Japan. A C. anglica fossil dated to the later phase of the Miocene has been found in Japan, and is a likely ancestor of C. japonica. Moreover, C. japonica fossils have been found that date back to the Pleiocene, and so post-date the formation of the Japanese archipelago (Uemura, 1981). C. japonica has thus adapted climatic differences over extended periods of time while expanding its distribution, in the face of global climate changes including several glacial and interglacial periods.

Eight of the 14 candidate genes identified by all Fdist2, Arlequin and BayeScan were associated with at least one of the latitude, the mean annual temperature, the daily minimum or maximum temperature, the maximum snow depth and the annual sunshine hours (Table 3). Three of them were associated with two or more environmental variables that correlated with one another, suggesting that the genes might have responded to the same selection pressures.

Putative functions for four of these eight candidate genes were revealed by BLASTx searches. For example, SNPg04024 is a putative MYBPA1 protein and would thus activate the promoters of several of the general flavonoid pathway genes (Bogs et al., 2007). This kind of function is important for defense against diseases and predators, and for the survival of the individual and/or population. Individuals with genotypes that confer advantages in their specific environments would be selected for, changing the allele frequency at the candidate loci in that population. However, it is important to properly account for hitchhiking effects when making suggestions of this kind, because it is possible that the genes identified may simply be closely linked to the true adaptive gene.

In addition to the location and climate data, other environmental variables such as soil type and the species composition of the forest will also have important effects on selection. Phenotypic differences of the two varieties such as needle, branchlets and wood characteristics that are the result of selection will also be important for association study.

Such information should be collected and used in future association analyses.

The clumpy distribution of the outlier loci, and LD

Significant deviations from the Poisson distribution of outlier loci were observed for all marker intervals (P<0.001), indicating that the 64 mapped outlier loci and loci associated with environmental variables were not randomly distributed on the linkage map; instead, they tended to cluster, especially on LG2 and LG11 (Figure 5), in a manner reminiscent of the ‘genomic islands of divergence’ discussed by Nosil et al. (2009). These two genomic regions, which are likely to have a major role in local adaptation, cover around 1.7 cM on LG2 and about 2.1 cM on LG11, corresponding to about 10.2 and 12.6 Mb, respectively, in this species. Three of the outlier loci in the LG11 region (SNPg01330, SNPg02685 and SNPg04024) were associated with some environment variables. Two of the loci in the LG11 region had Bayes factors above 3 (log₁₀), which is generally interpreted as a ‘decisive’ level of significance. The SNPg04024 locus has a Bayes factor of 5, giving a posterior probability of 1. We are thus convinced that the LG11 region is affected by directional selection along the environmental gradient, presumably by factors such as mean annual temperature, daily maximum temperature, hours of sunshine and/or maximum snow depth. Scotti-Saintagne et al. (2004) identified some important regions of their genomes associated with adaptation using both methods of outlier detection and quantitative trait locus mapping.

Surprisingly, the LD among the loci in each of the region is also high, affecting one pair in each of LG2 and LG11. Although LD in forest trees decays rapidly within several thousand basepairs (Neale and Savolainen, 2004, Eckert et al. (2010) have observed high LD in the loblolly pine, affecting the set of five SNPs located on LG 8. The LD of forest trees has to date only been estimated in coding regions; it is possible that the LD values of non-coding regions might be higher than those for coding regions, as is the case in angiosperms (Gaut et al., 2007; Moritsuka et al., 2012). The SNPg04215 locus on LG7 exhibited substantial LD with two loci on LG2, which is unusual; the high LD might reflect the effects of directional selection, resulting in the same selection pressure. Evolutionary mechanisms such as co-adaptation of gene complexes might be affected by this LD (Dobzhansky, 1970).

The outlier loci detected in the other regions with high significance levels are also strong candidates for involvement in local adaptation. It will be necessary to confirm whether these loci correspond to genuine adaptive genes, to compare the sequences of the full-length regions containing these genes in populations from different environments, and to determine the LD associated with these loci. The identification of outlier loci with high significance levels is essential for conservation efforts and will be required for future work on molecular breeding.

Data Archiving

Data have been deposited at Dryad: doi:10.5061/dryad.gt178.

References

Beaumont MA, Balding DJ (2004). Identifying adaptive genetic divergence among populations from genome scans. Mol Ecol 13: 969–980.
Article CAS PubMed Google Scholar
Beaumont MA, Nichols RA (1996). Evaluating loci for use in the genetic analysis of population structure. Proc R Soc Lond Ser B 263: 1619–1626.
Article Google Scholar
Bogs J, Jaffé FW, Takos AM, Walker AR, Robinson SP (2007). The grapevine transcription factor VvMYBPA1 regulates proanthocyanidin synthesis during fruit development. Plant Physiol 143: 1347–1361.
Article CAS PubMed PubMed Central Google Scholar
Conesa A, Götz S, García-Gómez JM, Terol J, Talón M, Robles M (2005). Blast2GO: a universal tool for annotation, visualization and analysis infunctional genomics research. Bioinformatics 21: 3674.
Article CAS PubMed Google Scholar
Coop G, Witonsky D, Di Rienzo A, Pritchard JK (2010). Using environmental correlations to identify loci underlying local adaptation. Genetics 185: 1411–1423.
Article CAS PubMed PubMed Central Google Scholar
Dobzhansky T (1970) Genetics of the Evolutionary Process. Columbia University Press: New York.
Google Scholar
Eckert AJ, Bower AD, González-Martínez SC, Wegrzyn JL, Coop G, Neale DB (2010). Back to nature: ecological genomics of loblolly pine (Pinus taeda, Pinaceae). Mol Ecol 19: 3789–3805.
Article CAS PubMed Google Scholar
El Mousadik A, Petit RJ (1996). High level of genetic differentiation for allelic richness among populations of the argan tree [Argania spinosa (L.) Skeels] endemic to Morocco. Theor Appl Genet 92: 832–839.
Article CAS PubMed Google Scholar
Evanno G, Regnaut S, Goudet J (2005). Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Mol Ecol 14: 2611–2620.
Article CAS PubMed Google Scholar
Eveno E, Collada C, Guevara MA, Léger V, Soto A, Díaz L et al (2008). Contrasting patterns of selection at Pinus pinaster Ait. drought stress candidate genes as revealed by genetic differentiation analyses. Mol Biol Evol 25: 417–437.
Article CAS PubMed Google Scholar
Excoffier L, Lischer HEL (2010). Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Mol Ecol Resour 10: 564–567.
Article PubMed Google Scholar
Excoffier L, Smouse PE, Quattro JM (1992). Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics 131: 479–491.
CAS PubMed PubMed Central Google Scholar
Foll M, Gaggiotti O (2008). A genome-scan method to identify selected loci appropriate for both dominant and codominant markers: a Bayesian perspective. Genetics 180: 977–995.
Article PubMed PubMed Central Google Scholar
Gaut BS, Wright SI, Rizzon C, Dvorak J, Anderson LK (2007). Recombination: an underappreciated factor in the evolution of plant genomes. Nat Rev Genet 8: 77–84.
Article CAS PubMed Google Scholar
González-Martínez SC, Krutovsky KV, Neale DB (2006). Forest tree population genomics and adaptive evolution. New Phytol 170: 227–238.
Article PubMed Google Scholar
Goudet J (2000). FSTAT: a program to estimate and test gene diversities and fixation indices. Ver. 2.9.1. Available at http://www2.unil.ch/popgen/softwares/fstat.htm.
Grattapaglia D, Plomion C, Kirst M, Sederoff RR (2009). Genomics of growth traits in forest trees. Curr Opin Plant Biol 12: 148–156.
Article CAS PubMed Google Scholar
Hayashi Y (1960). Taxonomical and Phytogeographical Study of Japanese Conifers. Norin-Shuppan: Tokyo, in Japanese.
Hoffmann AA, Willi Y (2008). Detecting genetic responses to environmental change. Nat Rev Genet 9: 421–432.
Article CAS PubMed Google Scholar
Holliday JA, Ritland K, Aitken SN (2010). Widespread, ecologically relevant genetic markers developed from association mapping of climate-related traits in Sitka spruce (Picea sitchensis). New Phytol 188: 501–514.
Article PubMed Google Scholar
Howe GT, Aitken SN, Neale DB, Jermstad KD, Wheeler NC, Chen THH (2003). From genotype to phenotype: unraveling the complexities of cold adaptation in forest trees. Can J Bot 81: 1247–1266.
Article CAS Google Scholar
Japan Meteorological Agency (2002) Mesh Climatic Data of Japan. Japan Meteorological Agency: Japan Meteorological Business Support Center, Tokyo.
Kang BY, Mann IK, Major JE, Rajora OP (2010). Near-saturated and complete genetic linkage map of black spruce (Picea mariana). BMC Genomics 11: 515.
Article PubMed PubMed Central Google Scholar
Kauer M, Dieringer D, Schlötterer C (2003). A microsatellite variability screen for positive selection associated with the ‘out of Africa’ habitat expansion of Drosophila melanogaster. Genetics 165: 1137–1148.
CAS PubMed PubMed Central Google Scholar
Kosambi DD (1944). The estimation of map distances from recombination values. Ann Eugen 12: 172–175.
Article Google Scholar
Lewis PO, Zaykin D (2002). GDA. Available via http://hydrodictyon.eeb.uconn.edu/people/plewis/software.php.
Luikart G, England PR, Tallmon D, Jordan S, Taberlet P (2003). The power and promise of population genomics: from genotyping to genome typing. Nat Rev Genet 4: 981–994.
Article CAS PubMed Google Scholar
Moriguchi Y, Ujino-Ihara T, Uchiyama K, Futamura N, Saito M, Ueno S et al (2012). The construction of a high-density linkage map for identifying SNP markers that are tightly linked to a nuclear-recessive major gene for male sterility in Cryptomeria japonica D. Don. BMC Genomics 13: 95.
Article CAS PubMed PubMed Central Google Scholar
Moritsuka E, Hisataka Y, Tamura M, Uchiyama K, Watanabe A, Tsumura Y et al (2012). Extended linkage disequilibrium in non-coding regions in a conifer, Cryptomeria japonica. Genetics 190: 1145–1148.
Article CAS PubMed PubMed Central Google Scholar
Murai S (1947). Major forestry tree species in the Tohoku region and their varietal problems. In: Kokudo Saiken Zourin Gijutsu Kouenshu, Aomori-rinyukai (eds). pp 131–151. in Japanese.
Neale DB (2007). Genomics to tree breeding and forest health. Curr Opin Genet Dev 17: 1–6.
Article Google Scholar
Neale DB, Ingvarsson PK (2008). Population, quantitative and comparative genomics of adaptation in forest trees. Curr Opin Plant Biol 11: 1–7.
Article Google Scholar
Neale DB, Kremer A (2011). Forest tree genomics: growing resources and applications. Nat Rev Genet 12: 111–122.
Article CAS PubMed Google Scholar
Neale DB, Savolainen O (2004). Association genetics of complex traits in conifers. Trends Plant Sci 9: 325–330.
Article CAS PubMed Google Scholar
Nei M (1977). F-statistics and analysis of gene diversity in subdivided populations. Ann Hum Genet 41: 225–233.
Article CAS PubMed Google Scholar
Nei M, Chesser RK (1983). Estimation of fixation indices and gene diversities. Ann Hum Genet 47: 253–259.
Article CAS PubMed Google Scholar
Namroud MC, Beaulieu J, Juge N, Laroche J, Bousquet J (2008). Scanning the genome for gene single nucleotide polymorphisms involved in adaptive population differentiation in white spruce. Mol Ecol 17: 3599–3616.
Article CAS PubMed PubMed Central Google Scholar
Nosil P, Funk DJ, Ortiz-Barrientos D (2009). Divergent selection and heterogeneous genomic divergence. Mol Ecol 18: 375–402.
Article PubMed Google Scholar
Ohba K (1993). Clonal forestry with sugi (Cryptomeria japonica). In: Ahuja MR, Libby WJ, (eds). Clonal Forestry II. Conservation and Application. Springer-Verlag: Berlin, pp. 66–90.
Google Scholar
Peakall R, Smouse PE (2006). GENALEX 6: genetic analysis in Excel. Population genetic software for teaching and research. Mol Ecol Notes 6: 288–295.
Article Google Scholar
Pelgas B, Bousquet J, Meirmans PG, Ritland K, Isabel N (2011). QTL mapping in white spruce: gene maps and genomic regions underlying adaptive traits across pedigrees, years and environments. BMC Genomics 12: 145.
Article PubMed PubMed Central Google Scholar
Pérez-Figueroa A, García-Pereira MJ, Saura M, Rolán-Alvarez E, Caballero A (2010). Comparing three different methods to detect selective loci using dominant markers. J Evol Biol 23: 2267–2276.
Article PubMed Google Scholar
Pritchard JK, Stephens M, Donnelly P (2000). Inference of population structure using multilocus genotype data. Genetics 155: 945–959.
CAS PubMed PubMed Central Google Scholar
Prunier J, Laroche J, Beaulieu J, Bousquet J (2011). Scanning the genome for gene SNPs related to climate adaptation and estimating selection at the molecular level in boreal black spruce. Mol Ecol 20: 1702–1716.
Article CAS PubMed Google Scholar
Savolainen O, Pyhäjärvi T (2007). Genomic diversity in forest trees. Curr Opinion Plant Biol 10: 162–167.
Article CAS Google Scholar
Savolainen O, Pyhäjärvi T, Knürr T (2007). Gene flow and local adaptation in forest trees. Ann Rev Ecol Evol Syst 38: 595–619.
Article Google Scholar
Scotti-Saintagne C, Mariette S, Porth I, Goicoechea PG, Barreneche T, Bodénès C et al (2004). Genome scanning for interspecific differentiation between two closely related oak species Quercus robur L. & Q. petraea (Matt.) Liebl. Genetics 168: 1615–1626.
Article CAS PubMed PubMed Central Google Scholar
Stapley J, Reger J, Feulner PGD, Smadja C, Galindo J, Ekblom R et al (2010). Adaptation genomics: the next generation. Trends Ecol Evol 25: 705–712.
Article PubMed Google Scholar
Taira A (2001). Tectonic evolution of the Japanese island arc system. Ann Rev Earth Planet Sci 29: 109–134.
Article CAS Google Scholar
Takahashi T, Tani N, Taira H, Tsumura Y (2005). Microsatellite markers reveal high allelic variation in natural populations of Cryptomeria japonica near refugial areas of the last glacial period. J Plant Res 118: 83–90.
Article CAS PubMed Google Scholar
Takahara H (1998). Distribution history of Cryptomeria forest. Pp 207–223. In Yasuda Y, Miyoushi N, (eds) Vegetation history of the Japanese Archipelago. Asakura-Shoten: Tokyo, pp. 207–223. in Japanese.
Google Scholar
Tani N, Takahashi T, Iwata H, Yuzuru M, Ujino-Ihara T, Matsumoto A et al (2003). A consensus linkage map for sugi (Cryptomeria japonica) from two pedigrees, based on microsatellites and expressed sequence taqs. Genetics 165: 1551–1568.
CAS PubMed PubMed Central Google Scholar
Tsukada M (1986). Altitudinal and latitudinal migration of Cryptomeria japonica for the past 20,000 years in Japan. Quat Res 26: 135–152.
Article Google Scholar
Tsumura Y, Kado T, Takahashi T, Tani N, Ujino-Ihara T, Iwata H (2007). Genome-scan to detect genetic structure and adaptive genes of natural populations of Cryptomeria japonica. Genetics 176: 2393–2403.
Article CAS PubMed PubMed Central Google Scholar
Uchiyama K, Ujino-Ihara T, Ueno S, Taguchi Y, Futamura N, Shinohara K et al (2012). Single nucleotide polymorphisms in Cryptomeria japonica: their discovery and validation for genome mapping and diversity studies. Tree Genet Genome e-pub ahead of print 4 May 2012; doi:10.1007/s11295-012-0508-5.
Article Google Scholar
Uemura K (1981). Ancestor and change of distribution in Cryptomeria japonica. Iden 35: 74–79. in Japanese.
Google Scholar
Van Ooijen JW, Voorrips RE (2001) JoinMap: version 3.0, software for the calculation of genetic linkage maps Plant Research International: Wageningen, The Netherlands.
Vasemägi A, Primmer CR (2005). Challenges for identifying functionally important genetic variation: the promise of combining complementary research strategies. Mol Ecol 14: 3623–3642.
Article PubMed Google Scholar
Vitalis R, Dawson K, Boursot P (2001). Interpretation of variation across marker loci as evidence of selection. Genetics 158: 1811–1823.
CAS PubMed PubMed Central Google Scholar
Vitalis R, Dawson K, Boursot P, Belkhir K (2003). DetSel 1.0: a computer program to detect markers responding to selection. J Hered 94: 429–431.
Article CAS PubMed Google Scholar
Voorrips RE (2002). MapChart: software for the graphical presentation of linkage maps and QTLs. J Hered 93: 77–78.
Article CAS PubMed Google Scholar
Weir BS (1996) Genetic Data Analysis II. Sinauer Associates: Sunderland, MA.
Google Scholar
Wright S (1922). Coefficients of inbreeding and relationship. Am Nat 56: 330–338.
Article Google Scholar
Wright S (1978) Evolution and the Genetics of Populations. Variability within and among natural populations vol. 4, The University of Chicago Press: Chicago.
Google Scholar
Yamazaki T (1995). Cryptomeriaceae. In: Iwatsuki K, Yamazaki T, Boufford DE, Ohba H, (eds). Flora of Japan. Vol I. Pteridophyta and Gymnospermae. Kodansha: Tokyo, p. 264.
Google Scholar
Yasue M, Ogiyama K, Suto S, Tsukahara H, Miyahara F, Ohba K (1987). Geographical differentiation of natural Cryptomeria stands analyzed by diterpene hydrocarbon constituents of individual trees. J Jpn For Soc 69: 152–156.
Google Scholar

Download references

Acknowledgements

We thank H Tachida for insightful comments and discussions concerning earlier versions of this manuscript. We also thank Y Taguchi for excellent technical assistance. This study was supported by the Program for the Promotion of Basic and Applied Research for Innovations in Bio-oriented Industry.

Author information

Authors and Affiliations

Department of Forest Genetics, Forestry and Forest Products Research Institute, Tsukuba, Ibaraki, Japan
Y Tsumura, K Uchiyama, Y Moriguchi, S Ueno & T Ihara-Ujino

Authors

Y Tsumura
View author publications
You can also search for this author in PubMed Google Scholar
K Uchiyama
View author publications
You can also search for this author in PubMed Google Scholar
Y Moriguchi
View author publications
You can also search for this author in PubMed Google Scholar
S Ueno
View author publications
You can also search for this author in PubMed Google Scholar
T Ihara-Ujino
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Y Tsumura.

Ethics declarations

Competing interests

The authors declare no conflict of interest.

Additional information

Supplementary Information accompanies the paper on Heredity website

Supplementary information

Supplementary Figure 1 (PPT 93 kb)

Supplementary Figure legend (DOC 34 kb)

Supplementary Table 1 (XLS 43 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tsumura, Y., Uchiyama, K., Moriguchi, Y. et al. Genome scanning for detecting adaptive genes along environmental gradients in the Japanese conifer, Cryptomeria japonica. Heredity 109, 349–360 (2012). https://doi.org/10.1038/hdy.2012.50

Download citation

Received: 11 January 2012
Revised: 10 May 2012
Accepted: 16 July 2012
Published: 29 August 2012
Issue Date: December 2012
DOI: https://doi.org/10.1038/hdy.2012.50

Keywords

This article is cited by

Genetic diversity, genetic differentiation and demographic history of Cryptomeria (Cupressaceae), a tertiary relict plant in East Asia based on RAD sequencing
- Xinyu Li
- Mengying Cai
- Yoshihiko Tsumura
European Journal of Forest Research (2024)
Key triggers of adaptive genetic variability of sessile oak [Q. petraea (Matt.) Liebl.] from the Balkan refugia: outlier detection and association of SNP loci from ddRAD-seq data
- Endre Gy. Tóth
- Klára Cseke
- Zoltán A. Köbölkuti
Heredity (2023)
Genetic consequence of widespread plantations of Cryptomeria japonica var. sinensis in Southern China: implications for afforestation strategies under climate change
- Mengying Cai
- Kentaro Uchiyama
- Yoshihiko Tsumura
Tree Genetics & Genomes (2023)
Effects of the last glacial period on genetic diversity and genetic differentiation in Cryptomeria japonica in East Asia
- Yoshihiko Tsumura
- Megumi Kimura
- Wenjun Han
Tree Genetics & Genomes (2020)
Inferring the demographic history of Japanese cedar, Cryptomeria japonica, using amplicon sequencing
- Natsuki Moriguchi
- Kentaro Uchiyama
- Junko Kusumi
Heredity (2019)

Subjects

Abstract

Similar content being viewed by others

Introduction

Materials and methods

Investigated populations

SNP genotyping

Genetic structure

Methods for detecting candidate loci for divergence

The gathering of environmental data for each population and the identification of correlations with the genetic data

Mapping the outlier loci on the linkage map, and LD

Results

SNP genotyping

Genetic structure

Outlier loci

Relationships between specific loci and environmental variables

Distribution of outlying loci on the linkage map, and LD

Discussion

Genetic structure

Detection of outlier loci

Relationships with environmental variables

The clumpy distribution of the outlier loci, and LD

Data Archiving

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

This article is cited by

Search

Quick links