A candidate gene association analysis identifies SNPs potentially involved in drought tolerance in European beech (Fagus sylvatica L.)

Studies of genetic variation underlying traits related to drought tolerance in forest trees are of great importance for understanding their adaptive potential under a climate change scenario. In this study, using a candidate gene approach, associations between SNPs and drought related traits were assessed in saplings of European beech (Fagus sylvatica L.) representing trees growing along steep precipitation gradients. The saplings were subjected to experimentally controlled drought treatments. Response of the saplings was assessed by the evaluation of stem diameter growth (SDG) and the chlorophyll fluorescence parameters FV/FM, PIabs, and PItot. The evaluation showed that saplings from xeric sites were less affected by the drought treatment. Five SNPs (7.14%) in three candidate genes were significantly associated with the evaluated traits; saplings with particular genotypes at these SNPs showed better performance under the drought treatment. The SNPs were located in the cytosolic class I small heat-shock protein, CTR/DRE binding transcription factor, and isocitrate dehydrogenase genes and explained 5.8–13.4% of the phenotypic variance. These findings provide insight into the genetic basis of traits related to drought tolerance in European beech and could support the development of forest conservation management strategies under future climatic conditions.

www.nature.com/scientificreports/ metabolic mechanisms. This osmotic adjustment is especially important in roots, allowing their growth under drought 16,17 . It is well-known from the current literature that varying soil characteristics, such as texture, acidity, and nutrient availability, may greatly affect tree growth and its allocation to leaves, stem, and root organs [23][24][25][26][27] . Some studies have shown distinct soil effects on the seasonal timing of leaf development and the metabolism of carbon and nitrogen [28][29][30] . All this may in turn affect the responsiveness of trees to drought through interacting effects on mechanisms controlling tree water balances and mediating physiological and molecular stress tolerance. Indeed, previous studies have shown such interacting effects of drought and soil on annual shoot growth, root development, nitrogen and carbon metabolism, leaf water status, synthesis of anti-oxidative compounds and osmotic stress defense 26,[31][32][33][34] . This demonstrates that the soil may play a significant role in the function of trees under water limited conditions and thus deserves further attention in experimental studies elaborating the drought tolerance of trees.
Common garden experiments and provenance trials have shown that local adaptation is common in forest trees being reflected in differences in adaptive phenotypic traits that match local environmental conditions [1][2][3][4][5] . Indeed, morphological and physiological data suggest that despite its presumed susceptibility to drought, populations of European beech from dry sites are more tolerant to drought than populations from wet sites [35][36][37][38] . For example, by monitoring physiological parameters such as leaf water potential and carbon isotopic discrimination during a 3 year period including the year 2003, one of the driest years in Europe, it was found that beech trees growing in a xeric Mediterranean environment did not demonstrate substantial signs of drought stress compared to beech trees in Central Europe 39 . These findings were further supported by dendroecological data 40,41 and drought experiments with seedlings. When exposed to drought, not only water potential, transpiration rates and growth are less affected in seedlings from xeric sites, but also the root/shoot ratio is higher, which facilitates access to soil water 34,42,43 . Additionally, studies based on genetic markers such as AFLPs and SNPs have shown differences in genetic variation between beech populations growing in environments with different moisture availability 44,45 . Thus, although these results suggest that there is local adaptation to drought in European beech, little is still known about the genetic variation underlying drought tolerance in this species.
The genetic variation underlying many traits in forest trees is likely to be polygenic, controlled by a large number of loci with small and moderate effects [46][47][48] . A common approach for the identification of genetic polymorphisms underlying adaptive traits is to test for associations between phenotypes and genetic variation 49,50 . Association studies can be classified as Genome-Wide Association Studies (GWAS) and candidate gene-based association studies 51 . GWAS allow searching for genetic associations with adaptive variation throughout the genome; however, it is still rather expensive and requires numerous genome-wide markers that often are noncoding or have unknown gene function 52,53 . In contrast, candidate gene-based association studies use genetic variation in genes that are likely directly involved in genetic control of adaptive traits of interest 49,51,54 . Several studies using this approach in plants have been able to successfully identify associations between SNPs in candidate genes and phenotypic traits such as disease resistance 55 , cold-hardiness 56 , bud and flower phenology 57,58 , and wood property traits 59 .
Among different genetic markers, single nucleotide polymorphisms (SNPs) are the preferred choice of molecular genetic markers for association studies, since they are the most common polymorphisms in the genome. They are found in both genic and intergenic regions and can represent mutations that might lead to changes in phenotype [60][61][62] . Recently, the development of SNP markers in candidate genes that are potentially involved in traits related to climate has been reported for European beech [63][64][65] . Not only associations between some of these SNPs and important traits, such as bud burst, bud set, growth rate, chlorophyll fluorescence, and transpiration have been detected 66,67 , but also their association with different geographic and environmental variables, such as elevation, temperature, precipitation, and aridity 45,68,69 , suggesting their relevance for the study of genetic variation important for adaptation to climate change in European beech.
Precipitation gradients might reflect differences in water availability that could promote local adaptation. Therefore, their study represents a good environmental approach to search for genetic variation underlying drought tolerance related traits. In this study, saplings of European beech from populations occurring along steep precipitation gradients in Switzerland were collected. A previous study showed signatures of natural selection in these saplings 45 indicating the importance of further exploring genetic variation that could be involved in drought tolerance in these populations. In this study, the saplings from twelve beech populations along steep precipitation gradients in the upper Rhine and Rhône valleys in Switzerland were selected and subjected to simulated summer drought conditions. The drought simulation experiment with the selected saplings was conducted on two contrasting types of natural forest soil, with acidic or calcareous characteristic, respectively, to account for potentially interfering soil effects. Photosynthesis and growth are critical processes affected by drought 18 , therefore, they were used to evaluate the performance of the saplings under such conditions. Photosynthetic performance was measured using the following three parameters: maximum quantum efficiency of PSII (F V / F M ), performance index of PSII on absorption basis (PI abs ), and total performance index of PSII (PI tot ). Growth was evaluated as increment in stem diameter growth (SDG). The association between these traits and 70 SNPs in 23 candidate genes potentially involved in drought response was tested.
ANOVA analysis showed that there were significant differences in SGD among populations under the drought treatment (SDG 2013: F (11,694) = 4.217, P < 0.001; SDG 2014: F (11,631) = 4.677, P < 0.001; SDG 2013-2014: F (11,641) = 6.561, P < 0.001). The t-test for equality of means revealed that there were significant mean differences between Saxon, one of the most xeric populations, and other four populations with higher precipitation. In all cases, saplings from Saxon showed higher SDG (Table 1). Association analysis. Seventy SNPs were used for the association analyses because six of the 76 initially selected SNPs turned out to be monomorphic (APX1_2, PhyB, 50_320, 52_1_249, 92_166, and 110_1_111).  ). In addition, although relatedness between pairs of individuals was weak (the r QG coefficients varied mostly between 0.0 and 0.1), saplings collected underneath the same adult tree were more related having a higher proportion of the r QG coefficients above 0.1 ( Supplementary Fig. S3). This indicates that accounting for relatedness was important for reducing false positives in the association analysis. Thus, the results of the MLM for the association analysis are also presented.  www.nature.com/scientificreports/ The association analysis using MLM revealed that among the 70 SNPs analyzed, five SNPs (7.14%) showed significant association with the PI abs , PI tot , and SDG traits, while two SNPs (PP2C_315 and 7_520) showed a trend toward significant association with the F V /F M index ( Table 2). The phenotypic variation explained by the SNPs (R 2 ) with significant association was relatively high, between 5.8% and 13.4% (Table 2), and three of them were associated with more than one trait (50_39, IDH_1, and IDH_4). They were located in coding regions: one of them representing a non-synonymous substitution (50_39), and the rest, synonymous substitutions (110_1_293, IDH_1, IDH_4, and 50_232).
Interestingly, each significant association was found only in particular conditions, i.e., the association between SNP and trait was significant only in specific experimental and soil conditions ( Table 2). In all cases the significant associations were observed under drought conditions, except the SNP 110_1_293 in the Cytosolic class I small heat-shock protein gene, which was associated with the PI abs index for chlorophyll fluorescence under control rather than drought conditions. Homozygous individuals TT for this SNP showed a mean PI abs by 9% higher than homozygotes AA and heterozygotes TA, even though the differences were not significant (Fig. 3a). The PI tot index was significantly associated with the SNP 50_39 in the CTR/DRE binding factor gene under drought/ acidic soil conditions. Heterozygotes CA at this SNP showed significantly higher PI tot (by 31% on average) than homozygotes AA (P < 0.001, Fig. 3b). In contrast, no significant differences were found between heterozygotes CA and homozygotes CC, although heterozygotes CA showed higher PI tot (Fig. 3b). The SNP 50_39 also showed significant association with SDG 2013 under drought/acidic soil conditions; heterozygous saplings showed significantly higher SDG (by 38% on average) than AA (P < 0.001) and CC (P < 0.05) saplings (Fig. 4a). Another SNP from the same gene, 50_232, was associated with SDG 2013-2014 also under drought/acidic soil conditions. Homozygotes GG for this SNP had significantly lower SDG (by 28.3% on average) than homozygotes AA (P < 0.001, Fig. 4b) and heterozygotes GA (P < 0.01, Fig. 4b).
Spearmans' correlation analysis showed that correlation between individual heterozygosity and the traits was not significant in most comparisons, except F V /F M and SDG that showed significant association with individual heterozygosity under drought but not under control conditions (Supplementary Table S9).

Discussion
Photosynthesis is a key physiological process in plants. It depends on environmental factors such as light, temperature, and water availability, therefore, its evaluation can provide information on the response of plants to different environmental stresses, indicating whether the plant is being affected. Since chlorophyll fluorescence is correlated with photosynthetic rates, the evaluation of chlorophyll fluorescence is a widely used approach to assess the impact of stress on photosynthesis 70,71 . Table 2. SNPs showing significant or close to significant association with chlorophyll fluorescence traits (FV/ FM, PI abs , and PI tot ) and stem diameter growth (SDG) in different experimental conditions and for the pooled individuals (all saplings). P*-adjusted P value; R 2 -percent of phenotypic variation explained by the SNP; ND-not determined; + -close to significance. In bold are P values that were still significant after applying the Bonferroni correction (P ≤ 0.0014) or after adjusting for a FDR < 0.1. www.nature.com/scientificreports/ Three different parameters based on chlorophyll fluorescence, which provide information on distinct phases of the photochemical process 72 were assessed in our study. It was found that F V /F M was not affected by the drought treatment, since no significant differences were found between saplings under control vs. drought conditions. It has been reported that F V /F M has low responsiveness to drought stress and is not a good indicator to evaluate drought stress tolerance [73][74][75][76][77][78] . In contrast, PI abs and PI tot were better indicators of tolerance to drought stress. Significant differences in PI abs and PI tot were found between saplings under control vs. drought treatments (Fig. 1). Saplings under drought treatment demonstrated lower values of PI abs and PI tot indicating that the biochemical performance of PSII was negatively affected by drought conditions. Similar negative impact on PSII measured by chlorophyll fluorescence has been found in other studies on European beech saplings under drought conditions 73,76,78 .
Although there were no significant differences among populations in chlorophyll fluorescence traits, there is evidence suggesting that saplings from the studied xeric sites recover faster after drought 76 . Other studies also suggest that beech trees from dry habitats are less affected by drought based on the assessment of such parameters as carbon isotopic composition, transpiration rate, and water potential 39,42 . Further exploration by including more individuals and populations from precipitation gradients could help to find actual differences in chlorophyll fluorescence related traits.
Growth is another important trait to evaluate plant responses to different stresses, as it is also affected by environmental conditions. It was observed that under the drought treatment, SDG of all saplings was negatively affected (Fig. 2). Other studies have also reported reductions in diameter increment, ring width, and height growth in beech under a soil water deficit 29,34,79,80 . This reduction in SDG could be explained not only by a decrease in carbon assimilation under drought conditions 81,82 and a prioritization of carbon storage over growth 29 , but also by changes in biomass allocation priorities resulting in an increase in the root/shoot ratio to facilitate access to soil water 8,43 . Interestingly, in 2014, a higher reduction in SDG was observed in saplings subjected to the drought treatment compared to 2013 (Fig. 2). In fact, European beech has a delayed growth response to drought that results in a more marked reduction the next year after the drought occurs 83 , which could explain the higher reduction in SDG observed in 2014.
Additionally, the type of soil also influenced SDG. Indeed, neighboring populations with very similar climate but growing on different soil types can show marked differences in growth 84 . It is known that although European beech is able to grow on different soil types, humid calcareous soils are the most favorable for its optimum growth 85 . In fact, saplings growing on calcareous soil showed higher SDG than saplings growing on acidic soil, especially in 2014 and the overall SDG 2013-2014 (Fig. 2). This effect of soil type on SDG is expected since soil characteristics influence water availability and nutrient uptake, and consequently affect growth 9,29,86 . Furthermore, it was found that the effect of the drought treatment was enhanced by the soil type: saplings growing on acidic soil and under drought conditions showed the lowest SDG (Fig. 2b,c). Similar results were reported by Thiel et al. 34 , who found that under drought conditions a sandy soil with lower water storage capacity and lower nutrient availability had a stronger negative impact on the performance of beech compared to a loamy soil.
Interestingly, although the drought treatment had a negative impact on SDG, comparison among saplings under the drought treatment showed that Saxon, one of the most xeric populations, showed higher SDG than saplings from populations with higher precipitation regimes, indicating that saplings from Saxon were less  (Table 1). These findings are in agreement with other studies reporting that populations of beech from xeric sites are less sensitive to drought 34,40,43 , indicating adaptation to such conditions. Potentially, adaptive genetic variation can be detected by associating variation of adaptive phenotypic traits with allelic variation 50 . Nevertheless, population structure developed due to selectively neutral genetic processes such as genetic drift and relatedness can cause false associations 53 . Thus, it is necessary to account for them in association analyses. In the present study, the inclusion of both population structure and relatedness inferred using the selectively neutral microsatellite markers reduced the chance of false associations. This was demonstrated by a similar distribution between the expected and the observed P-values in the quantile-quantile plots ( Supplementary Figs S1 and S2). Furthermore, although there was weak genetic relatedness between pairs of saplings, the saplings collected underneath the same adult tree were more related (Supplementary Fig. S3) confirming that accounting for relatedness in addition to accounting for population structure was also important for the reduction of false positive associations. Thus, the SNPs deviating significantly from the expected P-values are likely to be truly associated with the studied phenotypic traits.
In total, five SNPs (7.14% of all SNPs) demonstrated significant association with the studied traits ( Table 2). The phenotypic variation explained by these SNPs was relatively high (5.8-13.4%), compared to the results of other studies reporting values between 2.1 and 6.9% in beech and other species 59,66,87,88 . Several reasons could explain these findings. First, controlled experimental conditions could heighten differences in response to drought and, thus, increase the proportion of phenotypic variation explained by the SNPs. Second, candidate genes that participate in stress response were specifically selected for this study, increasing the chances of finding SNPs that could have an effect on the phenotypic traits. However, it cannot be ruled out that sample size could have an effect on the percentage of phenotypic variation explained by the SNPs. Studies with small population sizes could be affected by the Beavis effect that refers to the statistical effect of overestimation of marker-trait association due to small sample sizes in QTL or association studies 89 . It is estimated that a sample size of 1000 individuals would be needed to avoid such overestimation 48 . In our study, a total of 755 saplings and only 187-193 saplings per experimental conditions (treatment/soil) were used (Supplementary Table S10). Thus, an overestimation of R 2 due to low sample size cannot be ruled out. Furthermore, the presence of linkage disequilibrium (LD) could also inflate R 2 of an individual marker, since it could jointly contribute to the trait together with other markers 48 . Among the significant SNPs, LD was found between pairs of SNPs in the same gene: between IDH_1 and IDH_4, and between 50_39 and 50_232 45 . Thus, an inflation of R 2 due to LD also cannot be ruled out. In addition, it is difficult to identify the likely causal SNP when it is in LD with other SNPs that are also significantly associated with a trait.
Noteworthy, significant associations between the SNPs and the phenotypic traits were found only in particular experimental and soil conditions (Table 2). Indeed, the adaptive value of genetic variation and its influence on fitness are highly dependent on the context 90 . Thus, under control/acidic soil conditions, the 110_1_293 SNP in the Cytosolic class I small heat-shock protein gene was significantly associated with the chlorophyll fluorescence index PI abs . Nevertheless, the two alleles at this SNP seem to have low differences in their phenotypic effect, since the differences in the performance of the different genotypes were not significant (Fig. 3a). In contrast, the 50_39 SNP in the CTR/DRE binding factor gene demonstrated association under drought/acidic soil conditions with both PI tot and SDG 2013; apparently, heterozygotes had a better performance under such conditions (Figs. 3b and 4a), indicating overdominance. Under the same conditions, another SNP from the same gene, 50_232, showed association with SDG 2013-2014 (Table 2). Allele A at this SNP was strongly associated with faster SDG under such conditions (Fig. 4b) indicating a dominant mode of action.
Similarly, the IDH_1 and IDH_4 SNPs in the Isocitrate dehydrogenase gene also showed a dominant mode of action: alleles C in IDH_1 and G in IDH_4 were associated with faster SDG under drought/calcareous conditions (Fig. 4). Noteworthy, IDH_1 and IDH_4 were also found to be associated with environmental variables in the original sites such as temperature, precipitation, and humidity 45 indicating that they are very likely to be involved in adaptation of the studied populations. Indeed, previous studies in other tree species using isoenzymes also identified the IDH gene to be significantly involved in tolerance to environmental stresses such as drought and temperature 91,92 . Kim et al. 93 showed increase of the IDH gene expression under drought conditions in maize.
All the SNPs showing significant association with the studied traits were located in coding regions, but only 50_39 represented a non-synonymous substitution, while 110_1_293, IDH_1, IDH_4, and 50_232-synonymous substitutions. Non-synonymous SNPs are considered the most likely target of natural selection because they cause changes in protein sequence, and thus, might directly affect the phenotype 61 . Thus, the synonymous SNPs showing significant association in the present study could be tightly linked to non-synonymous SNPs that were not genotyped. However, synonymous SNPs could also be under direct selection and cause changes in phenotypic traits, since they can affect mRNA splicing, mRNA stability, and translation kinetics [94][95][96] or affect translation and protein synthesis due to tRNA bias 97,98 . This highlights the importance of considering not only non-synonymous SNPs but also synonymous and non-coding SNPs, since they could have a key role in gene expression and posttranscriptional regulation and, thus, influence important adaptive phenotypic traits 47,62,99,100 .
The 7_520 and PP2C_315 SNPs showed close to significant association with the chlorophyll fluorescence index F V /F M . One of the requirements to evaluate association is that the phenotypic trait should be variable enough 53 . It was not the case for F V /F M and that could limit the detection of associations for this trait. However, low sample sizes could also limit the detection of associations. Since complex traits in forest trees are usually controlled by many loci with small to moderate phenotypic effect, large sample sizes are required to detect their effect 53,101,102 . Indeed, it is estimated that population sizes in the range of hundreds or even thousands would be needed 48 . Thus, further exploration with larger sample size and a better coverage of European beech distribution area would be needed not only to confirm association of the 7_520 and PP2C_315 SNPs with F V /F M , but also to discover new associations important for adaptation to drought 103  www.nature.com/scientificreports/ It has been also estimated that the number of causative loci underlying polygenic traits in plants could be in the order of several hundred 48 . Although candidate gene-based association studies are less demanding regarding the number of markers required because they provide a direct link to genes that supposedly affect the trait of interest, they might miss important genetic variation in other genes not included in the study that could be relevant, but that have not been identified yet 51,53,54 . With the development of recent technologies such as Genotype by Sequencing (GBS) the discovery of SNPs throughout the genome of non-model species is now more feasible 104,105 , making possible the implementation of GWAS. This, together with the recent publication of a reference genome of European beech 106 , will help to get more insight into the genetic basis of drought tolerance in this species.
Heterozygosity has usually been considered as advantageous for fitness. For example, highly heterozygous individuals in crops and domesticated animals can express heterosis-better vigor and yields 107 . Heterozygosity was positively associated with growth, stress tolerance, and survival in different pine species 91,107,108 . However, heterozygosity did not always strongly correlate with fitness 109,110 . Indeed, our results show that only heterozygous individuals at 50_39 SNP in the CTR/DRE binding factor gene showed better performance over homozygotes (Figs. 3b and 4a), and, in most of the cases, no significant correlation between total multi-locus individual heterozygosity and the traits was found (Supplementary Table S9). Only F V /F M and SDG 2013 showed significant association with individual heterozygosity under drought conditions. However, the magnitude of the correlation coefficients (0.09-0.17) indicated a relatively weak correlation. Therefore, the better performance of some saplings cannot be attributed to the general level of heterozygosity in the studied SNPs. Further exploration including additional SNPs that provide a better coverage of the genome could lead to a better estimation of the whole-genome heterozygosity and shed light on its association with the performance of the saplings.
In addition to genetic variation, epigenetic variation could also be responsible for differences in phenotypic traits. Epigenetic marks could also explain some of the phenotypic variation that is usually attributed to genetic variation. Thus, if it has an effect on adaptive phenotypic traits and is stable enough for selection to act, epigenetic variation could provide long lasting selective advantage and be important for local adaptation 46,111 . During stress conditions, epigenetic changes play an important role leading to acclimation responses via phenotypic plasticity 101,111 . Moreover, some epigenetic changes can be very stable and inherited from one generation to the next, and thus, facilitate adaptation in important traits such as drought tolerance 112,113 . Even though our results showed that some genotypes confer a better performance under drought conditions, which suggest genetic adaptation, it cannot be ruled out that acclimation, and thus, phenotypic plasticity, could have also played a significant role in the performance of the saplings. Indeed, the tested SNPs could represent adaptive genetic variation and also be involved in acclimation, since their genes participate in stress response. Although it is difficult to disentangle the relative contributions that phenotypic plasticity and genetic variation have on the phenotype 114 , future research should take into consideration both genetic and epigenetic variation, as both contribute to adaptation to environmental changing conditions.

Material and methods
Plant material and experimental design. Twelve beech populations along steep precipitation gradients in the upper Rhine and Rhône valleys in Switzerland were selected (Table 3). From each population, 16-31 adult trees about 50 m apart from each other were selected. Underneath each tree, 2-4 saplings approximately 20 cm tall and 3-5 years old were excavated, for a total of 60-64 saplings per population. In total, 755 saplings were transplanted in spring 2011 in a randomized design to the model ecosystem facility MODOEK of the Swiss Federal Institute for Forest, Snow and Landscape Research (WSL) in Birmensdorf, Switzerland to be subjected to simulated summer drought conditions. The MODOEK facility is composed of 16 chambers equipped with an automated irrigation system for the control of water supply. Each chamber has a sliding roof and is split below ground in two concrete-walled lysimeters with a depth of 150 cm, each containing acidic (Haplic Alisol) or calcareous (Fluvisol) forest soil of similar texture with a pH of 4.0 and 6.9, respectively 26 . In a randomized design, two saplings per population were transplanted on each type of soil in each chamber. Two growing seasons (2011 and 2012) were used for saplings acclimatization. To exclude natural precipitation and control water irrigation, the sliding roofs of the chambers were closed from May to October. During that period, the saplings were irrigated every 2 or 3 days with 50 l of water per m −2 resembling ambient rainfall composition 26 . During hot periods, the amount and frequency of irrigation were incremented to keep soil water content above 0.15 m 3 m −3 at 10 cm soil depth and counterbalance high evapotranspiration.
Drought conditions were imposed in eight of the chambers in 2013 and 2014 by excluding water irrigation from May to August. In particularly hot days, short irrigation pulses were applied to avoid too fast soil drying and irreversible damage of the saplings. The remaining eight chambers were used as control by not applying drought conditions and maintaining irrigation every 2 or 3 days as described above. A summary on the number of saplings from each population under each treatment (control and drought), type of soil (acidic and calcareous), and chamber is provided in Supplementary Table S11.
Water relations of both soil and saplings was monitored to assess soil drought and physiological stress intensities. In each lysimeter, volumetric soil water content was measured at 10 cm soil depth using PC-controlled moisture probes and 5TM soil moisture and temperature sensor (Decagon Devices, Inc., Pullman, WA, USA). Predawn leaf water potential was measured across a subset of 58 saplings on acidic soil (mesic population Mastrils and xeric population Saxon) using a Scholander pressure chamber M 600 (Mosler Tech Support, Berlin, Germany). DNA extraction. Leaves from the saplings were collected, dehydrated with silica gel and stored at room temperature until DNA extraction. DNA extraction was performed with the DNeasy 96 Plant Kit (Qiagen, Hilden, Germany). To examine the amount and quality of the extracted DNA, electrophoresis in 1% agarose gel using 1X TAE as running buffer was performed. DNA was stained and visualized with Roti-Safe GelStain (Roth, Karlsruhe, Germany) under UV illumination, and compared with a Lambda DNA size ladder (Roche, Mannheim, Germany).

Microsatellite amplification.
To establish the genetic relatedness among pairs of individuals, saplings were genotyped at 13 highly polymorphic microsatellite markers. Three of them were EST-linked markers originally developed for Quercus robur-GOT066, FIR065 and FIR004 116 . The rest of the markers were random genomic microsatellites developed for F. crenata (sfc0018, sfc0161, sfc1063) 117 and F. sylvatica (FS3-04, msf11,  csolfagus_06, csolfagus_19, Fagsyl_002929, Fagsyl_003994) [118][119][120][121] . The markers were pooled in four multiplexes for the PCR amplification: all EST markers were included in the first multiplex; all sfc markers in the second multiplex; FS3-04 and msf11 markers in the third multiplex; and csolfagus and Fagsyl markers in the fourth multiplex. Primers for the GOT066, FIR065, sfc0018, sfc1143, FS3-04, and Fagsyl_002929 markers were labeled with 6-hexachlorofluorescein (HEX) fluorescent dye, whereas primers for the FIR004, sfc0161, sfc1063, mfs11, csolfa-gus_06, csolfagus_19, and  www.nature.com/scientificreports/ Selection of candidate genes and SNPs. To assess genetic variation possibly underlying the evaluated traits, candidate genes were selected as described in Cuervo-Alarcon et al. 45 . In brief, based on previous studies on SNP discovery in European beech [63][64][65] , candidate genes that very likely participate in stress response according to the UniProt (www.unipr ot.org) and the Arabidopsis Information Resource (TAIR) (www.arabi dopsi s.org) databases 122,123 were selected. SNPs for genotyping in those genes were chosen based on their identification as haplotype tag SNPs by the software htSNPer 1.0 124 or because they already showed signs of natural selection in previous studies 66,68 . For primer design, sequences surrounding the SNPs were sent to LGC Genomics Ltd. where SNP genotyping was performed using the PCR-based KASP genotyping assay (Hoddesdon, UK). In total, 70 polymorphic SNPs in 23 candidate genes were genotyped in the saplings and represented 19 non-synonymous, 26 synonymous and 25 non-coding SNPs (Supplementary Table S12).
Statistical analysis of phenotypic traits. Normality and homogeneity of variances for each trait was tested using Shapiro-Wilk test and Levene's test, respectively, using the IBM SPSS statistics software (https :// www.ibm.com/analy tics/spss-stati stics -softw are). F V /F M was not normally distributed for most of the groups (Supplementary Table S13); in contrast, data for the rest of the traits was normally distributed either for the majority or for all the groups (Supplementary Tables S14-S18). As mixed-effects models are robust to departures from normality, they were used to test for statistically significant differences in both chlorophyll fluorescence and SDG traits using the Minitab Statistical Software (https ://www.minit ab.com).
For the chlorophyll fluorescence traits F V /F M , PI abs , and PI tot , the effects of population, treatment, and chamber were tested for significant differences, while soil was not accounted for because chlorophyll fluorescence was assessed only in saplings growing on acidic soil. For the mixed-effects model, population was considered as a random factor, whereas treatment and chamber were considered fixed factors. In addition, a t-test for equality of means was performed to compare saplings under control vs. drought treatment, and also one way-ANOVA was carried out to test for differences among populations under drought treatment. Both t-test and ANOVA were performed using the IBM SPSS statistics software (https ://www.ibm.com/analy tics/spss-stati stics -softw are).
For the SDG traits, effects of population, treatment, soil and chamber were tested for significant differences. For the mixed-effects model, population was considered a random factor, whereas treatment, chamber, and soil were considered fixed factors. Besides, t-tests for equality of means to compare saplings under control vs. drought treatment, saplings on acidic vs. calcareous soil and populations under drought treatment were carried out using the IBM SPSS statistics software (https ://www.ibm.com/analy tics/spss-stati stics -softw are).

Association analysis.
Associations between SNPs and the phenotypic traits were tested using the TASSEL 5.0 software 125 . For the analyses, individuals were grouped according to the experimental conditions: treatment and treatment/soil. Besides, the analysis for all saplings together regardless of experimental conditions was also performed. In this case, phenotypic data were normalized by dividing the data by the mean of each experimental condition. Association analyses were performed using the general linear model (GLM) and the mixed linear model (MLM). The GLM took into account population structure that potentially could be developed due to selectively neutral genetic processes such as genetic drift as a confounding factor (Q), where the Q-matrix was calculated using the tentatively selectively neutral microsatellite markers and the STRU CTU RE 2.3.4 software 126 as described in Cuervo-Alarcon et al. 45 . The MLM took into account both Q and kinship (K) as confounding factors. The K-matrix was calculated also based on the microsatellite genotypes as the pairwise relatedness coefficient r QG estimated according to Queller and Goodnight 127 using the GenAlEx software 128,129 ; negative values of r QG were set to zero. Two different methods were used to account for multiple testing in GLM and MLM analyses: Bonferroni correction using P-value cut-off equaled 0.00143 and adjustment of P-values using the false discovery rate 130 (FDR equal to 0.1) implemented in the R function "p.adjust" 131 . Only SNPs with a minimum allele frequency (MAF) higher than 0.05 were used for the association analysis.
Furthermore, relationships between heterozygosity and the measured traits were evaluated. Individual heterozygosity (proportion of loci that are heterozygous across an individual) was calculated using the GenAlEx 6.5 software 128,129 . Spearman's correlation analyses between individual heterozygosity and the traits were performed using the R 3.3.1 software 131 .

Data availability
The data on candidate genes, polymorphic SNPs, respective nucleotide sequences, their PCR primers and the NCBI GeneBank and the EMBL nucleotide sequence database accession numbers used in this study are publicly available and provided in Table S12 and references cited there.