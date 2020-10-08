Sample and SNP data

We used the same SNP dataset used by Pfeiffer et al.30, produced by genotyping-by-sequencing43 and consisting of 1709 SNPs filtered and genotyped for 158 specimens of the plant C. gayana (Cyperaceae) distributed among 17 high Andean wetlands in Chile´s Norte Chico region (latitudinal range: 26S–32S; altitudinal range: 2852–4307 m.a.s.l.). This region is characterized by a large precipitation gradient increasing with latitude, with climatic conditions varying from Mediterranean to hyperarid. Sample size per wetland ranged from 4 to 10 C. gayana individuals. Detailed information regarding the study area, study sites, species biology and sampling, DNA extraction and SNP data production can be found elsewhere30,42,44. A map of the study area is provided in Supplementary Fig. S5.

Detection of SNP associated with climatic variables and SNP dataset generation

Statistical analyses for detection of candidate climate-selected loci were performed in the R environment (https://cran.r-project.org). We performed redundancy analyses (RDA) to detect putatively selected loci based on the correlation between individual genotypes and climatic variables potentially acting as selective factors. Prior to this analysis, we used a clustering of variables around latent variables (CLV)45 to reduce the initial set of 21 predictor variables, composed of longitude, latitude and 19 standard WorldClim bioclimatic variables (BIO1-19) at 30 s spatial resolution, downloaded from WorldClim version 246 (available at https://worldclim.org/version2). We followed the recommendations of Vigneau et al.47 in determining the number of clusters, then selected the variable most representative of each (Supplementary Fig. S1). Details of the environmental and spatial data for each wetland are reported in the Supplementary Note on variable selection and detection of candidate climate-selected loci. RDA was then carried out as described in Forester et al.31 using the package vegan48. We calculated the variance inflation factors (VIF) to check for multicollinearity problems considering a maximum value of 10. Significance of the RDA model and each individual RDA axis was tested using ANOVA-like permutation tests with 9999 randomizations. We identified outlier loci on each significant axis with a cut-off of 2.5 SD around the mean. Based on these results, we separated the SNPs into candidate climate-selected and non-candidate loci datasets, composed of outlier loci associated with climate variables (i.e. 90) and loci not identified as outliers according to multiple detection methods (i.e. 1421), respectively. In order to remove as many loci with potentially large effects as possible, the non-candidate loci dataset was constructed by eliminating the 229 outliers detected by Pfeiffer et al.30, 17 monomorphic loci, and 42 of the 90 climate-selected loci that were unique to the present study, from the 1709 original loci. However, to ensure adequate comparison of variation patterns of candidate climate-selected and non-candidate loci using a matching number of loci, we created a non-candidate dataset by randomly selecting 90 loci out of the 1421 non-candidate loci. This dataset was used in all subsequent analyses. Note that all analyses were also performed using the full non-candidate loci dataset. Patterns detected with the 90 and 1421 non-candidate selected loci were always consistent (Supplementary Fig. S7).

Genetic structure of the candidate climate-selected and non-candidate loci

Genetic differentiation between wetlands was assessed through various traditional approaches (Supplementary Note on population genetic structure). In addition, we investigated the genetic structure common to both the non-candidate and candidate climate-selected SNP data by carrying out co-inertia analyses (CoIA)32. CoIA is a symmetric canonical analysis33 that ordinates matching objects (e.g. individuals) from two data matrices, in this instance the non-candidate and candidate climate-selected loci datasets, along successive canonical axes. The CoIA axes are calculated such that congruence between the two tables is maximized, approximated by their squared covariance (i.e. inertia). CoIA first requires separate ordinations of the two matrices, which we performed by subjecting the non-candidate and candidate climate-selected datasets to principal component analyses. The number of principal components that we retained accounted for 90% of the total variation of each dataset. To test the significance of the correlation between the genetic variation of the non-candidate and candidate climate-selected loci datasets, we calculated the RV coefficient, a multivariate generalization of the Pearson correlation coefficient33, and tested its significance based on 9999 permutations. Both the principal component and CoIA analyses were performed using the ade4 package49. The projection of the genetic profiles of the individuals onto the co-inertia space allows common genetic patterns between the non-candidate and the candidate climate-selected data to be visualized. The distance between the non-candidate and candidate climate-selected dataset in the CoIA space reflects the effects of processes causing the candidate and non-candidate selected loci datasets to diverge. Thus, to quantify the genome-wide signature of these processes, we computed the Euclidean distances between their CoIA scores on the first seven canonical axes, which together represented 94.7% of the co-inertia. We analyzed whether the divergence levels varied geographically by applying a linear model to test for the effects of basins and sites nested within basins.

Factors linked to divergence between non-candidate and candidate climate-selected loci datasets

We searched for factors linked to population divergence between the non-candidate and candidate climate-selected loci datasets using linear models. To assess divergence between the two datasets at the population level, we averaged divergence estimates of all individuals belonging to the same wetland. The set of explanatory variables included climatic variables potentially acting as selective factors and genetic diversity of candidate and non-candidate selected loci, estimated using gstudio50 as the expected heterozygosity (He) calculated for each SNP dataset (non-candidate and candidate selected loci, Supplementary Table S5) over all loci with a minimum of four genotyped individuals per population30. We used the leaps package51 to identify the optimal subset of bioclimatic variables for use in detecting candidate climate-selected loci. We considered that these variables can have both linear as well as quadratic effects. The best model was selected based on a performance analysis returned by leaps using the AICcmodavg package52. To estimate the contribution of each of the predictors found to be significant, we performed a variation partition using vegan’s varpart function48. Before running these analyses, we ensured that there was no effect of the number of genotyped individuals, as this may act as a possible confounding factor.