## Introduction

In populations, natural selection targets genomic regions with evolutionarily conserved functions or with genetic variants contributing to adaptation to changing environments. Patterns of DNA sequence polymorphisms in these regions are expected to bear the signature of directional or balancing, positive selection on adaptive mutations, or of negative selection against deleterious mutations (Bamshad and Wooding 2003; Nielsen 2005; Vitti et al. 2013). Identifying genes showing such selection signatures has been a major goal of population genetics over the last decades. Many statistical methods have been developed, accounting for the neutral evolution expected for molecular polymorphisms in populations with varying degrees of genetic structure or with particular demographic histories (Pavlidis and Alachiotis 2017; Vitti et al. 2013; Weigand and Leese 2018). Thanks to high-throughput sequencing technologies, these methods can now be used to perform Genome-Wide Scans for Selection (GWSS) using single nucleotide polymorphism (SNP) datasets (Ahrens et al. 2018; Haasl and Payseur 2016; Oleksyk et al. 2010). Although GWSS have identified cohorts of genes associated with past or ongoing selective processes, they are not designed to identify gene coadaptation, resulting from epistatic selection on interacting genes (Otto and Whitlock 2009).

Few studies have examined the impact of epistatic selection models in population samples using DNA polymorphisms. Simulations of two-locus epistatic models with different degrees of recombination (i.e., 0 < c < 0.5) in a panmictic population have shown that the efficiency of epistatic selection and its statistical detection are improved if standing genetic variation already exists, rather than if selection operates on de novo mutations (Takahasi 2009; Takahasi and Tajima 2005). Simulations of two-locus coadaptation in subdivided populations have also shown that the fixation probability of the coadapted haplotype across all subpopulations increases upon moderate migration and isolation (Takahasi 2007).

Adaptive epistatic interactions between alleles at two independent loci are expected to generate linkage disequilibrium (LD). It has been shown that the correlation coefficient r, an LD measure related to r2, can be used to detect epistatic selection between two bi-allelic loci in a population because it is a directional measure which can indicate an excess of ancestral and derived allelic associations, relative to recombinant allelic associations (Takahasi and Innan 2008). This will detect epistatic selection either in a coadaptation model where two derived alleles can form a coadapted allelic combination, or in a compensatory model where the two derived alleles are individually deleterious but compensate when combined (Piskol and Stephan 2008). More recently, a simulation study of epistatic selection in structured populations has suggested the use of the DIS2 measure of LD (Ohta 1982a; 1982b), which quantifies how the frequencies of the different two-locus haplotypes in a subpopulation depart from the average frequencies across all subpopulations (Id-Lahoucine et al. 2019). However, high DIS2 values were found in simulated models of two-locus epistatic selection but also of single-locus selection at two independent loci. This suggests that DIS2 cannot always distinguish between these two models. In addition, population structure, genetic drift, and relatedness among individuals due to nonrandom mating also act as confounders because they increase genome-wide levels of LD and generate long-distance LD (Glémin et al. 2006; Mangin et al. 2012; Nordborg 2000; Slatkin 2008), which can falsely be interpreted as signatures of epistatic selection (Zhang et al. 2004).

A significant improvement towards capturing the LD due to physical linkage in a structured population with various degrees of relatedness among individuals was the introduction of r2v, an r2 measure which includes the kinship matrix into the calculation in order to penalize correlated two-locus genotypic data arising from high levels of relatedness (Mangin et al. 2012). Subsequently, the r|a measure of LD was proposed to identify interspecific genetic incompatibilities corresponding to pairs of loci showing an excess of ancestral haplotype combinations in admixed populations (Schumer and Brandvain 2016). The r|a measure is a partial correlation coefficient between genotypic data at two loci given the genome-wide ancestry proportion, a, between two species. Although such type of measure represented an improvement, authors acknowledged that genetic relationships in populations with complex demographic histories and genetic structures were not fully taken into account (Schumer and Brandvain 2016).

In this study, we propose a statistical test to detect epistatic selection in heterogeneously structured populations (i) between two bi-allelic SNPs by using the rv measure, or (ii) between two genomic regions including each multiple SNPs, by using the corPC1v measure. This latter measure captures the quantitative correlation between the first principal component (PC1) summarizing the multi-SNP genotypes for each genomic region. Using simulations of genome-wide SNP data in structured diploid populations with random to self-mating processes under two epistatic selection models, we show that, compared with r and corPC1, rv, and corPC1v (i) drastically reduced the background LD generated by population structure and relatedness between individuals; (ii) showed an equivalent or a lower power to detect epistatic selection, depending on the mating process, on the dominance of selected mutations and on the selection model; and (iii) T statistics ($$T = \sqrt {n - 2} \frac{r}{{\sqrt {1 - r^2} }}$$) derived from rv or corPC1v (i.e., $$T_{r_v}$$ and $$T_{corPC1_v}$$) fitted a Student distribution τ(n−2) under the null hypothesis of independence between the two tested loci. Hence, unlike T statistics derived from r or corPC1 (i.e., Tr and TcorPC1), $$T_{r_v}$$ and $$T_{corPC1_v}$$ can be used for statistical testing of the correlation coefficient between two loci, while accounting for population structure and heterogeneous relatedness between individuals.

Empirical detections of epistatic selection on SNP data are scarce in the literature (Brachi et al. 2015; Caicedo et al. 2004; Csilléry et al. 2014; Grzeskowiak et al. 2014; Hu and Hu 2015; Pool 2015). We assayed our statistical test in the frame of Genome-wide Epistatic Selection Scans (GWESS), with genomic SNP data from two different model organisms: the legume plant Medicago truncatula and humans. As a proof of concept, we first described in M. truncatula the detection of epistatic selection between the Super Numeric Nodule MtSUNN gene, encoding a receptor which is central for the negative regulation of symbiotic root nodulation, and the CLAVATA3-like (CLE) signaling peptide MtCLE02. Accordingly, an ectopic expression of the MtCLE02 gene in M. truncatula wild-type (WT) and sunn mutant roots experimentally demonstrated a MtSUNN-dependent negative role of the MtCLE02 gene on nodulation, hence validating functionally the genetic interaction between these two genes. In humans, we illustrated the usefulness of the approach by identifying a significant epistatic or coselection signal in Central South Asian populations between SLC24A5 and EDAR genes, encoding, respectively, a cation exchanger affecting pigmentation in zebrafish and human (Lamason et al. 2005) and a receptor involved in the development of hair follicles, teeth, and sweat glands (Botchkarev and Fessing 2005; Sadier et al. 2014). Together with the fact that SLC24A5 and EDAR were previously shown to be under strong positive selection in Europe and East Asian populations, respectively (Bryk et al. 2008; Sabeti et al. 2007; Speidel et al. 2019), our results highlight the role of epistatic selection or coselection in shaping gene coadaptation during the evolution of populations.

## Materials and methods

### Genetic models of epistatic selection

We follow fitness genotype formalization under epistatic selection models as in (Takahasi and Innan 2008; Takahasi and Tajima 2005). Two independent bi-allelic loci A and B were considered, with ancestral alleles A and B, and derived alleles a and b, in a haploid population. The coadaptation model consists in positively selecting the two-locus ab combination. The compensatory model consists in selecting against the Ab and aB two-locus combinations, but not against AB and ab. The coefficient s is used to positively or negatively select two-locus genotypes (Table 1), and in the neutral model all fitness values are set up to 1. In a diploid population, the two-locus fitness expression is more complex because it depends on the level of dominance of the derived alleles (Supplementary Table S1).

### SNP-based and window-based LD measures of epistatic selection

In a diploid organism, at a given bi-allelic SNP with alleles coded 0 and 1, the three possible genotypes are (00, 01, 11), which can be coded as the allelic dose of allele 1 (0, 1, 2). The measure on unphased genotypes between two bi-allelic loci is defined by the correlation coefficient r between vectors of genotypes at the SNPs l and m, Xl and Xm (Hill and Robertson 1968; Rogers and Huff 2009; Weir 1979):

$$r = \frac{{\mathop {\sum }\nolimits_{i = 1}^n \left( {X_i^l - \overline {X^l} } \right)\left( {X_i^m - \overline {X^m} } \right)}}{{\sqrt {\mathop {\sum }\nolimits_{i = 1}^n \left( {X_i^l - \overline {X^l} } \right)^2} \sqrt {\mathop {\sum }\nolimits_{i = 1}^n \left( {X_i^m - \overline {X^m} } \right)^2} }}.$$

In the case where 0 and 1 are the ancestral and derived alleles, respectively, a positive sign of r indicates that combinations of ancestral and derived alleles (i.e., 00 and 11) preferentially segregate in individuals at the two SNPs, compared with alternative combinations (i.e., 01 and 10). At two physically unlinked loci in a panmictic population, this measure allows to detect fitness interactions between two new mutations under the coadaptation or the compensatory model (Takahasi and Innan 2008). In the context of GWESS with high-density SNP data, we propose to use the corPC1 measure of LD between two genomic regions containing each multiple SNPs. The first principal component PC1l is used to summarize quantitatively the multi-SNP genotypes of the genomic region l (see (McVean 2009)). Then, corPC1 is the correlation coefficient between vectors of summarized multi-SNP genotypes of the two genomic regions l and m, PC1l and PC1m:

$$cor_{PC1} = \frac{{\mathop {\sum }\nolimits_{i = 1}^n \left( {PC1_i^l - \overline {PC1^l} } \right)\left( {PC1_i^m - \overline {PC1^m} } \right)}}{{\sqrt {\mathop {\sum }\nolimits_{i = 1}^n \left( {PC1_i^l - \overline {PC1^l} } \right)^2} \sqrt {\mathop {\sum }\nolimits_{i = 1}^n \left( {PC1_i^m - \overline {PC1^m} } \right)^2} }}.$$

However, as mentioned in (Mangin et al. 2012), population structure and relatedness among individuals generate nonindependence between individuals and tend to bias upwardly the LD values. This is particularly the case in highly inbred or predominantly selfing species (Glémin et al. 2006). At a given locus, Mangin and collaborators proposed to weight the observations by multiplying the vector of genotypes by $$V^{ - \frac{1}{2}}$$, where V is the kinship (or relatedness) matrix among individuals. V is built with the Vij covariance for all pairs (i,j) of individuals. Vij is the average number of identical genotypes between individuals i and j over all SNPs, in a genome-wide SNP dataset. This measure of Identity In State is a good proxy of Identity By Descent as SNP markers are likely to be accurately modeled by an infinite site mutation model. Consequently, since r is the Pearson correlation coefficient, rv can be computed as $$cor(V^{ - \frac{1}{2}}X^l,V^{ - \frac{1}{2}}X^m)$$ (Mangin et al. 2012) and equivalently corPC1v as $$cor(V^{ - \frac{1}{2}}PC1^l,V^{ - \frac{1}{2}}PC1^m)$$.

### Statistical test of epistatic selection based on LD

Under the hypothesis that observations within Xl and Xm (respectively within PC1l and PC1m) are independent, then r (respectively corPC1) can be used to obtain the T statistics:

$$T = \sqrt {n - 2} \frac{r}{{\sqrt {1 - r^2} }},$$

which follows a Student distribution τ(n−2). However, in the case were observations are not independent, i.e., when the genotypes at a given locus are correlated within the population due to nonrandom mating and/or between populations due to structure, we then expect that only the T statistics obtained using rv or corPC1v follow the τ(n−2) distribution. In the case where the ancestral/derived allele status is known at the SNPs, a positive sign of r (or rv) strictly reflects the coadaptation and compensatory epistatic models (see the previous section), and a unilateral test can be performed with alternative hypothesis being “r (or rv) > 0”. If the ancestral/derived allele status is not known, the sign of r is not interpretable, but the p value of the test can be computed on either side of the null distribution. Likewise, whether the ancestral/derived allele status of the SNPs is known or not, the sign of corPC1 (or corPC1v) is not interpretable since PC1 or PC1 with opposite signs imply an identical ranking of individuals genotypes (or relatedness) in a given genomic region (see for instance (Li and Ralph 2019)).

### Simulation and LD-based detection of epistatic selection in structured population

Simulations of neutral evolution at two independent bi-allelic loci were carried out in order to evaluate the distribution of Tr, TcorPC1, $$T_{r_v}$$, and $$T_{corPC1_v}$$ statistics under the null hypothesis and their fit to a Student distribution τ(n−2). In addition, two-locus epistatic selection was simulated in the same framework to evaluate the statistical power (1-β) of these measures to detect two-locus epistatic selection given a type I error (α) for the null hypothesis. Details of our simulation procedure are provided in the Supplementary Method File 1, along with the python code used to run the simulations. Briefly, genome-wide (four chromosomes) SNP data (~15,000 SNPs per chromosome) in a two-population split model with 250 diploid individuals per population during 300 generations (the ancestral population before the split was generated by coalescent simulations) were replicated 1000 times for all combinations of the following parameter settings: (i) selection regimes as neutrality, coadapted (COAD) and compensatory (COMP) two-locus epistatic selection, or additive (ADD) two-locus selection (all selection models starting 100 generations after the split time), (ii) random or self-mating (95% selfing rate) since the initial generation, and (iii) complete recessivity, codominance or dominance of the mutations under selection.

### GWESS with SNP data in M. truncatula and human

GWESS was performed in M. truncatula using a raw dataset of 22,079,533 SNP markers identified on the eight chromosomes of the species by the Medicago HapMap Project on a collection of 262 accessions (see http://www.medicagohapmap.org/downloads/mt40). The collection has already been screened for GWAS for different traits (Bonhomme et al. 2014; 2019; Burgarella et al. 2016; Kang et al. 2015; Le Signor et al. 2017; Rey et al. 2017; Stanton-Geddes et al. 2013; Yoder et al. 2014) but also for GWSS (Bonhomme et al. 2015; Branca et al. 2011; Paape et al. 2013). This highly self-mating species (95% selfing rate), originating from the Mediterranean basin, is structured in two major subpopulations, the Far-West population (FW) concentrated on the West part under Atlantic influence, and the Circum population (C) that spreads over the rest of the Mediterranean basin (Bonhomme et al. 2014; Burgarella et al. 2016; De Mita et al. 2011; Ronfort et al. 2006). Samples from the FW subpopulation and from the C subpopulation consist of 80 and 182 accessions, respectively. We used a bait approach for GWESS, in which TcorPC1 and $$T_{corPC1_v}$$ were calculated for a given candidate gene, here MtSUNN, with the Medtr4g070970 gene identifier in the genome version 4.0—http://www.medicagogenome.org/—(Tang et al. 2014) or MtrunA17Chr4g0035451 in the v5 version—https://medicago.toulouse.inra.fr/MtrunA17r5.0-ANR/—(Pecrix et al. 2018), against each of the remaining 48,339 genes of the genome. The TcorPC1 and $$T_{corPC1_v}$$ statistics were calculated based on PC1 values from SNP data located in 10 kbp windows spanning each M. truncatula gene, and PC1 values from MtSUNN. P values were then obtained from the τ(n−2) null distribution. For the PC1 calculation on each gene, imputed SNP data were required. Gene-based imputation was performed using the Tassel software (Bradbury et al. 2007), where each missing base was imputed with the accession that shares the longest haplotype surrounding the base, on a window of 30 SNPs maximum (Bonhomme et al. 2014). For the calculation of $$T_{corPC1_v}$$, the kinship matrix V of the 262 individuals was estimated based on 7,252,792 SNPs with a 5% Minor Allele Frequency (MAF).

In human, GWESS was performed by using the dataset of 644,257 SNPs (431,951 SNPs with a 5% MAF) with no missing data from the HGDP-CEPH Human Genome Diversity Panel on a world-wide sample (America, Asia, Europe, Middle East, North Africa, Sub-Saharan Africa and Oceania) of 940 individuals belonging to 57 populations from 23 countries (Cann et al. 2002; Li et al. 2008). The genome version (i.e., the gene positions) B36 was used for this analysis in order to fit with SNP positions in the HGDP-CEPH dataset, similarly to (Daub et al. 2013). We used a bait approach in which Tr and $$T_{r_v}$$ were calculated for SNPs located in or near SLC24A5 and EDAR genes (chromosome 15 and 2, respectively), against each of the remaining SNPs of the genome.

### Functional genetic validation of the relationship between MtCLE02 and MtSUNN in M. truncatula

The M. truncatula sunn mutant, the MtCLE02 cloning strategy for overexpression, the root transformation protocol, and the conditions for phenotyping the nodulation are described in the Supplementary Method File 2.

## Results

### Quality control of simulations

To ensure that simulations produced consistent between-population structure and within-population inbreeding levels, the FST and FIS parameters were calculated (Supplementary Fig. S1). At the outcome of the simulations (i.e., generation 300), the average FIS in self-mating and panmictic populations was equal to 0.92 and 0.07, respectively, while the average FST was equal to 0.19 and 0.10, respectively. In order to quantify the fixation of coselected alleles, we tracked down the evolution of the frequency of the derived alleles a and b at the two SNPs A and B intended to be targeted by selection, located on two different chromosomes. Selection efficiency was measured by the co-fixation rate of a and b at each generation in each selection model (Supplementary Fig. S2). The first observation was that the COAD epistatic model generally induced a higher speed of co-fixation than the ADD positive selection model, while, as expected, the COMP model tended to maintain a higher polymorphism due to selection of both AB and ab combinations at the two selected loci. The second observation was that co-fixation rates of the derived alleles in self-mating populations reached more rapidly an equilibrium value than in panmictic populations, but more importantly that in self-mating populations the dominance level of the selected mutations had few effect on the co-fixation dynamics because of the very low heterozygosity level (FIS = 0.92 at the onset of selection). A similar result was observed in a simulation study of selective sweeps in self-mating populations (Hartfield and Bataillon 2020). On the other hand, the dominance level in panmictic populations strongly impacted co-fixation dynamics due to more complex fitness patterns in the presence of heterozygotes (Supplementary Table S1). However, despite starting from standing variation at SNPs under selection, values of the co-fixation rates were moderate, and this result must be interpreted in light of the small size of the simulated population (N = 250 in each population), as selection efficiency increases with population size according to a factor Ns (Glémin 2007).

### Two-locus LD under epistatic selection models

We focused hereafter on the evolution of the two-locus average LD across simulations, in self-mating and panmictic populations under selection models with codominance. Codominance of the selected mutations indeed produced the highest co-fixation rates in panmictic populations, while the dominance level had no effect in self-mating populations (Supplementary Fig. S2). Dominance or recessivity of the selected mutations mainly impacted negatively the two-locus average LD under the COMP epistatic model in panmictic populations (Supplementary Fig. S3). In the codominant mutations model, we first observed that under the neutral model, the population structure with or without nonrandom mating generated LD between two independent loci, as measured using r or corPC1, that could reach 0.25–0.5 at the final generation (Fig. 1). This background LD was lowered to zero or close to zero on average, when correcting these statistics by the V matrix, as measured using rv or corPC1v.

Second, we observed that selection models tended to generate more LD than the neutral model, as measured using rv or corPC1v. For instance, in self-mating species, the COAD, COMP, and ADD selection models all tended to generate more LD than the neutral model, with COAD and COMP generating more LD than the ADD model (Fig. 1). In panmictic populations, however, only the COMP model generated a consistent LD, compared with the COAD and ADD models which generated low LD. Nevertheless, despite correcting for population structure, it remained difficult to distinguish epistatic selection from additive selection in self-mating populations at the haplotype level, as different haplotypes were under selection in each subpopulation. This artifact was less prominent when focusing on the SNPs targeted by selection.

Finally, SNP-based LD measures (r/rv) were more efficient than haplotype-based LD measures (corPC1/corPC1v) to detect epistatic selection (Fig. 1). However, these measures could not capture any signal once allele fixation at one SNP or co-fixation at the two SNPs occurred. On the other hand, corPC1/corPC1v relied on SNP polymorphisms in the genomic region surrounding SNPs under selection, so that they could benefit from the hitch-hiking effect even after allele fixation at the selected SNPs.

### False positive control and power of two-locus correlation statistics

On the assumption that values of the correlation coefficient r follow a Student distribution τ(n−2) under the null hypothesis of independence between the two variables tested, we examined the fit of the statistics Tr, TcorPC1, $$T_{r_v}$$, and $$T_{corPC1_v}$$ to such a distribution. False positive (FP) proportions of these statistics are given for different rejection quantiles of the Student distribution τ(n−2) in Table 2. Two time-points were considered in neutral simulations, at generation 140 in the midst of the time course, and at the last generation 300. At generation 140, in both the self-mating and random-mating models, Tr and TcorPC1 showed excessively large FP proportions. For instance, FP proportions ranging from 55 to 81% were observed for a 1% type I error, while $$T_{r_v}$$ and $$T_{corPC1_v}$$ showed adequate, conservative FP proportions ranging from 0.1 to 3% for the same 1% type I error (Table 2). At generation 300, a similar behavior was observed, with FP proportions ranging from 74 to 91% for a 1% type I error, while $$T_{r_v}$$ and $$T_{corPC1_v}$$ showed adequate—though less conservative in the case of the self-mating model—FP proportions ranging from 1.1 to 22% for the same 1% type I error. This indicates that corrections for population structure and heterogeneous relatedness are necessary in order to perform statistical tests of the neutral hypothesis for a null correlation between two independent loci (two SNPs or two genomic regions), accounting for “noisy” neutral processes.

A power analysis of the Tr, TcorPC1, $$T_{r_v}$$, and $$T_{corPC1_v}$$ statistics was then performed by using simulated data (i) under the null hypothesis of neutrality and independence between loci and (ii) under each of the selection models and independence between loci. At both time-points (generations 140 and 300), a general trend was that the detection power with r/rv and corPC1/corPC1v was higher for the COMP model than for the COAD or the ADD models (i.e., 25–50%, 10–65%, and 10–30%, respectively, for α = 5% with rv or corPC1v statistics), especially when considering random mating (Fig. 2). In addition, the correction of LD-based measures by the kinship matrix (rv/corPC1v) did not increase the detection power of epistatic selection; rather, it tended to reduce power, especially in the COMP model, but not in the COAD model. This is due to the fact that the fixation of the AB allelic combination was more frequent in subpopulations than in the whole population in the COMP model (see Supplementary Fig. S2)—a consequence of unequal initial frequencies of the ancestral/derived alleles in the simulations—, leading to high LD values when population structure was not taken into account. Finally, corPC1v tended to show less power than rv because of haplotype heterogeneity, namely when the same selected allele was associated to different haplotypes within a subpopulation (Fig. 2a–d).

### Detection of two-locus coadaptation in the M. truncatula plant

To illustrate the statistical testing of the correlation coefficient between two loci using TcorPCI or $$T_{corPC1_v}$$, a one-dimension GWESS was performed using a bait approach with the MtSUNN gene, which is a key regulator of nodulation in legumes, against the 48,339 other genes of the M. truncatula genome. Two scans were implemented, including SNP data from either the whole M. truncatula collection—n = 262 individuals—or from the FW subpopulation—n = 80 individuals—(Fig. 3a–d; respectively). A clear inflation towards small p values could be observed for scans based on TcorPCI (Fig. 3a, b) compared with scans based on $$T_{corPC1_v}$$ (Fig. 3c, d), and this inflation was higher with data from the whole collection that showed a higher degree of population structure. In the FW subpopulation scan, a sharp peak was observed using $$T_{corPC1_v}$$ on the chromosome 6 corresponding to the MtCLE02 (Medtr6g009390) gene on top of the peak (Fig. 3d, p value = 1.7 × 10−8). MtCLE02 corresponded to the top candidate gene showing an epistatic selection signal outside of the chromosome 4 where MtSUNN was located. Whereas MtCLE02 was also highly correlated with MtSUNN when considering TcorPCI (Fig. 3b, p value = 2.74 × 10−13), several other genomic regions displayed similar or even more significant signals, which may indicate spurious genome-wide correlations. Interestingly, scans based on SNP data from the whole M. truncatula population did not reveal such strong signal in the genomic region containing MtCLE02 (Fig. 3a, c; p value = 0.077 and 0.006 for TcorPCI and $$T_{corPC1_v}$$, respectively), indicating that in this specific case, epistatic selection may have occured at the subpopulation level.

### Experimental evidence for the genetic relationship between MtSUNN and MtCLE02 in M. truncatula

The MtSUNN gene encodes a leucine-rich repeats-receptor like kinase, whereas the MtCLE02 genes encodes a CLAVATA-like secreted signaling peptide. The SUNN receptor function, which is crucial in the systemic negative regulation of nodulation, was previously associated to other CLE secreted signaling peptide encoding genes, MtCLE12 and MtCLE13 (Mortier et al. 2012; 2010), but not with MtCLE02. Whereas their expression was induced by the inoculation with symbiotic rhizobia bacteria initiating nodulation, this was not the case for MtCLE02 (Supplementary Fig. S4A, B). These previously documented CLE/SUNN relationships pointed us to test for a putative functional interaction between CLE02 signaling peptides and the SUNN receptor. As previously performed for MtCLE12 or MtCLE13 (Mortier et al. 2012; 2010), we used a genetic approach consisting in overexpressing comparatively the MtCLE02 gene in M. truncatula WT and sunn mutant roots (Fig. 4). First, a quantification of the nodule number in WT versus sunn mutant roots highlighted the well-known supernodulation phenotype of the sunn mutant (Mann and Whitney–Wilcoxon test, p value = 2 × 10−8). Second, the nodule number was significantly decreased when MtCLE02 was overexpressed in WT roots, as validated by real time RT-PCR (Supplementary Fig. S4C), indicating a negative role of MtCLE02 on nodulation (Fig. 4, Mann and Whitney–Wilcoxon test, p value = 2 × 10−6). Third, MtCLE02 overexpression in sunn mutant roots did not impact the nodule number (Mann and Whitney–Wilcoxon test, p value = 0.66), in contrast to what was observed in the WT, indicating that the negative role of CLE02 on nodulation relies on the SUNN receptor.

### Detection of two-locus coadaptation in human populations

In human, two GWESS were performed on the world-wide sample of 940 individuals with SNP data from SLC24A5 and EDAR genes, two major drivers of the external appearance which have been subjected to strong positive selection in human populations according to different studies (Basu Mallick et al. 2013; Beleza et al. 2013; Bryk et al. 2008; Sabeti et al. 2007; Speidel et al. 2019). Two bait SNPs, 15_46172199 (rs2250072) and 2_108973688 (rs6749207) located in SLC24A5 and EDAR genes, respectively, were chosen for the GWESS. For each bait SNP, SNP-based statistics Tr and $$T_{r_v}$$ were calculated for 431,950 genome-wide SNPs. Scans implemented in the world-wide population with SNPs 15_46172199 and 2_108973688, each as bait, were inflated towards small p values when using the Tr statistic (Fig. 5a, b), compared with scans implemented with the $$T_{r_v}$$ statistic (Fig. 5c, d). Using the SNP 15_46172199 as bait for SLC24A5 gene, a peak corresponding to the EDAR gene was detected, with the SNP 2_108946170 as the top significant SNP (Fig. 5a, c; $$T_{r_v}$$-based p value = 2.29 × 10−9). Conversely, when the scan was performed with the SNP 2_108973688 as bait from the EDAR gene, a peak corresponding to the SLC24A5 gene was detected, with the SNP 15_46179457 as the top significant SNP (Fig. 5b, d; $$T_{r_v}$$-based p value = 1.2 × 10−12). Genome-wide LD distributions between each bait SNP and all other SNPs (Fig. 5c, d; top left of each panel) showed very high LD values between SNPs from EDAR and SLC24A5, thus indicating extremely significant signals (EDAR SNPs ranked among the top 0.01–0.18% SNPs in LD with SLC24A5; and SLC24A5 SNPs ranked among the top 0.006–0.03% SNPs in LD with EDAR). The world-wide geographic distribution of genotypes at SNPs 15_46172199—SLC24A5—and 2_108973688—EDAR—(Fig. 6c) correlated substantially with the global human population structure, as depicted by a phylogenetic tree based on the kinship matrix among individuals (Fig. 6a, b). Indeed, the derived allele at SNP 15_46172199, associated with the light skin allele at the SLC24A5 gene, was present in Europe, North Africa, Middle East and Central South Asia; and the derived allele at SNP 2_108973688, associated with the thick hair allele at the EDAR gene, was present in East-Asia, America, and Oceania (Fig. 6c). The strong LD signature observed in the world-wide samples between SLC24A5 and EDAR, as measured with Tr, therefore reflected the selection of derived alleles in different geographic regions, and thus a correlation with the global population structure. However, the LD measured with $$T_{r_v}$$ was still highly significant between SLC24A5 and EDAR, indicating that epistatic selection may have occurred between both genes at the level of geographic sub-regions.

In order to localize the geographic origin of such selection signature, GWESSs were performed within six geographic regions of the world-wide sample: Central South Asia, East Asia, Sub-Saharan Africa, Middle East, Europe, and America (Supplementary Figs. S5S10). Only the GWESS performed in Central South Asia indicated a significant LD between SNPs at the SLC24A5 and EDAR genes (Supplementary Fig. S5C, $$T_{r_v}$$-based p value = 6.7 × 10−6 at SNP 2_108973688; Supplementary Fig. S5D, $$T_{r_v}$$-based p value = 2.8 × 10−6 at SNP 15_46174380).

Human population samples from the HGDP-CEPH dataset in Central South Asia are composed of eight different ethnic groups from Pakistan. To search for local signals, LD tests were performed with $$T_{r_v}$$ between two candidate SNPs within SLC24A5 (15_46179457 (rs1834640) and 15_46172199), and three candidates SNPs within EDAR (2_108962124 (rs260607), 2_108982808 (rs17034770), and 2_108973688), for the 50 ethnic groups or populations distributed within eight geographic regions and showing polymorphism at all five SNPs. Average and standard deviation of −log10(p value) across six pairwise SNP comparisons (Fig. 6d) strongly supported a high LD between SLC24A5 and EDAR in the Burusho ethnic group from Pakistan (3.2 and 0.36, respectively), as also highlighted by genotypes of Burusho individuals (Fig. 6c, rv = 0.63 for genotypes between the two dotted lines). This pattern of high LD between SLC24A5 and EDAR in Burusho did not seem to be generated by any population substructure in this ethnic group, since LD tests performed with Tr resulted in a similar average (3.18) and standard deviation (0.36) of -log10(p value) (Supplementary Fig. S11).

## Discussion

We introduced a statistical method which can detect the signature of epistatic selection using LD between two loci. SNP-based ($$T_{r_v}$$) and window-based ($$T_{corPC1_v}$$) statistics, which take into account the underlying population structure and relatedness among individuals, are shown to fit a Student distribution τ(n−2), allowing to easily and quickly test for significance of correlation coefficients in the frame of GWESS using either a candidate SNP, a gene, or a short genomic region as bait. Simulations have shown that $$T_{r_v}$$ and $$T_{corPC1_v}$$ showed equivalent or less power than Tr or TcorPC1 to detect epistatic selection occurring simultaneously in all subpopulations, ranging from 10 to 65% (assuming a 5% type I error) depending on the epistatic selection model and mating process. Thus, selection signals in local populations could be more difficult to detect with $$T_{r_v}$$ and $$T_{corPC1_v}$$ because in these cases, selection can be correlated with population structure. In addition, the impact of the kinship matrix on LD correction changes depending on the scale of the sampling, with a stronger impact for large scales of geographic population structure than for smaller, less structured, and less heterogeneous geographic scales. These features suggest that GWESS should be performed not only on a global sample comprising individuals from different populations, but also on samples from different subpopulations in order to search for more population-specific patterns of epistatic selection. Despite the power was not increased when using $$T_{r_v}$$ and $$T_{corPC1_v}$$, simulations as well as analyses performed both in M. truncatula and human genomes strongly supported their use, in order to efficiently control for FP. Interestingly, although SNP-based statistics (Tr,$$T_{r_v}$$) may tend to show an increased power than window-based statistics (TcorPC1, $$T_{corPC1_v}$$) because they are not sensitive to haplotype heterogeneity and because mutations under epistatic selection at both loci can be directly tested by SNP-based statistics, window-based statistics also show several advantages. First, window-based statistics are faster to implement at the genome scale, notably in two-dimensional GWESS. Second, the window size can be fixed at a value that fits best the average LD decay in the species studied, even though a standard 10 kbp size can be used by default. Finally, window-based statistics allow detecting coevolving genes even after the putative fixation of coselected SNPs, because surrounding SNPs within genes or windows also carry a selection signal by hitchhiking.

The applications of our method to SNP data from human populations or from populations of the model plant M. truncatula allowed in both cases to identify a couple of genes most probably under epistatic selection, or at least under coselection. In M. truncatula, additional experiments revealed a genetic interaction likely shaped by epistatic selection between the MtSUNN and MtCLE02 genes. The polymorphism at the MtSUNN gene could be driven by balancing selection at the local level because the H statistic (Fay and Wu, 2000) is 1.45 in the FW population and ranks among the highest 8.41% in the entire genome. On the other hand, H = 0.52 based on the whole species, ranking among the highest 39.5%, according to a set of 47,875 genes. The polymorphism at the MtCLE02 gene seems more affected by an ongoing soft sweep that can be detected at the level of the whole species (H = −2.25, ranking among the lowest 8.98% of the genome). Still, both genes maintain polymorphisms, and epistatic selection could drive this pattern. As a proof of concept, a genetic approach was used to demonstrate the functional interaction between the CLE02 signaling peptide and the SUNN receptor in the context of symbiotic root nodulation. Indeed, the CLE02 signaling peptide negatively affected the number of nodule organs on the plant root system depending on the SUNN receptor, as previously shown for other CLE peptide encoding genes, MtCLE12 and MtCLE13 (Gautrat et al. 2019; Mortier et al. 2012; 2010). Interestingly, it should be noted that the MtCLE02 gene is, in contrast to MtCLE12 and MtCLE13, not regulated by symbiotic nodulation conditions and not phylogenetically closely related to previously characterized CLE peptide encoding genes shown to have a related negative impact on nodule number (Hastwell et al. 2017). The identification of a novel CLE peptide acting in this genetic pathway highlights the discovery power of our method to functionally associate gene pairs independently of their expression pattern or of a coregulation pattern criterion.

In human, the SLC24A5 gene, a major driver of variation in skin pigmentation, has been shown to be under positive selection in the European population (Deng and Xu 2018; Izagirre et al. 2006; Sabeti et al. 2007). The causal mutation for the light skin phenotype was not present in HGDP data (SNP rs1426654, position: 15_46213776), but SNPs used (i.e., 15_46179457 and 15_46172199) were located on the same SLC24A5 haplotype background that was previously characterized (Basu Mallick et al. 2013; Beleza et al. 2013; Crawford et al. 2017). In addition, the V370A mutation in the EDAR gene coding for a receptor related to TNFα receptors and involved in driving hair structure, as well as teeth and sweat glands development, was shown to be under positive selection in East Asia and in native Americans, and to increase hair thickness (Bryk et al. 2008; Sadier et al. 2014; Speidel et al. 2019). However, just as for SLC24A5, the causal mutation in the EDAR gene was not present in HGDP data, but SNPs used (i.e., 2_108962124, 2_108973688, and 2_108982808) were located within the genomic sequence of EDAR. The geographic distribution of genotypes at SNPs 15_46172199 and 2_108973688 strongly correlated with the world-wide human population structure, which explained the high LD observed at this level in the scans implemented with Tr. However, scans implemented with $$T_{r_v}$$ indicated a persistence of top SNPs in LD between SLC24A5 and EDAR in the world-wide sample, which might be due to epistatic selection or coselection. We identified such selection signature in Central South Asia, with the Burusho ethnic group from Pakistan as being one possible geographic origin. The GWESS performed in Central South Asia, and subsequent LD tests performed between SLC24A5 and EDAR in subpopulations with Tr or $$T_{r_v}$$ statistics showed similar results, indicating a weak effect of population structure, as previously observed within this geographic region and in India (Rosenberg et al. 2006). The Burusho showed a predominant association between the derived alleles at SLC24A5 and the ancestral alleles at EDAR, which is indicative of the persistence of a typical European light skin and thin hair structure in this ethnic group. To date, no functional link is yet demonstrated between the critical skin pigmentation-related SLC24A5 gene and the ectodysplasin pathway to which EDAR belongs. While the long-distance LD between EDAR and SLC24A5 largely reflects strong positive selection acting independently on these two genes in different geographic regions, our results also suggest that coselection of these genes may have locally contributed to the phenotypic make-up of human populations.

Our method was fast to implement on a computer equipped with an Intel Xeon E5-2640 v4 processor (10 Cores and 2.4–3.4 GHz performance, supplied by 256 Go memory), as only few minutes were needed to estimate the kinship matrix, depending on the sample size (e.g., 30 s and ~3 min for Medicago and human, respectively), and to implement one GWESS (e.g., 2 and 12 min for 48,339 Medicago genes with $$T_{corPC1_v}$$ and 431,950 human SNPs with $$T_{r_v}$$, respectively). Although functional analyses tools for genetic interaction are still limited to a few model species, GWESS can easily be performed on non-model organisms, as long as sufficiently dense SNP data are available, to identify candidate genes under epistatic selection that may be relevant to study in related model species for instance. If GWESS is not directly applicable, possible relationships between candidate genes could be directly tested using dedicated SNP markers, but one should be cautious about the lack of benchmark with other genes. A natural extension of the one-dimension use of $$T_{r_v}$$- or $$T_{corPC1_v}$$-based tests is the implementation of two-dimensional GWESS in which the correlation of each polymorphic locus (SNP, gene or genomic region) in the genome would be tested against all remaining polymorphic loci, thanks to parallel computations on bioinformatics platforms. We anticipate that such an approach will open the way towards exploring evolutionary coadapted gene networks.