Admixture has obscured signals of historical hard sweeps in humans

The role of natural selection in shaping biological diversity is an area of intense interest in modern biology. To date, studies of positive selection have primarily relied on genomic datasets from contemporary populations, which are susceptible to confounding factors associated with complex and often unknown aspects of population history. In particular, admixture between diverged populations can distort or hide prior selection events in modern genomes, though this process is not explicitly accounted for in most selection studies despite its apparent ubiquity in humans and other species. Through analyses of ancient and modern human genomes, we show that previously reported Holocene-era admixture has masked more than 50 historic hard sweeps in modern European genomes. Our results imply that this canonical mode of selection has probably been underappreciated in the evolutionary history of humans and suggest that our current understanding of the tempo and mode of selection in natural populations may be inaccurate.

set of samples). We then applied our analytical pipeline to each artificial genomic dataset: that is, we arranged the ~19k annotated genes used in the empirical analyses according to their linear position within the simulated genomic dataset and generated individual gene scores (i.e. a standardised CLR score corrected for gene length; see Methods). Next, outlier genes were characterized as those with q < 0.10 and merged into sweeps by combining outlier genes less than 1Mb apart, with the final set of candidate sweeps being those that had at least one outlier gene with q < 0.01. This process was repeated 30  . The largest mean FDR was ~11% across the six simulated populations using the same q value threshold as our empirical analyses (i.e. q < 0.01), which is taken as a conservative estimate of the study-wide FDR (Fig. 2B).
To test the power of our approach to detect selective sweeps in European history, we simulated positive selection under three different selection coefficients (s = 1%, 2%, and 10%) that covered the estimated range for our candidate sweeps, and three distinct onset times (55, 44, 36 ka; i.e. ranging from the early phases of the putative movement of AMH into Eurasia to shortly after the estimated split between the Main Eurasian lineage and WHG populations ~38 ka 10 ( Supplementary Fig. 19). Two selection scenarios were modelled -both had a uniform selection pressure acting on all descendant population branches, with the first model assuming that For each simulated selection scenario, 200 selected gene scores were generated by assigning the maximum score from each simulated selected 5Mb segment to a randomly chosen gene. These scores were then corrected for gene length using the distribution of 'neutral' gene scores taken from a genomic scaffold generated following the same procedures used in the FDR estimation (i.e. concatenating 565 simulated 5Mb segments generated under a neutral demography). This process was repeated 50 times, using a different randomly generated genomic scaffold to serve as the neutral genetic background in each case. Finally, we determined the upper 1% of gene scores Our power analyses revealed that sweep detection power was drastically reduced in the simulated WHG population relative to other Eurasian populations -e.g. being 20% lower on average in the WHG population compared to the two 'large' ancient farming populations ( Fig. 5 and Supplementary Fig. 20). Demographic parameters inferred by Kamm et al. used in our simulations suggest that the WHG population experienced a prolonged reduction in effective population size (N e ) compared to populations on the Main Eurasian branch ( Fig. 2A and Supplementary Fig. 19). The subsequent inflation in genetic drift caused by the lower N e in WHG lineages appears to have led to more rapid erasure of fixed sweep signals and a subsequent reduction of detection power relative to Main Eurasian populations.

Testing the impact of recombination rate on sweep detection
While SweepFinder2 is fairly robust to recombination rate variation 11 , we performed separate msms coalescent simulations using the demography shown in Supplementary Fig. 19 -but without any admixture or selection events and omitting the CHG/ANE and EHG branchesunder three different recombination rates: i) an 'average' rate of 1.45 cM/Mb (estimated by averaging across 1,000 randomly placed 1Mb windows using data from ref. 12 and also ii) 2-fold and iii) 10-fold lower rates (0.72 cM/Mb and 0.145 cM/Mb, respectively). The latter two rates were included to specifically test if the p values associated with gene scores were systematically elevated in genomic regions with relatively low recombination rates, which could lead to an inflation of false positives at these regions. We performed 1000 neutral simulations for each recombination rate using the following msms command: msms -ms 19 1 -oFP 0.00000000 -t 3000 -r $scaledRecRate -I 3 2 17 0 0 -n 1 0.95 -n 2 0.66 -n 3 0.11 -ej 0.018 3 2 -en 0.024 2 0.13 -ej 0.045 1 2 -en 0.045001 2 1.6 -en 0.330001 2 1 For each simulation, we generated multiple 5Mb sequences for four separate populations -Anatolia EF, WHG, UK LNBA, and Western Europe LNBA -replicating the missingness and sample sizes of each population following the same approach used for all other simulations (see details in Methods). We then generated 10 genomes for each population and used these to calculate the false positive rate following the same steps outlined in Supplementary Methods: Analytical pipeline FDR and power estimation. The resulting distribution of p values for simulated genes show a reasonable match to the expectations of the standard Gaussian distribution for all three recombination rates, with no systematic inflation of low p values for lower recombination rates across the four tested populations ( Supplementary Fig. 7). These results indicate that genomic regions with low recombination were not more prone to false positives in our study.

Estimating sweep detection power relative to selection start time and beneficial allele frequency at time of sampling
To determine how SF2 detection power varied with onset time of selection and the frequency of the sweep haplotype at the time of sampling, we used the coalescent simulator msms 8 to simulate 10,000 selected allele frequency trajectories along the Main Eurasian branch of our demographic model ( Fig. 2A and Supplementary Fig. 19), ignoring admixture events. Beneficial mutations arose at random within the past 200,000 years -initially along the branch that predates the separation of Eurasian and African populations, and subsequently along the Main Eurasian branch -with the population mutation rate assumed to be proportional to the current effective population size ( Fig. 2A and Supplementary Fig. 19). For each selected locus, we also simulated linked neutral genetic variants across a 5Mb region for the hypothetical Early European Farmer population from 8 ka (i.e. LBK branch in ref. 10 ) and ran these sequences through SweepFinder2.
To ensure that the simulated data matched the statistical properties of the ancient data used in the present study, we replicated the empirical SNP ascertainment scheme by conditioning on SNP presence in a simulated African population and also generated the same number of simulated 5Mb sequences as observed for the Neolithic Anatolian population in the present study (see Methods for more details on the ascertainment and missing data approaches in our simulations).
Similarly, selection coefficients were sampled from an exponential distribution with a mean selection coefficient of s = 1%, which leads to selection coefficients that broadly match those estimated from our sweeps (s ranging between ~1 to~10%; see Methods and Supplementary Table 2). For the subsequent analysis, we removed all simulations with s ≤ 1%.
To measure detection power based on the starting time of the sweep, we computed the percentage of simulations that had a SweepFinder2 CLR value larger than the 0.1% empirical false positive rate (FPR) threshold, conditioned on trajectories where the beneficial allele had reached a frequency ≥ 80%. The FPR was calculated as a function of selection starting time by using the default general additive model (GAM) in the R package ggplot2 13 smoothing function geom_smooth to calculate the best fit of a binary response (i.e. sweep CLR signals either significant or not) relative to selection start times for each range of selection coefficients ( Supplementary Fig. 15A).
To determine the relationship between detection power and the beneficial allele frequency when simulated 5Mb sequences were sampled at 8 ka, we considered all final allele frequencies but only focused on sweeps starting within the past 20 ka with s > 1% ( Supplementary Fig. 15B).
Here, we were primarily interested in the detection power of SweepFinder2 for incomplete sweeps that had insufficient time to reach fixation. Power was again computed using the geom_smooth GAM function, in this case computing the optimal fit of a binary response variable (i.e. significant CLR statistic based on a 0.1% FPR) relative to the final frequency reached by the beneficial allele at the time of sampling.

Testing the accuracy of SF2-based estimates of the selection coefficient
We used msms to simulate selection along the Main Eurasian branch of the demographic model

Impact of sample sizes on sweep detection in modern populations
Modern populations tended to have systematically fewer sweeps than ancient populations in this study, a result that did not change when altering the q value threshold used to determine outlier genes that define the sweeps (Supplementary Figs. 8 and 11). To test if smaller sample sizes were more likely to produce more false positives (for example, by inflating the number of sweep haplotypes in a partial sweep by chance), we resampled one modern (FIN) and one ancient population with large sample numbers (Central Europe LNBA) to lower sample sizes (25, 10 and 5) and reran the complete detection pipeline to see if there was a systematic increase in the number of detected sweeps. We used an ancient population in addition to a modern population to take into account the potential impact of missing data and pseudo-haploidy on sweep detection.
Notably, reduction in sample size did occasionally result in an increase in the number of detected sweeps for both ancient and modern populations; however, this effect was not systematic and in most cases the number of sweeps decreased in number -particularly at n eff < 10 ( Supplementary   Fig. 12). Further, the number of sweeps detected in the modern population was always lower than that detected for the ancient population at comparable effective sample sizes. Our results indicate that while there is some stochasticity in the number of reported sweeps, the overall pattern of decreased sweeps in modern populations is unlikely to be an artefact of their larger sample sizes.

Impact of population sample composition on sweep detection power
To estimate the impact of our population assignment on the robustness of our candidate sweep detection, we reran the sweep detection pipeline on alternate sample groupings for both the WHG and Steppe populations and compared these to the original sample groupings. The original sample assignments for each population were governed by minimizing the temporal and spatial variability within each population while maintaining a homogeneous archaeological context. The alternate population groupings preserve the archaeological context but use a coarser geographical and temporal context, and/or include samples with additional ancestry components not present in the original sample groupings (Supplementary Table 1). Specifically, for the WHG populationwhich initially comprised 44 samples largely derived from the Balkan region -we added 12 samples mostly sourced from the Gravettian culture from western Europe (sourced from France, Germany, and Luxembourg) that were ~1 ka older on average (mean sample ages: 10 ka vs 9 ka for original samples). The 75 samples from the Steppe population were supplemented by 15 samples of similar antiquity and provenance (Maykop Culture samples; mean age ~4.5 ka for both groupings), but which draw ~4% of their ancestry from Siberian hunter-gatherers (i.e. Eastern Hunter-Gatherers with Siberian genetic affinity) that is absent in the samples in the original grouping 14,15 .
The two sample groupings had significantly correlated gene scores for each population (Pearson's r = 0.82 and 0.64 for WHG and Steppe, respectively; Supplementary Fig. 13).
Notably, whilst changing the sample composition of the WHG and Steppe populations resulted in some different sweeps being detected for each (Supplementary Figs. 13 and 14), 19 of the 27 sweeps (4/7 and 15/20 for WHG-and Steppe-specific sweeps, respectively) were observed regardless of which samples were used. Importantly, amongst these 19 retained sweeps, seven were observed in a population other than the alternate WHG or Steppe population (e.g. sweep 4:71.5-72.4 is significant in only one of the two Steppe sample groupings, however it is also significant in Anatolia_EF and CentralEurope LNBA; Supplementary Fig. 14). Overall, 22 of the 57 candidate sweeps appear in two or more ancient Eurasian populations, which increases to 49 out of 57 candidate sweeps when sweeps are defined by containing at least one gene with a minimum q < 0.05 (instead of the q < 0.01). Our results imply that the shared genetic history and large number of ancient populations used in this study improve the chance that a sweep will be observed and thereby impart a degree of robustness to the sweep detection process.

Testing if sweeps are artefacts of low genome complexity
To test for potential false-positive sweep signals arising in genomic regions of low sequence complexity, we partitioned the genome into areas of high and low complexity based on CRG100 mappability scores 16 downloaded from the UCSC Genome Browser 17 and checked if the proportion of mappable SNPs was systematically reduced in any of the 57 candidate sweep regions. Mappability is a metric that indicates how readily reads can be aligned to a particular genomic region, which has a positive association with sequence complexity. For each of the 57 candidate sweeps, we calculated the proportion of ~1240k SNPs found within each sweep that also lie within high complexity regions (i.e. falling in windows with a mean mappability value > 0.9). Notably, the SNP probes were designed to target genome regions with high mappability 18 , and our results confirm this with at least 96.7% of the SNPs in each sweep lying in high mappability regions (Supplementary Table 3), indicating that our candidate sweeps were unlikely to include artefacts caused by low genome complexity.

Sweep haplotype detection
For all 57 sweeps, we qualitatively checked the occurrence of the sweep haplotypes in five Upper Paleolithic Eurasian human samples by comparing their haplotype plots with the dominant haplotype within the populations with significant sweep signals (haploimages provided as Supplementary Data 58-114 19 ; https://doi.org/10.25909/631595df4f1b2). To test the robustness of our qualitative method, we used the following quantitative approach to infer the most likely selected haplotype. To reconstruct the sweep haplotype, we reasoned that the alleles carried on the selected haplotype would be at high frequencies in the ancient populations where the sweep was significant. Accordingly, for each sweep, we identified a subset of alleles that are strongly associated with the sweep, using the following two statistics: Here, p is the sample allele frequency, and q = 1-p, for a specific SNP in population i. For each sweep, equation (1) was evaluated for every SNP lying in the sweep region to identify a subset of SNPs that act as reliable markers for the sweep. Equation (1) provides a weighted score for each SNP, which is the product of the log-transformed SweepFinder2 CLR score (i.e. the CLR score overlapping the SNP in question) and the standardised allele frequency, which is scaled to fall between 0 (when both alleles are at 50%) and 1 (when either allele is fixed). This value is calculated for each of the 18 ancient populations used in the study (indexed by i) and then summed together. Accordingly, equation (1) returns large weighted scores for SNPs where one allele is consistently near fixation in populations where the sweep also exhibits evidence for selection.
Once a subset of marker SNPs were identified for each sweep, equation (2) was used to evaluate the allele that is most strongly associated with the selected haplotype. Following the logic of equation (1) haplotype presence based on either 90% or 95% of the top 50 or 100 marker SNPs).
Additionally, we recalculated r after removing 12 sweeps that had complex SF2 CLR signals that may have concatenated two or more sweep regions (Supplementary Table 2  The classical definition of a hard sweep is based on the fixation of de novo beneficial mutations and linked neutral variants 33 . However, hard sweep signals can still be created when the beneficial variant is rare, or the beneficial haplotype is near to fixation, so long as the underlying genealogy is sufficiently star-like 34 . In the present study, a hard sweep signal is created by a specific distortion of the site-frequency spectrum emblematic of a star-like genealogy (i.e. the elevation of rare and nearly fixed variants), which also encompasses selection from rare standing variation and partial sweeps near to fixation.

Supplementary Text 3: HLA locus sweep in Anatolian Neolithic samples
The most striking SweepFinder2 signal in our data is a ~1. To estimate nucleotide diversity from our data, we calculated the expected heterozygosity for each SNP across the combined Anatolian Neolithic samples. While our data is limited to a set of 1.1 million SNPs and thus does not allow estimation of the true nucleotide diversity, our estimates are still useful for investigating local reductions in diversity resulting from a selective sweep. Accordingly, we calculated the average diversity in non-overlapping 20kb windows that span the sweep and flanking regions (from 23.5Mb to 35Mb on chromosome 6). Because our raw diversity measure has no obvious interpretation, we rescaled our estimates by dividing them by the 90% diversity quantile across this region.
We observe depressed levels of nucleotide diversity in a 1.5 Mb region centred directly beneath the SweepFinder2 signal peak, with diversity increasing sharply in the regions flanking each side of the sweep. Notably, this region of low nucleotide diversity also features relatively low recombination rates, with the transition to higher diversity being demarcated by two recombination hotspots situated at 31.57 Mb and 32.19 Mb (Fig. 3). This close correlation between regional recombination rate and diversity suggests that the depletion in diversity underlying the sweep signal is not a bioinformatic artifact related to read alignment problems or aDNA-related issues (e.g. data missingness), since these factors are unrelated to recombination rate and should not produce such a correlation. Rather, this relationship is consistent with the interplay between positive selection and recombination -i.e. neutral variants sitting near to the beneficial mutation remain in tight linkage while more distant variants lying beyond the two hotspots have become uncoupled by recombination during selection.
Notably, background selection could also lead to a reduction in local diversity in genomic regions marked by low recombination rate and a high density of functional elements. However, in this case we would expect that similar patterns would be observed in all populations examined in our study. Instead, most other populations exhibit weak SweepFinder2 signals and local allelic diversity consistent with neighbouring neutral regions ( Fig. 3 and Supplementary Fig. 16)though the two European Early Farmer populations show pronounced (albeit insignificant) SF2 signals and some local diversity depletion that is consistent with having a high proportion of Anatolian Early Farmer ancestry along with some Western Hunter Gather ancestry.
Finally, we used the least squares method to fit a model predicting the reduction in diversity following the fixation of a selective sweep to the Anatolian Early Farmer samples. Following ref. 35 , the predicted reduction in diversity relative to the genomic background was modelled as In contrast with the AWE sweeps, the four sweeps that first appear at ~30 ka (i.e. initially present in Vestonice) are common in WHG (~75%), but are less frequent in Steppe herders (~25%) and absent in Anatolia EF, and are also missing from all admixed ancient and modern European populations (Fig. 4B). These patterns suggest that these four sweeps most likely arose on the ancestral branch that links the Vestonice individual with WHG groups, following the separation of these lineages from the ancestors of Anatolian EF around 35-40 ka 10  Where λ SGV and λ New measure the occurrence rate of mutants destined for selective fixation for SGVs and de novo mutants, respectively. These rates were obtained from equations 8 and 11 in ref. 40 , resulting in the following: where Θ is the population mutation parameter, h is the dominance, and R is the composite parameter, 2h b α b / (2h d α d + 1), α is the population selection parameter 2N e s (with s designating the selection coefficient), and g is the number of generations rescaled in units of 2N e . The d and b indices denote whether the parameters occur before or after the environmental change that triggers the onset of selection, with b indicating that the beneficial phase of the allele following the environmental change, and d indicating the preceding deleterious phase (noting that the allele could also be strictly neutral during this phase, in which case s d = 0). Importantly, this equation allows for different effective population sizes before and after the onset of the selection pressure, enabling bottleneck effects to be incorporated. We assume that the onset of selection coincided with the founding of the AMH migrant population after separating from ancestral African AMH, After obtaining , we calculated P SGV following equation 14 in ref. 40 : noting that Θ is now calculated as 4N e μM, i.e. the population mutation rate is rescaled according to the mutational target size calculated in the prior equation. All other parameters are recycled from the previous calculation.
Using these two equations, we estimated E[M|G] and P SGV using standard estimates for μ (1.5 x 10 -8 ) 41 and generation time (i.e. g = 29 years, noting that G = t f /g) 42 , setting dominance to be strictly additive (i.e. h d = h b = 0.5), and taking a range of values for other parameters that were not known a priori (see Supplementary Table 4). The range of beneficial selection coefficients replicating estimates from the present study (see Methods and Supplementary Table 2), with the ancestral N e set to 20,000 and Eurasian N e varying between 2,000 to 20,000 to model differential bottleneck effects, aligning with values in our demographic model ( Fig. 2A and Supplementary   Fig. 19). Our results indicate that sweeps from SGVs and de novo mutations both were quite likely in Eurasian history ( Fig. 6 and Supplementary Fig. 21), with de novo mutations dominating when purifying selection was strong (s d = 0.01) prior to the environmental change, whereas SGVs were more likely when purifying selection was weaker (s d ≤ 0.001). Fixation from SGVs was particularly likely (P SGV > 0.75) when the bottleneck intensity was strong (10% of ancestral N e , i.e. 2,000), the fixation interval was constrained to ≤20kyrs, and selection strength during the beneficial phase was ~0.01, all of which are highly plausible for the observed Eurasian sweeps.
Next, we calculated the probability that selection from SGV resulted in the fixation of loci descending from a single variant at the time of the environmental shift, P single , over the same set of parameters, obtaining an estimate of the proportion of fixed SGVs that guarantee a hard sweep signal. We calculate P single as 1 -P mult , where P mult is the probability that multiple copies are fixed as defined by equation 18 in ref. 40 . As expected, P single exhibits a reverse dependency on the strength of purifying selection than P SGV , directly reflecting the greater expected abundance of SGV copies available at the time of environmental change when purifying selection is weak ( Supplementary Fig. 21). Nonetheless, the probabilities of a single copy fixing following the environmental change were non-negligible across the full parameter space explored, even when mutations were strictly neutral prior to the environmental change (minimum P single~1 0%).
Finally, we calculated the compound function P hard = P SGV P single + 1 -P SGV = 1 -P SGV P mult , which estimates the probability that a fixed variant arising from either a SGV or a de novo mutation produces a hard sweep (i.e. the product P SGV P single estimating the probability of a hard sweep from SGVs and 1 -P SGV estimating the probability of a hard sweep from de novo mutations, assuming that all fixed de novo variants produce a hard sweep). The inverse dependency between P SGV and P single on the strength of purifying selection meant that hard sweeps were highly likely when causal mutations experienced some degree of purifying selection prior to the environmental shift (minimum P hard~5 0% when s d > 0), with soft sweep signals becoming particularly dominant when causal mutations were previously strictly neutral and the fixation time intervals were tightly constrained ( Supplementary Fig. 21).
Notably, the P single calculation only considers scenarios where all beneficial mutations arise in a single locus, whereby we expect that the probability of a hard sweep signal from SGV could be higher if mutations occur across multiple independent loci. While a recent study suggests that selection on variants distributed across multiple loci tend to result in polygenic signals when Θ > 0.1, this work assumed that the fixation of a single beneficial mutation was sufficient to arrive at the new fitness optimum 38 . Accordingly, scenarios where the fitness optimum is more distant could support outcomes where one or more loci reach fixation when Θ exceeds 0.1. However, little a priori information is available to inform the parameterization of appropriate polygenic selection models for Eurasian history, including the distance to new fitness optima, mutational target size, epistasis between beneficial loci, and the strength of purifying selection prior to the environmental change, amongst others. Nonetheless, our results suggest that beneficial variants fixing in Eurasian history were likely to leave a hard sweep signal, lending further credibility to the 57 hard sweeps observed in this study.

Supplementary Text 6: Selection was sufficiently strong to result in several sweeps during the Eurasian Pleistocene
Using the five Late Pleistocene individuals to time the candidate sweeps suggests that the majority of candidate sweep haplotypes were at high frequencies by 35 ka, suggesting frequent episodes of surprisingly strong selection prior to the early Pleistocene period. Our estimates of s indicate that selection was reasonably strong, with 53 sweeps having s > 0.5% and 41 having s > 1%, with the largest value of s nearing 10% (see Supplementary Table 2). The selection coefficients we observe are strong enough to drive new or initially rare mutations to a detectable frequency (i.e. >90%) in only 3-4 thousand years for the strongest selection coefficients observed in our data, and within 10-20 thousand years for sweeps with the weakest selection coefficients ( Supplementary Fig. 18). Importantly, these estimates are likely lower than their true s valuesthe SFS-based approach of SweepFinder2 tends to underestimate s by about 20-30% 43 , and selection from standing variation can lead to substantially narrower sweep regions, also resulting in underestimation of s. Indeed, when comparing the inferred selection coefficients against the simulated values for the WHG and Anatolian_EF populations, s was consistently underestimated across the different selection regimes, particularly when acting upon more recent standing variation ( Supplementary Fig. 18). Thus, our data are consistent with multiple episodes of moderate to strong selection driving beneficial variants to high frequency prior to 35kya, with the underlying selection pressures potentially arising from the new environmental conditions encountered by AMH following their arrival in Eurasia. on the raw SweepFinder2 CLR test statistics to infer p values. The top panel shows CLR scores from two Eurasian populations simulated under a neutral demography (see Fig. 2 and Supplementary Fig. 19), following the assignment of a single CLR score to each gene and subsequent correction to account for the positive correlation between gene size and CLR score  Figs. 6 and 7), whereby only ancient Eurasian populations with n eff ≥ 10 (and the three modern European populations) were used to determine the 57 sweeps identified in this study. Figure 8. The frequency of sweeps across populations according to different q value thresholds and outlier gene aggregation distances. Sweeps were defined by grouping together outlier genes within 250kb, 500kb, or 1Mb (i.e. Join width) of each other that fell under a specific q value threshold (see key). Populations were split into two groups based on whether they had sufficient data to make robust inferences (n eff ≥ 10 = 'Large') or not (n eff < 10 = 'Small'). The sweep binning criterion used in the present study defined outlier genes at q < 0.10 and used a join-width of 1Mb to aggregate into sweeps; however, only sweeps with at least one outlier gene with q < 0.01 were retained as candidate sweeps, and only data from 'Large' ancient European populations and three modern populations were used to determine sweeps (see Methods). These sweeps are indicated by the 'x' symbol in the plot. Figure 9. The Z score for each outlier gene observed amongst the 57 candidate sweeps (y-axis), for all modern and ancient populations (x-axis). Populations are organized chronologically according to broad archaeological categorizations, with populations that had insufficient data for robust inferences labelled as 'Small' (i.e. n eff < 10; see Methods). Sweeps and genes are organized by chromosomal position. Outlier genes are indicated with a specific symbol according to their q value (0.05 ≤ q < 0.10 = *; 0.01 ≤ q < 0.05 = **; q < 0.01 = !). European hunter-gatherers, Steppe, Basal Eurasian or Neanderthal populations was excluded from the demography to explicitly quantify power in the absence of admixture. In (A), only sweeps with a final frequency of more than 80% were considered, whereas (B) includes incomplete sweeps that started less than 20kyrs prior to sampling. Power falls below 50% when selection started prior to ~80 ka (i.e. ~70 ka before sampling occurred), or when the final haplotype frequency is below 85% at the time of sampling. All plotted values are best fitting LOESS estimates, with 95% confidence intervals shown as grey ribbons. percentages indicating the proportion of ancestry contributed by the incoming admixture branch).

Supplementary
Model parameters are taken from one of three studies, as denoted by the associated superscript (1 = ref. 10 ; 2 = ref. 44 ; 3 = ref. 45 ). Hard sweeps were simulated on the Main Eurasian branch at three different times (red shapes and text) and were inherited by all descendant populations. The yellow text boxes provide information on the samples that were used in simulation analyses (haploid sample sizes reported), which occur before and after major admixture events.
combinations of selection strength (row panels, see outer labels) and starting time (row panels, see inner labels). Each violin is coloured according to the number of significant tests, with mean and standard deviations indicated by the black horizontal and vertical lines, respectively.
Selection was either modelled as acting uniformly across the entire population history (x-axis, post admixture selection = No), or ceasing at 8 ka following the admixture of WHG and Main Eurasian branch (post admixture selection = Yes). Panels with no information represent populations that split from the Main Eurasian branch before the introduction of the beneficial allele that therefore did not inherit the selected locus. Figure 21. The probability that a sweep was caused by a standing genetic variant (SGV) relative to a de novo mutation conditional on a sweep of either type occurring within a prescribed time interval (left panels). Symbol sizes show the expected mutational target sizes needed to ensure fixation of the beneficial allele within a specific time interval (indicated on the x-axis). SGVs are assumed to have been under some degree of purifying selection (denoted by different coloured symbols) prior to the environmental shift that initiates the beneficial selection phase (symbol shapes), with Eurasian populations experiencing a range of bottleneck intensities (Eurasian N e relative to ancestral N e indicated in row panels). The probability that all fixed SGVs descended from a single copy at the time of the environmental shift (central panels) was strongly dependent on the strength of purifying selection, reversing the direction of this dependency observed for the SGV fixation probabilities. Combining these two probabilities resulted in hard sweep patterns from either SGVs or de novo variants being highly likely (>50%) when causal mutations were at least weakly deleterious prior to the environmental change (right panels; see Supplementary Text 5). All equations come from ref. 40 .

Captions for Supplementary Tables
Supplementary Table 1. Full set of metadata for all ancient western Eurasian samples considered in this study. Data includes sample ages, geographic locations, as well as associated cultural and historical affinities for each individual. This information is combined with genetic relationships between samples in order to categorize the populations that were used for selection analyses. Table 2 Table 3. The number and proportion of mappable SNPs falling within each of the 57 candidate sweep windows relative to other regions of the genome, based on the CRG100 mappability scores from ref. 16 . All sweeps contain at least 96.7% mappable SNPs, indicating that they are unlikely to be alignment artefacts. Table 4. Parameter values used to calculate the probability of fixation from standing genetic variation and de novo mutations, based on equations from ref. 40 .

Captions for Supplementary Data
Supplementary Data 1-57. Composite likelihood ratio (CLR) scores (grey line) from SweepFinder2 relative to genomic coordinates (x-axis) for the top 57 candidate sweeps 20  The images depict the pseudo-haplotype calls at each locus in the sweep window and flanking region, for each population used in this study. Majority alleles, i.e. those alleles most likely to be associated with the sweep, are shown in yellow, and minority alleles in black, with missing data being white. Only individuals with no more than 50% missing data are shown. In addition to the populations scanned for sweeps using SweepFinder2, haplo-images for five Late Pleistocene Eurasian AMH samples (Ust'-Ishim, Kostenki14, GoyetQ-116, and El Mirón) were also included to help determine the earliest timing of the selection pressure for each sweep. Note that only Ust'-Ishim had sufficient coverage to make diploid calls (i.e. both alleles are present at each locus), whereas the four remaining samples use pseudo-haploid calls. All figures are available at https://doi.org/10.25909/631595df4f1b2.