Upper Palaeolithic genomes reveal deep roots of modern Eurasians

We extend the scope of European palaeogenomics by sequencing the genomes of Late Upper Palaeolithic (13,300 years old, 1.4-fold coverage) and Mesolithic (9,700 years old, 15.4-fold) males from western Georgia in the Caucasus and a Late Upper Palaeolithic (13,700 years old, 9.5-fold) male from Switzerland. While we detect Late Palaeolithic–Mesolithic genomic continuity in both regions, we find that Caucasus hunter-gatherers (CHG) belong to a distinct ancient clade that split from western hunter-gatherers ∼45 kya, shortly after the expansion of anatomically modern humans into Europe and from the ancestors of Neolithic farmers ∼25 kya, around the Last Glacial Maximum. CHG genomes significantly contributed to the Yamnaya steppe herders who migrated into Europe ∼3,000 BC, supporting a formative Caucasus influence on this important Early Bronze age culture. CHG left their imprint on modern populations from the Caucasus and also central and south Asia possibly marking the arrival of Indo-Aryan languages.


Supplementary
OA, other ancient sample. Significant statistics are highlighted in bold. "Test 1" and "test 2" were performed as in 11 . The number of primary alleles (p) and secondary alleles (s) in ancient male samples at X chromosome sites found to be polymorphic in European populations and adjacent sites 4 bases upstream and downstream are reported. Estimates are based on "test 1" and "test 2" 11 . Associated p-values for contingency tables used in this analysis were calculated using

Supplementary
Fisher's exact test.

Genotype Satsurblia
Coverage Satsurblia   Layer A2 contained a distinct lithic industry with transverse, trapezoid arrowheads. Special forms of denticulates known as "Lekalo" or "Kmlo" tools [16][17][18] , often retouched on the ventral face, were represented in this assemblage. Flake scrapers, including the thumb-nail type, were also found. Blade cores and some production waste (debitage) reflect on-site blade reduction for secondary shaping of tools. A similar industry was reported in Neolithic sites of the region such as the Darkveti rockshelter, ca. 5 km away in the Kvirila river gorge 19 , and Paluri, which is situated further away on the Enguri river 17 . Charcoal from this layer was dated to 7,430 ± 40 uncal. BP (OS-63263).
Layer A2 contained the grave of a complete skeleton from a young adult male (ca. 30-35 years old) whom we have called "Kotias". The grave was dug from this layer into layer B. The skeleton was laid in supine position and stones were placed over certain parts of the skeleton which crushed the skull and covered the knees and lower limb bones. The skull was angled with the right side facing upwards. Both the right and left hands, which lay along the side of the skeleton, covered the groin area. A 15 cm long bone splinter was found in the chest cavity of the specimen below the cervical vertebrae and above the clavicle on the right side. It cannot be affirmed, however, that this splinter was the cause of death. It appears that there were some pathological lesions on the right first rib. Other traumatic and degenerative pathological conditions were evident on the left calcaneus and talus. Attrition of the teeth was intense, especially when taking the relatively young age of the specimen into consideration. Two superimposed shallow hearths were exposed some 20 cm above the skeleton's legs probably either indicating the sealing of the grave or a later occupation. A sample from one of the hearths was dated to 8,370 ± 55 uncal. BP (RTT 5220). A tibia from the skeleton was directly dated to

Grotte du Bichon
The small cave "Grotte du Bichon" is situated in the Swiss Jura Mountains, a few kilometres north of the city of La Chaux-de-Fonds (canton Neuchâtel), at an altitude of 845 m a.s.l.
( Supplementary Fig. 6). During its first excavation, undertaken by speleologists in 1956, bones of a young man were discovered intermingled with the remains of a female brown bear (Ursus arctos) and nine flint projectile points, apparently stemming from the hunter's weapons.

Estimation of molecular damage and sequence length
Ancient DNA molecules are subject to post-mortem degradation typified by short sequence length and an over-representation of nucleotide misincorporation sites at the ends of reads 29,30 .
Using a subsample of 500,000 reads per sample, which had been processed as described in the methods section with the SeqTK step omitted, we examined the sequence length distribution of our reads (Supplementary Fig. 7) and patterns of molecular damage using mapDamage 2.0 31 ( Supplementary Fig. 8). Only bases with a minimum quality of 30 were considered when running mapDamage.
Endogenous DNA sequences from ancient samples tend to have an average sequence length of less than 100 bp 32  For each sample a clear increase in C to T and G to A transitions can be seen at the 5' and 3' ends of molecules respectively ( Supplementary Fig. 8), a typical hallmark of aDNA 29,30 .
Misincorporation frequencies at the start and end of reads (>20% for the Georgian samples and >7% for Bichon) are consistent with levels present in other similarly aged specimens 1,2,6,8 .
Bichon may have less damage at the 5' ends of molecules (the 3' ends were not always completely sequenced) than CHG because DNA fragments retrieved from Bichon tended to be longer ( Supplementary Fig. 7) and thus better preserved.

Mitochondrial DNA contamination
Mitochondrial DNA contamination was evaluated by assessing the proportion of secondary (non-  Table 13).

X chromosome contamination
As all our samples were male (Supplementary Table 17) we were able to assess the level of X chromosome contamination in our samples as described in 2 which was based on 11 . The X chromosome is found in single copy in males and therefore two or more different alleles found at a given site along this chromosome may be due to contamination, damage, sequence error or mismapping. Assuming a similar background site error rate, it is expected that contamination will be more conspicuous at polymorphic sites than at neighbouring monomorphic sites due to the increased propensity for allelic diversity at these sites in contaminating populations. We examined discordance in the rate of heterozygous calls between known polymorphic sites on the non-pseudoautosomal region of the X chromosome reported in the 1,000 Genomes Project 36 and their adjacent sites. Genotypes in the 1,000 Genomes dataset and ancient samples were called and filtered according to 2 Fig. 2). This is consistent with ADMIXTURE analysis and the geographic range of these groups -CHG were separated from these North Eurasian hunter-gatherers by the Caucasus mountain range.

Bichon shares close affinity to other western hunter-gatherers
We also explored the relationships between ancient samples by performing outgroup f 3 -statistics of the form f 3 (X, OA; Yoruba) where we let X be Kotias, Satsurblia and Bichon in turn and OA be all other ancient groups in the dataset (Supplementary Fig. 1). These statistics are informative as their magnitude is proportional to the amount of shared genetic history between the ancient individuals (X and OA) since they diverged from an African (in this case Yoruban) outgroup.
We found that CHG share the most drift with each other and the least drift with the Pleistocene sample Ust'-Ishim ( Supplementary Fig. 1A&B). Other ancient samples share an intermediate amount of drift with no obvious pattern to the distribution of allele sharing. Bichon shares the most genetic drift with other western hunter-gatherers, followed by Scandinavian and eastern hunter-gatherers ( Supplementary Fig. 1C). The fact that Bichon is closest to other WHG, and not equally close to SHG and EHG, suggests that there may have already been sub-structure between these hunter-gatherer groups 13,700 years ago when Bichon was alive.

Supplementary Note 4 Caucasus hunter-gatherers and early farmers are sister groups with an earlier divergence for western hunter-gatherers
To explore the topology between CHG, WHG and early farmers (EF) we used available high coverage data and performed f 3 -statistics (see methods), attempting all possible triplet combinations for these three groups (Figure 2A; Supplementary Table 4). When we did this we presumed that two samples form a clade and the other sample is the outgroup to this clade. For the correct topology we would expect f 3 > 0, as the two correctly grouped samples will have shared drift since they diverged from the outgroup. For incorrect topologies we would expect f 3 = 0 as the incorrectly grouped samples will not have shared drift exclusive to themselves. We  Table 8) and ADMIXTURE analysis ( Figure   1B)). As the signal for EF and CHG forming a clade is much stronger than for the other two topologies we consider the most parsimonious scenario to be that farmers and CHG are sister groups that diverged from each other after splitting from WHG.
Unfortunately we did not have a high coverage diploid sample representing ANE to include in these analyses. Analyses using D-statistics (Supplementary Table 6) revealed however that ANE and WHG group together to the exclusion of CHG. It therefore seems likely that an ancient south (Neolithic farmers and CHG) divergence from the ancient North (WHG and ANE) was the earliest split for these groups. This is shown in Supplementary Fig. 2

Supplementary Note 5
Dating the split among Caucasus hunter-gatherers, western hunter-gatherers and early farmers We used G-PhoCS 54 to reconstruct the joint demographic history of western and Caucasus hunter -gatherers (WHG and CHG respectively) and early farmers (EF). This analysis requires (1) the topology of the underlying population tree; (2) sequence data from short, homologous windows; and (3) specified directional gene flow between branches (migration bands).

G-PhoCS represents the demographic history of a collection of samples by a (binary) tree in
which each branch is a population, with each sample belonging to a different leaf branch and interior branches corresponding to ancestral populations. To find the most likely topology of this tree, we used f 3 analysis to determine the most likely ordering of the population splits (see Figure 2B for a graphical representation). For the G-PhoCS analyses, we considered both a tree with only the ancient genomes, and a tree with an African San Pygmy 55 as the outgroup.

Genome-wide windows of high sample coverage for demographic analyses
Since this analysis requires sequence data from all genomes in short (1 kilobase (kb)) homologous windows, we chose high-coverage genomes to represent each group (Bichon and Loschbour to represent WHG, Kotias to represent CHG, and either Stuttgart or NE1 to represent EF). In addition, we used a high-quality San Pygmy genome 55 as an outgroup. To find the best set of windows, we first generated all-sites coverage information for chromosomes 1 to 22, restricted to regions classified as "neutral" according to the filters in 54 (using UCSC liftOver tool to translate coordinates from hg18 to hg19), and extracted the depth information using a program written in C, again filtering for sites with read depth between 10 and 35 (we avoid sites with very low and very high coverage because alignment and genotyping is problematic (for more details see 37 ): samtools mpileup -C50 -uDI -f <reference.fa> -r <chromosome> \ -l <bedfile with accepted regions> <bamfiles> | bcftools view -gc -\ | get_depth_intervals minCover=10 maxCover=35 interval_file=<chromosome> We then scanned each chromosome for good windows, using a simple heuristic to maximise the sample coverage. We start by finding the first 1 kb window with at least 80% coverage. We then search locally for a window within the next 10 kb for the 1 kb window with the highest coverage.
Finally, we jump 5 kb forward from the chosen location and repeat the process until we reach the end of the chromosome. For the whole genome, this search yielded a total of 152,883 highquality windows. We then used SAMtools/BCFtools 56 (using flags as above) and custom programs written in C and MATLAB to extract genotypes for the windows and converted the genotypes into fasta files for G-PhoCS. To deal with DNA damage in ancient samples, we "in vitro" deaminated all our sequences, as already done for previous analyses of aDNA 57 .

Directional gene flow between branches
Because our WHG samples predate the arrival of farming to central and northern Europe 58 , any gene flow creating shared drift between EF and WHG must be from WHG to EF. Ideally, we would like our model to only allow gene flow between WHG and EF after the arrival of farming to the WHG locations. However, G-PhoCS requires migration to start or stop at time where populations split. Fortunately, our analysis puts the split between the two WHG, Bichon and Loschbour, at around 14k years ago, just a few thousand years prior to farming. We therefore allow gene flow between WHG and EF only after this split. Because Loschbour is temporally and geographically closer than Bichon to the EF, we allow only gene flow from Loschbour to the EF.

G-PhoCS set up
Gamma distributed priors were used for all observables. The shape parameter was set to an With regards to migration from Loschbour to Stuttgart, we explored three different settings: no migration and migrations bands with either a strong or a weak prior. In the case of a weak prior, the shape parameter was set to 0.002 and the mean to 200, to allow exploration of the whole space. With such a broad distribution, the value of the mean hardly influenced our results, based on exploratory runs. As for the strong prior, the shape parameter was set to 1 and the mean to 20,000, corresponding to ~18% of the genealogies sampled as Stuttgart actually originating from Loschbour, with a Loschbour-Bichon split time of 6,666 years (the prior mean). Migration had a negligible effect on split times. The only exception was the split between EF and Caucasus Hunter Gatherers, which was approximately 4-10 thousand year younger without migration; however, the confidence interval for this split was similar, and very broad, for levels of migration, suggesting that this split is difficult to date with the data we have. In this supplementary, we present the results with the strong prior, since previous studies 1,7,46 have pointed to mixing between hunter-gatherers and early farmers. We also explored models with migration going in the opposite direction or bidirectional, but this did not affect the results (as already noted in the original paper describing G-Phocs 54 , the direction of migration tends not to be captured by this method).
We used G-PhoCS's automatic feature to set step sizes (finetune parameters) of the Markov chain for each parameter, which aims for intermediate acceptance ratios. During this procedure, the first 10,000 steps were used to find appropriate step sizes, by updating the parameters every 100 MCMC steps and performing 100 updates. After the step sizes were set, 3,000,000 MCMC steps were performed, with the first 100,000 steps discarded as burn-in. After observing traces of observables, we found that our demographic parameters of interest converged well before the end of the burn-in period. We started two independent chains for each setting in order to assess appropriate mixing of the chain, and observed no problems in any setting.

Converting dates from ancient genomes
G-PhoCS assumes samples to be contemporaneous. The ages of our ancient genomes all fell within a range of ~6k years (~7 kya for the youngest, EF, to ~13 kya for the oldest, Bichon). This discrepancy is relatively small compared to the ages of the splits of interest, and will not affect estimates in a qualitative way (especially given the size of the confidence interval of this type of analysis). To convert split times for a given node as computed by G-PhoCS into calendar dates, we added the mean of the ages of the samples that defined that node. The only modern genome is the San, which is only used as an outgroup; as such, the age of that split between the San and the ancient genomes is not of interest (and given how old that split is, a difference of 10k years in age of the genomes has negligible consequences on the estimates).
Split times estimates from G-PhoCS have to be converted into calendar years based on a mutation rate. Recent work on the high quality genome from Ust'Ishim 3 provides a mutation rate calibrated on ancient DNA, (0.5 × 10 −9 per site per year) which is also in line with estimates from high quality modern genomes 59 . We converted this mutation rate into an appropriate substitution rate for our in vitro deaminated sequences.

G-PhoCS Results
If we consider the model with Stuttgart to represent EF, and San as an outgroup, we find that the split between WHG and the population ancestral to CHG and EF is dated at around ~46 kya, implying an early divergence at the time of, or shortly after, the colonisation of Europe. On the WHG branch, the split between Bichon and Loschbour is dated to ~18 kya (just older than the age of Bichon), implying continuity in western Europe. The split between CHG and EF is dated at ~24 kya, thus suggesting a possible link with the LGM, although the broad confidence intervals require some caution with this interpretation.
Our conclusions are qualitatively similar for other models, irrespective of which genome (Stuttgart or NE1) was used to represent EF, nor whether we used an outgroup (San) or not. In the main text and in Figure 2B, we report the dates from the model including Stuttgart and San, but details of the other models are available in Supplementary Table 5.

Supplementary Note 6 Phenotypes of interest
To get a picture of the phenotypic characteristics of our samples, we examined genes which have been associated with particular phenotypes in modern populations, including some loci which have been subject to selection in European populations. To investigate skin tone in our samples we began by using the 8-plex prediction model 66

Runs of homozygosity
To gain an insight into past population structure we examined runs of homozygosity (ROH) in our ancient samples. ROH occur when identical extended regions of the genome are inherited from both parents and their distribution can be informative about past population demography 2,60-62 . Long homozygous genomic stretches provide evidence for recent endogamy because recombination has not acted to break down these long tracts which are identical by descent. In contrast short runs can be indicative of an ancient population bottleneck. After such a constrictive event a population will experience a period of increased inbreeding creating long homozygous haplotypes but these segments can be broken up by recombination over time as the population expands, creating short homozygous runs.
Examination of ROH requires dense diploid genotypes. We used imputation to maximise the information content of our most ancient sample, Satsurblia, which was sequenced to 1.44x.
Imputation allows the inference of missing genotypes by comparing surrounding haplotypes in the sample to those found in a phased reference panel and has been shown to be a valid method for leveraging palaeogenomic data 2 . We were concerned that the haplotypes present in Satsurblia may not be well represented by haplogroups in our modern dataset. To test if CHG genotypes could be accurately imputed we down-sampled our high coverage genome Kotias to ~1x and compared 546,625 imputed genotypes which overlapped with our confidently called high coverage genotypes for the filtered Human Origins dataset ( Supplementary Fig. 10). When we imposed a genotype probability of 0.99 we found that 85% of loci were retained and of those 99.41% (97.91% of heterozygotes) matched our high coverage calls ( Supplementary Fig. 8).
This high concordance rate supports the use of imputed CHG data in our analyses.
ROH analysis was carried out as described in the methods section. When we plotted short ROH (<1.6 Mb) against long ROH (≥ 1.6 Mb) ( Figure 3A)  We also placed our ROH into size bins 60 ( Figure 2B)

Supplementary Note 9 ADMIXTURE analysis
The admixture proportions are shown on Supplementary Fig. 5 for all samples and in Figure 1 for ancient individuals at K=17 (the minimal cross validation error; Supplementary Fig. 9). In the South Asian "dark purple" and the Near Eastern "magenta" are present, they mainly consist of those two components, with a small fraction of Western hunter-gatherer related ancestry.
Modern populations from the Caucasus and West Asia harbour the same components, but with an increased fraction of the Near Eastern "magenta" component. From K=15 onwards, a new cluster nearly completely describing CHG ("lime green") appears. Out of the two CHG samples, the older Satsurblia is fully assigned to the "lime green" cluster, whereas the later sample Kotias also features minor (<10%) Near Eastern ancestry. This new CHG-related component also appears in modern populations in west Eurasia, but no modern population belongs purely to this cluster. Even modern populations from the Caucasus continue to harbour a large amount (close to 50%) of the Near Eastern "magenta" component. The components characteristic of Native Americans populations ("dark green", "light brown", "dark gray" and "dark blue") are also worth noting. These components are present in MA1 and eastern hunter-gatherers, in agreement with previous studies 5,7 , but is at very low levels in the  Table   1) from 2 using default parameters.

Swedish Samples, Loschbour and Stuttgart
BAM files with genome-level coverage (Supplementary Table 1) from 8 and 1 were realigned to the human genome as outlined in the methods section. MapDamage rescaling 31 was not performed on data from 1 as these samples had been molecularly treated to remove deaminated cytosines, reducing DNA damage associated errors.

La Braña and Mal'ta (MA1)
FASTQ files from 6 and 5 were aligned to the GRCh37 build of the human genome with the mitochondrial sequence removed. Further processing was carried out as described in the methods section. For MA1, mapDamage rescaling 31 was omitted due to the low deamination rates found in this sample 5 .

Ust'-Ishim
A BAM file containing sequences from the Ust'-Ishim genome 3 was filtered to remove reads with mapping quality of less than 30. This sample had been molecularly treated to remove deaminated cytosines so mapDamage rescaling 31 was not performed.

Kostenki
A BAM file containing sequences from the Kostenki genome was downloaded from 4 and duplicate reads were removed. Sequences had already been filtered to have mapping quality > 30 and the first and last 5 bp of all reads had been soft-clipped to a base quality of zero. As terminal bases had been soft-clipped and the sample had been molecularly treated to remove deaminated cytosines, mapDamage rescaling 31 was not carried out.

Otzi the Tyrolean Iceman
Forward FASTQ files were downloaded from 9 and realigned to the GRCh37 build of the human genome with the mitochondrial sequence removed. Sequences were further processed as described in the methods section however reads were not rescaled using mapDamage as the substitution frequencies were outside the bounds of the posterior predictive distribution intervals set by the mapDamage model 31 .

Haak ancient data
Genotypes for Eurasian ancient samples (Supplementary Table 1

Allentoft ancient data
Genotypes for Bronze and Iron Age ancient samples published in 10 were kindly provided by GeoGenetics, Copenhagen. Only samples with at least 200,000 called genotypes were used (Supplementary Table 1).

Soft-clipping ancient data
Due to the SeqTK step carried out prior to alignment (see methods), Kotias, Satsurblia, and Bichon had the first and last two bases of all reads removed. To mirror this step in published ancient data which had not been trimmed in this way, the first and last 2 bp of all sequences were soft-clipped to a base quality of two before genotypes were called (with the exception of data from 7 and 10 for which genotypes were already called).