Initial Upper Palaeolithic humans in Europe had recent Neanderthal ancestry

Modern humans appeared in Europe by at least 45,000 years ago1–5, but the extent of their interactions with Neanderthals, who disappeared by about 40,000 years ago6, and their relationship to the broader expansion of modern humans outside Africa are poorly understood. Here we present genome-wide data from three individuals dated to between 45,930 and 42,580 years ago from Bacho Kiro Cave, Bulgaria1,2. They are the earliest Late Pleistocene modern humans known to have been recovered in Europe so far, and were found in association with an Initial Upper Palaeolithic artefact assemblage. Unlike two previously studied individuals of similar ages from Romania7 and Siberia8 who did not contribute detectably to later populations, these individuals are more closely related to present-day and ancient populations in East Asia and the Americas than to later west Eurasian populations. This indicates that they belonged to a modern human migration into Europe that was not previously known from the genetic record, and provides evidence that there was at least some continuity between the earliest modern humans in Europe and later people in Eurasia. Moreover, we find that all three individuals had Neanderthal ancestors a few generations back in their family history, confirming that the first European modern humans mixed with Neanderthals and suggesting that such mixing could have been common.

The transition between the Middle and Upper Palaeolithic periods in Europe, which started about 47,000 years before present (47 kyr bp) 1,2 , overlapped with the spread of modern humans and the disappearance of Neanderthals, which occurred by about 40 kyr bp 6 . Analyses of the genomes of Neanderthals and modern humans have shown that gene flow occurred between the two hominin groups approximately 60-50 kyr bp [8][9][10][11] , probably in southwestern Asia. However, owing to the scarcity of modern human remains from Eurasia that are older than 40 kyr [1][2][3][4][5]12 , genome-wide data are available for only three individuals of this age 7,8,13 (Fig. 1). Little is therefore known about the genetics of the earliest modern humans in Eurasia, the extent to which they interacted with archaic humans and their contribution to later populations. For example, whereas the roughly 42,000 to 37,000-year-old 'Oase1' individual from Romania 7,14 and the roughly 45,000-year-old 'Ust'Ishim' individual from Siberia 8 do not show specific genetic relationships to subsequent Eurasian populations, the approximately 40,000-year-old 'Tianyuan' individual from China contributed to the genetic ancestry of ancient and present-day East Asian populations 13 . Another open question is the extent to which modern humans mixed with Neanderthals when they spread across Europe and Asia. Direct evidence of local interbreeding exists only for the Oase1 individual, who had a recent Neanderthal ancestor 7 in his family history.
Here, we analyse genome-wide data from human specimens found in direct association with an Initial Upper Palaeolithic (IUP) assemblage of artefacts in Bacho Kiro Cave, Bulgaria 1 (Fig. 1), as well as from two more recent specimens from the same site ( Supplementary Information 1). The IUP groups together assemblages that fall chronologically between the last Middle Palaeolithic assemblages and the first bladelet industries of the Upper Palaeolithic. The IUP spans a broad geographical area 15 , from southwest Asia, central and eastern Europe to Mongolia 16 (Fig. 1, https://doi.org/ 10.1038/s41586-021-03335-3

Article
Supplementary Information 1). Although there are reasons to group these assemblages on the basis of their lithic technology, the IUP also shows great regional variability. Therefore, it is debated whether the IUP represents a dispersal of modern humans across middle-latitude Eurasia, the diffusion of certain technological ideas, instances of independent invention, or a combination of some or all of these 15 . The IUP is contemporaneous with late Neanderthal sites in central and western Europe 6 and precedes later Upper Palaeolithic techno-complexes in Europe, such as the Protoaurignacian and the Aurignacian, by several thousand years 5 .
Five human specimens were recovered from Bacho Kiro Cave in recent excavations. They consist of a lower molar (F6-620) found in the upper part of Layer J in the Main Sector, and four bone fragments (AA7-738, BB7-240, CC7-2289 and CC7-335) from Layer I in Niche 1. They have been directly radiocarbon-dated to between 45,930 and 42,580 calibrated years before present (cal. bp) 1,2 , and their mitochondrial genomes are of the modern human type, suggesting that they are the oldest Upper Palaeolithic modern humans that have been recovered in Europe 1 . One bone fragment was found in Layer B in the Main Sector (F6-597) and another one was among the finds from excavations in the 1970s, when it was retrieved in a position corresponding to the interface of Layers B and C (BK1653). The two latter bone fragments were directly dated to 36,320-35,600 cal. bp and 35,290-34,610 cal. bp 1,2 , respectively. Although the lithic assemblages from the later layers are sparse, they are likely to be Aurignacian 1,2 . We also produced additional data from a mandible 7,14 that was found outside any archaeological context in Peștera cu Oase, Romania (referred to as 'Oase1') 14 . The mandible was directly dated to about 42-37 kyr bp 14 , although this may be an underestimate as the dating was performed before recent technical improvements.
We extracted DNA from between 29.3 mg and 64.7 mg of tooth or bone powder from the specimens as described 1 . We also treated 15 mg of bone powder from the Oase1 mandible with 0.5% hypochlorite solution to reduce bacterial and human contamination before DNA extraction 17  For the six specimens, between 57,293 and 3,272,827 of the targeted SNPs were covered by at least one DNA fragment (Extended Data Table 1). Of these, between 11,655 and 2,290,237 SNPs were covered by at least one fragment showing C-to-T substitutions in the first three and/or the last three positions from the ends, suggesting the presence of deaminated cytosine bases, which are typical of ancient DNA 20 (Extended Data Table 1, Extended Data Fig. 1). On the basis of the numbers of putatively deaminated fragments aligning to the X chromosome and the autosomes 21 ( Supplementary Information 4), we conclude that specimens F6-620, AA7-738, BB7-240 and CC7-335 belonged to males, whereas BK1653 and CC7-2289 belonged to females, although the low amount of data makes this conclusion tentative for CC7-2289 (Extended Data Fig. 2a).
Using an approach that makes use of DNA deamination patterns 22 , we estimated that the overall nuclear DNA contamination was between 2.2% ± 0.5% (F6-620) and 42.4% ± 0.6% (CC7-2289). In the male specimens, we estimated contamination from polymorphisms on the X chromosome 23 to between 1.6% ± 0.1% and 3.4% ± 0.5% (Supplementary Information 2). Owing to the presence of present-day human contamination, we restricted all downstream analyses to putatively deaminated fragments for all specimens except F6-620 (for which contamination was so low that we used all fragments). This left between 11,655 and 3,272,827 SNPs per specimen to be used for the subsequent analyses ( Supplementary Information 2).
The molar F6-620 and the bone fragment AA7-738 have identical mitochondrial genome sequences 1 and both come from males. The pairwise mismatch rate between the two specimens at the SNPs 24 is 0.13, similar to the mismatch rate between libraries from the same specimen (Extended Data Fig. 2b). By contrast, this number is 0.23 (interquartile range: 0.22-0.25) for the other Bacho Kiro Cave specimens, similar to unrelated ancient individuals from other studies (Extended Data Fig. 2b). Thus, we conclude that specimens F6-620 and AA7-738 belonged to the same individual or to identical twins, which is much less likely.
We enriched the libraries from the male individuals using probes that targeted about 6.9 Mb of the Y chromosome 25 (Supplementary Information 4) and arrived at 15.2-fold coverage for F6-620, 2.5-fold for BB7-240 and 1.5-fold for CC7-335. F6-620 carries a basal lineage of the Y chromosome haplogroup F (F-M89), whereas BB7-240 and CC7-335 carry haplogroup C1 (C-F3393). Although haplogroup C is common among males from East Asia and Oceania, both haplogroups F and C1 are rare in present-day humans and are found only at low frequencies in mainland Southeast Asia and Japan 26, 27 . 19  We estimated the extent of genetic similarity among the Bacho Kiro Cave individuals and other early modern humans using outgroup f 3 -statistics 28 . The three roughly 45,000-year-old IUP individuals are more similar to one another than to any other ancient individual (Extended Data Fig. 3a). By contrast, BK1653, which is about 35,000 years old, is more similar to later Upper Palaeolithic individuals from Europe who are around 38,000 years old or younger 29,30  When comparing the Bacho Kiro Cave individuals to present-day populations 31 , we found that the IUP individuals share more alleles (that is, more genetic variants) with present-day populations from East Asia, Central Asia and the Americas than with populations from western Eurasia (Fig. 2a, Supplementary Information 5), whereas the later BK1653 individual shares more alleles with present-day western Eurasian populations (Extended Data Figs. 3b, 4a).
We next investigated whether these observations could be due to the fact that present-day populations in western Eurasia derive part of their ancestry from 'Basal Eurasians' 32,33 , an inferred population that diverged early from other non-African populations and may have 'diluted' allele sharing between western Eurasian populations and IUP individuals. To do this, we compared the Ust'Ishim, Oase1 and IUP Bacho Kiro Cave individuals to western Eurasian individuals such as the approximately 38,000-year-old 'Kostenki14' individual from Russia 29,30 , which pre-dates the introduction of 'Basal Eurasian' ancestry to Europe around 8,000 cal. bp 32 . We found that the Ust'Ishim and Oase1 individuals showed no more affinity to western than to eastern Eurasian populations, suggesting that they did not contribute ancestry to later Eurasian populations, as previously shown 7 . Notably, the GoyetQ116-1 individual has previously been shown to share more alleles with early East Asians than other individuals of a similar age in Europe 13 .
When we explored models of population history that are compatible with the observations above using admixture graphs 28 , we found that the IUP Bacho Kiro Cave individuals were related to populations that contributed ancestry to the Tianyuan individual in China as well as, to a lesser extent, to the GoyetQ116-1 and Ust'Ishim individuals (all |Z| < 3; Fig. 2d, Supplementary Information 6). This resolves the previously unclear relationship between the GoyetQ116-1 and Tianyuan individuals 13 without the need for gene flow between these two geographically distant individuals. The models also suggest that the later BK1653 Article individual belonged to a population that was related, but not identical, to that of the GoyetQ116-1 individual (Fig. 2d, Extended Data Fig. 4, Supplementary Information 6) and that the Věstonice cluster, whose members were found in association with Gravettian assemblages 29 , derived part of their ancestry from such a population and the rest from populations related to the roughly 34,000-year-old 'Sunghir' individuals 40 from Russia (Fig. 2d, Supplementary Information 6).
To study the spatial distribution of Neanderthal ancestry in the genomes of the Bacho Kiro Cave individuals, we used around 1.7 million SNPs at which Neanderthal 9 and/or Denisovan 43 genomes differ from African genomes 7 and an approach 44 that detects tracts of archaic DNA in ancient genomes. We found a total of 279.6 centiMorgans (cM) of Neanderthal DNA in F6-620, 251.6 cM in CC7-335 and 220.9 cM in BB7-240, and these individuals carried seven, six and nine Neanderthal DNA segments longer than 5 cM, respectively (Fig. 3, Extended Data Fig. 7a, Supplementary Information 8). The longest introgressed Neanderthal segment in F6-620 encompassed 54.3 cM, and the longest segments in CC7-335 and BB7-240 were 25.6 cM and 17.4 cM, respectively (Fig. 3, Extended Data Fig. 7a). By contrast, the total amount of Neanderthal DNA in the BK1653 genome was 121.7 cM and the longest Neanderthal segment was 2.5 cM (Fig. 3, Extended Data Fig. 7a).
On the basis of the distribution of the long Neanderthal segments, we estimate that individual F6-620 had a Neanderthal ancestor less than six generations back in his family tree (Extended Data Table 2, Supplementary Information 8). Both the CC7-335 and BB7-240 individuals had Neanderthal ancestors about seven generations back in their families, with upper confidence intervals of ten and seventeen generations, respectively (Extended Data To further explore the extent to which the Bacho Kiro Cave individuals contributed ancestry to later populations in Eurasia, we investigated whether the Neanderthal DNA segments in Bacho Kiro Cave genomes overlapped with Neanderthal segments in present-day populations more than expected by chance. We found more overlapping of segments between present-day East Asian populations and the IUP Bacho Kiro Cave individuals (lowest correlation coefficient of 0.09, 95% CI: 0.08-0.1) than with the BK1653 individual (P = 0.02, Wilcoxon test). By contrast, the BK1653 individual shows more overlapping of Neanderthal segments with present-day western Eurasian populations (a correlation coefficient of 0.11, 95% CI: 0.1-0.12) than do the IUP Bacho Kiro Cave individuals (P < 1 × 10 −18 , Wilcoxon test). This is compatible with the observation that the IUP Bacho Kiro Cave population contributed more ancestry to later populations in Asia and the Americas, whereas the BK1653 individual contributed more ancestry to populations in western Eurasia.
We next looked for overlap between parts of the human genome that carry little or no Neanderthal ancestry (Neanderthal 'deserts'), which are thought to have been caused by purifying selection against  Neanderthal DNA shortly after introgression 45,46 . We find almost no introgressed Neanderthal DNA in the previously described deserts in the IUP Bacho Kiro Cave and Oase1 individuals (249 kb out of 898 Mb of introgressed sequence; P = 0.0079, permutation P value). When we restricted these comparisons to the more recent Neanderthal contributions (that is, segments longer than 5 cM), we similarly found no overlap (0 Mb out of 415 Mb, P = 0.15, permutation P value), suggesting that selection against Neanderthal DNA variants occurred within a few generations, although additional individuals with recent Neanderthal ancestry will be needed to fully resolve this question.
In conclusion, the Bacho Kiro Cave genomes show that several distinct modern human populations existed during the early Upper Palaeolithic in Eurasia. Some of these populations, represented by the Oase1 and Ust'Ishim individuals, show no detectable affinities to later populations, whereas groups related to the IUP Bacho Kiro Cave individuals contributed to later populations with Asian ancestry as well as some western Eurasian humans such as the GoyetQ116-1 individual in Belgium. This is consistent with the fact that IUP archaeological assemblages are found from central and eastern Europe to present-day Mongolia 5,15,16 (Fig. 1), and a putative IUP dispersal that reached from eastern Europe to East Asia. Eventually populations related to the IUP Bacho Kiro Cave individuals disappeared in western Eurasia without leaving a detectable genetic contribution to later populations, as indicated by the fact that later individuals, including BK1653 at Bacho Kiro Cave, were closer to present-day European populations than to present-day Asian populations 29, 30 . In Europe, the notion of successive population replacements is also consistent with the archaeological record, where the IUP is clearly intrusive against the Middle Palaeolithic background and where, apart from the common focus on blades, there are no clear technological connections between the IUP and the subsequent Aurignacian technologies. Finally, it is striking that all four of the European individuals who overlapped in time with late Neanderthals 7 and from whom genome-wide data have been retrieved had close Neanderthal relatives in their family histories (Fig. 3, Extended Data Figs. 7,8). This suggests that mixing between Neanderthals and the first modern humans that arrived into Europe was perhaps more common than is often assumed.
Note added in proof: A companion paper 47 describes an individual from the Czechia who-based on genetic analyses-may be of similar age to the IUP Bacho Kiro Cave individuals and who carries a proportion of Neanderthal ancestry similar to later Upper Palaeolithic humans.

Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41586-021-03335-3.

Ethics declaration
All approvals for specimen handling have been obtained from the relevant institutions. For the Oase1 specimen, the permission was granted to S.P. by the Emil Racovita Institute of Speleology, as the national authority in caves study. For the Bacho Kiro Cave specimens, the permission was granted by the Bulgarian Ministry of Culture and the National Museum of Natural History (Sofia, Bulgaria).

DNA extraction and library preparation
Data generation for the seven Bacho Kiro Cave specimens (specimen IDs: F6-620, AA7-738, BB7-240, CC7-2289, CC7-335, F6-597 and BK1653) was based on DNA libraries prepared and described previously 1 . To obtain additional data from the Oase1 individual, we extracted DNA from 15 mg of bone powder from the specimen 7, 14 . As it was previously found to be highly contaminated with microbial and present-day human DNA, the bone powder was treated with 0.5% hypochlorite solution before DNA extraction 17 . Four single-stranded DNA libraries were prepared from the resulting extract and two additional libraries were prepared, each using 5 μl of the two DNA extracts generated previously 7 as described 49 . The pools of libraries were then sequenced directly on Illumina MiSeq and HiSeq 2500 platforms in a double index configuration (2 × 76 cycles) 50 and base calling was done using Bustard (Illumina).

DNA captures
We enriched the selected amplified libraries for about 3.7 million SNPs across the genome described in supplementary data 2 of ref. 19 (SNP Panel 1 or 390k array), and supplementary data 1-3 of ref. 7 (SNP Panels 2, 3 and 4, or 840k, 1000k and Archaic admixture arrays, respectively). For the male individuals (Bacho Kiro Cave F6-620, BB7-240 and CC7-335), an aliquot of each library was additionally enriched for about 6.9 Mb of the Y chromosome 25 . All of the enriched libraries were sequenced on the Illumina HiSeq 2500 platforms in a double index configuration (2 × 76 cycles) 50 and base calling was done using Bustard (Illumina).

Sequencing of capture products and data processing
For all sequencing runs we trimmed the adapters and merged overlapping forward and reverse reads into single sequences using leeHom 51 (version: https://bioinf.eva.mpg.de/leehom/). The Burrows-Wheeler Aligner 52 (BWA, version: 0.5.10-evan.9-1-g44db244; https://github. com/mpieva/network-aware-bwa) with the parameters adjusted for ancient DNA ("-n 0.01 -o 2 -l 16500") 43 was used to align the data from all sequencing runs to the human reference genome (GRCh37/1000 Genomes release; ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/phase2_reference_assembly_sequence/). Only reads that showed perfect matches to the expected index combinations were used for all downstream analyses. PCR duplicates were removed using bam-rmdup (version: 0. 6.3; https://github.com/mpieva/ biohazard-tools) and SAMtools (version: 1.3.1) 53 was used to filter for fragments that were at least 35 bp long and that had a mapping quality equal to or greater than 25. BAM files of the libraries enriched for the specific subset of the nuclear genome were further intersected with the BED files containing target SNP positions (390k, 840k, 1000k, Archaic admixture, a merged set of SNP Panels 1 and 2 or 1240k, and a merged set of SNP Panels 1, 2 and 3 or 2200k) and regions (Y chromosome) using BEDtools 54 (version: 2. 24.0). In order to filter for endogenous ancient DNA or putatively deaminated fragments, we used elevated C-to-T substitutions relative to the reference genome at the first three and/or last three positions of the alignment ends 20 . We merged libraries originating from the same specimen using samtools merge 53 to produce the final datasets for downstream analyses (Extended Data Table 1, Supplementary Information 2).

Merging of the Bacho Kiro Cave and Oase1 data with other genomes
We performed random read sampling using bam-caller (https://github. com/bodkan/bam-caller, version: 0.1) by picking a base with a base quality of at least 30 at each position in the 1240k and 2200k SNP Panels that was covered by at least one fragment longer than 35 bp with a mapping quality equal to or higher than 25 (L ≥ 35 bp, MQ ≥ 25, BQ ≥ 30).
To mitigate the effect of deamination-derived substitutions on downstream analyses, we did not sample any Ts on the forward strands (in the orientation as sequenced) or any As on the reverse strands in the first five and/or last five positions from the alignment ends. Owing to the haploid nature of the Y chromosome, we called genotypes across the approximately 6.9 Mb of the Y chromosome for the enriched libraries of male individuals by calling a consensus allele at each position by majority call requiring a minimum coverage of 3 for specimens F6-620 and BB7-240 and of 2 for specimen CC7-335 using using bam-caller (https://github.com/bodkan/bam-caller, version: 0.1) ( Supplementary  Information 2).

Population genetic analyses
To determine the relationship of the Bacho Kiro Cave and Oase1 individuals to other modern and archaic humans we used a range of f-statistics from ADMIXTOOLS 28 (version: v5.1) and as implemented in the R package admixr 62 (version: 0.7.1; Supplementary Information 4). We used qpGraph program (Admixture Graph) from ADMIXTOOLS 28 (version: v5.1) to test models of the relationship among Initial Upper Palaeolithic Bacho Kiro Cave individuals, the roughly 35,000-year-old Bacho Kiro Cave individual BK1653 and other ancient modern humans from Eurasia older than 30,000 years bp (Fig. 2d, Supplementary Information 6).

Neanderthal ancestry
We estimated the proportion of Neanderthal DNA in the genomes of present-day and ancient modern humans by computing a direct f 4 ratio 28 that takes advantage of the two high-quality Neanderthal genomes 9,10,41 (Extended Data Fig. 6, Supplementary Information 7). We used admixfrog 44 (version: 0. 5.6, https://github.com/BenjaminPeter/ admixfrog/) to detect archaic introgressed segments in the genomes of the Bacho Kiro Cave and Oase1 individuals, as well as in other ancient modern humans and 254 present-day non-African individuals from the SGDP 31 as a direct comparison ( Supplementary Information 8). We used these introgressed segments to estimate the number of generations since the most recent Neanderthal ancestor of the IUP Bacho Kiro Cave and Oase1 individuals ( Supplementary Information 8), to investigate the overlap of Neanderthal segments in Bacho Kiro Cave individuals with those detected in present-day and ancient modern humans (Supplementary Information 9), and to investigate the overlap of Neanderthal segments in the IUP Bacho Kiro Cave and Oase1 individuals with parts of the human genome devoid of Neanderthal ancestry 45,46 (Neanderthal deserts; Supplementary Information 10).

Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this paper.