Modern humans appeared in Europe by at least 45,000 years ago1,2,3,4,5, but the extent of their interactions with Neanderthals, who disappeared by about 40,000 years ago6, and their relationship to the broader expansion of modern humans outside Africa are poorly understood. Here we present genome-wide data from three individuals dated to between 45,930 and 42,580 years ago from Bacho Kiro Cave, Bulgaria1,2. They are the earliest Late Pleistocene modern humans known to have been recovered in Europe so far, and were found in association with an Initial Upper Palaeolithic artefact assemblage. Unlike two previously studied individuals of similar ages from Romania7 and Siberia8 who did not contribute detectably to later populations, these individuals are more closely related to present-day and ancient populations in East Asia and the Americas than to later west Eurasian populations. This indicates that they belonged to a modern human migration into Europe that was not previously known from the genetic record, and provides evidence that there was at least some continuity between the earliest modern humans in Europe and later people in Eurasia. Moreover, we find that all three individuals had Neanderthal ancestors a few generations back in their family history, confirming that the first European modern humans mixed with Neanderthals and suggesting that such mixing could have been common.
The transition between the Middle and Upper Palaeolithic periods in Europe, which started about 47,000 years before present (47 kyr bp)1,2, overlapped with the spread of modern humans and the disappearance of Neanderthals, which occurred by about 40 kyr bp6. Analyses of the genomes of Neanderthals and modern humans have shown that gene flow occurred between the two hominin groups approximately 60–50 kyr bp8,9,10,11, probably in southwestern Asia. However, owing to the scarcity of modern human remains from Eurasia that are older than 40 kyr1,2,3,4,5,12, genome-wide data are available for only three individuals of this age7,8,13 (Fig. 1). Little is therefore known about the genetics of the earliest modern humans in Eurasia, the extent to which they interacted with archaic humans and their contribution to later populations. For example, whereas the roughly 42,000 to 37,000-year-old ‘Oase1’ individual from Romania7,14 and the roughly 45,000-year-old ‘Ust’Ishim’ individual from Siberia8 do not show specific genetic relationships to subsequent Eurasian populations, the approximately 40,000-year-old ‘Tianyuan’ individual from China contributed to the genetic ancestry of ancient and present-day East Asian populations13. Another open question is the extent to which modern humans mixed with Neanderthals when they spread across Europe and Asia. Direct evidence of local interbreeding exists only for the Oase1 individual, who had a recent Neanderthal ancestor7 in his family history.
Here, we analyse genome-wide data from human specimens found in direct association with an Initial Upper Palaeolithic (IUP) assemblage of artefacts in Bacho Kiro Cave, Bulgaria1 (Fig. 1), as well as from two more recent specimens from the same site (Supplementary Information 1). The IUP groups together assemblages that fall chronologically between the last Middle Palaeolithic assemblages and the first bladelet industries of the Upper Palaeolithic. The IUP spans a broad geographical area15, from southwest Asia, central and eastern Europe to Mongolia16 (Fig. 1, Supplementary Information 1). Although there are reasons to group these assemblages on the basis of their lithic technology, the IUP also shows great regional variability. Therefore, it is debated whether the IUP represents a dispersal of modern humans across middle-latitude Eurasia, the diffusion of certain technological ideas, instances of independent invention, or a combination of some or all of these15. The IUP is contemporaneous with late Neanderthal sites in central and western Europe6 and precedes later Upper Palaeolithic techno-complexes in Europe, such as the Protoaurignacian and the Aurignacian, by several thousand years5.
Five human specimens were recovered from Bacho Kiro Cave in recent excavations. They consist of a lower molar (F6-620) found in the upper part of Layer J in the Main Sector, and four bone fragments (AA7-738, BB7-240, CC7-2289 and CC7-335) from Layer I in Niche 1. They have been directly radiocarbon-dated to between 45,930 and 42,580 calibrated years before present (cal. bp)1,2, and their mitochondrial genomes are of the modern human type, suggesting that they are the oldest Upper Palaeolithic modern humans that have been recovered in Europe1. One bone fragment was found in Layer B in the Main Sector (F6-597) and another one was among the finds from excavations in the 1970s, when it was retrieved in a position corresponding to the interface of Layers B and C (BK1653). The two latter bone fragments were directly dated to 36,320–35,600 cal. bp and 35,290–34,610 cal. bp1,2, respectively. Although the lithic assemblages from the later layers are sparse, they are likely to be Aurignacian1,2. We also produced additional data from a mandible7,14 that was found outside any archaeological context in Peștera cu Oase, Romania (referred to as ‘Oase1’)14. The mandible was directly dated to about 42–37 kyr bp14, although this may be an underestimate as the dating was performed before recent technical improvements.
We extracted DNA from between 29.3 mg and 64.7 mg of tooth or bone powder from the specimens as described1. We also treated 15 mg of bone powder from the Oase1 mandible with 0.5% hypochlorite solution to reduce bacterial and human contamination before DNA extraction17. Among DNA fragments sequenced from the DNA libraries constructed from the Bacho Kiro Cave and Oase1 extracts, between 0.003% and 1.8% could be mapped to the human genome (Supplementary Information 2). Owing to the low fraction of hominin DNA, we used in-solution hybridization capture18 to enrich the libraries for about 3.8 million single-nucleotide polymorphisms (SNPs) that are informative about modern human variation and archaic admixture7,19 (excluding F6-597, which contained very little if any endogenous DNA; Supplementary Information 2).
For the six specimens, between 57,293 and 3,272,827 of the targeted SNPs were covered by at least one DNA fragment (Extended Data Table 1). Of these, between 11,655 and 2,290,237 SNPs were covered by at least one fragment showing C-to-T substitutions in the first three and/or the last three positions from the ends, suggesting the presence of deaminated cytosine bases, which are typical of ancient DNA20 (Extended Data Table 1, Extended Data Fig. 1). On the basis of the numbers of putatively deaminated fragments aligning to the X chromosome and the autosomes21 (Supplementary Information 4), we conclude that specimens F6-620, AA7-738, BB7-240 and CC7-335 belonged to males, whereas BK1653 and CC7-2289 belonged to females, although the low amount of data makes this conclusion tentative for CC7-2289 (Extended Data Fig. 2a).
Using an approach that makes use of DNA deamination patterns22, we estimated that the overall nuclear DNA contamination was between 2.2% ± 0.5% (F6-620) and 42.4% ± 0.6% (CC7-2289). In the male specimens, we estimated contamination from polymorphisms on the X chromosome23 to between 1.6% ± 0.1% and 3.4% ± 0.5% (Supplementary Information 2). Owing to the presence of present-day human contamination, we restricted all downstream analyses to putatively deaminated fragments for all specimens except F6-620 (for which contamination was so low that we used all fragments). This left between 11,655 and 3,272,827 SNPs per specimen to be used for the subsequent analyses (Supplementary Information 2).
The molar F6-620 and the bone fragment AA7-738 have identical mitochondrial genome sequences1 and both come from males. The pairwise mismatch rate between the two specimens at the SNPs24 is 0.13, similar to the mismatch rate between libraries from the same specimen (Extended Data Fig. 2b). By contrast, this number is 0.23 (interquartile range: 0.22–0.25) for the other Bacho Kiro Cave specimens, similar to unrelated ancient individuals from other studies (Extended Data Fig. 2b). Thus, we conclude that specimens F6-620 and AA7-738 belonged to the same individual or to identical twins, which is much less likely.
We enriched the libraries from the male individuals using probes that targeted about 6.9 Mb of the Y chromosome25 (Supplementary Information 4) and arrived at 15.2-fold coverage for F6-620, 2.5-fold for BB7-240 and 1.5-fold for CC7-335. F6-620 carries a basal lineage of the Y chromosome haplogroup F (F-M89), whereas BB7-240 and CC7-335 carry haplogroup C1 (C-F3393). Although haplogroup C is common among males from East Asia and Oceania, both haplogroups F and C1 are rare in present-day humans and are found only at low frequencies in mainland Southeast Asia and Japan26,27.
We estimated the extent of genetic similarity among the Bacho Kiro Cave individuals and other early modern humans using outgroup f3-statistics28. The three roughly 45,000-year-old IUP individuals are more similar to one another than to any other ancient individual (Extended Data Fig. 3a). By contrast, BK1653, which is about 35,000 years old, is more similar to later Upper Palaeolithic individuals from Europe who are around 38,000 years old or younger29,30 (3.0 ≤ |Z| ≤ 17.4; Extended Data Fig. 4, Supplementary Information 5); for example, to the roughly 35,000-year-old ‘GoyetQ116-1’ individual from Belgium and members of the ‘Věstonice’ genetic cluster, who are associated with later Gravettian assemblages29 (Extended Data Figs. 3a, 4b, c).
When comparing the Bacho Kiro Cave individuals to present-day populations31, we found that the IUP individuals share more alleles (that is, more genetic variants) with present-day populations from East Asia, Central Asia and the Americas than with populations from western Eurasia (Fig. 2a, Supplementary Information 5), whereas the later BK1653 individual shares more alleles with present-day western Eurasian populations (Extended Data Figs. 3b, 4a).
We next investigated whether these observations could be due to the fact that present-day populations in western Eurasia derive part of their ancestry from ‘Basal Eurasians’32,33, an inferred population that diverged early from other non-African populations and may have ‘diluted’ allele sharing between western Eurasian populations and IUP individuals. To do this, we compared the Ust’Ishim, Oase1 and IUP Bacho Kiro Cave individuals to western Eurasian individuals such as the approximately 38,000-year-old ‘Kostenki14’ individual from Russia29,30, which pre-dates the introduction of ‘Basal Eurasian’ ancestry to Europe around 8,000 cal. bp32. We found that the Ust’Ishim and Oase1 individuals showed no more affinity to western than to eastern Eurasian populations, suggesting that they did not contribute ancestry to later Eurasian populations, as previously shown7,8 (Supplementary Information 5, Extended Data Fig. 5). By contrast, the IUP Bacho Kiro Cave individuals shared more alleles with the roughly 40,000-year-old Tianyuan individual13 from China (Fig. 2b) and other ancient Siberians34,35 and Native Americans36,37,38,39 (Fig. 2c) than with the Kostenki14 individual (3.6 ≤ |Z| ≤ 5.3). Among other western Eurasian Upper Palaeolithic humans, the IUP Bacho Kiro Cave individuals shared more alleles with the Oase1 (3.6 ≤ |Z| ≤ 4.3) and roughly 35,000-year-old GoyetQ116-129 individuals than with the Kostenki14 individual (3.2 ≤ |Z| ≤ 4.3; Fig. 2c, Supplementary Information 5). Notably, the GoyetQ116-1 individual has previously been shown to share more alleles with early East Asians than other individuals of a similar age in Europe13.
When we explored models of population history that are compatible with the observations above using admixture graphs28, we found that the IUP Bacho Kiro Cave individuals were related to populations that contributed ancestry to the Tianyuan individual in China as well as, to a lesser extent, to the GoyetQ116-1 and Ust’Ishim individuals (all |Z| < 3; Fig. 2d, Supplementary Information 6). This resolves the previously unclear relationship between the GoyetQ116-1 and Tianyuan individuals13 without the need for gene flow between these two geographically distant individuals. The models also suggest that the later BK1653 individual belonged to a population that was related, but not identical, to that of the GoyetQ116-1 individual (Fig. 2d, Extended Data Fig. 4, Supplementary Information 6) and that the Věstonice cluster, whose members were found in association with Gravettian assemblages29, derived part of their ancestry from such a population and the rest from populations related to the roughly 34,000-year-old ‘Sunghir’ individuals40 from Russia (Fig. 2d, Supplementary Information 6).
As the IUP Bacho Kiro Cave individuals lived at the same time as some of the last Neanderthals in Europe6, we estimated the proportion of Neanderthal DNA in their genomes by taking advantage of two high-quality Neanderthal genomes9,10,41. We found that the IUP individuals F6-620, BB7-240 and CC7-335 carried 3.8% (95% confidence interval (CI): 3.3–4.4%), 3.0% (95% CI: 2.4–3.6%) and 3.4% (95% CI: 2.8–4.0%) Neanderthal DNA, respectively. This is more than the average of 1.9% (95% CI: 1.5–2.4%) found in other ancient or present-day humans, except for the Oase1 individual, who had a close Neanderthal relative (6.4% (95% CI: 5.7–7.1%); Extended Data Fig. 6, Supplementary Information 7). By contrast, the more recent BK1653 individual carried only 1.9% (95% CI: 1.4–2.4%) Neanderthal DNA, similar to other ancient and present-day humans10,41 (Extended Data Fig. 6). As has been the case for all humans studied so far, the Neanderthal DNA in BK1653 and the IUP Bacho Kiro Cave individuals was more similar to the Vindija33.1910 and Chagyrskaya842 Neanderthals than to the Altai Neanderthal9 (2.8 ≤ |Z| ≤ 5.1; Supplementary Information 7).
To study the spatial distribution of Neanderthal ancestry in the genomes of the Bacho Kiro Cave individuals, we used around 1.7 million SNPs at which Neanderthal9 and/or Denisovan43 genomes differ from African genomes7 and an approach44 that detects tracts of archaic DNA in ancient genomes. We found a total of 279.6 centiMorgans (cM) of Neanderthal DNA in F6-620, 251.6 cM in CC7-335 and 220.9 cM in BB7-240, and these individuals carried seven, six and nine Neanderthal DNA segments longer than 5 cM, respectively (Fig. 3, Extended Data Fig. 7a, Supplementary Information 8). The longest introgressed Neanderthal segment in F6-620 encompassed 54.3 cM, and the longest segments in CC7-335 and BB7-240 were 25.6 cM and 17.4 cM, respectively (Fig. 3, Extended Data Fig. 7a). By contrast, the total amount of Neanderthal DNA in the BK1653 genome was 121.7 cM and the longest Neanderthal segment was 2.5 cM (Fig. 3, Extended Data Fig. 7a).
On the basis of the distribution of the long Neanderthal segments, we estimate that individual F6-620 had a Neanderthal ancestor less than six generations back in his family tree (Extended Data Table 2, Supplementary Information 8). Both the CC7-335 and BB7-240 individuals had Neanderthal ancestors about seven generations back in their families, with upper confidence intervals of ten and seventeen generations, respectively (Extended Data Table 2, Extended Data Fig. 7b, Supplementary Information 8). Thus, all IUP Bacho Kiro Cave individuals had recent Neanderthal ancestors in their immediate family histories.
To further explore the extent to which the Bacho Kiro Cave individuals contributed ancestry to later populations in Eurasia, we investigated whether the Neanderthal DNA segments in Bacho Kiro Cave genomes overlapped with Neanderthal segments in present-day populations more than expected by chance. We found more overlapping of segments between present-day East Asian populations and the IUP Bacho Kiro Cave individuals (lowest correlation coefficient of 0.09, 95% CI: 0.08–0.1) than with the BK1653 individual (P = 0.02, Wilcoxon test). By contrast, the BK1653 individual shows more overlapping of Neanderthal segments with present-day western Eurasian populations (a correlation coefficient of 0.11, 95% CI: 0.1–0.12) than do the IUP Bacho Kiro Cave individuals (P < 1 × 10−18, Wilcoxon test). This is compatible with the observation that the IUP Bacho Kiro Cave population contributed more ancestry to later populations in Asia and the Americas, whereas the BK1653 individual contributed more ancestry to populations in western Eurasia.
We next looked for overlap between parts of the human genome that carry little or no Neanderthal ancestry (Neanderthal ‘deserts’), which are thought to have been caused by purifying selection against Neanderthal DNA shortly after introgression45,46. We find almost no introgressed Neanderthal DNA in the previously described deserts in the IUP Bacho Kiro Cave and Oase1 individuals (249 kb out of 898 Mb of introgressed sequence; P = 0.0079, permutation P value). When we restricted these comparisons to the more recent Neanderthal contributions (that is, segments longer than 5 cM), we similarly found no overlap (0 Mb out of 415 Mb, P = 0.15, permutation P value), suggesting that selection against Neanderthal DNA variants occurred within a few generations, although additional individuals with recent Neanderthal ancestry will be needed to fully resolve this question.
In conclusion, the Bacho Kiro Cave genomes show that several distinct modern human populations existed during the early Upper Palaeolithic in Eurasia. Some of these populations, represented by the Oase1 and Ust’Ishim individuals, show no detectable affinities to later populations, whereas groups related to the IUP Bacho Kiro Cave individuals contributed to later populations with Asian ancestry as well as some western Eurasian humans such as the GoyetQ116-1 individual in Belgium. This is consistent with the fact that IUP archaeological assemblages are found from central and eastern Europe to present-day Mongolia5,15,16 (Fig. 1), and a putative IUP dispersal that reached from eastern Europe to East Asia. Eventually populations related to the IUP Bacho Kiro Cave individuals disappeared in western Eurasia without leaving a detectable genetic contribution to later populations, as indicated by the fact that later individuals, including BK1653 at Bacho Kiro Cave, were closer to present-day European populations than to present-day Asian populations29,30. In Europe, the notion of successive population replacements is also consistent with the archaeological record, where the IUP is clearly intrusive against the Middle Palaeolithic background and where, apart from the common focus on blades, there are no clear technological connections between the IUP and the subsequent Aurignacian technologies. Finally, it is striking that all four of the European individuals who overlapped in time with late Neanderthals7 and from whom genome-wide data have been retrieved had close Neanderthal relatives in their family histories (Fig. 3, Extended Data Figs. 7, 8). This suggests that mixing between Neanderthals and the first modern humans that arrived into Europe was perhaps more common than is often assumed.
Note added in proof: A companion paper47 describes an individual from the Czechia who—based on genetic analyses—may be of similar age to the IUP Bacho Kiro Cave individuals and who carries a proportion of Neanderthal ancestry similar to later Upper Palaeolithic humans.
All approvals for specimen handling have been obtained from the relevant institutions. For the Oase1 specimen, the permission was granted to S.P. by the Emil Racovita Institute of Speleology, as the national authority in caves study. For the Bacho Kiro Cave specimens, the permission was granted by the Bulgarian Ministry of Culture and the National Museum of Natural History (Sofia, Bulgaria).
DNA extraction and library preparation
Data generation for the seven Bacho Kiro Cave specimens (specimen IDs: F6-620, AA7-738, BB7-240, CC7-2289, CC7-335, F6-597 and BK1653) was based on DNA libraries prepared and described previously1. To obtain additional data from the Oase1 individual, we extracted DNA from 15 mg of bone powder from the specimen7,14. As it was previously found to be highly contaminated with microbial and present-day human DNA, the bone powder was treated with 0.5% hypochlorite solution before DNA extraction17. Four single-stranded DNA libraries were prepared from the resulting extract and two additional libraries were prepared, each using 5 μl of the two DNA extracts generated previously7 as described49. The pools of libraries were then sequenced directly on Illumina MiSeq and HiSeq 2500 platforms in a double index configuration (2 × 76 cycles)50 and base calling was done using Bustard (Illumina).
We enriched the selected amplified libraries for about 3.7 million SNPs across the genome described in supplementary data 2 of ref. 19 (SNP Panel 1 or 390k array), and supplementary data 1–3 of ref. 7 (SNP Panels 2, 3 and 4, or 840k, 1000k and Archaic admixture arrays, respectively). For the male individuals (Bacho Kiro Cave F6-620, BB7-240 and CC7-335), an aliquot of each library was additionally enriched for about 6.9 Mb of the Y chromosome25. All of the enriched libraries were sequenced on the Illumina HiSeq 2500 platforms in a double index configuration (2 × 76 cycles)50 and base calling was done using Bustard (Illumina).
Sequencing of capture products and data processing
For all sequencing runs we trimmed the adapters and merged overlapping forward and reverse reads into single sequences using leeHom51 (version: https://bioinf.eva.mpg.de/leehom/). The Burrows-Wheeler Aligner52 (BWA, version: 0.5.10-evan.9-1-g44db244; https://github.com/mpieva/network-aware-bwa) with the parameters adjusted for ancient DNA (“-n 0.01 –o 2 –l 16500”)43 was used to align the data from all sequencing runs to the human reference genome (GRCh37/1000 Genomes release; ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/phase2_reference_assembly_sequence/). Only reads that showed perfect matches to the expected index combinations were used for all downstream analyses. PCR duplicates were removed using bam-rmdup (version: 0.6.3; https://github.com/mpieva/biohazard-tools) and SAMtools (version: 1.3.1)53 was used to filter for fragments that were at least 35 bp long and that had a mapping quality equal to or greater than 25. BAM files of the libraries enriched for the specific subset of the nuclear genome were further intersected with the BED files containing target SNP positions (390k, 840k, 1000k, Archaic admixture, a merged set of SNP Panels 1 and 2 or 1240k, and a merged set of SNP Panels 1, 2 and 3 or 2200k) and regions (Y chromosome) using BEDtools54 (version: 2.24.0). In order to filter for endogenous ancient DNA or putatively deaminated fragments, we used elevated C-to-T substitutions relative to the reference genome at the first three and/or last three positions of the alignment ends20. We merged libraries originating from the same specimen using samtools merge53 to produce the final datasets for downstream analyses (Extended Data Table 1, Supplementary Information 2).
Merging of the Bacho Kiro Cave and Oase1 data with other genomes
We performed random read sampling using bam-caller (https://github.com/bodkan/bam-caller, version: 0.1) by picking a base with a base quality of at least 30 at each position in the 1240k and 2200k SNP Panels that was covered by at least one fragment longer than 35 bp with a mapping quality equal to or higher than 25 (L ≥ 35 bp, MQ ≥ 25, BQ ≥ 30). To mitigate the effect of deamination-derived substitutions on downstream analyses, we did not sample any Ts on the forward strands (in the orientation as sequenced) or any As on the reverse strands in the first five and/or last five positions from the alignment ends. Owing to the haploid nature of the Y chromosome, we called genotypes across the approximately 6.9 Mb of the Y chromosome for the enriched libraries of male individuals by calling a consensus allele at each position by majority call requiring a minimum coverage of 3 for specimens F6-620 and BB7-240 and of 2 for specimen CC7-335 using using bam-caller (https://github.com/bodkan/bam-caller, version: 0.1) (Supplementary Information 2).
We merged the data from the newly sequenced specimens with datasets of previously published ancient and present-day humans, as well as archaic humans, for three SNP panels (1240k, 2240k and Archaic admixture; Supplementary Information 3). Data from the 1240k panel include genotypes of 2,109 ancient and 2,974 present-day individuals compiled from published studies and available in the EIGENSTRAT format28 at https://reich.hms.harvard.edu/allen-ancient-dna- resource-aadr-downloadable-genotypes- present-day-and-ancient-dna-data (version 37.2, released 22 February 2019). Data from the 2240k panel include published genetic data of ancient modern humans obtained through hybridization captures7,13,29 and a range of present-day9,31 and ancient modern humans8,30,32,33,34,35,36,37,55,56,57,58,59,60, as well as the archaics9,10,42,43,61, for which whole-genome shotgun data of varying coverage are available (Supplementary Information 3). The Archaic admixture panel data include 21 ancient modern humans directly enriched for these sites7,13,29, as well as the genotypes of present-day9,31 and ancient modern humans8,30,32,33,34,35,36,37,55,56,57,58,59,60, as well as the archaics9,10,42,43,61, for which whole-genome shotgun data are available (Supplementary Information 3) and that were intersected with about 1.7 million SNPs of the Archaic admixture panel using BEDTools54 (version: 2.24.0).
Population genetic analyses
To determine the relationship of the Bacho Kiro Cave and Oase1 individuals to other modern and archaic humans we used a range of f-statistics from ADMIXTOOLS28 (version: v5.1) and as implemented in the R package admixr62 (version: 0.7.1; Supplementary Information 4). We used qpGraph program (Admixture Graph) from ADMIXTOOLS28 (version: v5.1) to test models of the relationship among Initial Upper Palaeolithic Bacho Kiro Cave individuals, the roughly 35,000-year-old Bacho Kiro Cave individual BK1653 and other ancient modern humans from Eurasia older than 30,000 years bp (Fig. 2d, Supplementary Information 6).
We estimated the proportion of Neanderthal DNA in the genomes of present-day and ancient modern humans by computing a direct f4 ratio28 that takes advantage of the two high-quality Neanderthal genomes9,10,41 (Extended Data Fig. 6, Supplementary Information 7). We used admixfrog44 (version: 0.5.6, https://github.com/BenjaminPeter/admixfrog/) to detect archaic introgressed segments in the genomes of the Bacho Kiro Cave and Oase1 individuals, as well as in other ancient modern humans and 254 present-day non-African individuals from the SGDP31 as a direct comparison (Supplementary Information 8). We used these introgressed segments to estimate the number of generations since the most recent Neanderthal ancestor of the IUP Bacho Kiro Cave and Oase1 individuals (Supplementary Information 8), to investigate the overlap of Neanderthal segments in Bacho Kiro Cave individuals with those detected in present-day and ancient modern humans (Supplementary Information 9), and to investigate the overlap of Neanderthal segments in the IUP Bacho Kiro Cave and Oase1 individuals with parts of the human genome devoid of Neanderthal ancestry45,46 (Neanderthal deserts; Supplementary Information 10).
Further information on research design is available in the Nature Research Reporting Summary linked to this paper.
The aligned sequences of the Bacho Kiro Cave and Oase1 individuals have been deposited in the European Nucleotide Archive under accession number PRJEB39134. Comparative data of present-day human genomes from the SGDP that were used in this study are available at https://www.simonsfoundation.org/simons-genome-diversity-project/. Comparative data used in this study, which include genotypes of 2,109 ancient and 2,974 present-day individuals compiled from published studies, are available in the EIGENSTRAT file format at https://reich.hms.harvard.edu/allen-ancient-dna-resource- aadr-downloadable-genotypes- present-day-and-ancient-dna-data (version 37.2, released 22 February 2019). To determine the Y chromosome haplogroups of male individuals in this study, we used the Y-haplogroup tree from the International Society of Genetic Genealogy (ISOGG, available at http://www.isogg.org, version: 13.38).
Hublin, J.-J. et al. Initial Upper Palaeolithic Homo sapiens from Bacho Kiro Cave, Bulgaria. Nature 581, 299–302 (2020).
Fewlass, H. et al. A 14C chronology for the Middle to Upper Palaeolithic transition at Bacho Kiro Cave, Bulgaria. Nat. Ecol. Evol. 4, 794–801 (2020).
Higham, T. et al. The earliest evidence for anatomically modern humans in northwestern Europe. Nature 479, 521–524 (2011).
Benazzi, S. et al. Early dispersal of modern humans in Europe and implications for Neanderthal behaviour. Nature 479, 525–528 (2011).
Hublin, J.-J. The modern human colonization of western Eurasia: when and where? Quat. Sci. Rev. 118, 194–210 (2015).
Higham, T. et al. The timing and spatiotemporal patterning of Neanderthal disappearance. Nature 512, 306–309 (2014).
Fu, Q. et al. An early modern human from Romania with a recent Neanderthal ancestor. Nature 524, 216–219 (2015).
Fu, Q. et al. Genome sequence of a 45,000-year-old modern human from western Siberia. Nature 514, 445–449 (2014).
Prüfer, K. et al. The complete genome sequence of a Neanderthal from the Altai Mountains. Nature 505, 43–49 (2014).
Prüfer, K. et al. A high-coverage Neandertal genome from Vindija Cave in Croatia. Science 358, 655–658 (2017).
Green, R. E. et al. A draft sequence of the Neandertal genome. Science 328, 710–722 (2010).
Benazzi, S. et al. The makers of the Protoaurignacian and implications for Neandertal extinction. Science 348, 793–796 (2015).
Yang, M. A. et al. 40,000-year-old individual from Asia provides insight into early population structure in Eurasia. Curr. Biol. 27, 3202–3208.e9 (2017).
Trinkaus, E. et al. An early modern human from the Peştera cu Oase, Romania. Proc. Natl Acad. Sci. USA 100, 11231–11236 (2003).
Kuhn, S. L. & Zwyns, N. Rethinking the initial Upper Paleolithic. Quat. Int. 347, 29–38 (2014).
Zwyns, N. et al. The northern route for human dispersal in central and northeast Asia: new evidence from the site of Tolbor-16, Mongolia. Sci. Rep. 9, 11759 (2019).
Korlević, P. et al. Reducing microbial and human contamination in DNA extractions from ancient bones and teeth. Biotechniques 59, 87–93 (2015).
Fu, Q. et al. DNA analysis of an early modern human from Tianyuan Cave, China. Proc. Natl Acad. Sci. USA 110, 2223–2227 (2013).
Haak, W. et al. Massive migration from the steppe was a source for Indo-European languages in Europe. Nature 522, 207–211 (2015).
Briggs, A. W. et al. Patterns of damage in genomic DNA sequences from a Neandertal. Proc. Natl Acad. Sci. USA 104, 14616–14621 (2007).
Meyer, M. et al. Nuclear DNA sequences from the Middle Pleistocene Sima de los Huesos hominins. Nature 531, 504–507 (2016).
Peyrégne, S. & Peter, B. M. AuthentiCT: a model of ancient DNA damage to estimate the proportion of present-day DNA contamination. Genome Biol. 21, 246 (2020).
Korneliussen, T. S., Albrechtsen, A. & Nielsen, R. ANGSD: analysis of next generation sequencing data. BMC Bioinformatics 15, 356 (2014).
Mittnik, A. et al. Kinship-based social inequality in Bronze Age Europe. Science 366, 731–734 (2019).
Petr, M. et al. The evolutionary history of Neanderthal and Denisovan Y chromosomes. Science 369, 1653–1656 (2020).
Poznik, G. D. et al. Punctuated bursts in human male demography inferred from 1,244 worldwide Y-chromosome sequences. Nat. Genet. 48, 593–599 (2016).
Kutanan, W. et al. Contrasting paternal and maternal genetic histories of Thai and Lao populations. Mol. Biol. Evol. 36, 1490–1506 (2019).
Patterson, N. et al. Ancient admixture in human history. Genetics 192, 1065–1093 (2012).
Fu, Q. et al. The genetic history of Ice Age Europe. Nature 534, 200–205 (2016).
Seguin-Orlando, A. et al. Genomic structure in Europeans dating back at least 36,200 years. Science 346, 1113–1118 (2014).
Mallick, S. et al. The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature 538, 201–206 (2016).
Lazaridis, I. et al. Ancient human genomes suggest three ancestral populations for present-day Europeans. Nature 513, 409–413 (2014).
Lazaridis, I. et al. Genomic insights into the origin of farming in the ancient Near East. Nature 536, 419–424 (2016).
Sikora, M. et al. The population history of northeastern Siberia since the Pleistocene. Nature 570, 182–188 (2019).
Rasmussen, M. et al. Ancient human genome sequence of an extinct Palaeo-Eskimo. Nature 463, 757–762 (2010).
Moreno-Mayar, J. V. et al. Terminal Pleistocene Alaskan genome reveals first founding population of Native Americans. Nature 553, 203–207 (2018).
Moreno-Mayar, J. V. et al. Early human dispersals within the Americas. Science 362, eaav2621 (2018).
Rasmussen, M. et al. The genome of a Late Pleistocene human from a Clovis burial site in western Montana. Nature 506, 225–229 (2014).
Rasmussen, M. et al. The ancestry and affiliations of Kennewick Man. Nature 523, 455–458 (2015).
Sikora, M. et al. Ancient genomes show social and reproductive behavior of early Upper Paleolithic foragers. Science 358, 659–662 (2017).
Petr, M., Pääbo, S., Kelso, J. & Vernot, B. Limits of long-term selection against Neandertal introgression. Proc. Natl Acad. Sci. USA 116, 1639–1644 (2019).
Mafessoni, F. et al. A high-coverage Neandertal genome from Chagyrskaya Cave. Proc. Natl Acad. Sci. USA 117, 15132–15136 (2020).
Meyer, M. et al. A high-coverage genome sequence from an archaic Denisovan individual. Science 338, 222–226 (2012).
Peter, B. M. 100,000 years of gene flow between Neandertals and Denisovans in the Altai mountains. Preprint at https://doi.org/10.1101/2020.03.13.990523 (2020).
Vernot, B. et al. Excavating Neandertal and Denisovan DNA from the genomes of Melanesian individuals. Science 352, 235–239 (2016).
Sankararaman, S., Mallick, S., Patterson, N. & Reich, D. The combined landscape of Denisovan and Neanderthal ancestry in present-day humans. Curr. Biol. 26, 1241–1247 (2016).
Prüfer, K. et al. A genome sequence from a modern human skull over 45,000 years old from Zlatý kůň in Czechia. Nat. Ecol. Evol. https://doi.org/10.1038/s41559-021-01443-x (2021).
R Core Team. R: A Language and Environment for Statistical Computing http://www.R-project.org/ (R Foundation for Statistical Computing, 2013).
Gansauge, M.-T., Aximu-Petri, A., Nagel, S. & Meyer, M. Manual and automated preparation of single-stranded DNA libraries for the sequencing of DNA from ancient biological remains and other sources of highly degraded DNA. Nat. Protocols 15, 2279–2300 (2020).
Kircher, M., Sawyer, S. & Meyer, M. Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform. Nucleic Acids Res. 40, e3 (2012).
Renaud, G., Stenzel, U. & Kelso, J. leeHom: adaptor trimming and merging for Illumina sequencing reads. Nucleic Acids Res. 42, e141 (2014).
Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26, 589–595 (2010).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Raghavan, M. et al. Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans. Nature 505, 87–91 (2014).
Olalde, I. et al. Derived immune and ancestral pigmentation alleles in a 7,000-year-old Mesolithic European. Nature 507, 225–228 (2014).
Gallego Llorente, M. et al. Ancient Ethiopian genome reveals extensive Eurasian admixture throughout the African continent. Science 350, 820–822 (2015).
Keller, A. et al. New insights into the Tyrolean Iceman’s origin and phenotype as inferred by whole-genome sequencing. Nat. Commun. 3, 698 (2012).
Gamba, C. et al. Genome flux and stasis in a five millennium transect of European prehistory. Nat. Commun. 5, 5257 (2014).
Jones, E. R. et al. Upper Palaeolithic genomes reveal deep roots of modern Eurasians. Nat. Commun. 6, 8912 (2015).
Hajdinjak, M. et al. Reconstructing the genetic history of late Neanderthals. Nature 555, 652–656 (2018).
Petr, M., Vernot, B. & Kelso, J. admixr-R package for reproducible analyses using ADMIXTOOLS. Bioinformatics 35, 3194–3195 (2019).
We thank A. Weihmann and B. Schellbach for their help with DNA sequencing; R. Barr, P. Korlević and S. Tüpke for help with graphics; D. Reich and M. Slatkin for discussions and input; the Tourist Association STD “Bacho Kiro” in Dryanovo; the History Museum in Dryanovo; the Regional History Museum in Gabrovo; the National Museum of Natural History (NMNH) in Sofia; and N. Spassov. M.H. is supported by a Marie Skłodowska Curie Individual Fellowship (no. 844014). Q.F. was supported by the Strategic Priority Research Program (B) (XDB26000000) of CAS, NSFC (41925009, 41630102, 41672021). O.T.M. and S.C. were supported by a grant from the Ministry of Research and Innovation, CNCS - UEFISCDI, project number PN-III-P4-ID-PCCF-2016-0016, within PNCDI III and the EEA Grants 2014-2021, under Project contract no. 3/2019. F.W. received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 948365). P.S. was supported by the Vallee Foundation, the European Research Council (grant no. 852558), the Wellcome Trust (217223/Z/19/Z) and Francis Crick Institute core funding (FC001595) from Cancer Research UK, the UK Medical Research Council and the Wellcome Trust. This study was funded by the Max Planck Society and the European Research Council (grant agreement no. 694707 to S.P.).
Open access funding provided by Max Planck Society.
The authors declare no competing interests.
Peer review information Nature thanks Carles Lalueza-Fox, Marie Soressi and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 C-to-T substitution frequencies at the beginning and end of nuclear alignments for the merged libraries of the Bacho Kiro Cave and Oase1 specimens.
Only fragments of at least 35 bp that mapped to the human reference genome with a mapping quality of at least 25 (MQ ≥ 25) were used for this analysis. Solid lines depict all fragments and dashed lines the fragments that have a C-to-T substitution at the opposing end (conditional substitutions).
Extended Data Fig. 2 Sex determination, pairwise mismatch rate between specimens and principal component analyses (PCAs).
a, Sex determination for Bacho Kiro Cave specimens. Only fragments that showed C-to-T substitutions in the first three and/or last three positions and overlapping 2200k Panel SNPs were used for this analysis (for the number of deaminated fragments per specimen, see Supplementary Table 2.8). The expected ratios of X to (X + autosomal) fragments for a female and a male individual are depicted as dashed lines, and circles correspond to the calculated values for each of the Bacho Kiro Cave specimens. Whiskers correspond to 95% binomial confidence intervals. b, Pairwise mismatch rate between different libraries from the same specimen (intra-specimen), between different Bacho Kiro Cave specimens (inter-specimen) and between other ancient modern humans older than 30,000 cal. bp. The boxplots were drawn using the summary statistics geom_stat from the R-package ggplot; lower and upper hinges, first and third quartiles; whiskers, maximum value of 1.5× the interquartile range; centre line, median. SNPs across all autosomes of the 2200k Panel were used for the calculations (number of SNPs (nsnps) = 2,056,573). c, A PCA of 2,970 present-day humans genotyped on 597,573 SNPs with 22 ancient individuals older than 30,000 cal. bp projected onto the plane. d, A PCA of 1,444 present-day Eurasian and Native American individuals genotyped on 597,573 SNPs with 22 ancient individuals older than 30,000 cal. bp projected onto the plane. c, d, Grey dots denote present-day human genomes.
Extended Data Fig. 3 Heatmaps of outgroup f3-statistics corresponding to the amount of shared genetic drift between individuals and/or populations.
a, Genetic clustering of ancient individuals, including the IUP Bacho Kiro Cave, BK1653 and Oase1 individuals based on the amount of shared genetic drift and calculated as f3(ancient1, ancient2; Mbuti). Lighter colours in this panel indicate higher f3 values and correspond to higher shared genetic drift (nsnps = 2,056,573). b, c, Shared genetic drift between the approximately 35,000-year-old BK1653 (b; nsnps = 825,379) or approximately 38,000-year-old Kostenki14 individuals29,30 (c; nsnps = 1,676,430) and present-day human populations from the SGDP31 calculated as f3(Bacho Kiro BK1653/Kostenki14, present-day humans; Mbuti). Three Mbuti individuals from the same panel31 were used as an outgroup. Higher f3 values47 are indicated with warmer colours and correspond to higher shared genetic drift. Plotted f3 values were calculated using ADMIXTOOLS28 as implemented in admixr61. Coordinates for present-day humans were previously published31. The heatmap scale is consistent with those in Fig. 2a, Supplementary Figs. 5.1, 5.2.
a, In contrast to the IUP Bacho Kiro Cave individuals, individual BK1653 is significantly closer to the approximately 38,000-year-old Kostenki14 individual29,30 than to present-day non-African populations from Central Asia and Siberia, East Asia, South Asia, Oceania or the Americas, as calculated by D(Kostenki14, present-day humans; BachoKiro BK1653, Mbuti). D values for each comparison, plotted as barplots, were calculated using ADMIXTOOLS28 as implemented in admixr61. Present-day human genomes from the SGDP31 were used in these statistics, and three Mbuti individuals from the same panel were used as an outgroup. **|Z| ≥ 3, *|Z| ≥ 2. b, c, BK1653 shares significantly more alleles with the approximately 35,000-year-old GoyetQ116-129 (b) and approximately 31,000-year-old Vestonice1629 individuals (c) than with most other ancient modern humans. D values calculated as in a. Filled circles correspond to |Z| ≥ 3, and open circles indicate a |Z|<3 (not significant). Error bars in all panels show s.e. calculated using a weighted block jackknife28 across all autosomes on the 2200k Panel (nsnps (Bacho Kiro BK1653) = 825,379) and a block size of 5 Mb.
Extended Data Fig. 5 D(Kostenki14, X; IUP Bacho Kiro Cave individuals/Ust’Ishim/Oase1, Mbuti) where X is a present-day non-African population from Central Asia and Siberia, East Asia, South Asia, Oceania or the Americas.
D values for each comparison, plotted as barplots, were calculated using ADMIXTOOLS28 as implemented in admixr61. Present-day human genomes from the SGDP31 were used in these statistics, and three Mbuti individuals from the same panel were used as an outgroup. **|Z| ≥ 3, *|Z| ≥ 2. Error bars denote s.e. calculated using a weighted block jackknife28 across all autosomes on the 2200k Panel and a block size of 5 Mb. a, A pool of three IUP Bacho Kiro Cave individuals (nsnps = 1,813,821). b, Ust’Ishim (nsnps = 1,951,462). c, Oase1 (nsnps = 402,526).
a, The proportion of Neanderthal ancestry in Bacho Kiro Cave individuals and other ancient and present-day modern humans calculated with a direct f4 ratio that takes advantage of the two high-coverage Neanderthal genomes9,10,41. f4 ratio (alpha) values calculated using ADMIXTOOLS28 as implemented in admixr61. b, c, Neanderthals9,10,42 share significantly more derived alleles with the IUP Bacho Kiro Cave individuals than with most present-day31 (b) or ancient modern humans (c). D values calculated using ADMIXTOOLS28 as implemented in admixr61. Filled circles correspond to |Z| ≥ 3; open circles indicate |Z| < 3 (not significant). Error bars in all panels show s.e. calculated using a weighted block jackknife28 across all autosomes on the 2200k Panel (nsnps = 2,056,573) and a block size of 5 Mb.
Extended Data Fig. 7 Segments of Neanderthal ancestry and estimates of the number of generations since the most recent Neanderthal ancestor.
a, A combined plot of the inferred Neanderthal segments in the genomes of the IUP Bacho Kiro Cave, BK1653 and Oase1 individuals, including chromosome X, using a hidden Markov model approach (admixfrog)44. b, Maximum likelihood estimates (dashed red lines) of the number of generations since a recent additional Neanderthal introgression into the IUP Bacho Kiro Cave and Oase1 individuals. Dashed black lines show 95% confidence intervals.
Extended Data Fig. 8 Spatial distribution of Neanderthal DNA in the Ust’Ishim, Tianyuan and Kostenki14 genomes.
Segments corresponding to Neanderthal ancestry inferred using a hidden Markov model approach (admixfrog)44 longer than 0.2 cM are indicated in blue. Centromeres are indicated in black. Pie charts indicate the total amount of Neanderthal DNA identified in each genome.
This file contains Supplementary Information Sections 1-10 with Supplementary Tables S1.1-S1.2, S2.1-S2.10, S4.1, S5.1-S5.15, S7.1, S8.1, S9.1, Supplementary Figures S2.1-S2.2, S5.1-S5.14, S6.1-S6.8, S7.1-S7.17, S8.1-S8.6, S9.1-S9.4 and Supplementary References for each section. See contents page of the Supplementary Information file for more details.
About this article
Cite this article
Hajdinjak, M., Mafessoni, F., Skov, L. et al. Initial Upper Palaeolithic humans in Europe had recent Neanderthal ancestry. Nature 592, 253–257 (2021). https://doi.org/10.1038/s41586-021-03335-3
Nature Ecology & Evolution (2021)
Nature Ecology & Evolution (2021)