Grey wolf genomic history reveals a dual ancestry of dogs

Bergström, Anders; Stanton, David W. G.; Taron, Ulrike H.; Frantz, Laurent; Sinding, Mikkel-Holger S.; Ersmark, Erik; Pfrengle, Saskia; Cassatt-Johnstone, Molly; Lebrasseur, Ophélie; Girdland-Flink, Linus; Fernandes, Daniel M.; Ollivier, Morgane; Speidel, Leo; Gopalakrishnan, Shyam; Westbury, Michael V.; Ramos-Madrigal, Jazmin; Feuerborn, Tatiana R.; Reiter, Ella; Gretzinger, Joscha; Münzel, Susanne C.; Swali, Pooja; Conard, Nicholas J.; Carøe, Christian; Haile, James; Linderholm, Anna; Androsov, Semyon; Barnes, Ian; Baumann, Chris; Benecke, Norbert; Bocherens, Hervé; Brace, Selina; Carden, Ruth F.; Drucker, Dorothée G.; Fedorov, Sergey; Gasparik, Mihály; Germonpré, Mietje; Grigoriev, Semyon; Groves, Pam; Hertwig, Stefan T.; Ivanova, Varvara V.; Janssens, Luc; Jennings, Richard P.; Kasparov, Aleksei K.; Kirillova, Irina V.; Kurmaniyazov, Islam; Kuzmin, Yaroslav V.; Kosintsev, Pavel A.; Lázničková-Galetová, Martina; Leduc, Charlotte; Nikolskiy, Pavel; Nussbaumer, Marc; O’Drisceoil, Cóilín; Orlando, Ludovic; Outram, Alan; Pavlova, Elena Y.; Perri, Angela R.; Pilot, Małgorzata; Pitulko, Vladimir V.; Plotnikov, Valerii V.; Protopopov, Albert V.; Rehazek, André; Sablin, Mikhail; Seguin-Orlando, Andaine; Storå, Jan; Verjux, Christian; Zaibert, Victor F.; Zazula, Grant; Crombé, Philippe; Hansen, Anders J.; Willerslev, Eske; Leonard, Jennifer A.; Götherström, Anders; Pinhasi, Ron; Schuenemann, Verena J.; Hofreiter, Michael; Gilbert, M. Thomas P.; Shapiro, Beth; Larson, Greger; Krause, Johannes; Dalén, Love; Skoglund, Pontus

doi:10.1038/s41586-022-04824-9

Download PDF

Article
Open access
Published: 29 June 2022

Grey wolf genomic history reveals a dual ancestry of dogs

Nature volume 607, pages 313–320 (2022)Cite this article

160k Accesses
38 Citations
1669 Altmetric
Metrics details

Subjects

Abstract

The grey wolf (Canis lupus) was the first species to give rise to a domestic population, and they remained widespread throughout the last Ice Age when many other large mammal species went extinct. Little is known, however, about the history and possible extinction of past wolf populations or when and where the wolf progenitors of the present-day dog lineage (Canis familiaris) lived^{1,2,3,4,5,6,7,8}. Here we analysed 72 ancient wolf genomes spanning the last 100,000 years from Europe, Siberia and North America. We found that wolf populations were highly connected throughout the Late Pleistocene, with levels of differentiation an order of magnitude lower than they are today. This population connectivity allowed us to detect natural selection across the time series, including rapid fixation of mutations in the gene IFT88 40,000–30,000 years ago. We show that dogs are overall more closely related to ancient wolves from eastern Eurasia than to those from western Eurasia, suggesting a domestication process in the east. However, we also found that dogs in the Near East and Africa derive up to half of their ancestry from a distinct population related to modern southwest Eurasian wolves, reflecting either an independent domestication process or admixture from local wolves. None of the analysed ancient wolf genomes is a direct match for either of these dog ancestries, meaning that the exact progenitor populations remain to be located.

Global Phylogeographic and Admixture Patterns in Grey Wolves and Genetic Legacy of An Ancient Siberian Lineage

Article Open access 22 November 2019

Japanese wolves are most closely related to dogs and share DNA with East Eurasian dogs

Article Open access 23 February 2024

Assessment of genetic diversity, population structure and wolf-dog hybridisation in the Eastern Romanian Carpathian wolf population

Article Open access 19 December 2023

Main

The grey wolf (Canis lupus) has been present across most of the northern hemisphere for the last few hundred thousand years and, unlike many other large mammals, did not go extinct in the Late Pleistocene. Studies of present-day genomes have found that current population structure formed mostly in the last ~30,000–20,000 years^9,10,11, or roughly since the Last Glacial Maximum (LGM; ~28–23 thousand years ago (ka)¹²). Siberian wolves predating the LGM have ancestries that are largely basal to present-day diversity, which has led to suggestions that many pre-LGM wolf lineages went extinct^13,14. Among the central questions is thus to what extent the global wolf population was subject to extinction processes or responded to climate change with new adaptations.

While it is clear that grey wolves gave rise to dogs, there is no consensus regarding when, where and how this happened^{1,2,3,4,5,6,7,8}. Skeletal remains attributable to the present-day dog lineage appear archaeologically by 14 ka¹⁵, and genetic estimates of when the ancestors of dogs and modern wolves diverged range from 40–14 ka^9,13,16. However, genetic data from modern and ancient dogs coupled with modern wolves, to which previous studies were largely restricted, may not be able to resolve the origin of dogs. Genetic diversity within dogs is affected by their dynamic history and is unable to confidently pinpoint an origin. Relationships to modern wolves can likewise be affected by local extinction and gene flow since domestication^6,9. Regions where early dogs have been found do not necessarily imply places of origin either, as the existence of earlier dogs elsewhere cannot be excluded. Instead, the origin of dogs could be resolved if wolf genetic diversity across space and time was exhaustively characterized and it could be determined which populations were closest to the ancestors of dogs.

Wolf genomes spanning 100,000 years

We sequenced 66 new ancient wolf genomes from Europe, Siberia and north-western North America to a median of 1× coverage (range, 0.02–13×) (Fig. 1a,b), incorporated five previously sequenced ancient wolf genomes^14,17 and increased coverage for one¹³. We also sequenced an ancient dhole genome from the Caucasus, contextually dated to >70 ka, to serve as an outgroup. Fractions of X-chromosome DNA showed that 69% of the wolves were male (95% confidence interval (CI), 57–80%; P = 0.0013, binomial test), mirroring male over-representation among ancient genomes from woolly mammoths¹⁸, bison¹⁹, brown bears¹⁹ and domestic dogs⁸. For wolves without dates or with dates beyond the radiocarbon limit of ~50 ka, we estimated ages through mitochondrial tip dating²⁰ and obtained an average 95% CI of 21,573 years and an average prediction error of 5,133 years (Supplementary Figs. 1 and 2). We merged single-nucleotide polymorphism (SNP) genotypes called from these genomes with those from worldwide modern wolves (n = 68), modern (n = 369) and ancient (n = 33) dogs, and other canid species (Methods). The total dataset spans the last 100,000 years (Fig. 1b).

**Fig. 1: Seventy-two ancient wolf genomes.**

In a principal component analysis (PCA) on a matrix of shared genetic drift, the ancient wolves clustered strongly by age and not by geography (Pearson’s r_{PC1,sample age} = 0.85, P = 5 × 10⁻²¹) (Fig. 1c). Similarly, ancient wolves share more drift with modern wolves the younger they are (Extended Data Fig. 1a and Supplementary Fig. 3). Previous studies have suggested an LGM ancestry turnover^13,14,21, and, indeed, we found that all individuals younger than the LGM (that is, postdating 23 ka) were more similar to each other than to wolves predating ~28 ka (Extended Data Fig. 1b). However, the same pattern is also visible when contrasting affinities to younger versus older wolves at any point during the last 100,000 years (Supplementary Fig. 4). Using simulations, we confirmed that the observed temporal relationships are largely similar to what would be expected in a panmictic population (Supplementary Fig. 5). A long-standing process of ancestry homogenization due to connectivity thus seems to have driven Pleistocene wolf relationships. The changes during the LGM therefore represent not a shift in long-term population dynamics, but the most recent manifestation of this process.

Siberia as a source of global gene flow

We next tested for directionality in the gene flow that connected wolf ancestry over time. Analyses using f₄-statistics showed that all wolves postdating 23 ka are more similar to Siberian wolves than to European or Central Asian wolves from ~30 ka (Extended Data Fig. 1c and Supplementary Fig. 6). This suggests that Siberian-related ancestry expanded into Europe, in line with mitochondrial evidence²¹. The same dynamic of Siberian gene flow into Europe unfolded between 50 and 35 ka (Supplementary Fig. 6). We found that an admixture graph model with recurrent, unidirectional gene flow from Siberia into Europe could explain these relationships (Fig. 2a and Supplementary Fig. 8). Although we could not distinguish pulse-like from continuous gene flow, our results suggest that Siberia acted as a source and Europe as a sink for migration throughout the Late Pleistocene and show no evidence of gene flow in the other direction (Extended Data Fig. 1d and Supplementary Fig. 7).

**Fig. 2: One hundred thousand years of wolf population history.**

While these results demonstrate pervasive gene flow, they also show that the ancestry replacements were incomplete and that minority fractions of deep European ancestry have persisted until the present day (Fig. 2a,b). Most analysed modern Eurasian wolves probably retain local Pleistocene ancestry, as they are best modelled by qpAdm as having 10–40% ancestry that is more divergent than the oldest Siberian wolves in this study at ~100 ka (Supplementary Figs. 11 and 12). In addition to local grey wolf ancestry not represented among our ancient genomes, this may include African golden wolf-related ancestry in the Near East and South Asia²² and ancestry of unknown canid origin in Tibet²³ (Supplementary Fig. 10). While all Eurasian wolves today share the majority of their ancestry within the last 25,000 years, the persistence of deep local ancestries provides evidence against widespread local extinction in Late Pleistocene Eurasia and suggests that the species as a whole, unlike many other megafauna, did not come close to extinction.

Many modern and ancient North American wolves show evidence of coyote (Canis latrans) admixture^24,25 (Extended Data Fig. 1e), which explains why some of them do not cluster with wolves of similar age in the PCA (Fig. 1c). On the basis of coalescence rates²⁶ between male X chromosomes, which have perfect haplotype phase, we estimated that wolves and coyotes began diverging ~700 ka (Supplementary Fig. 14), broadly in line with a fossil divergence of ~1 million years ago²⁷. Our data show that coyote admixture has occurred at least since 100–80 ka, and two analysed Pleistocene wolves from the Yukon also carried coyote mitochondrial lineages. These findings imply that either the Pleistocene range of coyotes extended further north than currently thought or that admixture occurring further south propagated northwards through the wolf population. In our Eurasian wolves, no influx of coyote ancestry is observed over time (Extended Data Fig. 1e). We found a slight west–east gradient of increasing coyote affinity among Eurasian wolves, but this pattern probably reflects admixture into coyotes from North American wolves (which are related to wolves in eastern Siberia) (Supplementary Fig. 9).

After accounting for coyote admixture, we found that wolf ancestry in Alaska and the Yukon was highly connected to Siberia over time (Fig. 2a). This mirrors European wolf history, but, while some deep local European ancestry persists, no deep North American ancestry appears to persist to the present. The Bering land bridge probably allowed for an influx of Siberian wolves into Alaska intermittently between 70 and 11 ka^28,29, but we found no evidence of gene flow in the other direction. All present-day North American wolves can be modelled as having 10–20% coyote ancestry and the remaining ancestry from Siberian wolves younger than ~23 ka, with no contribution from earlier North American wolves (Fig. 2b). We found that red and Algonquin wolves similarly fit as shifted towards coyotes along this two-source admixture cline^11,25, but we cannot rule out greater complexity in their history. While genomic data alone cannot establish an absence of grey wolves at any particular time, our results are consistent with local extinction in North America, for example during the LGM when ice sheets covered the northern half of the continent³⁰, or, alternatively, an absence of grey wolves south of the ice sheets until after the ice retreated.

High connectivity in the Pleistocene

To understand how differentiated past wolf populations were, we calculated the proportion of genetic variation between rather than within (pairwise F_ST; ref. ³¹) sets of wolves grouped in space and time. Before the LGM, differentiation even between distant regions was low (F_ST < 3%) (Fig. 2c). Early European and North American populations were thus neither very different from each other nor from the Siberian-related wolves that over time replaced much of their ancestry. We also estimated X-chromosome coalescence rates²⁶, which suggested that any two Pleistocene wolves shared ancestry within ~10,000 years of the date of the older wolf (Fig. 2d and Supplementary Fig. 15). Pervasive gene flow thus prevented deep divergences among wolf populations in the Late Pleistocene.

In the last ~10,000 years (the Holocene), population dynamics were different from those in the Pleistocene, with no evidence for further Siberian gene flow into Europe; instead, European-related ancestry spread eastwards and contributed to modern wolves in China and Siberia (Fig. 2b). Higher levels of differentiation today (F_ST of ~10–60%) probably largely reflect population bottlenecks following habitat encroachment and persecution by humans in the last few centuries^32,33, although there is some evidence for increasing differentiation already during the last 20,000 years (Fig. 2c). MSMC2 estimates from present-day genomes suggest widespread effective population size declines in this period (Supplementary Fig. 13), but we found no concurrent decline in individual heterozygosity (Fig. 1d). Combined, this evidence suggests that an overall reduction in gene flow, as shown by the F_ST results, rather than a species-wide population decline²¹ might have resulted in lower local effective population sizes.

Natural selection over 100,000 years

The strong connectivity observed among Late Pleistocene wolves raises the possibility of species-wide adaptation. Natural selection is typically inferred indirectly from present-day genetic variation, but our 100,000-year (~30,000 generations) dataset enables direct detection of selected alleles. Testing each variant for an association between allele frequency and time across 72 ancient and 68 modern wolves, and applying genomic control³⁴ to correct for allele frequency variance caused by genetic drift, we found 24 genomic regions with evidence for selection (Fig. 3a and Extended Data Table 1). We confirmed the robustness of our method to demographic history by applying it to data simulated in the absence of selection, finding no false positives (Fig. 3b and Supplementary Fig. 17).

**Fig. 3: Natural selection in the ancient wolf time series.**

The strongest signal was observed on chromosome 25, where variants closely overlapping the gene IFT88 rose rapidly from close to 0% to 100% in frequency 40–30 ka and are still fixed in wolves and dogs today (Fig. 3c). Genealogical inference on modern wolves^35,36 further showed that IFT88 had the youngest time to the most recent common ancestor (TMRCA) (~70,000 years) in the genome (Fig. 3d). Disruption of IFT88 leads to craniofacial development defects in mice and to cleft lip and palate in humans³⁷. If future fossil studies reveal rapid craniodental change in this time period, this could implicate the IFT88 sweep as a driver, potentially in response to prey availability changes. But it is also possible that selection targeted unknown non-skeletal traits associated with IFT88 variation. The second strongest signal in the genome was 2.5 Mb downstream of IFT88, where allele frequencies shifted in a similar timeframe 40–20 ka (Fig. 3c), but it is not clear whether this region could be involved in long-range regulation of IFT88.

Three regions with evidence for selection overlap olfactory receptor genes, with variants on chromosome 15 increasing in frequency from close to 0% to 100% 45–25 ka (Fig. 3c), suggesting that olfaction was a recurrent target of adaptation in wolves. Most of the detected selection episodes occurred before the divergence of dogs, and dogs share the selected alleles (Supplementary Fig. 18). However, variants in YME1L1 increased in frequency from <5% to 50–70% in wolves from 20–0 ka but are not observed in dogs. A region on chromosome 10, where variation among dogs is associated with body size, drop ears and other traits^38,39,40, was under recent selection in specific dog breeds⁴¹, and we found that it was also selected in wolves in the last 20,000 years. Although it was not detected in our selection scan, the K^B deletion that underlies black fur⁴² was identified in a 14,000-year-old wolf from Tumat, Siberia (Supplementary Fig. 19). This deletion probably introgressed into wolves from dogs in the Holocene⁴², but our result also raises the possibility that its ultimate origin could have been in wild Pleistocene wolves.

Dog ancestry has eastern wolf affinities

We found that dogs share more genetic drift with wolves that lived after 28 ka than with those that lived before this time, which implies that the progenitors of dogs were genetically connected to other wolves at least until 28 ka (Fig. 1c and Extended Data Fig. 1b). A divergence around this time is also consistent with our MSMC2 analyses of X chromosomes (Supplementary Fig. 16). However, until the nature of the divergence process is better understood, it cannot be ruled out that domestication had started before this point.

The geographical origin of the present-day dog lineage Canis familiaris has remained controversial. Genetic studies have argued that wolves in East Asia^1,2, Central Asia⁴, the Middle East⁶, Europe⁵, Siberia¹⁶, or both eastern and western Eurasia independently³, contributed ancestry to early dogs, whereas others have been consistent with a single, but geographically unknown, progenitor population^8,9. Given our finding that part of wolf population structure is older than the likely time of dog domestication, we can expect dogs to be genetically closer to some ancient wolves than to others. To reduce the effects of gene flow since the emergence of dogs, we performed a PCA on wolves and dogs from the last 25,000 years, based on f₄-statistics quantifying their relationships only to wolves living before 28 ka (that is, before the LGM), and found that dogs showed relationship profiles similar to those of Siberian wolves from 23–13 ka (Fig. 4a, Extended Data Fig. 2 and Methods). Direct f₄-tests also showed that dogs are closer to Siberian than to European wolves from this period (Fig. 4b and Extended Data Fig. 3). European wolves postdating 28 ka have an affinity to pre-LGM European wolves, reflecting the persistence of deep west Eurasian wolf ancestry (Fig. 2a). The absence of such western affinities in dogs suggests that they did not originate from the European wolf populations sampled here.

While the north-eastern Siberian wolves from 23–13 ka display the greatest overall affinity to dogs, we found that they were not the immediate ancestors of dogs. When a broad set of ancient wolves were tested as candidate sources using qpWave/qpAdm⁴³, all single-source models, including one using an 18,000-year-old Siberian wolf, were strongly rejected for all dogs studied (P < 1 × 10⁻⁶) (Methods and Fig. 4c). However, a model featuring the Siberian wolf and 10–20% ancestry from a component approximated by the outgroup dhole fit dogs such as the 9,500-year-old Siberian Zhokhov¹⁷ individual (P = 0.29) (Fig. 4c). Although it uses an outgroup species, this two-source model does not necessarily imply admixture from two distinct populations or species. Instead, it could reflect dogs being derived from some local wolf ancestry that is unsampled and to some extent divergent from the available ancient wolves (Extended Data Fig. 4). Validating this interpretation, we found that recent European wolves, which have a small degree of deep, local European ancestry (Fig. 2a), obtain results very similar to those for dogs, requiring 10–20% unsampled ancestry, if only Siberian wolves were available as sources (Supplementary Fig. 11 and Supplementary Information). We therefore interpret the results for dogs as similarly reflecting some unsampled wolf ancestry that is not fully represented by the ancient Siberian wolves sampled here. This unsampled ancestry appears to have retained a partial degree of differentiation from the sampled ancient wolves since before 100 ka (Supplementary Fig. 12), and our results imply that it probably lived outside the regions of Europe, north-eastern Siberia and North America sampled here.

The results obtained for the Zhokhov dog also applied to ancient dogs from Lake Baikal, North America and north-eastern Europe (a 10,900-year-old Karelian dog) and to modern New Guinea singing dogs.As a group, qpWave could fit these dogs as having originated from a single ‘stream’ of ancient wolf diversity, in an approach not requiring a proximate source (Extended Data Table 2). This result shows that ancient wolf genomes can circumvent the complexities of more recent processes, as the same models were rejected when modern wolves were used as sources instead (Extended Data Table 2), probably owing to gene flow from dogs into wolves⁸.

Recent admixture and population changes thus complicate analyses of modern wolves. Even so, if wolf population structure has not been completely reshaped since the time of dog domestication, it is possible that part of the ancestry of the dog progenitors could still be represented and detectable among wolves today, even though the past geographical location of that ancestry would be unknown. We tested this in two ways. First, we projected dogs onto a PCA plot constructed using modern wolf genotypes, and found that they projected closer to wolves from China, Mongolia and the Altai than to wolves from Yakutia (Extended Data Fig. 5). Second, we extended our qpAdm analyses to modern wolf sources, and found that some Chinese wolves provided better fits than the 18,000-year-old Siberian wolf and could serve as single sources of Zhokhov dog ancestry without the need for an unsampled ancestry component (Extended Data Fig. 6). These results could be taken to support an eastern or central Eurasian dog origin outside of north-eastern Siberia, but we cannot draw firm geographical conclusions in the absence of ancient wolf genomes from these and other candidate regions.

A second source for western dog ancestry

We extended our analyses to a global set of ancient and modern dogs, to test for any ancestry contributions from additional, genetically distinct wolf progenitors. The strongest evidence for multiple progenitors would be if some dogs had different affinities to wolves that predate domestication, as such wolves cannot be affected by dog gene flow. Applying this rationale, we found that ancient Near Eastern and present-day African dogs, and to a lesser degree European dogs, are shifted towards western Eurasian wolves in the f₄-statistics PCA based on relationships to wolves that predate the LGM (Fig. 4a). This cline recapitulates the primary axis of population structure within dogs (between ancient Near Eastern and eastern Eurasian dogs⁸) (Fig. 4b), even when wolves from the last 28,000 years are excluded (Supplementary Fig. 20). The dog ancestry cline thus at least in part reflects wolf ancestry differences that predate the likely domestication timeframe. Testing the PCA observations explicitly, qpWave strongly rejected a single wolf progenitor when including Near Eastern dogs (P < 10⁻⁴) (Extended Data Table 2). The best-fitting qpAdm models for these dogs instead involved a source related to ancient European wolves, in addition to the ancestry found in the Zhokhov dog (Fig. 4c).

To test whether the sampled ancient European wolves could be the actual source of this second component of dog ancestry, we tested qpAdm models featuring the Siberian Zhokhov dog as one source—representing the eastern-related dog ancestry—and an ancient European wolf as a second source. These models did not fit Near Eastern and African dogs unless a third, outgroup component was also included to represent unsampled, divergent ancestry (Supplementary Fig. 21), meaning that European wolves are not a match for the missing ancestry. Expanding to all post-LGM and present-day wolves, only present-day wolves from Syria, Israel, Iran and India achieved good fits (Extended Data Fig. 7). In line with a source from this part of the world, when projected onto present-day wolf structure, Near Eastern and African dogs are shifted towards Caucasian and Near Eastern rather than European wolves (Extended Data Fig. 5). Using a present-day Syrian wolf as a source, we estimated 56% (standard error, 10%) Near Eastern-related wolf ancestry in the earliest available dog (7.2 ka) from the Levant, 37% (standard error, 3.5%) in the African Basenji breed and 5–25% in Neolithic and later European dogs (Fig. 4d). While the evidence of dual ancestry is based on ancient wolves that predate domestication and are thus unaffected by potential later gene flow, these exact estimates could be inflated if there is dog admixture in the Syrian wolf.

Next, we exhaustively tested admixture graph models of dog relationships, allowing up to two admixture events among four dog populations and the Syrian wolf. We obtained results consistent with the qpAdm inferences, as a single graph featuring Syrian wolf admixture into early Near Eastern dogs fit the data (Fig. 4f), with a separate dog lineage giving rise to early Karelian and eastern dogs. In this graph, the Karelian dog is most closely related to the ‘eastern’ source that also contributed ancestry to the early Near Eastern dog.

The widespread ancestry asymmetries observed between wolves and dogs today have been interpreted as reflecting recent, local admixture^8,9. Our finding that dogs have variable proportions of two distinct components of wolf ancestry may provide a unifying explanation for many of these asymmetries. For example, previous studies have explained an affinity between Pleistocene Siberian wolves andArctic dogs by suggesting admixture in the latter^13,17. The dual ancestry model can probably explain this asymmetry without such admixture, with the Arctic dogs instead having less of the western component (Supplementary Fig. 22). Conversely, higher levels of the western component in Near Eastern and African dogs probably explains at least part of their previously observed affinity to Near Eastern wolves^8,9,10. An observation that wolves in Xinjiang, central Asia, display no asymmetries to different dogs was interpreted as suggesting that other asymmetries are primarily due to dog-to-wolf gene flow⁸. Our results instead suggest that a balance of eastern and western wolf ancestries in central Asia (Fig. 2b) causes relative symmetry to the eastern and western dog ancestries. The Xinjiang wolves are thus not evidence against the dual ancestry model.

Conclusion

We show that wolf populations were genetically connected throughout the Late Pleistocene, probably because of the high mobility of wolves in an open landscape⁴⁴. The LGM did not necessarily correspond to an unprecedented time of change for the interconnected population of wolves, which might provide a clue to their perseverance when other northern Eurasian carnivores became extinct. Furthermore, the reason Pleistocene wolves appear basal to present-day diversity is not that they went extinct^13,14, but that continued gene flow homogenized later ancestry. Our finding that several selected alleles quickly reached fixation shows that adaptations spread to the whole population of Pleistocene wolves, a process that might have contributed to the survival of the species. At the same time, our results show that such rapid species-wide selective sweeps occurred only a few times over the last ~100,000 years.

Our results also provide insights into long-standing questions on the origin of dogs. First, dogs and present-day Eurasian wolves have been thought to be reciprocally monophyletic lineages⁹. We find that, overall, dogs are closer to eastern Eurasian wolves. Second, because no modern wolves are a good match for dog ancestry, the source population has been assumed to be extinct. Our results imply that this is not necessarily the case, as continued homogenization of wolf ancestry could have obscured earlier relationships to dogs. Third, it has been unclear whether more than one wolf population contributed to early and present-day dogs^3,7,8,9. We find that an eastern Eurasian-related source, ‘eastern dog progenitor’, appears to have contributed ~100% of the ancestry of early dogs in Siberia, the Americas, East Asia and north-eastern Europe. On top of this, a western Eurasian-related source, ‘western dog progenitor’, contributed 20–60% of the ancestry of early Near Eastern and African dogs and 5–25% of the ancestry of Neolithic and later European dogs. The western ancestry subsequently spread worldwide with, for example, the prehistoric expansion of agriculture in western Eurasia⁸ and the colonial era expansion of European dogs.

A previous study proposed that the earlier archaeological appearance of dogs in western and eastern Eurasia than in central Eurasia was due to independent domestication of western and eastern wolves, but that ancestry from the former was extinct or nearly extinct in present-day dogs³. Our results support the notion of two distinct ancestors of dogs but differ from this previous hypothesis. First, we demonstrate that ancestry from at least two wolf populations is extant and ubiquitous in modern dogs, and is the major determinant of dog population structure today. Second, we are able to reject Pleistocene European wolves related to those sampled here as a source for the C. familiaris lineage. Third, the previous study suggested that an Irish Neolithic dog had more ancestry from the western domestication than later dogs³, whereas we find that this dog had less ancestry from the western progenitor identified here than present-day European dogs (Fig. 4d). The lack of genomes from the earliest dogs in Europe, however, means that future studies may reveal them to have arisen from an independent domestication process that did not contribute substantially to later populations^3,45,46.

Our results are consistent with two scenarios: (1) independent domestication of the eastern and western progenitors that later merged in the west or (2) single domestication of the eastern progenitor, followed by admixture from western wolves as dogs arrived into southwestern Eurasia. Our results cannot distinguish between these scenarios, but, in either case, the merging or admixture must have occurred before 7.2 ka, the age of the oldest available Near Eastern dog genome⁸. A single domestication of the western progenitor followed by admixture from eastern wolves does not seem compatible with our results, as it would require replacement of 100% of the ancestry of eastern dogs. If dogs of 100% western progenitor ancestry were discovered, for example, in the earliest Near Eastern⁴⁷ or European¹⁵ contexts, this would imply independent domestication. Alternatively, the first dogs in the west could be of eastern progenitor ancestry, similar to the Karelian dog from 10.9 ka, in line with a single domestication process. Additional ancient wolf genomes, including from outside the regions covered here, where DNA often preserves less well, will also be necessary to further identify the wolf progenitors of dogs.

Methods

Sampling, DNA preparation and sequencing

Stockholm

Samples LOW002, LOW003, LOW006, LOW007, LOW008 and PON012 were processed at the Archaeological Research Laboratory at Stockholm University, Sweden, following methods previously described⁸. In brief, this involved extracting DNA by incubating the bone powder for 24 h at 37 °C in 1.5 ml of digestion buffer (0.45 M EDTA (pH 8.0) and 0.25 mg ml^–1 proteinase K), concentrating supernatant on Amicon Ultra-4 (30-kDa molecular weight cut-off (MWCO)) filter columns (MerckMillipore) and purifying on Qiagen MinElute columns. Double-stranded Illumina libraries were prepared using the protocol outlined in ref. ⁴⁸, with the inclusion of USER enzyme and the modifications described in ref. ⁴⁹.

Samples 367, PDM100, Taimyr-1 and Yana-1 were processed at the Swedish Museum of Natural History in Stockholm, Sweden, following previously described methods⁸. In brief, this involved extracting DNA using a silica-based method with concentration on Vivaspin filters (Sartorius), according to a protocol optimized for recovery of ancient DNA⁵⁰. Double-stranded Illumina libraries were prepared using the protocol outlined in ref. ⁴⁸, with the inclusion of USER enzyme.

Samples ALAS_024, VAL_033, ALAS_016, VAL_008, HMNH_007, HMNH_011, VAL_050, VAL_005, DS04, VAL_037, VAL_012, VAL_011, VAL_18A, IN18_016 and IN18_005 were processed at the Swedish Museum of Natural History in Stockholm, Sweden, following previously described methods for permafrost bone and tooth samples⁵¹. In brief, this involved DNA extraction using the methodology of ref. ⁵² and double-stranded Illumina library preparation as described in ref. ⁴⁸, with dual unique indexes and the inclusion of USER enzyme. Between eight and ten separate PCR reactions with unique indexes were carried out for each sample to maximize library complexity. The libraries were sequenced alongside samples HOV4, AL2242, AL2370, AL2893, AL3272 and AL3284 across three Illumina NovaSeq 6000 lanes with an S4 100-bp paired-end set-up at SciLifeLab in Stockholm.

Potsdam

Samples JAL48, JAL65, JAL69, JAL358, AH574, AH575 and AH577 were processed at the University of Potsdam. Pre-amplification steps (DNA extraction and library preparation) were conducted in separated laboratory rooms specially equipped for the processing of ancient DNA. Amplification and post-amplification steps were performed in different laboratory rooms. DNA was extracted from bone powder (29–54 mg) following a protocol specially adapted to recover short DNA fragments⁵². Single-stranded double-indexed libraries were built from 20 µl of DNA extract according to the protocol in ref. ⁵³. The libraries were sequenced on an HiSeq X platform at SciLifeLab in Stockholm.

Tübingen/Jena

Samples JK2174, JK2175, JK2179, JK2181, JK2183, TU144, TU148, TU839 and TU840 were processed at the University of Tübingen, with DNA extraction and pre-amplification steps undertaken in clean room facilities and post-amplification steps performed in a separate DNA laboratory. Both laboratories fulfil standards for work with ancient DNA^54,55. All surfaces of tooth and bone samples were initially UV irradiated for 30 min, to minimize the potential risk of modern DNA contamination. Subsequently, DNA was extracted by applying a well-established guanidine silica-based protocol for ancient samples⁵². Illumina sequencing libraries were prepared by using 20 µl of DNA extract per library⁴⁸; afterwards, dual barcodes (indexes) were chemically added to the prime ends of the libraries⁵⁶. For the samples from Auneau (TU839 and TU840), five sequencing libraries each were prepared; for all other samples processed in Tübingen, three sequencing libraries each were prepared. To detect potential contamination of the chemicals, negative controls were conducted for extraction and library preparation. After preparation of the sequencing libraries, DNA concentration was measured with qPCR (Roche LightCycler) using corresponding primers⁴⁸. The DNA concentration was given by the copy number of the DNA fragments in 1 µl of the sample.

Amplification of the indexed sequencing libraries was performed using Herculase II Fusion under the following conditions: 1× Herculase II buffer, 0.4 µM IS5 primer and 0.4 µM IS6 primer⁴⁸, Herculase II Fusion DNA polymerase (Agilent Technologies), 0.25 mM dNTPs (100 mM; 25 mM each dNTP) and 0.5–4 µl barcoded library as template in a total reaction volume of 100 µl. The applied amplification thermal profile was processed as follows: initial denaturation for 2 min at 95 °C; denaturation for 30 s at 95 °C, annealing for 30 s at 60 °C and elongation for 30 s at 72 °C for 3 to 20 cycles; and a final elongation step for 5 min at 72 °C. Thereafter, the amplified DNA was purified using a MinElute purification step and DNA was eluted in 20 µl TET. The concentration of the amplified DNA sequencing libraries was measured using a Bioanalyzer (Agilent Technologies) and a DNA1000 lab chip from Agilent Technologies.

The sequencing libraries were sequenced on an Illumina HiSeq 4000 platform at the Max Planck Institute for Science of Human History in Jena. The samples from Auneau (TU839 and TU840) were paired-end sequenced applying 2 × 50 + 8 + 8 cycles. All other libraries prepared in Tübingen were single-end sequenced using 75 + 8 + 8 cycles.

Oxford

Samples AL2657, AL2541, AL2741, AL2744, AL3185, AL2350, CH1109, AL2370, AL3272 and AL3284 were processed at the dedicated ancient DNA facility at the PalaeoBARN laboratory at the University of Oxford, following methods described previously⁸. In brief, double-stranded libraries were constructed following the protocol in ref. ⁴⁸. These libraries were sequenced on a HiSeq 2500 (AL2657, AL2541, AL2741, AL2744) or a HiSeq 4000 (AL3185, AL2350, CH1109) instrument at the Danish National Sequencing Center or on a NextSeq 550 instrument (AL2741) at the Natural History Museum of London. For samples AL2370, AL3272 and AL3284, between six and eight separate PCR reactions with unique indexes were carried out on their libraries and they were sequenced alongside samples HOV4, VAL_18A and IN18_016 on an Illumina NovaSeq 6000 lane with an S4 100-bp paired-end set-up at SciLifeLab in Stockholm.

Copenhagen

Samples CGG13, CGG17, CGG19, CGG20, CGG21, CGG25, CGG26, CGG27, CGG28, CGG34, Tumat1 and IRK were processed at the GLOBE Institute, University of Copenhagen. All pre-PCR work was performed in ancient DNA facilities following ancient DNA guidelines⁵⁷. The details of extraction, library construction and sequencing for the samples with CGG codes are described in ref. ²¹, in relation to the publication of mitochondrial data from these specimens. The Tumat1 sample was processed following the exact same protocol. In brief, DNA extraction was performed using a buffer containing urea, EDTA and proteinase K⁵⁰, double-stranded libraries were prepared with NEBNext DNA Sample Prep Master MixSet 2 (E6070S, New England Biolabs) and Illumina-specific adaptors⁴⁸, and sequencing was performed on an Illumina HiSeq 2500 platform using 100-bp single-read chemistry. For the IRK sample, DNA was extracted from three subsamples and purified as described in ref. ²¹. The three DNA extracts and the purified pre-digest of one subsample were incorporated into double-stranded libraries following the BEST protocol⁵⁸, with the modifications described in ref. ⁵⁹, and sequenced on a BGISEQ-500 platform using 100-bp single-read chemistry.

Santa Cruz

Samples SC19.MCJ017, SC19.MCJ015, SC19.MCJ010 and SC19.MCJ014 were processed at the UCSC Paleogenomics Lab and were provided by the Yukon Government Paleontology program. All pre-PCR work was performed in a dedicated ancient DNA facility at the University of California, Santa Cruz, following standard ancient DNA methods⁶⁰. Subsamples (250–350 mg) were sent to the UCI KECK AMS facility for radiocarbon dating, and the remaining amounts were powdered in a Retsch MM400 for extraction. For each sample, ~100 mg of powder was treated with a 0.5% sodium hypochlorite solution before extraction to remove surface contaminants⁶¹ and then combined with 1 ml lysis buffer for extraction, following the protocol in ref. ⁵². Samples were processed in parallel with a negative control. We quantified the extracts using a Qubit 1× dsDNA HS Assay kit (Q33231) before preparing libraries. We prepared single-stranded libraries following the protocol in ref. ⁶² and amplified the libraries for 9–16 cycles as informed by qPCR. After amplification, we cleaned the libraries using a 1.2× SPRI bead solution and pooled them to an equimolar ratio for in-house shallow quality-control sequencing on a NextSeq 550 paired-end 75-bp run. We then sent the libraries to Fulgent Genetics for deeper sequencing on two paired-end 150-bp lanes on a HiSeq X instrument.

Vienna

Sample HOV4 was processed at the Department of Anthropology, University of Vienna. The sample is a canine tooth, which after sequencing was determined to derive from a dhole (Cuon alpinus). DNA was extracted from its cementum using the methods described in ref. ⁶³ with a modified incubation time of ~18 h. The library was prepared according to the protocol in ref. ⁴⁸ with the modifications from ref. ⁶⁴. Five separate PCR reactions with unique indexes were carried out on the library and were sequenced alongside samples VAL_18A, IN18_016, AL2242, AL2370, AL2893, AL3272 and AL3284 on an Illumina NovaSeq 6000 lane with an S4 100-bp paired-end set-up at SciLifeLab in Stockholm.

An overview of all samples and their associated metadata is available in Supplementary Data 1.

Genome sequence data processing

For paired-end data, read pairs were merged and adaptors were trimmed using SeqPrep (https://github.com/jstjohn/SeqPrep), discarding reads that could not be successfully merged. Reads were mapped to the dog reference genome canFam3.1 using BWA aln (v.0.7.17)⁶⁵ with permissive parameters, including a disabled seed (-l 16500 -n 0.01 -o 2). Duplicates were removed by keeping only one read from any set of reads that had the same orientation, length and start and end coordinates. For sample Taimyr-1, previously published data¹³ were merged with newly generated data. Data from samples processed in Copenhagen were processed as described previously⁶⁶ except that they were also mapped to canFam3.1. Post-mortem damage was quantified using PMDtools (v0.60)⁶⁷ with the ‘--first’ and ‘--CpG’ arguments.

Genotyping and integration with previously published genomes

To construct a comparative dataset for population genetic analyses, we started from a published variant call set compiling 722 modern dog, wolf and other canid genomes from multiple previous studies (NCBI BioProject accession PRJNA448733)⁴⁰. To this, we added additional modern whole genomes from other studies: 4 African golden wolves and 15 Nigerian village dogs (Genome Sequence Archive (http://gsa.big.ac.cn/), accession PRJCA000335)⁶⁸, 12 Scandinavian wolves (European Nucleotide Archive accession PRJEB20635)⁶⁹, 9 North American wolves and coyotes (European Nucleotide Archive accession PRJNA496590)²⁵ and 8 other canids (African hunting dog, dhole, Ethiopian wolf, golden jackal, Middle Eastern grey wolves) (European Nucleotide Archive accession PRJNA494815)²². Reads from these genomes were mapped to the dog reference genome using bwa mem (version 0.7.15)⁷⁰, marked for duplicates using Picard Tools (v2.21.4) (http://broadinstitute.github.io/picard), genotyped at the sites present in the above dataset using GATK HaplotypeCaller (v3.6)⁷¹ with the ‘-gt_mode GENOTYPE_GIVEN_ALLELES’ argument and then merged into the dataset using bcftools merge (http://www.htslib.org/). The following filters were then applied to sites and genotypes across the full dataset: sites with excess heterozygosity (bcftools fill-tags ‘ExcHet’ P value < 1 × 10⁻⁶) were removed; indel alleles were removed by setting the genotype of any individual carrying such an allele to missing; genotypes at sites with a depth (taken as the sum of the ‘AD’ VCF fields) less than a third of or more than twice the genome-wide average for the given genome or lower than 5 were set to missing; genotypes containing any allele other than the two highest-frequency alleles at the site were set to missing; allele representation was normalized using bcftools norm; and, finally, sites at which 130 or more individuals had a missing genotype were removed. This resulted in a final dataset of 67.8 million biallelic SNPs. In ancestry analyses (that is, those involving f-statistics), modern wolves were treated as individuals while for modern dogs up to four individuals with the highest sequencing coverage from a given breed were used and combined into populations. A list of the modern genomes used in analyses and their associated metadata is included in Supplementary Data 2.

All ancient genomes were assigned pseudo-haploid genotypes on the variant sites in the above dataset using htsbox pileup r345 (https://github.com/lh3/htsbox), requiring a minimum read length of 35 bp (‘-l 35’), mapping quality of 20 (‘-q 20’) and base quality of 30 (‘-Q 30’). If an ancient genome carried an allele not present in the dataset, its genotype was set to missing. Previously generated ancient and historical wolf and dog genomes mapped to the dog reference were obtained from the respective publications^{3,7,8,13,17,66,72,73} (European Nucleotide Archive study accessions PRJEB7788, PRJEB13070, PRJNA319283, PRJEB22026, PRJNA608847, PRJEB38079, PRJEB39580, PRJEB41490) and genotyped in the same way. A list of the ancient genomes used in analyses and their associated metadata is included in Supplementary Data 2.

Mitochondrial genome phylogenetic analysis and evolutionary dating

We extracted reads mapped to the mitochondrial genome for the ancient wolf samples using samtools (v1.9)⁷⁴. We called consensus sequences using a 75% threshold, calling any sites with coverage less than 3 as ‘N’, using Geneious (v9.0.5) and removed any samples with greater than 10% missing data. We included a set of previously published mitochondrial genomes from ancient and modern wolves^{5,9,13,21,75,76,77,78,79,80}, which led to a final dataset of 183 individuals (62 ¹⁴C-dated ancient individuals, 24 undated ancient individuals of which 7 had infinite ¹⁴C dates, and 90 modern individuals). We also included three coyote-like sequences as outgroups (from one modern coyote and two ancient wolves with coyote-like mitochondrial sequences: SC19.MCJ015, ¹⁴C dated, and SC19.MCJ017, with an infinite ¹⁴C date). We aligned all sequences using Clustal Omega (v1.2.4)⁸¹. A Bayesian phylogeny was constructed using BEAST (v1.10.1)⁸², with an HKY + I + G substitution model chosen by JModelTest2 (v2.1.10)⁸³, uncorrelated relaxed log-normal clock and coalescent constant size tree prior. We combined 20 MCMC chains (each run for 200 million iterations), after excluding the first 25% of values as a burn-in. For ¹⁴C-dated samples, we included tip date priors that corresponded to a normal distribution with the same mean and 95% confidence distribution as the ¹⁴C dates. We estimated the ages of undated samples from a prior distribution as follows: (1) for the n = 24 ancient samples with no ¹⁴C information, we used a uniform prior of 0 to 1,000,000 years before the present (bp); (2) for the n = 7 ancient samples with infinite ¹⁴C dates, we used a uniform prior as in (1), but with the lower limit as the minimum date given by the radiocarbon dating; (3) all n = 90 modern samples had already been published previously²¹, and the tip date priors for these samples were the same as the uniform priors used in the earlier study (either 0 to 100 or 0 to 500 bp). The mitochondrial consensus sequences for the wolf samples newly reported here (excluding those that were removed because they had too much missing data) are available as Supplementary Data 4.

f-statistics and admixture graphs

f₃- and f₄-statistics were calculated with ADMIXTOOLS (v5.0)⁸⁴, using only transversion sites and with the ‘numchrom: 38’ argument. To overcome memory limitations when calculating large numbers of f₄-statistics, block jackknifing was performed external to ADMIXTOOLS across 225 blocks of 10 Mb in size. Admixture graphs were fit using qpGraph, with arguments ‘outpop: NULL’, ‘useallsnps: NO’, ‘blgsize: 0.05’, ‘forcezmode: YES’, ‘lsqmode: NO’, ‘diag: 0.0001’, ‘bigiter: 6’, ‘hires: YES’ and ‘lambdascale: 1’. Outgroup f₃-statistics were calculated using only sites ascertained to be heterozygous in the CoyoteCalifornia individual.

PCA was performed on outgroup f₃-statistics by transforming the values to distances by taking 1 – f₃ and then running the prcomp R function on the resulting distance matrix. Only ancient wolves were included in the calculation of PCs; present-day wolves and ancient and present-day dogs were then individually projected onto the PCs by re-running the analysis once for each of these individuals independently with that single individual added in and saving its coordinates. To avoid overloading the plot with dogs, only the following dogs were included: Basenji, Boxer, BullTerrier, NewGuineaSingingDog, SiberianHusky, Germany.HXH (7,000 bp), Germany.CTC (4.7 ka), Ireland.Newgrange (4,800 bp), Israel.THRZ02 (7,200 bp), Baikal.OL4223 (6,900 bp), Zhokhov.CGG6 (9,500 bp) and PortauChoix.AL3194 (4,000 bp).

PCA was performed on f₄-statistics by transforming the values to pairwise distances by taking \(\sqrt{2\times (1-r)}\), where r is the Pearson correlation for a given pair of individuals, and then running the ppca function from the pcaMethods (v1.74.0) R package on the resulting distance matrix. For the ‘pre-LGM PCA’ (Fig. 4a and Extended Data Fig. 2), only all possible f₄-statistics of the form f₄(X,A;B,C) were included, where X was the post-25 ka and present-day individuals included in the plot and A, B and C were drawn from a reference set of ancient wolves that lived before 28 ka. For each X, the input was thus a vector of f₄-statistics that quantified its relationships to pre-LGM wolves. Only wolves (post-25 ka and present day) were included in the calculation of PCs, and ancient and present-day dogs were then individually projected onto the PCs as described above.

Heterozygosity and F _ST estimates

Conditional heterozygosity was estimated at 1,250,173 transversion sites ascertained to be heterozygous in the CoyoteCalifornia individual, chosen because it is largely an outgroup to wolf diversity. For each individual, exactly two reads were sampled at each of these sites (if available), and the fraction of sites where these two reads displayed different alleles was calculated (alleles other than the two observed in the coyote were ignored). Standard errors were obtained by block jackknifing across the 38 chromosomes.

F_ST was calculated with smartpca from the EIGENSOFT (v7.2.1) package⁸⁵, using the ‘inbreed: YES’ option to account for the pseudohaploid genotypes of the ancient genomes (this option was also applied to present-day diploid genomes). F_ST was calculated pairwise for pools of at least two genomes, formed from individuals selected for being close in time and space (Supplementary Table 1). A few pairs of individuals showed high similarity indicating possible relatedness, as assessed by comparing read mismatch rates across versus within individuals, and one individual from each of these pairs was excluded from these analyses (JK2174 was excluded because of high similarity to JK2183, TU839 because of high similarity to TU840, and CGG17 because of high similarity to Yana-1). F_ST values for pairs of pools with age midpoints separated by less than 12,500 years were included in the plot.

Divergence time and effective population size analyses with MSMC2

We used MSMC2 (v2.1.2)²⁶ to infer population divergence times and effective population size histories. Input genotypes for this were called using GATK HaplotypeCaller (v3.6)⁸⁶ on ancient and modern genomes with sequencing coverage >5.8×. For divergence time analyses, haploid X chromosomes from two different male genomes were combined and the point at which the inferred effective population size for this ‘pseudodiploid’ chromosome increased sharply upwards was taken to correspond to a population divergence. Results were scaled using a mutation rate of 0.4 × 10⁻⁸ mutations per site per generation^13,87 (with a 25% lower rate for X-chromosome analyses) and a mean generational interval of 3 years¹³. For effective population size inferences, transition variants were ignored and results were scaled using a transversions-only mutation rate inferred from results on modern genomes. For more details on the MSMC2 analyses, see Supplementary Information section 3.

Selection analyses

Selection analysis was performed using PLINK (v1.90b5.2)⁸⁸. This analysis used the 72 ancient wolf genomes and 68 modern wolf genomes (with the latter including a historical Japanese wolf genome⁷³ treated as ancient for analysis purposes, with its age set to 200 bp). A list of the genomes used for this analysis is available in Supplementary Data 2 (“Used for selection scan” column). All SNPs, not only transversions, were used for this analysis. The age of each wolf was set as the phenotype, with values of 0 for modern wolves, and the ‘--linear’ argument was used to test for an association between SNP genotypes and age, also applying the ‘--adjust’ argument to correct P values using genomic control. The application of genomic control³⁴ here aimed to use the magnitude of temporal allele frequency variance observed across the genome to account for what was observed from genetic drift alone given wolf demographic history. Only results for the following sets of sites were retained and included in the Manhattan plot: sites where at least 40 ancient genomes had a genotype call, sites with a minor allele frequency among the ancient wolves of ≥5% and sites that had at least 7 neighbouring sites within a 50-kb window with a P value that was at least 90% as large (on a log₁₀ scale) as the P value of the site itself. The last ‘neighbourhood filter’ aimed to reduce false positives by requiring similar evidence across multiple nearby sites. As a P-value significance cut-off to correct for the genome-wide testing, we used 5 × 10⁻⁸, which is commonly used in genome-wide association studies in humans and also in dogs⁸⁹. We excluded 15 regions where only a single variant reached significance. A detailed table with the 24 detected regions is available in Supplementary Data 3. To test the robustness of this analysis to false positives arising from genetic drift alone, we applied the same analysis to data from neutral coalescent simulations generated using ms⁹⁰ and found no false positives. For more details, see Supplementary Information section 4.

Ancestry modelling with qpAdm and qpWave

We used the qpAdm and qpWave methods⁴³ from ADMIXTOOLS (v5.0)⁸⁴ to test ancestry models for wolf and dog targets postdating 23 ka. For the primary analyses, we used the following set of candidate source populations (age estimate in brackets, years bp): Armenia_Hovk1.HOV4 (ancient dhole), Siberia_UlakhanSular.LOW008 (70,772), Germany_Aufhausener.AH575 (57,233), Siberia_BungeToll.CGG29 (48,210), Germany_HohleFels.JK2183 (32,366), Siberia_BelayaGora.IN18_016 (32,020), Yukon_QuartzCreek.SC19.MCJ010 (29,943), Altai_Razboinichya.AL2744 (28,345), Siberia_BelayaGora.IN18_005 (18,148) and Germany_HohleFels.JK2179 (13,229). We used a rotating approach in which, for each target, we tested all possible one-, two- and three-source models that could be enumerated from the above set. Individuals from the set that were not used as a source in a given model served as thereference set (or the ‘right’ population in the qpAdm framework). This means that, in every model, each of the above individuals was always either in the source list or in the reference list. We ranked models on the basis of their P values, but prioritized models with fewer sources using a P-value threshold of 0.01: if a simpler model (meaning a model with fewer sources) had a P value above this threshold, it ranked above a more complex model (meaning a model with more sources) regardless of the P value of the latter. We also failed models with inferred ancestry proportions larger than 1.1 or smaller than −0.1. For single-source models, qpWave was run instead of qpAdm. Both programs were run with the ‘allsnps: YES’ option (without this option, there was very little power to reject models). We describe ancestry assigned to the ancient dhole source (Armenia_Hovk1.HOV4) as ‘unsampled’ ancestry; note that this does not imply that such ancestry is of non-wolf origin, only that it is not represented by (that is, diverged early from and lacks shared genetic drift with) the ancient wolf genomes in the reference set.

To test whether any post-23 ka or modern wolf genome available might be a good proxy for the western Eurasian wolf-related ancestry identified in Near Eastern and African dogs, we added the 9,500-year-old Zhokhov dog¹⁷ to the rotating set of candidate source populations. Chosen for its high coverage, early date and easterly location, this makes the assumption that the Zhokhov dog is a good representative for the eastern dog ancestry component. Using the African Basenji dog as a target, models involving the Zhokhov dog plus another given wolf thus allowed us to test whether that wolf was a good match for the additional component of ancestry. For more details on the qpAdm and qpWave analyses, see Supplementary Information sections 2 (wolf targets) and 5 (dog targets).

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.

Data availability

The generated DNA sequencing data are available in the European Nucleotide Archive (ENA) under study accession PRJEB42199. Previously published genomic data analysed here are available under accession numbers PRJNA448733, PRJCA000335, PRJEB20635, PRJNA496590, PRJNA494815, PRJEB7788, PRJEB13070, PRJNA319283, PRJEB22026, PRJNA608847, PRJEB38079, PRJEB39580 and PRJEB41490, with individual genomes used listed in Supplementary Data 2. The canFam3.1 reference genome is available under NCBI assembly accession GCF_000002285.3.

References

Savolainen, P., Zhang, Y.-P., Luo, J., Lundeberg, J. & Leitner, T. Genetic evidence for an East Asian origin of domestic dogs. Science 298, 1610–1613 (2002).
Article ADS CAS PubMed Google Scholar
Wang, G.-D. et al. Out of southern East Asia: the natural history of domestic dogs across the world. Cell Res. 26, 21–33 (2016).
Article PubMed Google Scholar
Frantz, L. A. F. et al. Genomic and archaeological evidence suggest a dual origin of domestic dogs. Science 352, 1228–1231 (2016).
Article ADS CAS PubMed Google Scholar
Shannon, L. M. et al. Genetic structure in village dogs reveals a Central Asian domestication origin. Proc. Natl Acad. Sci. USA 112, 13639–13644 (2015).
Article ADS CAS PubMed PubMed Central Google Scholar
Thalmann, O. et al. Complete mitochondrial genomes of ancient canids suggest a European origin of domestic dogs. Science 342, 871–874 (2013).
Article ADS CAS PubMed Google Scholar
Vonholdt, B. M. et al. Genome-wide SNP and haplotype analyses reveal a rich history underlying dog domestication. Nature 464, 898–902 (2010).
Article ADS CAS PubMed PubMed Central Google Scholar
Botigué, L. R. et al. Ancient European dog genomes reveal continuity since the Early Neolithic. Nat. Commun. 8, 16082 (2017).
Article ADS PubMed PubMed Central Google Scholar
Bergström, A. et al. Origins and genetic legacy of prehistoric dogs. Science 370, 557–564 (2020).
Article PubMed PubMed Central Google Scholar
Tian, H. et al. Intraflagellar transport 88 (IFT88) is crucial for craniofacial development in mice and is a candidate gene for human cleft lip and palate. Hum. Mol. Genet. 26, 860–872 (2017).
CAS PubMed PubMed Central Google Scholar
Fan, Z. et al. Worldwide patterns of genomic variation and admixture in gray wolves. Genome Res. 26, 163–173 (2016).
Article CAS PubMed PubMed Central Google Scholar
vonHoldt, B. M. et al. Whole-genome sequence analysis shows that two endemic species of North American wolf are admixtures of the coyote and gray wolf. Sci. Adv. 2, e1501714 (2016).
Article ADS PubMed PubMed Central Google Scholar
Hughes, P. D. & Gibbard, P. L. A stratigraphical basis for the Last Glacial Maximum (LGM). Quat. Int. 383, 174–185 (2015).
Article Google Scholar
Skoglund, P., Ersmark, E., Palkopoulou, E. & Dalén, L. Ancient wolf genome reveals an early divergence of domestic dog ancestors and admixture into high-latitude breeds. Curr. Biol. 25, 1515–1519 (2015).
Article CAS PubMed Google Scholar
Ramos-Madrigal, J. et al. Genomes of Pleistocene Siberian wolves uncover multiple extinct wolf lineages. Curr. Biol. 31, 198–206 (2020).
Article PubMed Google Scholar
Janssens, L. et al. A new look at an old dog: Bonn-Oberkassel reconsidered. J. Archaeol. Sci. 92, 126–138 (2018).
Article Google Scholar
Perri, A. R. et al. Dog domestication and the dual dispersal of people and dogs into the Americas. Proc. Natl Acad. Sci. USA 118, e2010083118 (2021).
Article CAS PubMed PubMed Central Google Scholar
Sinding, M.-H. S. et al. Arctic-adapted dogs emerged at the Pleistocene–Holocene transition. Science 368, 1495–1499 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Pečnerová, P. et al. Genome-based sexing provides clues about behavior and social structure in the woolly mammoth. Curr. Biol. 27, 3505–3510 (2017).
Article PubMed Google Scholar
Gower, G. et al. Widespread male sex bias in mammal fossil and museum collections. Proc. Natl Acad. Sci. USA 116, 19019–19024 (2019).
Article CAS PubMed PubMed Central Google Scholar
Drummond, A. J. & Rambaut, A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol. Biol. 7, 214 (2007).
Article PubMed PubMed Central Google Scholar
Loog, L. et al. Ancient DNA suggests modern wolves trace their origin to a Late Pleistocene expansion from Beringia. Mol. Ecol. 29, 1596–1610 (2020).
Article PubMed PubMed Central Google Scholar
Gopalakrishnan, S. et al. Interspecific gene flow shaped the evolution of the genus Canis. Curr. Biol. 28, 3441–3449 (2018).
Article CAS PubMed PubMed Central Google Scholar
Wang, M.-S. et al. Ancient hybridization with an unknown population facilitated high-altitude adaptation of canids. Mol. Biol. Evol. 37, 2616–2629 (2020).
Article CAS PubMed Google Scholar
vonHoldt, B. M. et al. A genome-wide perspective on the evolutionary history of enigmatic wolf-like canids. Genome Res. 21, 1294–1305 (2011).
Article CAS PubMed PubMed Central Google Scholar
Sinding, M.-H. S. et al. Population genomics of grey wolves and wolf-like canids in North America. PLoS Genet. 14, e1007745 (2018).
Article PubMed PubMed Central Google Scholar
Wang, K., Mathieson, I., O’Connell, J. & Schiffels, S. Tracking human population structure through time from whole genome sequences. PLoS Genet. 16, e1008552 (2020).
Article CAS PubMed PubMed Central Google Scholar
Kurtén, B. & Anderson, E. Pleistocene Mammals of North America (Columbia University Press, 1980).
Hu, A. et al. Influence of Bering Strait flow and North Atlantic circulation on glacial sea-level changes. Nat. Geosci. 3, 118–121 (2010).
Article ADS CAS Google Scholar
Vershinina, A. O. et al. Ancient horse genomes reveal the timing and extent of dispersals across the Bering Land Bridge. Mol. Ecol. 30, 6144–6161 (2021).
Article PubMed Google Scholar
Leonard, J. A. et al. Megafaunal extinctions and the disappearance of a specialized wolf ecomorph. Curr. Biol. 17, 1146–1150 (2007).
Article CAS PubMed Google Scholar
Hudson, R. R., Slatkin, M. & Maddison, W. P. Estimation of levels of gene flow from DNA sequence data. Genetics 132, 583–589 (1992).
Article CAS PubMed PubMed Central Google Scholar
Pilot, M. et al. Genome-wide signatures of population bottlenecks and diversifying selection in European wolves. Heredity 112, 428–442 (2014).
Article CAS PubMed Google Scholar
Dufresnes, C. et al. Howling from the past: historical phylogeography and diversity losses in European grey wolves. Proc. Biol. Sci. 285, 20181148 (2018).
PubMed PubMed Central Google Scholar
Devlin, B. & Roeder, K. Genomic control for association studies. Biometrics 55, 997–1004 (1999).
Article CAS PubMed MATH Google Scholar
Speidel, L., Forest, M., Shi, S. & Myers, S. R. A method for genome-wide genealogy estimation for thousands of samples. Nat. Genet. 51, 1321–1329 (2019).
Article CAS PubMed PubMed Central Google Scholar
Stern, A. J., Wilton, P. R. & Nielsen, R. An approximate full-likelihood method for inferring selection and allele frequency trajectories from DNA sequence data. PLoS Genet. 15, e1008384 (2019).
Article PubMed PubMed Central Google Scholar
Freedman, A. H. et al. Genome sequencing highlights the dynamic early history of dogs. PLoS Genet. 10, e1004016 (2014).
Article PubMed PubMed Central Google Scholar
Rimbault, M. et al. Derived variants at six genes explain nearly half of size reduction in dog breeds. Genome Res. 23, 1985–1995 (2013).
Article CAS PubMed PubMed Central Google Scholar
Webster, M. T. et al. Linked genetic variants on chromosome 10 control ear morphology and body mass among dog breeds. BMC Genomics 16, 474 (2015).
Article PubMed PubMed Central Google Scholar
Plassais, J. et al. Whole genome sequencing of canids reveals genomic regions under selection and variants influencing morphology. Nat. Commun. 10, 1489 (2019).
Article ADS PubMed PubMed Central Google Scholar
Boyko, A. R. et al. A simple genetic architecture underlies morphological variation in dogs. PLoS Biol. 8, e1000451 (2010).
Article PubMed PubMed Central Google Scholar
Anderson, T. M. et al. Molecular and evolutionary history of melanism in North American gray wolves. Science 323, 1339–1343 (2009).
Article ADS CAS PubMed PubMed Central Google Scholar
Haak, W. et al. Massive migration from the steppe was a source for Indo-European languages in Europe. Nature 522, 207–211 (2015).
Article ADS CAS PubMed PubMed Central Google Scholar
Mech, L. D. Unexplained patterns of grey wolf Canis lupus natal dispersal. Mamm. Rev. 50, 314–323 (2020).
Article Google Scholar
Baumann, C. et al. A refined proposal for the origin of dogs: the case study of Gnirshöhle, a Magdalenian cave site. Sci. Rep. 11, 5137 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Germonpré, M. et al. Fossil dogs and wolves from Palaeolithic sites in Belgium, the Ukraine and Russia: osteometry, ancient DNA and stable isotopes. J. Archaeol. Sci. 36, 473–490 (2009).
Article Google Scholar
Davis, S. J. M. & Valla, F. R. Evidence for domestication of the dog 12,000 years ago in the Natufian of Israel. Nature 276, 608–610 (1978).
Article ADS Google Scholar
Meyer, M. & Kircher, M. Illumina sequencing library preparation for highly multiplexed target capture and sequencing. Cold Spring Harb. Protoc. 2010, db.prot5448 (2010).
Article Google Scholar
Rodríguez-Varela, R. et al. Genomic analyses of pre-European Conquest human remains from the Canary Islands reveal close affinity to modern North Africans. Curr. Biol. 27, 3396–3402 (2017).
Article PubMed Google Scholar
Ersmark, E. et al. Population demography and genetic diversity in the Pleistocene cave lion. Open Quatern., https://doi.org/10.5334/oq.aa (2015).
Stanton, D. W. G. et al. Early Pleistocene origin and extensive intra-species diversity of the extinct cave lion. Sci. Rep. 10, 12621 (2020).
Article ADS PubMed PubMed Central Google Scholar
Dabney, J. et al. Complete mitochondrial genome sequence of a Middle Pleistocene cave bear reconstructed from ultrashort DNA fragments. Proc. Natl Acad. Sci. USA 110, 15758–15763 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Gansauge, M.-T. & Meyer, M. Single-stranded DNA library preparation for the sequencing of ancient or damaged DNA. Nat. Protoc. 8, 737–748 (2013).
Article PubMed Google Scholar
Poinar, H. N. & Cooper, A. Ancient DNA: do it right or not at all. Science 5482, 416 (2000).
Google Scholar
Knapp, M. & Hofreiter, M. Next generation sequencing of ancient DNA: requirements, strategies and perspectives. Genes 1, 227–243 (2010).
Article CAS PubMed PubMed Central Google Scholar
Kircher, M. Analysis of high-throughput ancient DNA sequencing data. Methods Mol. Biol. 840, 197–228 (2012).
Article CAS PubMed Google Scholar
Orlando, L. et al. Recalibrating Equus evolution using the genome sequence of an early Middle Pleistocene horse. Nature 499, 74–78 (2013).
Article ADS CAS PubMed Google Scholar
Carøe, C. et al. Single‐tube library preparation for degraded DNA. Methods Ecol. Evol. 9, 410–419 (2018).
Article Google Scholar
Mak, S. S. T. et al. Comparative performance of the BGISEQ-500 vs Illumina HiSeq2500 sequencing platforms for palaeogenomic sequencing. Gigascience 6, 1–13 (2017).
Article ADS PubMed PubMed Central Google Scholar
Fulton, T. L. & Shapiro, B. Setting up an ancient DNA laboratory. Methods Mol. Biol. 1963, 1–13 (2019).
Article CAS PubMed Google Scholar
Korlević, P. & Meyer, M. Pretreatment: removing DNA contamination from ancient bones and teeth using sodium hypochlorite and phosphate. Methods Mol. Biol. 1963, 15–19 (2019).
Article PubMed Google Scholar
Kapp, J. D., Green, R. E. & Shapiro, B. A fast and efficient single-stranded genomic library preparation method optimized for ancient DNA. J. Hered. 112, 241–249 (2021).
Article PubMed PubMed Central Google Scholar
Harney, É. et al. A minimally destructive protocol for DNA extraction from ancient teeth. Genome Res. 31, 472–483 (2021).
Article PubMed PubMed Central Google Scholar
Gamba, C. et al. Genome flux and stasis in a five millennium transect of European prehistory. Nat. Commun. 5, 5257 (2014).
Article ADS CAS PubMed Google Scholar
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Article CAS PubMed PubMed Central Google Scholar
Ramos Madrigal, J. et al. Genomes of extinct Pleistocene Siberian wolves provide insights into the origin of present-day wolves. Curr. Biol. 31, 199–206 (2021).
Google Scholar
Skoglund, P., Northoff, B. H. & Shunkov, M. V. Separating endogenous ancient DNA from modern day contamination in a Siberian Neandertal. Proc. Natl Acad. Sci. USA 111, 2229–2234 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Liu, Y.-H. et al. Whole-genome sequencing of African dogs provides insights into adaptations against tropical parasites. Mol. Biol. Evol. 35, 287–298 (2018).
Article CAS PubMed Google Scholar
Kardos, M. et al. Genomic consequences of intensive inbreeding in an isolated wolf population. Nat. Ecol. Evol. 2, 124–131 (2018).
Article PubMed Google Scholar
Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at https://arxiv.org/abs/1303.3997 (2013).
McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
Article CAS PubMed PubMed Central Google Scholar
Ní Leathlobhair, M. et al. The evolutionary history of dogs in the Americas. Science 361, 81–85 (2018).
Article ADS PubMed Google Scholar
Niemann, J. et al. Extended survival of Pleistocene Siberian wolves into the early 20th century on the island of Honshū. iScience 24, 101904 (2021).
Article ADS PubMed Google Scholar
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Article PubMed PubMed Central Google Scholar
Arnason, U., Gullberg, A., Janke, A. & Kullberg, M. Mitogenomic analyses of caniform relationships. Mol. Phylogenet. Evol. 45, 863–874 (2007).
Article CAS PubMed Google Scholar
Björnerfeldt, S., Webster, M. T. & Vilà, C. Relaxation of selective constraint on dog mitochondrial DNA following domestication. Genome Res. 16, 990–994 (2006).
Article PubMed PubMed Central Google Scholar
Matsumura, S., Inoshima, Y. & Ishiguro, N. Reconstructing the colonization history of lost wolf lineages by the analysis of the mitochondrial genome. Mol. Phylogenet. Evol. 80, 105–112 (2014).
Article PubMed Google Scholar
Meng, C., Zhang, H. & Meng, Q. Mitochondrial genome of the Tibetan wolf. Mitochondrial DNA 20, 61–63 (2009).
Article CAS PubMed Google Scholar
Pang, J.-F. et al. mtDNA data indicate a single origin for dogs south of Yangtze River, less than 16,300 years ago, from numerous wolves. Mol. Biol. Evol. 26, 2849–2864 (2009).
Article CAS PubMed PubMed Central Google Scholar
Zhang, H. et al. Complete mitochondrial genome of Canis lupus campestris. Mitochondrial DNA 26, 255–256 (2015).
Article CAS PubMed Google Scholar
Sievers, F. et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol. 7, 539 (2011).
Article PubMed PubMed Central Google Scholar
Suchard, M. A. et al. Bayesian phylogenetic and phylodynamic data integration using BEAST 1.10. Virus Evol. 4, vey016 (2018).
Article PubMed PubMed Central Google Scholar
Darriba, D., Taboada, G. L., Doallo, R. & Posada, D. jModelTest 2: more models, new heuristics and parallel computing. Nat. Methods 9, 772 (2012).
Article CAS PubMed PubMed Central Google Scholar
Patterson, N. et al. Ancient admixture in human history. Genetics 192, 1065–1093 (2012).
Article PubMed PubMed Central Google Scholar
Patterson, N., Price, A. L. & Reich, D. Population structure and eigenanalysis. PLoS Genet. 2, e190 (2006).
Article PubMed PubMed Central Google Scholar
Poplin, R. et al. Scaling accurate genetic variant discovery to tens of thousands of samples. Preprint at bioRxiv https://doi.org/10.1101/201178 (2018).
Koch, E. et al. De novo mutation rate estimation in wolves of known pedigree. Mol. Biol. Evol. 36, 2536–2547 (2019).
Article CAS PubMed Central Google Scholar
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience 4, 7 (2015).
Deane-Coe, P. E., Chu, E. T., Slavney, A., Boyko, A. R. & Sams, A. J. Direct-to-consumer DNA testing of 6,000 dogs reveals 98.6-kb duplication associated with blue eyes and heterochromia in Siberian Huskies. PLoS Genet. 14, e1007648 (2018).
Article PubMed PubMed Central Google Scholar
Hudson, R. R. Generating samples under a Wright–Fisher neutral model of genetic variation. Bioinformatics 18, 337–338 (2002).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

This work was supported by grants to P. Skoglund from the European Research Council (grant no. 852558), the Erik Philip Sörensen Foundation and the Science for Life Laboratory, Swedish Biodiversity Program, made available by support from the Knut and Alice Wallenberg Foundation. A.B., L.S., P. Swali and P. Skoglund were supported by Francis Crick Institute core funding (FC001595) from Cancer Research UK, the UK Medical Research Council and the Wellcome Trust. P. Skoglund was also supported by the Vallee Foundation, the European Molecular Biology Organisation and the Wellcome Trust (217223/Z/19/Z). Computations were supported by SNIC-UPPMAX. We also acknowledge support from Science for Life Laboratory, the Knut and Alice Wallenberg Foundation, the National Genomics Infrastructure funded by the Swedish Research Council and the Uppsala Multidisciplinary Center for Advanced Computational Science for assistance with massively parallel sequencing and access to the UPPMAX computational infrastructure. We thank the Yukon gold mining community and First Nations, including the Tr’ondëk Hwëch’in, for continued support of our palaeontology research in the Yukon Territories, Canada. We thank the Danish National High-Throughput Sequencing Centre and BGI-Europe for assistance in sequencing data generation and the Danish National Supercomputer for Life Sciences–Computerome (https://computerome.dtu.dk) for computational resources. We thank National Museum Wales for continued sampling support. M. Germonpré acknowledges support from the Brain.be 2.0 ICHIE project (BELSPO B2/191/P2/ICHIE). M.T.P.G. was supported by the European Research Council (grant no. 681396). M.-H.S.S. was supported by the Velux Foundations through the Qimmeq Project, the Aage og Johanne Louis-Hansens Fond and the Independent Research Fund Denmark (8028-00005B). L.D. acknowledges support from FORMAS (2018-01640). D.W.G.S. received funding for this project from the European Union’s Horizon 2020 research and innovation programme under Marie Skłodowska-Curie grant agreement no. 796877. M.P. was supported by the Polish National Agency for Academic Exchange–NAWA (grant no. PPN/PPO/2018/1/00037). V.J.S. was supported by the University of Zurich’s University Research Priority Program ‘Evolution in Action: From Genomes to Ecosystems’. This research was done with the participation of ZIN RAS (grant no. 075-15-2021-1069). We are grateful to the museum of the Institute of Plant and Animal Ecology UB RAS (Ekaterinburg, Russia) for provision of samples. R.P.J. and C.O’D. were supported by the Standing Committee for Archaeology of the Royal Irish Academy through the Archaeological Excavation Research Grant Scheme. E.Y.P., P.N. and V.V.P. are supported by the Russian Science Foundation (grant no. 16-18-10265-RNF and 21-18-00457-RNF). Y.V.K. was supported by the Russian Science Foundation (grant no. 20-17-00033). M.H. was supported by the European Research Council (consolidator grant GeneFlow no. 310763). M.L.-G. was supported by the Czech Science Foundation GAČR (grant no. 15-06446S) and institutional financing of the Moravian Museum from the Czech Ministry of Culture (IP DKRVO 2019-2023, MK000094862). L.S. is supported by the Sir Henry Wellcome fellowship (220457/Z/20/Z). We thank Staatliches Museum für Naturkunde Stuttgart for sample access. L.F. and G.L. were supported by European Research Council grants (ERC-2013-StG-337574-UNDEAD and ERC-2019-StG-853272-PALAEOFARM) and Natural Environmental Research Council grants (NE/K005243/1, NE/K003259/1, NE/S007067/1 and NE/S00078X/1). L.F. was also supported by the Wellcome Trust (210119/Z/18/Z). This research was funded in whole, or in part, by the Wellcome Trust (FC001595). For the purpose of open access, the author has applied a CC-BY public copyright licence to any author accepted manuscript version arising from this submission.

Author information

These authors contributed equally: Anders Bergström, David W. G. Stanton, Ulrike H. Taron
Deceased: Semyon Grigoriev

Authors and Affiliations

Ancient Genomics Laboratory, The Francis Crick Institute, London, UK
Anders Bergström, Leo Speidel, Pooja Swali & Pontus Skoglund
Department of Bioinformatics and Genetics, Swedish Museum of Natural History, Stockholm, Sweden
David W. G. Stanton, Erik Ersmark & Love Dalén
Centre for Palaeogenetics, Stockholm, Sweden
David W. G. Stanton, Erik Ersmark, Anna Linderholm, Anders Götherström & Love Dalén
School of Biological and Behavioural Sciences, Queen Mary University of London, London, UK
David W. G. Stanton & Laurent Frantz
Evolutionary Adaptive Genomics, Institute of Biochemistry and Biology, University of Potsdam, Potsdam, Germany
Ulrike H. Taron, Michael V. Westbury & Michael Hofreiter
Palaeogenomics Group, Department of Veterinary Sciences, Ludwig Maximilian University, Munich, Germany
Laurent Frantz
The GLOBE Institute, University of Copenhagen, Copenhagen, Denmark
Mikkel-Holger S. Sinding, Shyam Gopalakrishnan, Michael V. Westbury, Jazmin Ramos-Madrigal, Tatiana R. Feuerborn, Christian Carøe, Anders J. Hansen, Eske Willerslev & M. Thomas P. Gilbert
Smurfit Institute of Genetics, Trinity College Dublin, Dublin, Ireland
Mikkel-Holger S. Sinding
The Qimmeq Project, University of Greenland, Nuuk, Greenland
Mikkel-Holger S. Sinding & Tatiana R. Feuerborn
Greenland Institute of Natural Resources, Nuuk, Greenland
Mikkel-Holger S. Sinding
Institute for Archaeological Sciences, University of Tübingen, Tübingen, Germany
Saskia Pfrengle, Tatiana R. Feuerborn, Ella Reiter, Joscha Gretzinger, Susanne C. Münzel & Verena J. Schuenemann
Institute of Evolutionary Medicine, University of Zurich, Zurich, Switzerland
Saskia Pfrengle & Verena J. Schuenemann
Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, Santa Cruz, CA, USA
Molly Cassatt-Johnstone & Beth Shapiro
The Palaeogenomics & Bio-Archaeology Research Network, Research Laboratory for Archaeology and History of Art, University of Oxford, Oxford, UK
Ophélie Lebrasseur, James Haile, Anna Linderholm & Greger Larson
Department of Archaeology, School of Geosciences, University of Aberdeen, Aberdeen, UK
Linus Girdland-Flink
School of Biological and Environmental Sciences, Liverpool John Moores University, Liverpool, UK
Linus Girdland-Flink & Richard P. Jennings
Department of Evolutionary Anthropology, University of Vienna, Vienna, Austria
Daniel M. Fernandes, Ron Pinhasi & Verena J. Schuenemann
CIAS, Department of Life Sciences, University of Coimbra, Coimbra, Portugal
Daniel M. Fernandes
University of Rennes, CNRS, ECOBIO (Ecosystèmes, biodiversité, évolution)–UMR 6553, Rennes, France
Morgane Ollivier
Genetics Institute, University College London, London, UK
Leo Speidel
Max Planck Institute for the Science of Human History, Jena, Germany
Joscha Gretzinger
Department of Early Prehistory and Quaternary Ecology, University of Tübingen, Tübingen, Germany
Nicholas J. Conard
Senckenberg Centre for Human Evolution and Palaeoenvironment, University of Tübingen, Tübingen, Germany
Nicholas J. Conard, Chris Baumann, Hervé Bocherens & Dorothée G. Drucker
Texas A&M University, College Station, TX, USA
Anna Linderholm
Department of Geological Sciences, Stockholm University, Stockholm, Sweden
Anna Linderholm
Museum ‘Severnyi Mir’, Yakutsk, Russian Federation
Semyon Androsov
Department of Earth Sciences, Natural History Museum, London, UK
Ian Barnes & Selina Brace
Department of Geosciences and Geography, Faculty of Science, University of Helsinki, Helsinki, Finland
Chris Baumann
German Archaeological Institute, Berlin, Germany
Norbert Benecke
Biogeology, Department of Geosciences, University of Tübingen, Tübingen, Germany
Hervé Bocherens
School of Archaeology, University College Dublin, Dublin, Ireland
Ruth F. Carden
North-Eastern Federal University, Yakutsk, Russian Federation
Sergey Fedorov & Semyon Grigoriev
Hungarian Natural History Museum, Budapest, Hungary
Mihály Gasparik
Royal Belgian Institute of Natural Sciences, Brussels, Belgium
Mietje Germonpré
University of Alaska, Fairbanks, AK, USA
Pam Groves
Naturhistorisches Museum Bern, Bern, Switzerland
Stefan T. Hertwig, Marc Nussbaumer & André Rehazek
Institute of Ecology and Evolution, University of Bern, Bern, Switzerland
Stefan T. Hertwig
VNIIOkeangeologiya, St Petersburg, Russian Federation
Varvara V. Ivanova
University of Leiden, Leiden, the Netherlands
Luc Janssens
Institute for the History of Material Culture, Russian Academy of Sciences, St Petersburg, Russian Federation
Aleksei K. Kasparov & Vladimir V. Pitulko
Ice Age Museum, Shidlovskiy National Alliance ‘Ice Age’, Moscow, Russian Federation
Irina V. Kirillova
Department of Archaeology, Ethnology and Museology, Al-Farabi Kazakh State University, Almaty, Kazakhstan
Islam Kurmaniyazov
Sobolev Institute of Geology and Mineralogy, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russian Federation
Yaroslav V. Kuzmin
Ural Federal University, Yekaterinburg, Russian Federation
Pavel A. Kosintsev
Moravian Museum, Brno, Czech Republic
Martina Lázničková-Galetová
INRAP, Metz, France
Charlotte Leduc
Geological Institute, Russian Academy of Sciences, Moscow, Russian Federation
Pavel Nikolskiy
National Monuments Service, Department of Housing, Local Government and Heritage, Dublin, Ireland
Cóilín O’Drisceoil
Centre d’Anthropobiologie et de Génomique de Toulouse UMR 5288, CNRS, Faculté de Médecine Purpan, Université Paul Sabatier, Toulouse, France
Ludovic Orlando & Andaine Seguin-Orlando
Department of Archaeology, University of Exeter, Exeter, UK
Alan Outram
Arctic & Antarctic Research Institute, St Petersburg, Russian Federation
Elena Y. Pavlova
PaleoWest, Henderson, NV, USA
Angela R. Perri
Department of Anthropology, University of Nevada, Las Vegas, Las Vegas, NV, USA
Angela R. Perri
Museum & Institute of Zoology, Polish Academy of Sciences, Gdańsk, Poland
Małgorzata Pilot
Academy of Sciences of Sakha Republic, Yakutsk, Russian Federation
Valerii V. Plotnikov & Albert V. Protopopov
Zoological Institute of the Russian Academy of Sciences, St. Petersburg, Russian Federation
Mikhail Sablin
Stockholm University, Stockholm, Sweden
Jan Storå & Anders Götherström
Service Régional de l’Archéologie, Orléans, France
Christian Verjux
Institute of Archaeology and Steppe Civilizations, Al-Farabi Kazakh National University, Almaty, Kazakhstan
Victor F. Zaibert
Yukon Palaeontology Program, Whitehorse, Yukon Territories, Canada
Grant Zazula
Collections and Research, Canadian Museum of Nature, Ottawa, Ontario, Canada
Grant Zazula
Department of Archaeology, Ghent University, Ghent, Belgium
Philippe Crombé
Department of Zoology, University of Cambridge, Cambridge, UK
Eske Willerslev
Estación Biológica de Doñana (EBD-CSIC), Sevilla, Spain
Jennifer A. Leonard
University Museum, NTNU, Trondheim, Norway
M. Thomas P. Gilbert
Howard Hughes Medical Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
Beth Shapiro
Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
Johannes Krause
Human Evolution and Archaeological Sciences, University of Vienna, Vienna, Austria
Ron Pinhasi

Authors

Anders Bergström
View author publications
You can also search for this author in PubMed Google Scholar
David W. G. Stanton
View author publications
You can also search for this author in PubMed Google Scholar
Ulrike H. Taron
View author publications
You can also search for this author in PubMed Google Scholar
Laurent Frantz
View author publications
You can also search for this author in PubMed Google Scholar
Mikkel-Holger S. Sinding
View author publications
You can also search for this author in PubMed Google Scholar
Erik Ersmark
View author publications
You can also search for this author in PubMed Google Scholar
Saskia Pfrengle
View author publications
You can also search for this author in PubMed Google Scholar
Molly Cassatt-Johnstone
View author publications
You can also search for this author in PubMed Google Scholar
Ophélie Lebrasseur
View author publications
You can also search for this author in PubMed Google Scholar
Linus Girdland-Flink
View author publications
You can also search for this author in PubMed Google Scholar
Daniel M. Fernandes
View author publications
You can also search for this author in PubMed Google Scholar
Morgane Ollivier
View author publications
You can also search for this author in PubMed Google Scholar
Leo Speidel
View author publications
You can also search for this author in PubMed Google Scholar
Shyam Gopalakrishnan
View author publications
You can also search for this author in PubMed Google Scholar
Michael V. Westbury
View author publications
You can also search for this author in PubMed Google Scholar
Jazmin Ramos-Madrigal
View author publications
You can also search for this author in PubMed Google Scholar
Tatiana R. Feuerborn
View author publications
You can also search for this author in PubMed Google Scholar
Ella Reiter
View author publications
You can also search for this author in PubMed Google Scholar
Joscha Gretzinger
View author publications
You can also search for this author in PubMed Google Scholar
Susanne C. Münzel
View author publications
You can also search for this author in PubMed Google Scholar
Pooja Swali
View author publications
You can also search for this author in PubMed Google Scholar
Nicholas J. Conard
View author publications
You can also search for this author in PubMed Google Scholar
Christian Carøe
View author publications
You can also search for this author in PubMed Google Scholar
James Haile
View author publications
You can also search for this author in PubMed Google Scholar
Anna Linderholm
View author publications
You can also search for this author in PubMed Google Scholar
Semyon Androsov
View author publications
You can also search for this author in PubMed Google Scholar
Ian Barnes
View author publications
You can also search for this author in PubMed Google Scholar
Chris Baumann
View author publications
You can also search for this author in PubMed Google Scholar
Norbert Benecke
View author publications
You can also search for this author in PubMed Google Scholar
Hervé Bocherens
View author publications
You can also search for this author in PubMed Google Scholar
Selina Brace
View author publications
You can also search for this author in PubMed Google Scholar
Ruth F. Carden
View author publications
You can also search for this author in PubMed Google Scholar
Dorothée G. Drucker
View author publications
You can also search for this author in PubMed Google Scholar
Sergey Fedorov
View author publications
You can also search for this author in PubMed Google Scholar
Mihály Gasparik
View author publications
You can also search for this author in PubMed Google Scholar
Mietje Germonpré
View author publications
You can also search for this author in PubMed Google Scholar
Semyon Grigoriev
View author publications
You can also search for this author in PubMed Google Scholar
Pam Groves
View author publications
You can also search for this author in PubMed Google Scholar
Stefan T. Hertwig
View author publications
You can also search for this author in PubMed Google Scholar
Varvara V. Ivanova
View author publications
You can also search for this author in PubMed Google Scholar
Luc Janssens
View author publications
You can also search for this author in PubMed Google Scholar
Richard P. Jennings
View author publications
You can also search for this author in PubMed Google Scholar
Aleksei K. Kasparov
View author publications
You can also search for this author in PubMed Google Scholar
Irina V. Kirillova
View author publications
You can also search for this author in PubMed Google Scholar
Islam Kurmaniyazov
View author publications
You can also search for this author in PubMed Google Scholar
Yaroslav V. Kuzmin
View author publications
You can also search for this author in PubMed Google Scholar
Pavel A. Kosintsev
View author publications
You can also search for this author in PubMed Google Scholar
Martina Lázničková-Galetová
View author publications
You can also search for this author in PubMed Google Scholar
Charlotte Leduc
View author publications
You can also search for this author in PubMed Google Scholar
Pavel Nikolskiy
View author publications
You can also search for this author in PubMed Google Scholar
Marc Nussbaumer
View author publications
You can also search for this author in PubMed Google Scholar
Cóilín O’Drisceoil
View author publications
You can also search for this author in PubMed Google Scholar
Ludovic Orlando
View author publications
You can also search for this author in PubMed Google Scholar
Alan Outram
View author publications
You can also search for this author in PubMed Google Scholar
Elena Y. Pavlova
View author publications
You can also search for this author in PubMed Google Scholar
Angela R. Perri
View author publications
You can also search for this author in PubMed Google Scholar
Małgorzata Pilot
View author publications
You can also search for this author in PubMed Google Scholar
Vladimir V. Pitulko
View author publications
You can also search for this author in PubMed Google Scholar
Valerii V. Plotnikov
View author publications
You can also search for this author in PubMed Google Scholar
Albert V. Protopopov
View author publications
You can also search for this author in PubMed Google Scholar
André Rehazek
View author publications
You can also search for this author in PubMed Google Scholar
Mikhail Sablin
View author publications
You can also search for this author in PubMed Google Scholar
Andaine Seguin-Orlando
View author publications
You can also search for this author in PubMed Google Scholar
Jan Storå
View author publications
You can also search for this author in PubMed Google Scholar
Christian Verjux
View author publications
You can also search for this author in PubMed Google Scholar
Victor F. Zaibert
View author publications
You can also search for this author in PubMed Google Scholar
Grant Zazula
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Crombé
View author publications
You can also search for this author in PubMed Google Scholar
Anders J. Hansen
View author publications
You can also search for this author in PubMed Google Scholar
Eske Willerslev
View author publications
You can also search for this author in PubMed Google Scholar
Jennifer A. Leonard
View author publications
You can also search for this author in PubMed Google Scholar
Anders Götherström
View author publications
You can also search for this author in PubMed Google Scholar
Ron Pinhasi
View author publications
You can also search for this author in PubMed Google Scholar
Verena J. Schuenemann
View author publications
You can also search for this author in PubMed Google Scholar
Michael Hofreiter
View author publications
You can also search for this author in PubMed Google Scholar
M. Thomas P. Gilbert
View author publications
You can also search for this author in PubMed Google Scholar
Beth Shapiro
View author publications
You can also search for this author in PubMed Google Scholar
Greger Larson
View author publications
You can also search for this author in PubMed Google Scholar
Johannes Krause
View author publications
You can also search for this author in PubMed Google Scholar
Love Dalén
View author publications
You can also search for this author in PubMed Google Scholar
Pontus Skoglund
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

A.J.H., E.W., J.A.L., A.G., R.P., V.J.S., M.H., M.T.P.G., B.S., G.L., J.K., L.D. and P. Skoglund supervised the study. S.A., N.B., H.B., R.F.C., D.G.D., S.F., M. Gasparik, M. Germonpré, S. Grigoriev, P.G., S.T.H., V.V.I., L.J., R.P.J., A.K.K., I.V.K., I.K., Y.V.K., P.A.K., M.L.-G., C.L., P.N., M.N., C.O’D., A.O., E.Y.P., V.V. Pitulko, V.V. Plotnikov, A.V.P., A.R., M.S., J.S., C.V., V.F.Z., G.Z. and P.C. excavated or curated samples. D.W.G.S., U.H.T., M.-H.S.S., E.E., S.P., M.C.-J., O.L., L.G.-F., D.M.F., M.O., M.V.W., T.R.F., E.R., J.G., S.C.M., N.J.C., C.C., J.H., A.L., I.B., C.B., S.B., L.O., A.R.P. and A.S.-O. generated data through sample preparation and/or laboratory work. A.B., D.W.G.S., U.H.T., L.F., M.-H.S.S., L.S., S.G., J.R.-M., P. Swali, M.P. and P. Skoglund analysed and/or curated genomic data. A.B. and P. Skoglund wrote the paper with input from all authors.

Corresponding authors

Correspondence to Anders Bergström or Pontus Skoglund.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature thanks Kieren Mitchell, Claire Wade and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 f-statistics informing on wolf population history.

Bars denote ±1.96 standard errors for f₃-statistics, and ±3 standard errors for f₄-statistics, estimated from a block jackknife. a) Outgroup f₃-statistics quantifying shared genetic drift with a present-day wolf (Fig. S3). b) f₄-statistics contrasting affinities to a pre-LGM and a post-LGM Siberian wolf (Fig. S4). c) f₄-statistics contrasting affinities to a Siberian and a European pre-LGM wolf (Fig. S6). d) f₄-statistics quantifying whether a ~60 ky old Siberian wolf is closer to a contemporaneous European wolf or other individuals (Fig. S7). e) f₄-statistics quantifying whether a coyote is closer to a ~100ky old Siberian wolf or later individuals.

Extended Data Fig. 2 Placing dogs into wolf diversity in a ‘pre-LGM f₄ PCA’.

PCA on wolves that lived after 25 ka (including present-day), based on profiles of f₄-statistics only of the form f₄(X,A;B,C), where A, B, and C are wolves that lived prior to 28 kya. Dogs are projected. Dogs are coloured according to the f₄-statistic f₄(AndeanFox,X;Zhokhov dog 9.5ka,Tel Hreiz dog 7.2ka), with negative values going towards blue and positive values towards red. A few wolves (in colour) and dogs (in black) of particular interests are indicated with text labels. a) PC1 vs PC2 with the full set of wolves. b) PC3 vs PC4 with the full set of wolves. c) PC1 vs PC2 with western Chinese and North American outlier wolves removed. d) PC3 vs PC4 with western Chinese and North American outlier wolves removed.

Extended Data Fig. 3 Affinities of dogs to ancient wolves.

a) f₄-statistics of the form f₄(AndeanFox,X;wolf A,wolf B), quantifying for all individuals X whether they share more drift with wolf A or wolf B. The ages of A and B are indicated with dashed lines, with positive values indicating affinity to the upper individual and negative values indicating affinity to the lower individual. Bars denote ±3 standard errors estimated from a block jackknife.

Extended Data Fig. 4 A schematic model of how deep population structure could explain why dogs require ancestry from an outgroup population in qpAdm analyses.

Under this model, there is deep population structure between different wolf populations, including the wolf population that becomes the progenitor of dogs. High rates of gene flow over time largely homogenises the ancestry of all populations, but it does not completely erase the deep structure. If the true dog progenitor population is not sampled, a single-source qpAdm model involving one of the sampled wolf populations will not fit dog ancestry, because dogs do not share all of the genetic drift that has occurred in the history of the sampled population. But if an outgroup population is included as a source in qpAdm, this can account for the ‘missing’ deep ancestry in dogs, and therefore result in a model that fits dog ancestry.

Extended Data Fig. 5 Projecting dogs onto present-day wolf population structure.

Principal components analyses performed only on modern wolves, with modern dogs projected.

Extended Data Fig. 6 "Ocean plot" searching for the best available wolf match for the ancestry of eastern dogs.

With the Siberian Zhokhov dog (9.5k BP) as the target, each candidate wolf X was added in turn into the rotating qpAdm analysis. When X is not part of the sources, it is placed in the reference list. Models placed within the gray space labelled “Failed” have p-values fall below the lower limit of the plot.

Extended Data Fig. 7 "Ocean plot" searching for the best available wolf match for the west Eurasian wolf-related ancestry in western dogs.

With the African Basenji dog as a target, all available post-LGM and present-day wolf genomes X are tested as sources combined with the 9.5k-year old Siberian Zhokhov dog, which is assumed to represent a baseline for the Eastern-related dog progenitor ancestry. When X is not part of the sources, it is placed in the reference list. If a target has a model with p > 0.01, models with a larger number of sources are not plotted. Only four individuals achieve good fits in the two-source model (Zhokhov + X): WolfSyria, Wolf07Israel, Wolf20Iran and Wolf19India. For other individuals, including ancient and present-day European wolves, the two-source model can be rejected, and a three-source model with an unsampled ancestry component (Zhokhov + X + unsampled) is needed to fit the data.

Extended Data Table 1 Selection peaks

Full size table

Extended Data Table 2 qpWave tests of dog cladality

Full size table

Supplementary information

Supplementary Information

This file contains the following five sections: section 1: mitochondrial phylogeny and dating; section 2: wolf population history analyses; section 3: MSMC2 analyses of N_e history and divergence times; section 4: natural selection analyses; section 5: dog ancestry analyses. The file also includes Supplementary Figs. 1–22, Supplementary Tables 1–3 and additional references.

Reporting Summary

Peer Review File

Supplementary Data 1–3

This Excel file contains the following three sections: Supplementary Data 1: metadata for ancient wolf genomes; Supplementary Data 2: metadata and sources of previously published genomic data; Supplementary Data 3: detailed results from the natural selection scan.

Supplementary Data 4

Newly reported mitochondrial consensus sequences in fasta format.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Bergström, A., Stanton, D.W.G., Taron, U.H. et al. Grey wolf genomic history reveals a dual ancestry of dogs. Nature 607, 313–320 (2022). https://doi.org/10.1038/s41586-022-04824-9

Download citation

Received: 27 August 2021
Accepted: 28 April 2022
Published: 29 June 2022
Issue Date: 14 July 2022
DOI: https://doi.org/10.1038/s41586-022-04824-9

This article is cited by

Population structure and adaptability analysis of Schizothorax o’connori based on whole-genome resequencing
- Kuo Gao
- Zhi He
- Taiming Yan
BMC Genomics (2024)
Japanese wolves are most closely related to dogs and share DNA with East Eurasian dogs
- Jun Gojobori
- Nami Arakawa
- Yohey Terai
Nature Communications (2024)
Zoonotic parasites associated with predation by dogs and cats
- Jairo Alfonso Mendoza Roldan
- Domenico Otranto
Parasites & Vectors (2023)
Possible origins and implications of atypical morphologies and domestication-like traits in wild golden jackals (Canis aureus)
- Ayelet Barash
- Shlomo Preiss-Bloom
- Yaron Dekel
Scientific Reports (2023)
The paleo-synanthropic niche: a first attempt to define animal’s adaptation to a human-made micro-environment in the Late Pleistocene
- Chris Baumann
Archaeological and Anthropological Sciences (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.