Letter | Published:

Genomic analyses inform on migration events during the peopling of Eurasia

Nature volume 538, pages 238242 (13 October 2016) | Download Citation


High-coverage whole-genome sequence studies have so far focused on a limited number1 of geographically restricted populations2,3,4,5, or been targeted at specific diseases, such as cancer6. Nevertheless, the availability of high-resolution genomic data has led to the development of new methodologies for inferring population history7,8,9 and refuelled the debate on the mutation rate in humans10. Here we present the Estonian Biocentre Human Genome Diversity Panel (EGDP), a dataset of 483 high-coverage human genomes from 148 populations worldwide, including 379 new genomes from 125 populations, which we group into diversity and selection sets. We analyse this dataset to refine estimates of continent-wide patterns of heterozygosity, long- and short-distance gene flow, archaic admixture, and changes in effective population size through time as well as for signals of positive or balancing selection. We find a genetic signature in present-day Papuans that suggests that at least 2% of their genome originates from an early and largely extinct expansion of anatomically modern humans (AMHs) out of Africa. Together with evidence from the western Asian fossil record11, and admixture between AMHs and Neanderthals predating the main Eurasian expansion12, our results contribute to the mounting evidence for the presence of AMHs out of Africa earlier than 75,000 years ago.


The paths taken by AMHs out of Africa (OoA) have been the subject of considerable debate over the past two decades. Fossil and archaeological evidence13,14, and craniometric studies15 of African and Asian populations, demonstrate that Homo sapiens was present outside of Africa ~120–70 thousand years ago (kya)11. However, this colonization has been viewed as a failed expansion OoA16 since genetic analyses of living populations have been consistent with a single OoA followed by serial founder events17.

Ancient DNA (aDNA) sequencing studies have found support for admixture between early Eurasians and at least two archaic human lineages18,19, and suggest modern humans reached Eurasia at around 100 kya12. In addition, aDNA from modern humans suggests population structuring and turnover, but little additional archaic admixture, in Eurasia over the last 35–45 thousand years20,21,22. Overall, these findings indicate that the majority of human genetic diversity outside Africa derives from a single dispersal event that was followed by admixture with archaic humans18,23.

We used ADMIXTURE to analyse the genetic structure in our diversity set (Extended Data Figs 1, 2; Supplementary Information 1.1–7). We further compared the individual-level haplotype similarity of our samples using fineSTRUCTURE (Extended Data Fig. 3). Despite small sample sizes, we inferred 106 genetically distinct populations forming 12 major regional clusters, corresponding well to the 148 self-identified population labels. This clustering forms the basis for the groupings used in the scans of natural selection. Similar genetic affinities are highlighted by plotting the outgroup f3 statistic9 in the form f3(X, Y; Yoruba), which here measures shared drift between a non-African population X and any modern or ancient population Y from Yoruba as an African outgroup (Supplementary Information 2.2.6, Extended Data Fig. 4).

Our sampling allowed us to consider geographic features correlated with gene flow by spatially interpolating genetic similarity measures between pairs of populations (Supplementary Information 2.2.2). We considered several measures and report gradients of allele frequencies in Fig. 1, which was compared to gene flow patterns from EEMS24 as a validation (Extended Data Fig. 5). Controlling for pairwise geographic distance, we find a correlation between these genetic gradients and geographic and climatic features such as precipitation and elevation (inset of Fig. 1, Supplementary Information 2.2.2).

Figure 1: Genetic barriers across space.
Figure 1

Spatial visualization of genetic barriers inferred from genome-wide genetic distances, quantified as the magnitude of the gradient of spatially interpolated allele frequencies (value denoted by colour bar; grey areas have been land during the last glacial maximum but are currently underwater). Here we used a spatial kernel smoothing method based on the matrix of pairwise average heterozygosity and a MATLAB script that plots the hexagons of the grid with a colour coding to represent gradients. Inset, partial correlation between magnitude of genetic gradients and combinations of different geographic factors, elevation (E), temperature (T) and precipitation (R), for genetic gradients from fineSTRUCTURE (red) and allele frequencies (blue). This analysis (Supplementary Information 2.2.2 for details) shows that genetic differences within this region display some correlation with physical barriers such as mountain ranges, deserts, forests, and open water (such as the Wallace line).

We screened for evidence of selection by first focusing on loci that showed the highest allelic differentiation among groups (Supplementary Information 3). We then performed positive and purifying selection scans (Methods), and found some candidate loci that replicate previously known and functionally supported findings (Supplementary Table 1:3.3.4-I, Supplementary Information 3.1, Extended Data Fig. 6; Supplementary Table 1:3.1-IV,VI). Additionally, we infer more purifying selection in Africans in genes involved in pigmentation (bootstrapping p value (bpv) for RX/Y scores < 0.05) (Extended Data Fig. 6) and immune response against viruses (bpv < 0.05), while further purifying selection was indicated on olfactory receptor genes in Asians (bpv < 0.05) (Supplementary Table 1:3.1.1-II). Our scans for ancient balancing selection found a significant enrichment (FDR < 0.01) of antigen processing/presentation, antigen binding, and MHC and membrane component genes (Supplementary Information 3.2 and 3.3, Supplementary Table 1:3.3.2-I–III). The HLA (HLA-C)-associated gene (BTNL2) was the top highest scoring candidate in 8 of 12 geographic regions for the HKA test (Supplementary Table 1:3.3.1-I). Our positive selection scans, variant-based analyses (Supplementary Information 3.2 and 3.3) and gene enrichment studies also suggest new candidate loci (Supplementary Information 3.4 and 3.5, Supplementary Table 1:3.5-I–VI), a subset of which is highlighted in Supplementary Table 1:3-I.

Using fineSTRUCTURE, we find in the genomes of Papuans and Philippine Negritos more short haplotypes assigned as African than seen in genomes for individuals from other non-African populations (Extended Data Fig. 7). This pattern remains after correcting for potential confounders such as phasing errors and sampling bias (Supplementary Information 2.2.1). These shorter shared haplotypes would be consistent with an older population split25. Indeed, the Papuan–Yoruban median genetic split time (using multiple sequential Markovian coalescent (MSMC)) of 90 kya predates the split of all mainland Eurasian populations from Yorubans at ~75 kya (Supplementary Table 1:2.2.3-I, Extended Data Fig. 4, Fig. 2a). This result is robust to phasing artefacts (Extended Data Fig. 8, see Methods). Furthermore, the Papuan–Eurasian MSMC split time of ~40 kya is only slightly older than splits between west Eurasian and East Asian populations dated at ~30 kya (Extended Data Fig. 4). The Papuan split times from Yoruba and Eurasia are therefore incompatible with a simple bifurcating population tree model.

Figure 2: Evidence of an xOoA signature in the genomes of modern Papuans.
Figure 2

a, MSMC split times plot. The Yoruba–Eurasia split curve shows the mean of all Eurasian genomes against one Yoruba genome. The grey area represents top and bottom 5% of runs. We chose a Koinanbe genome as representative of the Sahul populations. bd, Decomposition of Papuan haplotypes inferred as African by fineSTRUCTURE. b, Semi-parametric decomposition of the joint distribution of haplotype lengths and non-African derived allele rate per SNP, showing the relative proportion of haplotypes in K = 20 components of the distribution, ordered by non-African derived allele rate, relative to the overall proportion of haplotypes in each component. The four datasets produced by considering haplotypes inferred as (African/Denisova) in (Europeans/Papuans) are shown with our inferred ‘extra Out-of-Africa’ (xOoA) component. AFR, African; DEN, Denisova; PNG, Papuans; EUR, Europeans. c, The properties of the components in terms of non-African derived allele rate, on which the components are ordered, and length. d, The reconstruction of haplotypes inferred as African in the genomes of Papuan individuals, using a mixture of all other data (red) and with the addition of the xOoA signature (black).

At least two main models could explain our estimates of older divergence dates for Sahul populations from Africa than mainland Eurasians in our sample: 1) admixture in Sahul with a potentially un-sampled archaic human population that split from modern humans either before or at the same time as did Denisova and Neanderthal; or 2) admixture in Sahul with a modern human population (extinct OoA line; xOoA) that left Africa after the split between modern humans and Neanderthals, but before the main expansion of modern humans in Eurasia (main OoA).

We consider support for these two non-mutually exclusive scenarios. Because the introgressing lineage has not been observed with aDNA, standard methods are limited in their ability to distinguish between these hypotheses. Furthermore, we show (Supplementary Information 2.2.7) that single-site statistics, such as Patterson’s D9,18 and sharing of non-African Alleles (nAAs), are inherently affected by confounding effects owing to archaic introgression in non-African populations23. Our approach therefore relies on multiple lines of evidence using haplotype-based MSMC and fineSTRUCTURE comparisons (which we show should have power at this timescale26; Supplementary Information 2.2.13).

We located and masked putatively introgressed27 Denisova haplotypes from the genomes of Papuans, and evaluated phasing errors by symmetrically phasing Papuans and Eurasians genomes (Methods). Neither modification (Fig. 2a, Supplementary Information 2.2.9, Supplementary Table 1:2.2.9-I) changed the estimated split time (based on MSMC) between Africans and Papuans (Methods, Supplementary Information 2.2.8, Extended Data Fig. 8, Supplementary Table 1.2.8-I). MSMC dates behave approximately linearly under admixture (Extended Data Fig. 8), implying that the hypothesized lineage may have split from most Africans around 120 kya (Supplementary Information 2.2.4 and 2.2.8).

We compared the effect on the MSMC split times of an xOoA or a Denisova lineage in Papuans by extensive coalescent simulations (Supplementary Information 2.2.8). We could not simulate the large Papuan–African and Papuan–Eurasian split times inferred from the data, unless assuming an implausibly large contribution from a Denisova-like population. Furthermore, while the observed shift in the African–Papuan MSMC split curve can be qualitatively reproduced when including a 4% genomic component that diverged 120 kya from the main human lineage within Papuans, a similar quantity of Denisova admixture does not produce any significant effect (Extended Data Fig. 8). This favours a small presence of xOoA lineages rather than Denisova admixture alone as the likely cause of the observed deep African–Papuan split. We also show (Methods) that such a scenario is compatible with the observed mitochondrial DNA and Y chromosome lineages in Oceania, as also previously argued13,28.

We further tested our hypothesized xOoA model by analysing haplotypes in the genomes of Papuans that show African ancestry not found in other Eurasian populations. We re-ran fineSTRUCTURE adding the Denisova, Altai Neanderthal and the Human Ancestral Genome sequences29 to a subset of the diversity set. FineSTRUCTURE infers haplotypes that have a most recent common ancestor (MRCA) with another individual. Papuan haplotypes assigned as African had, regardless, an elevated level of non-African derived alleles (that is, nAAs fixed ancestral in Africans) compared to such haplotypes in Eurasians. They therefore have an older mean coalescence time with our African samples.

Owing to the deep divergence between the sampled Denisova and the one introgressed into modern humans, it is possible that some archaic haplotypes have a MRCA with an African instead of Denisova and are assigned as ‘African’. We can resolve the coalescence time, and hence origin, of these haplotypes by their sequence similarity with modern Africans. To account for the archaic introgression we modelled these genomic segments as a mixture of haplotypes assigned a) as African or b) as Denisova in Eurasians and c) haplotypes assigned as Denisova in Papuans. These haplotypes are modelled (see Methods, Extended Data Fig. 9) in terms of the distribution of length and mutation rate measured as a density of non-African derived alleles. Since Eurasians (specifically Europeans) have not experienced Denisova admixture, this approach disentangles lineages that coalesce before the human/Denisova split from those that coalesce after.

We found that the xOoA signature (Fig. 2b–d; Supplementary Information 2.2.10) was necessary to account for the number of short haplotypes with ‘moderate’ nAAs density in the data (that is, proportion of non-African-derived sites higher than that of Eurasian haplotypes assigned as African but significantly lower than that of those assigned Denisova in either Eurasians or Papuans). Consistent with our MSMC findings (Supplementary Information 2.2.4), xOoA haplotypes have an estimated MRCA 1.5 times older than the Eurasian haplotypes in Papuan genomes, while the Denisovan haplotypes in Papuans are four times older than the Eurasian haplotypes. Adding up the contributions across the genome (Methods) leads to a genome-wide estimate of 1.9% xOoA (95% confidence interval 1.5–3.3) in Papuans, which we view as a lower bound.

Our results consistently point towards a contribution from a modern human source for derived29 alleles that are found in the genome sequence of Papuans but not in Africans. Possible confounders could involve a shorter generation time in Papuan and Philippine Negrito populations30, different recombination processes, or alternative demographic histories that have not been investigated here. We therefore strongly encourage the development of new model-based approaches that can investigate further the haplotype patterns described here.

In conclusion, our results suggest that while the genomes of modern Papuans derive primarily from the main expansion of modern humans out of Africa, we estimate that at least 2% of their genome sequence reflects an earlier, otherwise extinct, dispersal (Extended Data Fig. 10).

The inferred date of the xOoA split time (~120 kya) is consistent with fossil and archaeological evidence for an early expansion of H. sapiens from Africa13,14. Furthermore, the recently identified modern human admixture into the Altai Neanderthal before 100 kya12 is consistent with a modern human presence outside Africa well before the main OoA split time (~75 kya). Further studies will confirm whether the Papuan genetic signature reported here and the one observed in Altai Neanderthals reflect the same xOoA human group, as well as clarify the timing and route followed during such an early expansion. The high similarity between Papuans and the Altai Neanderthal reported in Extended Data Fig. 1 may indeed reflect a shared xOoA component. Further studies are needed to explore this model and suggest that understanding human evolutionary history will require the recovery of aDNA from additional fossils, and further archaeological investigations in under-explored geographical regions.


No statistical methods were used to predetermine sample size. The experiments were not randomized and the investigators were not blinded to allocation during experiments and outcome assessment.

Data preparation

We analyse a set of genomes sequenced by the same technology (Complete Genomics Inc.) which results in minimal platform differences between batches of samples analysed by slight modifications of CG proprietary pipeline (Extended Data Fig. 2; Supplementary Information 1.6). Informed consent forms and REC approvals were obtained for all samples newly collected for this study. We see good concordance between CG sequence and Illumina genotyping array results for the same samples with minor reference bias in the latter data (Extended Data Fig. 2; Supplementary Information 1.6). In the final dataset, we retained only one second-degree (Australians, to make use of all the available samples) and five third-degree relatives pairs (Supplementary Table 1:1.7-I). All genomes were annotated against the Ensembl GRCh37 database and compared to dbSNP Human Build 141 and Phase 1 of the 1000 Genomes Project dataset29 (Supplementary Information 1.1–1.6). We found 10,212,117 new SNPs, 401,911 of which were exonic. As expected from our sampling scheme, existing lists of variable sites have been extended mostly by the Siberian, Southeast Asian and South Asian genomes, which contribute 89,836 (22.4%), 63,964 (15.9%) and 40,758 (10.1%) of the new exonic variants detected in this study.

Compared to the genome-wide average, we see fewer heterozygous sites on chromosomes 1 and 2, and an excess on chromosomes 16, 19 and 21 (Extended Data Fig. 2). This pattern is independent of simple potential confounders, such as rough estimates of recombination activity and gene density (Supplementary Information 1.8), and mirrors the inter-chromosomal differences in divergence from chimpanzee31, suggesting large-scale differences in mutation rates among chromosomes. We confirmed this general pattern using 1000 Genomes Project data (Supplementary Information 1.8).

The ‘ancient genome diversity panel’ consisted of 106 samples from the main Diversity panel along with Altai Neanderthal, Denisova and the Modern Human reference genome. Sites that are heterozygous in archaic humans were removed.

Geographic gradient analyses

We used a Gaussian kernel smoothing (based on the shortest distance on land to each sample) to interpolate genetic patterns across space. Averaging over all markers, we obtained an expression for the mean square gradient of allele frequencies in terms of the matrix of genetic distance between pairs of samples (Supplementary Information 2.2.2). This provides a simple way to identify spatial regions that contribute strongly to genetic differences between samples, and can be used, in principle, for any measure of genetic difference (for fineSTRUCTURE data, we used negative shared haplotype length as a measure of differentiation).

To quantify the link between the magnitude of genetic gradients (from fineSTRUCTURE and allele frequency data) and geographic factors, we fitted a generalized linear model to the sum of genetic magnitude gradients on the shortest paths between samples to elevation, minimum quarterly temperature, and annual precipitation summed in the same way, controlling for path length and spatial random effects (Supplementary Information 2.2.2), and calculated partial correlations between genetic gradient magnitudes and geographic factors.

FineSTRUCTURE analysis

FineSTRUCTURE32 was run as described in Supplementary Information 2.2.1. Within the 106 genetically distinct genetic groups, labels were typically genetically homogeneous—113 of the 148 population labels (76%) were assigned to only one ‘genetic cluster’. Similarly, genetic clusters were typically specific to a label, with 66 of the 106 ‘genetic clusters’ (62%) containing only one population label.

Correction for phasing errors

To check whether phasing errors could produce the shorter Papuan haplotypes, we focused on regions of the genome that had an extended (>500 kb) run of homozygosity. We ran ChromoPainter for each individual on only these regions, meaning each individual was only painted where it had been perfectly phased. This did not change the qualitative features (Supplementary Information 2.2.1).

Removal of similar samples

Papuans are genetically distinct from other populations due to tens of thousands of years of isolation. We wanted to check whether the length of haplotypes assigned as African was biased by the inclusion of a large number of relatively homogeneous Eurasians with few Papuans. To do this we repeated the n = 447 painting allowing only donors from dissimilar populations, including only individuals who donated < 2% of a genome in the main painting. This did not change the qualitative haplotype length features (Supplementary Information 2.2.1).

Inclusion of ancient samples

We ran our smaller individual panel with (n = 109) and without (n = 106) ancient samples (Denisova, Neanderthal and ancestral human). This did not change the qualitative haplotype length features (Supplementary Information 2.2.1).

Selection analyses

We investigated balancing, positive and purifying selection for a part of the dataset with larger group sizes which was defined as the Selection subset (Supplementary Table 1:3.1-I and 3.2-I) using a wide range of window-based as well as variant-based approaches. Furthermore, we investigated how these signals relate to shared demographic history. Where possible we contextualized our findings by integrating them with information from various functional databases. Detailed descriptions of all methods used are available in Supplementary Information section 3.

MSMC, Denisova masking, simulations of alternative scenarios and assessment of phasing robustness

Genetic split times were initially calculated following the standard MSMC procedure8, and subsequently modified as follows. To estimate the effect of archaic admixture, putative Denisova haplotypes were identified in Papuans using a previously published method27 and masked from all the analysed genomes. Particularly, whether a putative archaic haplotype was found in heterozygous or homozygous state within the chosen Papuan genome, the ‘affected’ locus was inserted into the MSMC mask files and, hence, removed from the analysis.

We note that a fraction of the Denisova and Neanderthal contributions to the Papuan genomes may be indistinguishable, owing to the shared evolutionary history of these two archaic populations. As a result, some of the removed ‘Denisova’ haplotypes may have actually entered the genome of Papuans through Neanderthal. Regardless of this, our exercise successfully shows that the MSMC split time estimates are not affected by the documented presence of archaic genomic component (whether coming entirely from Denisova or partially shared with Neanderthal).

We further excluded the role of Denisova admixture in explaining the deeper African–Papuan MSMC split times through coalescent simulations (using ms to generate 30 chromosomes of 5 Mbp each, and simulating each scenario 30 times). These showed that the addition of 4% Denisova lineages to the Papuan genomes does not change the MSMC results, while the addition of 4% xOoA lineages recreates the qualitative shift observed in the empirical data.

Phasing artefacts were also taken into account as putative confounders of the MSMC split time estimates. We re-ran MSMC after re-phasing one Estonian, one Papuan and 20 West African and Pygmy genomes in a single experiment. This way we ruled out potential artefacts stemming from the excess of Eurasian over Sahul samples during the phasing process. Both the archaic and phasing corrections yielded the same split time as of the standard MSMC runs.

Emulation of all pairwise MSMC split times

We confirmed that none of the other populations behaved as an outlier from those identified in the n = 22 full pairwise analysis by estimating the MSMC split times between all pairs. We chose 9 representative populations (including Papuan, Yoruba and Baka) from the 22, and compared each of the 447 diversity panel genomes to them. For each individual not in our panel, we obtain the positive mixture weights using the modelThe parameters are estimated using the observations for which we have data using a quadratic loss function. We can then predict the unobserved valuesExamination of this matrix (Supplementary Information 2.2.3, Supplementary Table 1:2.2.3-III) implies no other populations are expected to have unusual MSMC split times from Africa.

Mixture model for African haplotypes in Papuans

Obtaining haplotypes from painting. We define African or Archaic haplotypes in Eurasians or Papuans as genomic loci spanning at least 1,000 bp, and showing SNPs that were assigned by chromopainter a ≥50% chance of copying from either an African or Archaic genome, respectively. For each haplotype we then calculated the number of non-African mutations, defined as sites found in derived state in a given haplotype and in ancestral state in all of the African genomes included in the present study.

Modelling. We used a non-parametric model for the joint distribution of length and non-African derived allele mutation rate in haplotypes. We fit K = 20 components to the joint distribution. Each component has a characteristic length , variability and mutation rate . A haplotype of length with such mutations from component has the following distribution:This model for haplotype lengths is motivated by the extreme age of the split times we seek to model. Recent splits would lead to an exponential distribution of haplotype lengths. However, owing to haplotype fixation caused by finite population size, very old splits have finite (non-zero) haplotype lengths. Additionally, the data are left-censored since we cannot reliably detect haplotypes that are very short. We note that while this makes a single component a reasonable fit to the data, as K increases the specific choice becomes less important.

We then impose the prior and use the expectation-maximization algorithm to estimate the mixture proportions along with the maximum likelihood parameter estimates . We do this for the four combinations of haplotypes assigned as African (AFR) and Denisova (DEN) found in Papuans (PNG) or Europeans (EUR), in order to learn the parameters. Supplementary Information 2.2.10 describes this in more detail. We then describe the distribution of haplotypes for each class of haplotype in terms of the expected proportion of haplotypes found in each component,where is the number of haplotypes of class . is a vector of the proportions from each of the components.

Single-out-of-Africa model. We fit haplotypes assigned as African in Papuans as a mixture of the others in a second layer of mixture modelling:where sum to 1. This is straightforward to fit.

xOoA model. We jointly estimate an additional component and the mixture contributions under the mixtureThis is non-trivial to fit. We use a penalization scheme to simultaneously ensure we a) obtain a valid mixture for ; b) give a prediction that is also a valid mixture; c) leave little signal in the residuals; and d) obtain a good fit. Cross-validation is used to obtain the optimal penalization parameters ( and ) with the loss function:where are the residuals in each component, (for a valid mixture) and (for requirement c, good solutions will have similar residuals across components). The loss is minimized via standard optimization techniques. Supplementary Information 2.2.10 details how initial values are found and explores the robustness of the solution to changes in A and B—the results do not change qualitatively for reasonable choices of these parameters, and the mixtures are valid to within numerical error.

Genome-wide xOoA estimation. We used the estimated xOoA derived allele mutation rate estimate to estimate the xOoA contribution in haplotypes classed as Eurasian or Papuan by ChromoPainter. First we obtained estimates of and using the single out-of-Africa model above, additionally allowing for a EUR.EUR contribution. We then estimate using the observed mutation rate and that predicted under the mixture model by rearranging the mixture:Estimates less than 0 are set to 0. The genome-wide estimate is obtained by weighting each by the proportion of the genome that was painted with that donor. Neanderthal and Denisova haplotypes were assumed to be proxied by PNG.DEN (0% xOoA by assumption); African haplotypes by PNG.AFR; Papuan and Australian by PNG.PNG and all other haplotypes by PNG.EUR. We obtain confidence intervals by bootstrap resampling of haplotypes for each donor/recipient pair.

We estimate the proportion of xOoA in Papuan haplotypes assigned as both Eurasian (0.1%, 95% CI 0–2.6) and Papuan (4%, 95% CI 2.9–4.5) (Supplementary Information 2.2.10), by using the estimated mutation density in xOoA.

Y chromosome and mtDNA haplopgroup analysis. The presence of an extinct xOoA trace in the genome of modern Papuans may seem at odds with analyses of mtDNA and Y chromosome phylogenies, which point to a single, recent origin for all non-African lineages (mtDNA L3, which gives rise to all mtDNA lineages outside Africa has been dated at ~70,000 years old33,34). However, uniparental markers inform on a small fraction of our genetic history, and a single origin for all non-African lineages does not exclude multiple waves OoA from a shared common ancestor. We show analytically (Supplementary Information 2.2.12) that, if the xOoA signature entered the genome of Papuan individuals > 40 kya, their mtDNA and Y lineages could have been lost by genetic drift even assuming an initial xOoA mixing component of up to 35%. Similar findings have been reported recently13.


Primary accessions

European Nucleotide Archive

Data deposits

The newly sequenced genomes are part of the Estonian Biocentre human Genome Diversity Panel (EGDP) and were deposited in the ENA archive under accession number PRJEB12437 and are also freely available through the Estonian Biocentre website (www.ebc.ee/free_data)


  1. 1.

    et al. Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays. Science 327, 78–81 (2010)

  2. 2.

    et al. Evolutionary history and adaptation from high-coverage whole-genome sequences of diverse African hunter-gatherers. Cell 150, 457–469 (2012)

  3. 3.

    et al. Tracing the route of modern humans out of Africa by using 225 human genome sequences from Ethiopians and Egyptians. Am. J. Hum. Genet. 96, 986–991 (2015)

  4. 4.

    et al. A selective sweep on a deleterious mutation in CPT1A in Arctic populations. Am. J. Hum. Genet. 95, 584–589 (2014)

  5. 5.

    et al. Large-scale whole-genome sequencing of the Icelandic population. Nat. Genet. 47, 435–444 (2015)

  6. 6.

    et al. The Cancer Genome Atlas Pan-Cancer analysis project. Nat. Genet. 45, 1113–1120 (2013)

  7. 7.

    & Inference of human population history from individual whole-genome sequences. Nature 475, 493–496 (2011)

  8. 8.

    & Inferring human population size and separation history from multiple genome sequences. Nat. Genet. 46, 919–925 (2014)

  9. 9.

    et al. Ancient admixture in human history. Genetics 192, 1065–1093 (2012)

  10. 10.

    & Revising the human mutation rate: implications for understanding human evolution. Nat. Rev. Genet. 13, 745–753 (2012)

  11. 11.

    et al. Climatic variability, plasticity, and dispersal: a case study from Lake Tana, Ethiopia. J. Hum. Evol. 87, 32–47 (2015)

  12. 12.

    et al. Ancient gene flow from early modern humans into Eastern Neanderthals. Nature 530, 429–433 (2016)

  13. 13.

    et al. Rethinking the dispersal of Homo sapiens out of Africa. Evol. Anthropol. 24, 149–164 (2015)

  14. 14.

    et al. The earliest unequivocally modern humans in southern China. Nature 526, 696–699 (2015)

  15. 15.

    et al. Genomic and cranial phenotype data support multiple modern human dispersals from Africa and a southern route into Asia. Proc. Natl Acad. Sci. USA 111, 7248–7253 (2014)

  16. 16.

    , , , & Genetic and archaeological perspectives on the initial modern human colonization of southern Asia. Proc. Natl Acad. Sci. USA 110, 10699–10704 (2013)

  17. 17.

    , & Geography predicts neutral genetic diversity of human populations. Curr. Biol. 15, R159–R160 (2005)

  18. 18.

    et al. A draft sequence of the Neandertal genome. Science 328, 710–722 (2010)

  19. 19.

    et al. Denisova admixture and the first modern human dispersals into Southeast Asia and Oceania. Am. J. Hum. Genet. 89, 516–528 (2011)

  20. 20.

    et al. Genome sequence of a 45,000-year-old modern human from western Siberia. Nature 514, 445–449 (2014)

  21. 21.

    et al. A revised timescale for human evolution based on ancient mitochondrial genomes. Curr. Biol. 23, 553–559 (2013)

  22. 22.

    et al. The genetic history of Ice Age Europe. Nature 534, 200–205 (2016)

  23. 23.

    et al. A high-coverage genome sequence from an archaic Denisovan individual. Science 338, 222–226 (2012)

  24. 24.

    , & Visualizing spatial population structure with estimated effective migration surfaces. Nat. Genet. 48, 94–100 (2016)

  25. 25.

    et al. A genetic atlas of human admixture history. Science 343, 747–751 (2014)

  26. 26.

    & A model for the length of tracts of identity by descent in finite random mating populations. Theor. Popul. Biol. 64, 141–150 (2003)

  27. 27.

    et al. Higher levels of Neanderthal ancestry in East Asians than in Europeans. Genetics 194, 199–209 (2013)

  28. 28.

    et al. Pleistocene mitochondrial genomes suggest a single major dispersal of non-Africans and a late glacial population turnover in Europe. Curr. Biol. 26, 827–833 (2016)

  29. 29.

    1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012)

  30. 30.

    , & Life history trade-offs explain the evolution of human pygmies. Proc. Natl Acad. Sci. USA 104, 20216–20219 (2007)

  31. 31.

    Chimpanzee Sequencing and Analysis Consortium. Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437, 69–87 (2005)

  32. 32.

    , , & Inference of population structure using dense haplotype data. PLoS Genet. 8, e1002453 (2012)

  33. 33.

    et al. A “Copernican” reassessment of the human mitochondrial DNA tree from its root. Am. J. Hum. Genet. 90, 675–684 (2012)

  34. 34.

    et al. The archaeogenetics of Europe. Curr. Biol. 20, R174–R183 (2010)

Download references


Support was provided by: Estonian Research Infrastructure Roadmap grant no 3.2.0304.11-0312; Australian Research Council Discovery grants (DP110102635 and DP140101405) (D.M.L., M.W. and E.W.); Danish National Research Foundation; the Lundbeck Foundation and KU2016 (E.W.); ERC Starting Investigator grant (FP7 - 261213) (T.K.); Estonian Research Council grant PUT766 (G.C. and M.K.); EU European Regional Development Fund through the Centre of Excellence in Genomics to Estonian Biocentre (R.V.; M.Me. and A.Me.), and Centre of Excellence for Genomics and Translational Medicine Project No. 2014-2020.4.01.15-0012 to EGC of UT (A.Me.) and EBC (M.Me.); Estonian Institutional Research grant IUT24-1 (L.S., M.J., A.K., B.Y., K.T., C.B.M., Le.S., H.Sa., S.L., D.M.B., E.M., R.V., G.H., M.K., G.C., T.K. and M.Me.) and IUT20-60 (A.Me.); French Ministry of Foreign and European Affairs and French ANR grant number ANR-14-CE31-0013-01 (F.-X.R.); Gates Cambridge Trust Funding (E.J.); ICG SB RAS (No. VI.58.1.1) (D.V.L.); Leverhulme Programme grant no. RP2011-R-045 (A.B.M., P.G. and M.G.T.); Ministry of Education and Science of Russia; Project 6.656.2014/K (S.A.F.); NEFREX grant funded by the European Union (People Marie Curie Actions; International Research Staff Exchange Scheme; call FP7-PEOPLE-2012-IRSES-number 318979) (M.Me., G.H. and M.K.); NIH grants 5DP1ES022577 05, 1R01DK104339-01, and 1R01GM113657-01 (S.Tis.); Russian Foundation for Basic Research (grant N 14-06-00180a) (M.G.); Russian Foundation for Basic Research; grant 16-04-00890 (O.B. and E.B); Russian Science Foundation grant 14-14-00827 (O.B.); The Russian Foundation for Basic Research (14-04-00725-a), The Russian Humanitarian Scientific Foundation (13-11-02014) and the Program of the Basic Research of the RAS Presidium “Biological diversity” (E.K.K.); Wellcome Trust and Royal Society grant WT104125AIA & the Bristol Advanced Computing Research Centre (http://www.bris.ac.uk/acrc/) (D.J.L.); Wellcome Trust grant 098051 (Q.A.; C.T.-S. and Y.X.); Wellcome Trust Senior Research Fellowship grant 100719/Z/12/Z (M.G.T.); Young Explorers Grant from the National Geographic Society (8900-11) (C.A.E.); ERC Consolidator Grant 647787 ‘LocalAdaptatio’ (A.Ma.); Program of the RAS Presidium “Basic research for the development of the Russian Arctic” (B.M.); Russian Foundation for Basic Research grant 16-06-00303 (E.B.); a Rutherford Fellowship (RDF-10-MAU-001) from the Royal Society of New Zealand (M.P.C.).

Author information

Author notes

    • Luca Pagani
    • , Daniel John Lawson
    • , Evelyn Jagoda
    • , Alexander Mörseburg
    • , Anders Eriksson
    • , Richard Villems
    • , Eske Willerslev
    • , Toomas Kivisild
    •  & Mait Metspalu

    These authors contributed equally to this work.


  1. Estonian Biocentre, 51010 Tartu, Estonia

    • Luca Pagani
    • , Georgi Hudjashov
    • , Lauri Saag
    • , Mari Järve
    • , Monika Karmin
    • , Alena Kushniarevich
    • , Bayazit Yunusbayev
    • , Kristiina Tambets
    • , Chandana Basu Mallick
    • , Hovhannes Sahakyan
    • , Gyaneshwer Chaubey
    • , Sergei Litvinov
    • , Doron M. Behar
    • , Ene Metspalu
    • , Richard Villems
    • , Toomas Kivisild
    •  & Mait Metspalu
  2. Department of Archaeology and Anthropology, University of Cambridge, Cambridge CB2 1QH, UK

    • Luca Pagani
    • , Evelyn Jagoda
    • , Alexander Mörseburg
    • , Florian Clemente
    • , Alexia Cardona
    • , Sarah Kaewert
    • , Charlotte Inchley
    • , Christiana L. Scheib
    • , Florin Mircea Iliescu
    • , Christina A. Eichstaedt
    •  & Toomas Kivisild
  3. Department of Biological, Geological and Environmental Sciences, University of Bologna, Via Selmi 3, 40126 Bologna, Italy

    • Luca Pagani
  4. Integrative Epidemiology Unit, School of Social and Community Medicine, University of Bristol, Bristol BS8 2BN, UK

    • Daniel John Lawson
  5. Department of Human Evolutionary Biology, Harvard University, Cambridge, Massachusetts 02138, USA

    • Evelyn Jagoda
  6. Integrative Systems Biology Lab, Division of Biological and Environmental Sciences & Engineering, King Abdullah University of Science and Technology, Thuwal 23955-6900, Saudi Arabia

    • Anders Eriksson
  7. Department of Zoology, University of Cambridge, Cambridge CB2 3EJ, UK

    • Anders Eriksson
    •  & Andrea Manica
  8. Estonian Genome Center, University of Tartu, 51010 Tartu, Estonia

    • Mario Mitt
    • , Reedik Mägi
    • , Evelin Mihailov
    •  & Andres Metspalu
  9. Department of Biotechnology, Institute of Molecular and Cell Biology, University of Tartu, 51010 Tartu, Estonia

    • Mario Mitt
    •  & Andres Metspalu
  10. Institut de Biologie Computationnelle, Université Montpellier 2, 34095 Montpellier, France

    • Florian Clemente
  11. Department of Psychology, University of Auckland, Auckland 1142, New Zealand

    • Georgi Hudjashov
    •  & Monika Karmin
  12. Statistics and Bioinformatics Group, Institute of Fundamental Sciences, Massey University, 4442 Palmerston North, New Zealand

    • Georgi Hudjashov
    •  & Murray P. Cox
  13. Department of Biology, Pennsylvania State University, University Park, Pennsylvania 16802, USA

    • Michael DeGiorgio
  14. Institute for Human Genetics, University of California, San Francisco, California 94143, USA

    • Jeffrey D. Wall
  15. MRC Epidemiology Unit, University of Cambridge, Institute of Metabolic Science, Box 285, Addenbrooke’s Hospital, Hills Road, Cambridge CB2 0QQ, UK

    • Alexia Cardona
  16. School of Life Sciences, Arizona State University, Tempe, Arizona 85287, USA

    • Melissa A. Wilson Sayres
  17. Center for Evolution and Medicine, The Biodesign Institute, Tempe, Arizona 85287, USA

    • Melissa A. Wilson Sayres
  18. Department of Evolutionary Biology, Institute of Molecular and Cell Biology, University of Tartu, 51010 Tartu, Estonia

    • Monika Karmin
    • , Lehti Saag
    • , Ene Metspalu
    •  & Richard Villems
  19. Mathematical Sciences, University of Southampton, Southampton SO17 1BJ, UK

    • Guy S. Jacobs
  20. Institute for Complex Systems Simulation, University of Southampton, Southampton SO17 1BJ, UK

    • Guy S. Jacobs
  21. Division of Biological Sciences, University of Montana, Missoula, Montana 59812, USA

    • Tiago Antao
  22. Institute of Genetics and Cytology, National Academy of Sciences, BY-220072 Minsk, Belarus

    • Alena Kushniarevich
  23. The Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK

    • Qasim Ayub
    • , Chris Tyler-Smith
    •  & Yali Xue
  24. Institute of Biochemistry and Genetics, Ufa Scientific Center of RAS, 450054 Ufa, Russia

    • Bayazit Yunusbayev
    • , Alexandra Karunas
    • , Sergei Litvinov
    • , Rita Khusainova
    • , Vita Akhmetova
    • , Irina Khidiyatova
    •  & Elza K. Khusnutdinova
  25. Kuban State Medical University, 350040 Krasnodar, Russia

    • Elvira Pocheshkhova
  26. Scientific Research Center of the Caucasian Ethnic Groups, St. Andrews Georgian University, 0162 Tbilisi, Georgia

    • George Andriadze
  27. Center for GeoGenetics, University of Copenhagen, 1350 Copenhagen, Denmark

    • Craig Muller
    • , Rasmus Nielsen
    •  & Eske Willerslev
  28. Research Centre for Human Evolution, Environmental Futures Research Institute, Griffith University, Nathan, Queensland 4111, Australia

    • Michael C. Westaway
    •  & David M. Lambert
  29. Center of Molecular Diagnosis and Genetic Research, University Hospital of Obstetrics and Gynecology, 1000 Tirana, Albania

    • Grigor Zoraqi
  30. Center of High Technology, Academy of Sciences, 100047 Tashkent, Uzbekistan

    • Shahlo Turdikulova
  31. Institute of Bioorganic Chemistry Academy of Science, 100047 Tashkent, Uzbekistan

    • Dilbar Dalimova
  32. L.N. Gumilyov Eurasian National University, 010008 Astana, Kazakhstan

    • Zhaxylyk Sabitov
  33. Centre for Advanced Research in Sciences (CARS), DNA Sequencing Research Laboratory, University of Dhaka, Dhaka-1000, Bangladesh

    • Gazi Nurun Nahar Sultana
  34. Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6145, USA

    • Joseph Lachance
  35. School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia 30332, USA

    • Joseph Lachance
  36. Departments of Genetics and Biology, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6313, USA

    • Sarah Tishkoff
  37. DNcode laboratories, 117623 Moscow, Russia

    • Kuvat Momynaliev
  38. Institute of Molecular Biology and Medicine, 720040 Bishkek, Kyrgyzstan

    • Jainagul Isakova
  39. Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, 630090 Novosibirsk, Russia

    • Larisa D. Damba
    • , Marina Gubina
    • , Daria V. Lichman
    • , Mikhail Voevoda
    •  & Ludmila P. Osipova
  40. Mongolian Academy of Medical Sciences, 210620 Ulaanbaatar, Mongolia

    • Pagbajabyn Nymadawa
  41. Northern State Medical University, 163000 Arkhangelsk, Russia

    • Irina Evseeva
  42. Anthony Nolan, The Royal Free Hospital, Pond Street, London NW3 2QG, UK

    • Irina Evseeva
  43. V. N. Karazin Kharkiv National University, 61022 Kharkiv, Ukraine

    • Lubov Atramentova
    •  & Olga Utevska
  44. Evolutionary Medicine group, Laboratoire d’Anthropologie Moléculaire et Imagerie de Synthèse, UMR 5288, Centre National de la Recherche Scientifique, Université de Toulouse 3, Toulouse 31073, France

    • François-Xavier Ricaut
    • , Nicolas Brucato
    •  & Thierry Letellier
  45. Genome Diversity and Diseases Laboratory, Eijkman Institute for Molecular Biology, 10430 Jakarta, Indonesia

    • Herawati Sudoyo
  46. Department of Molecular Genetics, Yakut Scientific Centre of Complex Medical Problems, 677027 Yakutsk, Russia

    • Nikolay A. Barashkov
    •  & Sardana A. Fedorova
  47. Laboratory of Molecular Biology, Institute of Natural Sciences, M.K. Ammosov North-Eastern Federal University, 677027 Yakutsk, Russia

    • Nikolay A. Barashkov
    •  & Sardana A. Fedorova
  48. Genos DNA laboratory, 10000 Zagreb, Croatia

    • Vedrana Škaro
  49. University of Osijek, Medical School, 31000 Osijek, Croatia

    • Vedrana Škaro
    •  & Dragan Primorac
  50. Center for Genomics and Transcriptomics, CeGaT, GmbH, D-72076 Tübingen, Germany

    • Lejla Mulahasanovic´
  51. St. Catherine Specialty Hospital, 49210 Zabok and 10000 Zagreb, Croatia

    • Dragan Primorac
  52. Eberly College of Science, The Pennsylvania State University, University Park, Pennsylvania 16802, USA

    • Dragan Primorac
  53. University of Split, Medical School, 21000 Split, Croatia

    • Dragan Primorac
  54. Laboratory of Ethnogenomics, Institute of Molecular Biology, National Academy of Sciences, Republic of Armenia, 7 Hasratyan Street, 0014 Yerevan, Armenia

    • Hovhannes Sahakyan
    •  & Levon Yepiskoposyan
  55. Department of Applied Social Sciences, University of Winchester, Sparkford Road, Winchester SO22 4NR, UK

    • Maru Mormina
  56. Thoraxklinik Heidelberg, University Hospital Heidelberg, 69120 Heidelberg, Germany

    • Christina A. Eichstaedt
  57. Novosibirsk State University, 630090 Novosibirsk, Russia

    • Daria V. Lichman
    • , Mikhail Voevoda
    •  & Ludmila P. Osipova
  58. RIPAS Hospital, Bandar Seri Begawan, BE1518 Brunei

    • Syafiq Abdullah
  59. National Cancer Centre Singapore, 169610 Singapore

    • Joseph T. S. Wee
  60. Department of Genetics and Fundamental Medicine, Bashkir State University, 450000 Ufa, Russia

    • Alexandra Karunas
    • , Sergei Litvinov
    • , Rita Khusainova
    • , Natalya Ekomasova
    • , Irina Khidiyatova
    •  & Elza K. Khusnutdinova
  61. Department of Genetics and Bioengineering. Faculty of Engineering and Information Technologies, International Burch University, 71000 Sarajevo, Bosnia and Herzegovina

    • Damir Marjanović
  62. Institute for Anthropological Researches, 10000 Zagreb, Croatia

    • Damir Marjanović
  63. Research Centre for Medical Genetics, Russian Academy of Sciences, Moscow 115478, Russia

    • Elena Balanovska
    •  & Oleg Balanovsky
  64. Genetics Laboratory, Institute of Biological Problems of the North, Russian Academy of Sciences, 685000 Magadan, Russia

    • Miroslava Derenko
    •  & Boris Malyarchuk
  65. Institute of Internal Medicine, Siberian Branch of Russian Academy of Medical Sciences, 630009 Novosibirsk, Russia

    • Mikhail Voevoda
  66. Leverhulme Centre for Human Evolutionary Studies, Department of Archaeology and Anthropology, University of Cambridge, Cambridge CB2 1QH, UK

    • Marta Mirazón Lahr
  67. Research Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, UK

    • Pascale Gerbault
    •  & Mark G. Thomas
  68. Department of Archaeology, University of Papua New Guinea, University PO Box 320, 134 NCD, Papua New Guinea

    • Matthew Leavesley
  69. College of Arts, Society and Education, James Cook University, PO Box 6811, Cairns, Queensland 4870, Australia

    • Matthew Leavesley
  70. Department of Anthropology, University College London, London WC1H 0BW, UK

    • Andrea Bamberg Migliano
  71. Max Planck Institute for the Science of Human History, Kahlaische Strasse 10, D-07743 Jena, Germany

    • Michael Petraglia
  72. Vavilov Institute for General Genetics, Russian Academy of Sciences, 119333 Moscow, Russia

    • Oleg Balanovsky
  73. Department of Integrative Biology, University of California Berkeley, Berkeley 94720, California, USA

    • Rasmus Nielsen
  74. Estonian Academy of Sciences, 6 Kohtu Street, Tallinn 10130, Estonia

    • Richard Villems


  1. Search for Luca Pagani in:

  2. Search for Daniel John Lawson in:

  3. Search for Evelyn Jagoda in:

  4. Search for Alexander Mörseburg in:

  5. Search for Anders Eriksson in:

  6. Search for Mario Mitt in:

  7. Search for Florian Clemente in:

  8. Search for Georgi Hudjashov in:

  9. Search for Michael DeGiorgio in:

  10. Search for Lauri Saag in:

  11. Search for Jeffrey D. Wall in:

  12. Search for Alexia Cardona in:

  13. Search for Reedik Mägi in:

  14. Search for Melissa A. Wilson Sayres in:

  15. Search for Sarah Kaewert in:

  16. Search for Charlotte Inchley in:

  17. Search for Christiana L. Scheib in:

  18. Search for Mari Järve in:

  19. Search for Monika Karmin in:

  20. Search for Guy S. Jacobs in:

  21. Search for Tiago Antao in:

  22. Search for Florin Mircea Iliescu in:

  23. Search for Alena Kushniarevich in:

  24. Search for Qasim Ayub in:

  25. Search for Chris Tyler-Smith in:

  26. Search for Yali Xue in:

  27. Search for Bayazit Yunusbayev in:

  28. Search for Kristiina Tambets in:

  29. Search for Chandana Basu Mallick in:

  30. Search for Lehti Saag in:

  31. Search for Elvira Pocheshkhova in:

  32. Search for George Andriadze in:

  33. Search for Craig Muller in:

  34. Search for Michael C. Westaway in:

  35. Search for David M. Lambert in:

  36. Search for Grigor Zoraqi in:

  37. Search for Shahlo Turdikulova in:

  38. Search for Dilbar Dalimova in:

  39. Search for Zhaxylyk Sabitov in:

  40. Search for Gazi Nurun Nahar Sultana in:

  41. Search for Joseph Lachance in:

  42. Search for Sarah Tishkoff in:

  43. Search for Kuvat Momynaliev in:

  44. Search for Jainagul Isakova in:

  45. Search for Larisa D. Damba in:

  46. Search for Marina Gubina in:

  47. Search for Pagbajabyn Nymadawa in:

  48. Search for Irina Evseeva in:

  49. Search for Lubov Atramentova in:

  50. Search for Olga Utevska in:

  51. Search for François-Xavier Ricaut in:

  52. Search for Nicolas Brucato in:

  53. Search for Herawati Sudoyo in:

  54. Search for Thierry Letellier in:

  55. Search for Murray P. Cox in:

  56. Search for Nikolay A. Barashkov in:

  57. Search for Vedrana Škaro in:

  58. Search for Lejla Mulahasanovic´ in:

  59. Search for Dragan Primorac in:

  60. Search for Hovhannes Sahakyan in:

  61. Search for Maru Mormina in:

  62. Search for Christina A. Eichstaedt in:

  63. Search for Daria V. Lichman in:

  64. Search for Syafiq Abdullah in:

  65. Search for Gyaneshwer Chaubey in:

  66. Search for Joseph T. S. Wee in:

  67. Search for Evelin Mihailov in:

  68. Search for Alexandra Karunas in:

  69. Search for Sergei Litvinov in:

  70. Search for Rita Khusainova in:

  71. Search for Natalya Ekomasova in:

  72. Search for Vita Akhmetova in:

  73. Search for Irina Khidiyatova in:

  74. Search for Damir Marjanović in:

  75. Search for Levon Yepiskoposyan in:

  76. Search for Doron M. Behar in:

  77. Search for Elena Balanovska in:

  78. Search for Andres Metspalu in:

  79. Search for Miroslava Derenko in:

  80. Search for Boris Malyarchuk in:

  81. Search for Mikhail Voevoda in:

  82. Search for Sardana A. Fedorova in:

  83. Search for Ludmila P. Osipova in:

  84. Search for Marta Mirazón Lahr in:

  85. Search for Pascale Gerbault in:

  86. Search for Matthew Leavesley in:

  87. Search for Andrea Bamberg Migliano in:

  88. Search for Michael Petraglia in:

  89. Search for Oleg Balanovsky in:

  90. Search for Elza K. Khusnutdinova in:

  91. Search for Ene Metspalu in:

  92. Search for Mark G. Thomas in:

  93. Search for Andrea Manica in:

  94. Search for Rasmus Nielsen in:

  95. Search for Richard Villems in:

  96. Search for Eske Willerslev in:

  97. Search for Toomas Kivisild in:

  98. Search for Mait Metspalu in:


R.V., E.W., T.K. and M.Me. conceived the study. A.K., K.T., C.B.M., Le.S., E.P., G.A., C.M., M.W., D.L., G.Z., S.T., D.D., Z.S., G.N.N.S., K.M., J.I., L.D.D., M.G., P.N., I.E., L.At., O.U., F.-X.R., N.B., H.S., T.L., M.P.C., N.A.B., V.S., L.A., D.Pr., H.Sa., M.Mo., C.A.E., D.V.L., S.A., G.C., J.T.S.W., E.Mi., A.Ka., S.L., R.K., N.T., V.A., I.K., D.M., L.Y., D.M.B., E.B., A.Me., M.D., B.M., M.V., S.A.F., L.P.O., M.Mi., M.L., A.B.M., O.B., E.K.K, E.M., M.G.T. and E.W. conducted anthropological research and/or sample collection and management. J.L. and S.Ti. provided access to data. L.P., D.J.L, E.J., A.Mo., A.E., M.Mi., F.C., G.H., M.D., L.S., J.W., A.C., R.M., M.A.W.S., S.K., C.I., C.L.S., M.J., M.K., G.S.J., T.A., F.M.I., A.K., Q.A., C.T.-S., Y.X., B.Y., C.B.M., T.K. and M.Me. analysed data. L.P., D.J.L., E.J., A.Mo., L.S., M.K., K.T., C.B.M., Le.S., G.C., M.Mi., P.G., M.L., A.B.M., M.P., E.M., M.G.T., A.Ma., R.N., R.V., E.W., T.K. and M.Me. contributed to the interpretation of results. L.P., D.J.L., E.J., A.Mo., A.E., F.C., G.H., M.D., A.C., M.A.W.S., B.Y., J.L., S.Ti., M.Mi., P.G., M.L., A.B.M., M.P., M.G.T., A.Ma., R.N., R.V., E.W., T.K. and M.Me. wrote the manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to Luca Pagani or Toomas Kivisild or Mait Metspalu.

Reviewer Information Nature thanks R. Dennell and the other anonymous reviewer(s) for their contribution to the peer review of this work.

Extended data

Supplementary information

PDF files

  1. 1.

    Supplementary Information

    This file contains Supplementary Text and Data, Supplementary Figures, Supplementary Tables and additional references (see Contents for more details).

Excel files

  1. 1.

    Supplementary Tables

    This file contains Supplementary Tables.

About this article

Publication history






Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.