Ancient gene flow from early modern humans into Eastern Neanderthals

Journal name:
Nature
Volume:
530,
Pages:
429–433
Date published:
DOI:
doi:10.1038/nature16544
Received
Accepted
Published online

It has been shown that Neanderthals contributed genetically to modern humans outside Africa 47,000–65,000 years ago. Here we analyse the genomes of a Neanderthal and a Denisovan from the Altai Mountains in Siberia together with the sequences of chromosome 21 of two Neanderthals from Spain and Croatia. We find that a population that diverged early from other modern humans in Africa contributed genetically to the ancestors of Neanderthals from the Altai Mountains roughly 100,000 years ago. By contrast, we do not detect such a genetic contribution in the Denisovan or the two European Neanderthals. We conclude that in addition to later interbreeding events, the ancestors of Neanderthals from the Altai Mountains and early modern humans met and interbred, possibly in the Near East, many thousands of years earlier than previously thought.

At a glance

Figures

  1. Divergence and heterozygosity in the Altai Neanderthal and Denisovan genomes.
    Figure 1: Divergence and heterozygosity in the Altai Neanderthal and Denisovan genomes.

    a, The maximum divergence between windows in the two archaic genomes versus their minimum divergence to Africans. Error bars represent the 95% confidence intervals from 1,000 bootstrap replicates. Regions previously described as inbred in the Altai Neanderthal genome2 were excluded. b, Heterozygosity (per 1,000 bp) in windows of each archaic genome versus their minimum divergence to Africans. c, d, Simulation of a model with gene flow into the Denisovan lineage from both the Altai Neanderthal (0.65%, 50,000 years ago) and an unknown archaic hominin (1%, 200,000 years ago) that diverged from other hominins 1.5 million years ago. The constant mutation rate used makes the slope of the simulated curves less steep than in the actual genomes, where mutation rate varies among windows. e, f, Simulation of a model that also includes modern human gene flow into the Altai Neanderthal lineage (3.55%, 100,000 years ago).

  2. Distinguishing between two scenarios of introgression into archaic humans.
    Figure 2: Distinguishing between two scenarios of introgression into archaic humans.

    a, The age distribution of ‘African’ haplotypes (≥50 kb) in the Altai Neanderthal and the Denisovan genomes as inferred by ARGweaver. Error bars represent the 95% credible intervals from 302 Markov chain Monte Carlo (MCMC) replicates. An ‘African’ haplotype coalesces within the African subtree before coalescing with the other archaic individual (inset), and its age is inferred as that coalescent time (arrowhead). The majority of the young ‘African’ haplotypes in the Altai Neanderthal genome are estimated to coalesce 100,000–230,000 years ago, with just a few estimated to coalesce less than 100,000 years ago (Supplementary Information section 10). b, The age distribution of ‘deep ancestral’ haplotypes (≥50 kb) in the Altai Neanderthal and Denisovan genomes. A ‘deep ancestral’ haplotype coalesces above the African subtree and the other archaic lineage (inset), and its age is inferred as that coalescent time (arrowhead). ky, thousand years.

  3. Refined demography of archaic and modern humans.
    Figure 3: Refined demography of archaic and modern humans.

    a, Total migration rates of six gene flow events inferred by G-PhoCS. The ranges correspond to 95% Bayesian credible intervals aggregated across runs. Five gene flow events have been previously reported, including gene flow from an unknown archaic group into Denisovans (blue arrow). In addition, we infer gene flow from a population related to modern humans into a population ancestral to the Altai Neanderthal (red arrow). It appears to come from a population that either split from the ancestors of present-day Africans or separated fairly early in the history of African populations (shaded circle). b, Effective population sizes and divergence times inferred by G-PhoCS. The ranges correspond to 95% Bayesian credible intervals aggregated across runs. The horizontal bars indicate posterior mean estimates for divergence times. Archaic samples (dots) are located at their estimated ages.

  4. Homozygous segments on chromosome 21.
    Figure 4: Homozygous segments on chromosome 21.

    The range of the cumulative length (Mb) of homozygous segments is shown as the surface of a polygon, with individuals at the extremes of each group’s range serving as vertices. Dots represent human individuals, archaic or otherwise, whereas great apes are not depicted individually. The Altai Neanderthal clusters with the other archaic individuals (inset) when recently inbred genomic regions are excluded.

  5. Migration rates in preliminary demographic inference.
    Extended Data Fig. 1: Migration rates in preliminary demographic inference.

    Total migration rates estimated for 22 directional migration bands in five separate preliminary G-PhoCS runs. Rows correspond to source populations and columns to the target populations. The 20 migration bands between modern and archaic populations were considered in five separate runs, each containing the four bands associated with a different modern human population (Supplementary Fig. 15A). The two migration bands between the Denisovan and the Altai Neanderthal populations were considered in all five runs, and the values shown here correspond to an aggregate of all five runs. The estimates are as shown in Supplementary Fig. 15B. Shade indicates the posterior mean total migration rate (legend), which approximates the probability that a lineage in the target population originated in the source population. The 95% Bayesian credible intervals from 2,000 MCMC replicates are indicated for migration bands whose upper credible interval bound is above 0.3%. We identified four clusters of migration bands, corresponding to what were likely at least four different cases of introgression between populations: (1) Neanderthals into non-African modern humans (red box), (2) Denisovans into Oceanians (green box), (3) between Neanderthals and Denisovans (magenta), and (4) modern humans into Neanderthals (blue box). Alt, Altai Neanderthal; Chi, Chinese; Den, Denisovan; Fre, French; Pap, Papuan; Yor, Yoruba.

  6. Demographic inference on simulated data.
    Extended Data Fig. 2: Demographic inference on simulated data.

    Simulated data were generated under the demographic model as inferred by G-PhoCS (Supplementary Table 13). Each simulated data set consisted of 10,000 loci of 1 kb length. We simulated the Altai Neanderthal, the Denisovan, and three modern human populations corresponding to the San, Yoruba, and French, with modern human demography consistent with recent studies (Supplementary information section 8). Three migration bands were simulated: (1) from the Altai Neanderthal to the Denisovan, (2) from a population that diverged from the ancestors of all present-day humans 300,000 years ago into the Altai Neanderthal, and (3) from a population that diverged from the ancestors of all modern and archaic humans roughly 2.6 million years ago into the Denisovan. a, Estimates of effective population sizes (theta, θ), population divergence times (tau, τ) and migration rates (m) from three G-PhoCS runs on data simulated with gene flow from modern humans into the Altai Neanderthal lineage. Each run analyses an individual from a different present-day population, using the exact same setup used in our main analysis (Supplementary Fig. 15A). Parameters are typically estimated accurately, with 95% Bayesian credible intervals containing the values used in simulations (horizontal red lines). Rates of archaic gene flow into Denisovan appear to be somewhat overestimated, and differences between analyses of African and non-African populations are consistent with those observed in the data analysis (Supplementary Fig. 15B). b, Similar analysis done on data simulated without gene flow from modern humans into the Altai Neanderthal lineage. Accurate estimates are obtained for all model parameters, and no gene flow is inferred from modern humans into the ancestors of the Altai Neanderthal. Error bars represent the 95% Bayesian credible intervals from 2,000 MCMC replicates.

  7. Simulation of different source populations for modern introgression into Neanderthals.
    Extended Data Fig. 3: Simulation of different source populations for modern introgression into Neanderthals.

    Estimated rates of migration from the modern human population to the Altai Neanderthal population obtained from 15 G-PhoCS runs on five simulated data sets. All demographic parameters are set according to the values inferred by G-PhoCS in our data analysis (Supplementary Table 13), and the five data sets differ by the source population for migration: no migration (none), population ancestral to all present-day humans (ancestral), population ancestral to Yoruba and Europeans (Yoruba), San population (San), and European population (European). The first two sets are the ones analysed in Extended Data Fig. 2. Each data set is analysed three times, using different present-day samples: European (French), Yoruba, or San. Significant differences in estimates are observed between the data sets with and without gene flow and the data set with gene flow from a source population related to Europeans. Only minor differences were observed between values inferred for the three data sets with source population diverging from African populations (San, Yoruba, and ancestral). We conclude from this that the source population likely diverged from an African human population before the divergence of present-day Eurasians. The shaded circle in Fig. 3a represents this conclusion. Error bars represent the 95% Bayesian credible intervals from 2,000 MCMC replicates.

  8. Simulation of present-day human contamination.
    Extended Data Fig. 4: Simulation of present-day human contamination.

    Simulated windows of 100 kb for the Altai Neanderthal and Denisovan genomes with present-day human contamination of 5% at the genotype level. Windows are binned by their minimum divergence to Africans using derived alleles at >0.9 frequency in the simulated African population. The x and y axis as in Fig. 1. a, Gene flow from a deeply divergent archaic hominin into the Denisovan lineage (1%) and Altai Neanderthal gene flow into the Denisovan lineage (0.65%). b, Gene flow from a deeply divergent archaic hominin into the Denisovan lineage, Altai Neanderthal gene flow into the Denisovan lineage and modern human gene flow into the Altai Neanderthal lineage (1.8%). Error bars represent the 95% confidence intervals from 1,000 bootstrap replicates.

  9. Haplotype ages inferred by ARGweaver on simulated data.
    Extended Data Fig. 5: Haplotype ages inferred by ARGweaver on simulated data.

    a, Distribution of ‘African’ haplotype ages in sequences simulated with introgression into the Altai Neanderthal lineage from modern humans 100,000 years ago. ‘African’ haplotypes are identified as in Fig. 2. Error bars represent the 95% Bayesian credible intervals from 302 MCMC replicates. b, Distribution of true haplotype ages for each of the estimated ages. The horizontal dotted lines show the estimated age. The plot is divided into four quadrants; the lower half represents ‘African’ haplotypes having true ages between 100,000 and 620,000 years ago (the divergence time between archaic and present-day humans), which are necessarily due to post-divergence gene flow from modern humans. The left side of the plot represents regions that would be identified as introgressed based on a threshold of ≤ 234,000 years. The counts in each quadrant are for Altai Neanderthal (red) and Denisovan (blue), respectively. The counts for the Denisovan in the lower two quadrants are zero because there was no simulated migration from modern humans into the Denisovan lineage. Note that this is a somewhat nonstandard plot of true age versus estimated age; a more standard, reversed view is given in Supplementary Fig. 33 and demonstrates that the estimated ages are largely unbiased. Error bars as in the standard Tukey box plot (R boxplot function).

  10. Main G-PhoCS demographic inference.
    Extended Data Fig. 6: Main G-PhoCS demographic inference.

    Summary of the main demographic inference using G-PhoCS in a model with four archaic populations and one modern human population. a, The population phylogeny assumed in each of the G-PhoCS runs. Labels on internal edges indicate names of the four ancestral populations: population ancestral to the two Western Neanderthals (W.NEA), population ancestral to all three Neanderthals (NEA), population ancestral to all four archaic individuals (ARC), and population ancestral to all human samples (HUM). We augmented the phylogeny with 14 directional migration bands (arrows) between all pairs of sampled populations except for the pairs of Neanderthal populations. In one of the runs we added an unknown ‘ghost’ population and a migration band from that population into the Denisovan population. b, Parameter estimates obtained by G-PhoCS in six separate runs analysing 13,754 neutral and loosely linked loci, substituting samples in the ‘Modern’ population with pairs of present-day humans from five different modern populations (Supplementary Table 11). The last run has gene flow from the ‘ghost’ population and uses two Yoruba individuals in the modern human population. Bar heights indicate posterior mean and error bars correspond to 95% Bayesian credible intervals. Estimates of divergence times (τ) and effective population sizes (θ) are given in raw form, scaled by number of mutations per 10 kb (left axis), and calibrated to absolute units, 1,000 years for time, and 1,000 individuals for effective population size, (right axis) assuming an average mutation rate of 0.5 × 10−9 mutations per year per bp and an average generation time of 29 years. For each of the 14 migration bands, we are showing the estimated total migration rates (m). See Supplementary Information section 8 for more information on parameter calibration and setup for G-PhoCS. A graphical summary of these estimates is given in Fig. 3.

  11. Migration rates in main demographic inference.
    Extended Data Fig. 7: Migration rates in main demographic inference.

    Total migration rates estimated for 46 directional migration bands in five separate G-PhoCS runs. Rows correspond to source populations and columns to the target populations. The 40 migration bands between modern (present-day) and archaic populations were considered in five separate runs, each containing the eight bands associated with a different modern human population (Extended Data Fig. 6a). The six migration bands between the Denisovan population and the three Neanderthal populations were considered in all six runs, and the values shown here were estimated as an aggregate of all five runs. The estimates are as shown in Extended Data Fig. 6. Shade indicates the posterior mean total migration rate (legend), which approximates the probability that a lineage in the target population originated in the source population. The 95% Bayesian credible intervals from 2,000 MCMC replicates are indicated for migration bands whose upper credible interval bound is above 0.3%. We identified four clusters of migration bands, corresponding to what were likely at least four different cases of introgression between populations: (1) Western (European) Neanderthals into non-African modern humans (red box), (2) Denisovans into East Asian and Oceanians (green box), (3) Neanderthals into Denisovans (magenta), and (4) modern humans into Eastern Neanderthals (blue box). Directed arrows in Fig. 3a depict these introgression events. Sid, El Sidrón Neanderthal; Vin, Vindija Neanderthal.

  12. Principal component analysis.
    Extended Data Fig. 8: Principal component analysis.

    The putatively introgressed segments in the Altai Neanderthal genome, defined by derived alleles in two individuals from the San, Yoruba, Mbuti, Dinka or Mandenka populations. The introgressed segments show no clear affinity to one present-day African population. A, Altai Neanderthal; S, San; M, Mbuti; Y, Yoruba; D, Dinka; N, Mandenka.

  13. Natural selection in chromosome 21.
    Extended Data Fig. 9: Natural selection in chromosome 21.

    Ratio of functional (putatively deleterious) to neutral polymorphism in archaic and present-day humans (Supplementary Information section 7). TFBS, transcription factor binding sites; upstream refers to 5 kb before the transcription start site of genes; UTRs, untranslated regions (in the mRNA). PhastCons ≥ 0.9 for a site to be used. Ne, Neanderthals; Af, Africans; As, Asians; Am, Americans.

Tables

  1. Shared derived alleles
    Extended Data Table 1: Shared derived alleles

Accession codes

Primary accessions

European Nucleotide Archive

References

  1. Arsuaga, J. L. et al. Neandertal roots: Cranial and chronological evidence from Sima de los Huesos. Science 344, 13581363 (2014)
  2. Prüfer, K. et al. The complete genome sequence of a Neanderthal from the Altai Mountains. Nature 505, 4349 (2014)
  3. Green, R. E. et al. A draft sequence of the Neandertal genome. Science 328, 710722 (2010)
  4. Fu, Q. et al. Genome sequence of a 45,000-year-old modern human from western Siberia. Nature 514, 445449 (2014)
  5. Fu, Q. et al. An early modern human from Romania with a recent Neanderthal ancestor. Nature (2015)
  6. Reich, D. et al. Genetic history of an archaic hominin group from Denisova Cave in Siberia. Nature 468, 10531060 (2010)
  7. Meyer, M. et al. A high-coverage genome sequence from an archaic Denisovan individual. Science 338, 222226 (2012)
  8. The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 6874 (2015)
  9. Pickrell, J. K. et al. Ancient west Eurasian ancestry in southern and eastern Africa. Proc. Natl Acad. Sci. USA 111, 26322637 (2014)
  10. Llorente, M. G. et al. Ancient Ethiopian genome reveals extensive Eurasian admixture throughout the African continent. Science 350, 820822 (2015)
  11. Gronau, I., Hubisz, M. J., Gulko, B., Danko, C. G. & Siepel, A. Bayesian inference of ancient human demography from individual genome sequences. Nature Genet. 43, 10311034 (2011)
  12. Sankararaman, S., Patterson, N., Li, H., Paabo, S. & Reich, D. The date of interbreeding between Neandertals and modern humans. PLoS Genet. 8, e1002947 (2012)
  13. Rasmussen, M. D., Hubisz, M. J., Gronau, I. & Siepel, A. Genome-wide inference of ancestral recombination graphs. PLoS Genet. 10, e1004342 (2014)
  14. Burbano, H. A. et al. Targeted investigation of the Neandertal genome by array-based sequence capture. Science 328, 723725 (2010)
  15. Fu, Q. et al. DNA analysis of an early modern human from Tianyuan Cave, China. Proc. Natl Acad. Sci. USA 110, 22232227 (2013)
  16. Rausa, F. M., Galarneau, L., Bélanger, L. & Costa, R. H. The nuclear receptor fetoprotein transcription factor is coexpressed with its target gene HNF-3β in the developing murine liver intestine and pancreas. Mech. Dev. 89, 185188 (1999)
  17. Enard, W. FOXP2 and the role of cortico-basal ganglia circuits in speech and language evolution. Curr. Opin. Neurobiol. 21, 415424 (2011)
  18. McVicker, G., Gordon, D., Davis, C. & Green, P. Widespread genomic signatures of natural selection in hominid evolution. PLoS Genet. 5, e1000471 (2009)
  19. Veeramah, K. R., Gutenkunst, R. N., Woerner, A. E., Watkins, J. C. & Hammer, M. F. Evidence for increased levels of positive and negative selection on the X chromosome versus autosomes in humans. Mol. Biol. Evol. 31, 22672282 (2014)
  20. Sankararaman, S. et al. The genomic landscape of Neanderthal ancestry in present-day humans. Nature 507, 354357 (2014)
  21. Pemberton, T. J. et al. Genomic patterns of homozygosity in worldwide human populations. Am. J. Hum. Genet. 91, 275292 (2012)
  22. Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 10341050 (2005)
  23. Castellano, S. et al. Patterns of coding variation in the complete exomes of three Neandertals. Proc. Natl Acad. Sciences USA 111, 66666671 (2014)
  24. Hublin, J. J. in Neandertals and Modern Humans in Western Asia (eds Akazawa, T., Aoki, K. & Bar-Yosef, O. ) (Kluwer Academic Publishers, 1998)
  25. Mercier, N. H. V., Bar-Yosef, O., Vandermeersch B., Stringer, C. & Joron, J.-L. Thermoluminescence date for the Mousterian burial site of Es-Skhul, Mt. Carmel. J. Archaeol. Sci. 20, 169174 (1993)
  26. Grün, R. et al. U-series and ESR analyses of bones and teeth relating to the human burials from Skhul. J. Hum. Evol. 49, 316334 (2005)
  27. Armitage, S. J. et al. The southern route “Out of Africa”: evidence for an early expansion of modern humans into Arabia. Science 331, 453456 (2011)
  28. Rose, J. I. A. & Marks, A. E. “Out of Arabia” and the Middle–Upper Palaeolithic transition in the southern Levant. Quartär 61, 4985 (2014)
  29. Liu, W. et al. The earliest unequivocally modern humans in southern China. Nature 526, 696699 (2015)
  30. Rohland, N. & Hofreiter, M. Comparison and optimization of ancient DNA extraction. Biotechniques 42, 343352 (2007)
  31. Meyer, M. & Kircher, M. Illumina sequencing library preparation for highly multiplexed target capture and sequencing. Cold Spring Harb. Protocols 2010 , http://dx.doi.org/10.1101/pdb.prot5448 (2010)
  32. Kircher, M., Sawyer, S. & Meyer, M. Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform. Nucleic Acids Res. 40, e3 (2012)
  33. Briggs, A. W. et al. Removal of deaminated cytosines and detection of in vivo methylation in ancient DNA. Nucleic Acids Res. 38, e87 (2010)
  34. Gnirke, A. et al. Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nature Biotechnol. 27, 182189 (2009)
  35. McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 12971303 (2010)
  36. Paten, B., Herrero, J., Beal, K., Fitzgerald, S. & Birney, E. Enredo and Pecan: genome-wide mammalian consistency-based multiple alignment with paralogs. Genome Res. 18, 18141828 (2008)
  37. Hudson, R. R. Generating samples under a Wright–Fisher neutral model of genetic variation. Bioinformatics 18, 337338 (2002)
  38. Roach, J. C. et al. Analysis of genetic inheritance in a family quartet by whole-genome sequencing. Science 328, 636639 (2010)
  39. Fenner, J. N. Cross-cultural estimation of the human generation interval for use in genetics-based population divergence studies. Am. J. Phys. Anthropol. 128, 415423 (2005)
  40. Freedman, A. H. et al. Genome sequencing highlights the dynamic early history of dogs. PLoS Genet. 10, e1004016 (2014)
  41. Gravel, S. et al. Demographic history and rare allele sharing among human populations. Proc. Natl Acad. Sci. USA 108, 1198311988 (2011)
  42. Prado-Martinez, J. et al. Great ape genetic diversity and population history. Nature 499, 471475 (2013)
  43. Durinck, S. et al. BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis. Bioinformatics 21, 34393440 (2005)
  44. Arbiza, L. et al. Genome-wide inference of natural selection on human transcription factor binding sites. Nature Genet. 45, 723729 (2013)

Download references

Author information

  1. These authors contributed equally to this work.

    • Martin Kuhlwilm &
    • Ilan Gronau

Affiliations

  1. Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, 04103 Leipzig, Germany

    • Martin Kuhlwilm,
    • Cesare de Filippo,
    • Martin Kircher,
    • Qiaomei Fu,
    • Hernán A. Burbano,
    • Aida M. Andrés,
    • Svante Pääbo,
    • Matthias Meyer &
    • Sergi Castellano
  2. Efi Arazi School of Computer Science, Herzliya Interdisciplinary Center (IDC), Herzliya 46150, Israel

    • Ilan Gronau
  3. Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, New York 14850, USA

    • Melissa J. Hubisz &
    • Adam Siepel
  4. Institute of Evolutionary Biology (UPF-CSIC), 08003 Barcelona, Spain

    • Javier Prado-Martinez,
    • Carles Lalueza-Fox &
    • Tomas Marques-Bonet
  5. Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA

    • Martin Kircher
  6. Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA

    • Qiaomei Fu
  7. Key Laboratory of Vertebrate Evolution and Human Origins of Chinese Academy of Sciences, IVPP, CAS, Beijing 100044, China

    • Qiaomei Fu
  8. Department of Molecular Biology, Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany

    • Hernán A. Burbano
  9. Área de Prehistoria, Departamento de Historia, Universidad de Oviedo, 33011 Oviedo, Spain

    • Marco de la Rasilla
  10. Departamento de Paleobiología, Museo Nacional de Ciencias Naturales, CSIC, 28006 Madrid, Spain

    • Antonio Rosas
  11. Anthropology Center of the Croatian Academy of Sciences and Arts, 10000 Zagreb, Croatia

    • Pavao Rudan,
    • Željko Kucan &
    • Ivan Gušic
  12. Croatian Academy of Sciences and Arts, Institute for Quaternary Paleontology and Geology, 10000 Zagreb, Croatia

    • Dejana Brajkovic
  13. Catalan Institution of Research and Advanced Studies (ICREA), 08010 Barcelona, Spain

    • Tomas Marques-Bonet
  14. Centro Nacional de Análisis Genómico (CRG-CNAG), 08028 Barcelona, Spain

    • Tomas Marques-Bonet
  15. Department of Anthropology, University of Toronto, Toronto, Ontario M5S 2S2, Canada

    • Bence Viola
  16. Department of Human Evolution, Max Planck Institute for Evolutionary Anthropology, 04103 Leipzig, Germany

    • Bence Viola
  17. Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA

    • Adam Siepel

Contributions

M.M. and Q.F. performed experiments; M.Ku., I.Gr., M.J.H., C.d.F., J.P.-M., M.Ki, Q.F., H.A.B., T.M.-B., A.M.A., S.P., M.M., A.S. and S.C. analysed genetic data; C.L.-F., M.d.l.R., A.R., P.R., D.B., Ž.,K., I.Gu. and B.V. analysed anthropological data; M.Ku., I.Gr., M.J.H., B.V., S.P., A.S. and S.C. wrote the manuscript.

Competing financial interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to:

Sequence data are available in the European Nucleotide Archive (ENA) under accession number PRJEB11828.

Author details

Extended data figures and tables

Extended Data Figures

  1. Extended Data Figure 1: Migration rates in preliminary demographic inference. (307 KB)

    Total migration rates estimated for 22 directional migration bands in five separate preliminary G-PhoCS runs. Rows correspond to source populations and columns to the target populations. The 20 migration bands between modern and archaic populations were considered in five separate runs, each containing the four bands associated with a different modern human population (Supplementary Fig. 15A). The two migration bands between the Denisovan and the Altai Neanderthal populations were considered in all five runs, and the values shown here correspond to an aggregate of all five runs. The estimates are as shown in Supplementary Fig. 15B. Shade indicates the posterior mean total migration rate (legend), which approximates the probability that a lineage in the target population originated in the source population. The 95% Bayesian credible intervals from 2,000 MCMC replicates are indicated for migration bands whose upper credible interval bound is above 0.3%. We identified four clusters of migration bands, corresponding to what were likely at least four different cases of introgression between populations: (1) Neanderthals into non-African modern humans (red box), (2) Denisovans into Oceanians (green box), (3) between Neanderthals and Denisovans (magenta), and (4) modern humans into Neanderthals (blue box). Alt, Altai Neanderthal; Chi, Chinese; Den, Denisovan; Fre, French; Pap, Papuan; Yor, Yoruba.

  2. Extended Data Figure 2: Demographic inference on simulated data. (138 KB)

    Simulated data were generated under the demographic model as inferred by G-PhoCS (Supplementary Table 13). Each simulated data set consisted of 10,000 loci of 1 kb length. We simulated the Altai Neanderthal, the Denisovan, and three modern human populations corresponding to the San, Yoruba, and French, with modern human demography consistent with recent studies (Supplementary information section 8). Three migration bands were simulated: (1) from the Altai Neanderthal to the Denisovan, (2) from a population that diverged from the ancestors of all present-day humans 300,000 years ago into the Altai Neanderthal, and (3) from a population that diverged from the ancestors of all modern and archaic humans roughly 2.6 million years ago into the Denisovan. a, Estimates of effective population sizes (theta, θ), population divergence times (tau, τ) and migration rates (m) from three G-PhoCS runs on data simulated with gene flow from modern humans into the Altai Neanderthal lineage. Each run analyses an individual from a different present-day population, using the exact same setup used in our main analysis (Supplementary Fig. 15A). Parameters are typically estimated accurately, with 95% Bayesian credible intervals containing the values used in simulations (horizontal red lines). Rates of archaic gene flow into Denisovan appear to be somewhat overestimated, and differences between analyses of African and non-African populations are consistent with those observed in the data analysis (Supplementary Fig. 15B). b, Similar analysis done on data simulated without gene flow from modern humans into the Altai Neanderthal lineage. Accurate estimates are obtained for all model parameters, and no gene flow is inferred from modern humans into the ancestors of the Altai Neanderthal. Error bars represent the 95% Bayesian credible intervals from 2,000 MCMC replicates.

  3. Extended Data Figure 3: Simulation of different source populations for modern introgression into Neanderthals. (50 KB)

    Estimated rates of migration from the modern human population to the Altai Neanderthal population obtained from 15 G-PhoCS runs on five simulated data sets. All demographic parameters are set according to the values inferred by G-PhoCS in our data analysis (Supplementary Table 13), and the five data sets differ by the source population for migration: no migration (none), population ancestral to all present-day humans (ancestral), population ancestral to Yoruba and Europeans (Yoruba), San population (San), and European population (European). The first two sets are the ones analysed in Extended Data Fig. 2. Each data set is analysed three times, using different present-day samples: European (French), Yoruba, or San. Significant differences in estimates are observed between the data sets with and without gene flow and the data set with gene flow from a source population related to Europeans. Only minor differences were observed between values inferred for the three data sets with source population diverging from African populations (San, Yoruba, and ancestral). We conclude from this that the source population likely diverged from an African human population before the divergence of present-day Eurasians. The shaded circle in Fig. 3a represents this conclusion. Error bars represent the 95% Bayesian credible intervals from 2,000 MCMC replicates.

  4. Extended Data Figure 4: Simulation of present-day human contamination. (212 KB)

    Simulated windows of 100 kb for the Altai Neanderthal and Denisovan genomes with present-day human contamination of 5% at the genotype level. Windows are binned by their minimum divergence to Africans using derived alleles at >0.9 frequency in the simulated African population. The x and y axis as in Fig. 1. a, Gene flow from a deeply divergent archaic hominin into the Denisovan lineage (1%) and Altai Neanderthal gene flow into the Denisovan lineage (0.65%). b, Gene flow from a deeply divergent archaic hominin into the Denisovan lineage, Altai Neanderthal gene flow into the Denisovan lineage and modern human gene flow into the Altai Neanderthal lineage (1.8%). Error bars represent the 95% confidence intervals from 1,000 bootstrap replicates.

  5. Extended Data Figure 5: Haplotype ages inferred by ARGweaver on simulated data. (156 KB)

    a, Distribution of ‘African’ haplotype ages in sequences simulated with introgression into the Altai Neanderthal lineage from modern humans 100,000 years ago. ‘African’ haplotypes are identified as in Fig. 2. Error bars represent the 95% Bayesian credible intervals from 302 MCMC replicates. b, Distribution of true haplotype ages for each of the estimated ages. The horizontal dotted lines show the estimated age. The plot is divided into four quadrants; the lower half represents ‘African’ haplotypes having true ages between 100,000 and 620,000 years ago (the divergence time between archaic and present-day humans), which are necessarily due to post-divergence gene flow from modern humans. The left side of the plot represents regions that would be identified as introgressed based on a threshold of ≤ 234,000 years. The counts in each quadrant are for Altai Neanderthal (red) and Denisovan (blue), respectively. The counts for the Denisovan in the lower two quadrants are zero because there was no simulated migration from modern humans into the Denisovan lineage. Note that this is a somewhat nonstandard plot of true age versus estimated age; a more standard, reversed view is given in Supplementary Fig. 33 and demonstrates that the estimated ages are largely unbiased. Error bars as in the standard Tukey box plot (R boxplot function).

  6. Extended Data Figure 6: Main G-PhoCS demographic inference. (241 KB)

    Summary of the main demographic inference using G-PhoCS in a model with four archaic populations and one modern human population. a, The population phylogeny assumed in each of the G-PhoCS runs. Labels on internal edges indicate names of the four ancestral populations: population ancestral to the two Western Neanderthals (W.NEA), population ancestral to all three Neanderthals (NEA), population ancestral to all four archaic individuals (ARC), and population ancestral to all human samples (HUM). We augmented the phylogeny with 14 directional migration bands (arrows) between all pairs of sampled populations except for the pairs of Neanderthal populations. In one of the runs we added an unknown ‘ghost’ population and a migration band from that population into the Denisovan population. b, Parameter estimates obtained by G-PhoCS in six separate runs analysing 13,754 neutral and loosely linked loci, substituting samples in the ‘Modern’ population with pairs of present-day humans from five different modern populations (Supplementary Table 11). The last run has gene flow from the ‘ghost’ population and uses two Yoruba individuals in the modern human population. Bar heights indicate posterior mean and error bars correspond to 95% Bayesian credible intervals. Estimates of divergence times (τ) and effective population sizes (θ) are given in raw form, scaled by number of mutations per 10 kb (left axis), and calibrated to absolute units, 1,000 years for time, and 1,000 individuals for effective population size, (right axis) assuming an average mutation rate of 0.5 × 10−9 mutations per year per bp and an average generation time of 29 years. For each of the 14 migration bands, we are showing the estimated total migration rates (m). See Supplementary Information section 8 for more information on parameter calibration and setup for G-PhoCS. A graphical summary of these estimates is given in Fig. 3.

  7. Extended Data Figure 7: Migration rates in main demographic inference. (272 KB)

    Total migration rates estimated for 46 directional migration bands in five separate G-PhoCS runs. Rows correspond to source populations and columns to the target populations. The 40 migration bands between modern (present-day) and archaic populations were considered in five separate runs, each containing the eight bands associated with a different modern human population (Extended Data Fig. 6a). The six migration bands between the Denisovan population and the three Neanderthal populations were considered in all six runs, and the values shown here were estimated as an aggregate of all five runs. The estimates are as shown in Extended Data Fig. 6. Shade indicates the posterior mean total migration rate (legend), which approximates the probability that a lineage in the target population originated in the source population. The 95% Bayesian credible intervals from 2,000 MCMC replicates are indicated for migration bands whose upper credible interval bound is above 0.3%. We identified four clusters of migration bands, corresponding to what were likely at least four different cases of introgression between populations: (1) Western (European) Neanderthals into non-African modern humans (red box), (2) Denisovans into East Asian and Oceanians (green box), (3) Neanderthals into Denisovans (magenta), and (4) modern humans into Eastern Neanderthals (blue box). Directed arrows in Fig. 3a depict these introgression events. Sid, El Sidrón Neanderthal; Vin, Vindija Neanderthal.

  8. Extended Data Figure 8: Principal component analysis. (140 KB)

    The putatively introgressed segments in the Altai Neanderthal genome, defined by derived alleles in two individuals from the San, Yoruba, Mbuti, Dinka or Mandenka populations. The introgressed segments show no clear affinity to one present-day African population. A, Altai Neanderthal; S, San; M, Mbuti; Y, Yoruba; D, Dinka; N, Mandenka.

  9. Extended Data Figure 9: Natural selection in chromosome 21. (117 KB)

    Ratio of functional (putatively deleterious) to neutral polymorphism in archaic and present-day humans (Supplementary Information section 7). TFBS, transcription factor binding sites; upstream refers to 5 kb before the transcription start site of genes; UTRs, untranslated regions (in the mRNA). PhastCons ≥ 0.9 for a site to be used. Ne, Neanderthals; Af, Africans; As, Asians; Am, Americans.

Extended Data Tables

  1. Extended Data Table 1: Shared derived alleles (43 KB)

Supplementary information

PDF files

  1. Supplementary Information (10.1 MB)

    This file contains Supplementary Text 1-11, Supplementary Tables 1-19, Supplementary Figures 1-34 and additional references (see Contents).

Additional data