Genetic evidence for two founding populations of the Americas

Journal name:
Nature
Volume:
525,
Pages:
104–108
Date published:
DOI:
doi:10.1038/nature14895
Received
Accepted
Published online

Genetic studies have consistently indicated a single common origin of Native American groups from Central and South America1, 2, 3, 4. However, some morphological studies have suggested a more complex picture, whereby the northeast Asian affinities of present-day Native Americans contrast with a distinctive morphology seen in some of the earliest American skeletons, which share traits with present-day Australasians (indigenous groups in Australia, Melanesia, and island Southeast Asia)5, 6, 7, 8. Here we analyse genome-wide data to show that some Amazonian Native Americans descend partly from a Native American founding population that carried ancestry more closely related to indigenous Australians, New Guineans and Andaman Islanders than to any present-day Eurasians or Native Americans. This signature is not present to the same extent, or at all, in present-day Northern and Central Americans or in a ~12,600-year-old Clovis-associated genome, suggesting a more diverse set of founding populations of the Americas than previously accepted.

At a glance

Figures

  1. South Americans share ancestry with Australasian populations that is not seen in Mesoamericans or North Americans.
    Figure 1: South Americans share ancestry with Australasian populations that is not seen in Mesoamericans or North Americans.

    a, Quantile–quantile plot of the Z-scores for the D-statistic symmetry test for whether Mixe and Suruí share an equal rate of derived alleles with a candidate non-American population X, compared to the expected ranked quantiles for the same number of normally distributed values. b, Z-scores for the h4-statistic. c, Z-scores for the ChromoPainter statistic. d, Heatmap of ChromoPainter statistics. For non-Americans we display the symmetry statistic S(non-American; Mixe, Suruí and Karitiana) for donating as many haplotypes to Mixe as to Suruí and Karitiana. For the Americas we plot S(Onge; Mixe, American) for receiving as many haplotypes from the Onge as do the Mixe.

  2. A model of population history that can explain the excess affinity to Oceanians observed in Amazonian populations.
    Figure 2: A model of population history that can explain the excess affinity to Oceanians observed in Amazonian populations.

    a, We fit an admixture graph model where a population related to the Andamanese Onge contributed a fraction α of the ancestry of ‘Population Y’, which later contributed a fraction γ to the ancestry of Amazonian groups today (the remainder of which is related to Mesoamerican Mixe). b, Two-dimensional grid of combinations of the admixture proportions α and γ which are compatible with the data in terms of how many predicted f4-statistics deviate by Z ≥ 3.0 from empirical values.

  3. Clustering analysis.
    Extended Data Fig. 1: Clustering analysis.

    ADMIXTURE38 clustering analysis performed on the Affymetrix Human Origins data used in this study. To aid in visualization, we only show results for Native American samples and for selected samples from Eurasian populations.

  4. qpWave coefficients.
    Extended Data Fig. 2: qpWave coefficients.

    Weights from qpWave for Native American populations and for non-American outgroup populations. No weights are given for Yoruba and Cabecar, as they are used in the computation.

  5. Excess allele sharing between the Surui and the Onge.
    Extended Data Fig. 3: Excess allele sharing between the Surui and the Onge.

    a, Tests for excess shared derived alleles with the Onge in all possible comparisons of 8 Suruí and 10 Mixe individuals. All Mixe–Suruí comparisons show a positive skew whereas all Mixe–Mixe and Suruí–Suruí comparisons are consistent with 0. Lines correspond to one standard error in either direction. b, Random sequence or genotype errors cannot explain the affinity of the Amazonians to Australasians, as simulated increased errors in the Onge do not cause an increased affinity to Suruí.

  6. Signals of admixture as a function of proximity to functional regions.
    Extended Data Fig. 4: Signals of admixture as a function of proximity to functional regions.

    a, The affinity of 16 Papuan high-coverage genomes to 2 Amazonian Suruí high-coverage genomes as a function of proximity to regions of functional importance (measured by B-value). b, A total of 395 tests of quartets D(Yoruba, X; Y, Z) shows that quartets with significantly positive slopes (|Z| > 3) also yield significant genome-wide D-statistics of the opposite sign. This suggests that signals of admixture are systematically stronger close to functionally important regions.

  7. Linkage disequilibrium-based symmetry tests.
    Extended Data Fig. 5: Linkage disequilibrium-based symmetry tests.

    a, h4(Yoruba, X; Mixe, Suruí) for SNP pairs within 0.01 cM of each other contrasted with the fraction of SNP pairs in linkage equilibrium in population X (H = 0). Error bars show ± 1 s.e. b, Scatterplot of Z-scores for the f4- and h4-statistics for the same quartets. For both these panels we only use populations with at least 6 samples. c, d, We computed D(Yoruba, X; Y, Z) and h4(Yoruba, X; Y, Z) for many combinations of populations as X, Y and Z using phased Affymetrix Human Origins SNP array data ascertained in a Yoruba individual. Except for Africans who have ancestry from lineages that diverged before the Yoruba used for ascertainment and Oceanians (who have archaic Denisovan ancestry) we observe that |Z| > 3 h4-statistics are always associated with a significantly positive D for the same quartet. e, Correlation of the h4-statistic with the genetic distance separation of pairs of SNPs for h4(Yoruba, X; Mixe, Suruí).

  8. Admixture graphs for fitted population history models.
    Extended Data Fig. 6: Admixture graphs for fitted population history models.

    a, An admixture graph where all of Mixe, Suruí and Karitiana are of 100% First American ancestry is rejected with 6 predicted f-statistics at least 3 standard errors from the empirically observed value. b, An admixture graph where the ancestors of Suruí and Karitiana receive 2% ancestry from a lineage related to the Onge is consistent with the data with no outliers. c, An admixture graph where the distinct ancestry in Amazonians is more closely related to Han than to Onge produces 6 outliers. d, An admixture graph with no distinctive ancestry in Karitiana or Suruí but East Asian gene flow into the Mixe produces 7 outliers. e, An admixture graph with no distinctive ancestry in Karitiana or Suruí but MA1-related gene flow into the Mixe produces 6 outliers.

  9. Plausible range for the non-First American admixture proportion in Amazonians.
    Extended Data Fig. 7: Plausible range for the non-First American admixture proportion in Amazonians.

    a, Range obtained assuming entirely First American ancestry in the Mixe. b, The maximum proportion of non-First American ancestry in the Mixe that is consistent with the data.

Tables

  1. qpWave analysis provides evidence that Central and South American genetic variation is inconsistent with being derived from a single homogeneous population
    Extended Data Table 1: qpWave analysis provides evidence that Central and South American genetic variation is inconsistent with being derived from a single homogeneous population
  2. Top 20 D-statistics observed for D(chimpanzee, Old World population; Central Americans, Amazonians)
    Extended Data Table 2: Top 20 D-statistics observed for D(chimpanzee, Old World population; Central Americans, Amazonians)
  3. f4-statistics for which the statistic predicted by the fitted admixture graphs deviates by more than |Z| > 3 from the statistic computed on the empirical data
    Extended Data Table 3: f4-statistics for which the statistic predicted by the fitted admixture graphs deviates by more than |Z| > 3 from the statistic computed on the empirical data

References

  1. Wang, S. et al. Genetic variation and population structure in Native Americans. PLoS Genet. 3, e185 (2007)
  2. Reich, D. et al. Reconstructing Native American population history. Nature 488, 370374 (2012)
  3. Rasmussen, M. et al. The genome of a Late Pleistocene human from a Clovis burial site in western Montana. Nature 506, 225229 (2014)
  4. Raghavan, M. et al. Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans. Nature 505, 8791 (2014)
  5. Neves, W. & Pucciarelli, H. The origins of the first Americans—an analysis based on the cranial morphology of early South American remains. Am. J. Phys. Anthropol. 81, 274 (1990)
  6. Neves, W. et al. Early Holocene human skeletal remains from Cerca Grande, Lagoa Santa, Central Brazil, and the origins of the first Americans. World Archaeol. 36, 479501 (2004)
  7. Neves, W. A., Prous, A., González-José, R., Kipnis, R. & Powell, J. Early Holocene human skeletal remains from Santana do Riacho, Brazil: implications for the settlement of the New World. J. Hum. Evol. 45, 1942 (2003)
  8. GonzálezJosé, R. et al. Late Pleistocene/Holocene craniofacial morphology in Mesoamerican Paleoindians: implications for the peopling of the New World. Am. J. Phys. Anthropol. 128, 772780 (2005)
  9. Rasmussen, M. et al. Ancient human genome sequence of an extinct Palaeo-Eskimo. Nature 463, 757762 (2010)
  10. Raghavan, M. et al. The genetic prehistory of the New World Arctic. Science 345, (2014)
  11. Gilbert, M. T. P. et al. DNA from pre-Clovis human coprolites in Oregon, North America. Science 320, 786789 (2008)
  12. Chatters, J. C. et al. Late Pleistocene human skeleton and mtDNA link Paleoamericans and modern Native Americans. Science 344, 750754 (2014)
  13. Jantz, R. L. & Owsley, D. W. Variation among early North American crania. Am. J. Phys. Anthropol. 114, 146155 (2001)
  14. Neves, W. A., Hubbe, M. & Correal, G. Human skeletal remains from Sabana de Bogota, Colombia: a case of Paleoamerican morphology late survival in South America? Am. J. Phys. Anthropol. 133, 10801098 (2007)
  15. González-José, R. et al. Craniometric evidence for Palaeoamerican survival in Baja California. Nature 425, 6265 (2003)
  16. Sparks, C. S. & Jantz, R. L. A reassessment of human cranial plasticity: Boas revisited. Proc. Natl Acad. Sci. USA 99, 1463614639 (2002)
  17. Relethford, J. H. Apportionment of global human genetic diversity based on craniometrics and skin color. Am. J. Phys. Anthropol. 118, 393398 (2002)
  18. Patterson, N. et al. Ancient admixture in human history. Genetics 192, 10651093 (2012)
  19. Lazaridis, I. et al. Ancient human genomes suggest three ancestral populations for present-day Europeans. Nature 513, 409413 (2014)
  20. Qin, P. & Stoneking, M. Denisovan ancestry in East Eurasian and Native American populations. Mol. Biol. Evol. (2015)
  21. Green, R. E. et al. A draft sequence of the Neandertal genome. Science 328, 710722 (2010)
  22. Meyer, M. et al. A high-coverage genome sequence from an Archaic Denisovan individual. Science 338, 222226 (2012)
  23. Prüfer, K. et al. The complete genome sequence of a Neanderthal from the Altai Mountains. Nature 505, 4349 (2014)
  24. McVicker, G., Gordon, D., Davis, C. & Green, P. Widespread genomic signatures of natural selection in hominid evolution. PLoS Genet. 5, e1000471 (2009)
  25. Gillespie, J. H. Genetic drift in an infinite population: the pseudohitchhiking model. Genetics 155, 909919 (2000)
  26. Coop, G. et al. The role of geography in human adaptation. PLoS Genet. 5, e1000500 (2009)
  27. Moorjani, P. et al. The history of African gene flow into Southern Europeans, Levantines, and Jews. PLoS Genet. 7, e1001373 (2011)
  28. Hellenthal, G. et al. A genetic atlas of human admixture history. Science 343, 747751 (2014)
  29. Sankararaman, S., Patterson, N., Li, H., Pääbo, S. & Reich, D. The date of interbreeding between Neandertals and modern humans. PLoS Genet. 8, e1002947 (2012)
  30. Lawson, D. J., Hellenthal, G., Myers, S. & Falush, D. Inference of population structure using dense haplotype data. PLoS Genet. 8, e1002453 (2012)
  31. Delaneau, O., Marchini, J. & Zagury, J.-F. A linear complexity phasing method for thousands of genomes. Nature Methods 9, 179181 (2011)
  32. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 17541760 (2009)
  33. McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 12971303 (2010)
  34. Busing, F. M., Meijer, E. & Van Der Leeden, R. Delete-m jackknife for unequal m. Stat. Comput. 9, 38 (1999)
  35. Reich, D., Thangaraj, K., Patterson, N., Price, A. L. & Singh, L. Reconstructing Indian population history. Nature 461, 489494 (2009)
  36. Robbins, R. B. Some applications of mathematics to breeding problems III. Genetics 3, 375389 (1918)
  37. Becker, R. A. & Wilks, A. R. Maps in S. AT&T Bell Laboratories Statistics Research Report [93.2] (1993)
  38. Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 16551664 (2009)

Download references

Author information

Affiliations

  1. Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA

    • Pontus Skoglund,
    • Swapan Mallick,
    • Niru Chennagiri &
    • David Reich
  2. Broad Institute of Harvard and MIT, Cambridge, Massachusetts 02142, USA

    • Pontus Skoglund,
    • Swapan Mallick,
    • Niru Chennagiri,
    • Nick Patterson &
    • David Reich
  3. Howard Hughes Medical Institute, Harvard Medical School, Boston, Massachusetts 02115, USA

    • Swapan Mallick &
    • David Reich
  4. Departamento de Genética, Instituto de Biociências, Universidade Federal do Rio Grande do Sul, 91501-970 Porto Alegre, RS, Brazil

    • Maria Cátira Bortolini &
    • Francisco Mauro Salzano
  5. Departamento de Genética e Biologia Evolutiva, Universidade de São Paulo, 05508-090, SP, Brazil

    • Tábita Hünemeier
  6. Departamento de Genética, Universidade Federal do Paraná, 81531-980 Curitiba, PR, Brazil

    • Maria Luiza Petzl-Erler

Contributions

P.S. performed analyses. P.S., S.M., M.C.B., N.C., T.H., M.L.P.-E., F.M.S., N.P. and D.R. prepared datasets. P.S. and D.R. wrote the paper.

Competing financial interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to:

Genome sequence data is available from (https://www.simonsfoundation.org/life-sciences/simons-genome-diversity-project-dataset/). New Affymetrix Human Origins array genotype data are available to researchers who send D.R. a signed letter agreeing to respect specific conditions (Supplementary Information section 1).

Author details

Extended data figures and tables

Extended Data Figures

  1. Extended Data Figure 1: Clustering analysis. (568 KB)

    ADMIXTURE38 clustering analysis performed on the Affymetrix Human Origins data used in this study. To aid in visualization, we only show results for Native American samples and for selected samples from Eurasian populations.

  2. Extended Data Figure 2: qpWave coefficients. (184 KB)

    Weights from qpWave for Native American populations and for non-American outgroup populations. No weights are given for Yoruba and Cabecar, as they are used in the computation.

  3. Extended Data Figure 3: Excess allele sharing between the Surui and the Onge. (112 KB)

    a, Tests for excess shared derived alleles with the Onge in all possible comparisons of 8 Suruí and 10 Mixe individuals. All Mixe–Suruí comparisons show a positive skew whereas all Mixe–Mixe and Suruí–Suruí comparisons are consistent with 0. Lines correspond to one standard error in either direction. b, Random sequence or genotype errors cannot explain the affinity of the Amazonians to Australasians, as simulated increased errors in the Onge do not cause an increased affinity to Suruí.

  4. Extended Data Figure 4: Signals of admixture as a function of proximity to functional regions. (188 KB)

    a, The affinity of 16 Papuan high-coverage genomes to 2 Amazonian Suruí high-coverage genomes as a function of proximity to regions of functional importance (measured by B-value). b, A total of 395 tests of quartets D(Yoruba, X; Y, Z) shows that quartets with significantly positive slopes (|Z| > 3) also yield significant genome-wide D-statistics of the opposite sign. This suggests that signals of admixture are systematically stronger close to functionally important regions.

  5. Extended Data Figure 5: Linkage disequilibrium-based symmetry tests. (352 KB)

    a, h4(Yoruba, X; Mixe, Suruí) for SNP pairs within 0.01 cM of each other contrasted with the fraction of SNP pairs in linkage equilibrium in population X (H = 0). Error bars show ± 1 s.e. b, Scatterplot of Z-scores for the f4- and h4-statistics for the same quartets. For both these panels we only use populations with at least 6 samples. c, d, We computed D(Yoruba, X; Y, Z) and h4(Yoruba, X; Y, Z) for many combinations of populations as X, Y and Z using phased Affymetrix Human Origins SNP array data ascertained in a Yoruba individual. Except for Africans who have ancestry from lineages that diverged before the Yoruba used for ascertainment and Oceanians (who have archaic Denisovan ancestry) we observe that |Z| > 3 h4-statistics are always associated with a significantly positive D for the same quartet. e, Correlation of the h4-statistic with the genetic distance separation of pairs of SNPs for h4(Yoruba, X; Mixe, Suruí).

  6. Extended Data Figure 6: Admixture graphs for fitted population history models. (136 KB)

    a, An admixture graph where all of Mixe, Suruí and Karitiana are of 100% First American ancestry is rejected with 6 predicted f-statistics at least 3 standard errors from the empirically observed value. b, An admixture graph where the ancestors of Suruí and Karitiana receive 2% ancestry from a lineage related to the Onge is consistent with the data with no outliers. c, An admixture graph where the distinct ancestry in Amazonians is more closely related to Han than to Onge produces 6 outliers. d, An admixture graph with no distinctive ancestry in Karitiana or Suruí but East Asian gene flow into the Mixe produces 7 outliers. e, An admixture graph with no distinctive ancestry in Karitiana or Suruí but MA1-related gene flow into the Mixe produces 6 outliers.

  7. Extended Data Figure 7: Plausible range for the non-First American admixture proportion in Amazonians. (185 KB)

    a, Range obtained assuming entirely First American ancestry in the Mixe. b, The maximum proportion of non-First American ancestry in the Mixe that is consistent with the data.

Extended Data Tables

  1. Extended Data Table 1: qpWave analysis provides evidence that Central and South American genetic variation is inconsistent with being derived from a single homogeneous population (340 KB)
  2. Extended Data Table 2: Top 20 D-statistics observed for D(chimpanzee, Old World population; Central Americans, Amazonians) (247 KB)
  3. Extended Data Table 3: f4-statistics for which the statistic predicted by the fitted admixture graphs deviates by more than |Z| > 3 from the statistic computed on the empirical data (452 KB)

Supplementary information

PDF files

  1. Supplementary Information (909 KB)

    This file contains Supplementary Text and Data 1-6, Supplementary Tables and additional references.

Additional data