Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Genomic insights into the formation of human populations in East Asia

Abstract

The deep population history of East Asia remains poorly understood owing to a lack of ancient DNA data and sparse sampling of present-day people1,2. Here we report genome-wide data from 166 East Asian individuals dating to between 6000 bc and ad 1000 and 46 present-day groups. Hunter-gatherers from Japan, the Amur River Basin, and people of Neolithic and Iron Age Taiwan and the Tibetan Plateau are linked by a deeply splitting lineage that probably reflects a coastal migration during the Late Pleistocene epoch. We also follow expansions during the subsequent Holocene epoch from four regions. First, hunter-gatherers from Mongolia and the Amur River Basin have ancestry shared by individuals who speak Mongolic and Tungusic languages, but do not carry ancestry characteristic of farmers from the West Liao River region (around 3000 bc), which contradicts theories that the expansion of these farmers spread the Mongolic and Tungusic proto-languages. Second, farmers from the Yellow River Basin (around 3000 bc) probably spread Sino-Tibetan languages, as their ancestry dispersed both to Tibet—where it forms approximately 84% of the gene pool in some groups—and to the Central Plain, where it has contributed around 59–84% to modern Han Chinese groups. Third, people from Taiwan from around 1300 bc to ad 800 derived approximately 75% of their ancestry from a lineage that is widespread in modern individuals who speak Austronesian, Tai–Kadai and Austroasiatic languages, and that we hypothesize derives from farmers of the Yangtze River Valley. Ancient people from Taiwan also derived about 25% of their ancestry from a northern lineage that is related to, but different from, farmers of the Yellow River Basin, which suggests an additional north-to-south expansion. Fourth, ancestry from Yamnaya Steppe pastoralists arrived in western Mongolia after around 3000 bc but was displaced by previously established lineages even while it persisted in western China, as would be expected if this ancestry was associated with the spread of proto-Tocharian Indo-European languages. Two later gene flows affected western Mongolia: migrants after around 2000 bc with Yamnaya and European farmer ancestry, and episodic influences of later groups with ancestry from Turan.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview.
Fig. 2: Model of deep population relationships.
Fig. 3: Estimates of mixture proportions using qpAdm.

Similar content being viewed by others

Data availability

The aligned sequences are available through the European Nucleotide Archive under accession number PRJEB42781. The newly generated genotype data of 383 modern East Asian individuals have been deposited in Zenodo (https://doi.org/10.5281/zenodo.4058532). The previously published data co-analysed with our newly reported data can be obtained as described in the original publications, which are all referenced in Supplementary Table 4; a compiled dataset that includes the merged genotypes used in this paper is available as the Allen Ancient DNA Resource at https://reich.hms.harvard.edu/allen-ancient-dna-resource- aadr-downloadable-genotypes-present-day-and-ancient-dna-data. Any other relevant data are available from the corresponding authors upon reasonable request.

References

  1. Cavalli-Sforza, L. L. The Chinese human genome diversity project. Proc. Natl Acad. Sci. USA 95, 11501–11503 (1998).

    Article  CAS  PubMed  ADS  PubMed Central  Google Scholar 

  2. HUGO Pan-Asian SNP Consortium. Mapping human genetic diversity in Asia. Science 326, 1541–1545 (2009).

    Article  Google Scholar 

  3. Haak, W. et al. Massive migration from the steppe was a source for Indo-European languages in Europe. Nature 522, 207–211 (2015).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  4. Allentoft, M. E. et al. Population genomics of Bronze Age Eurasia. Nature 522, 167–172 (2015).

    Article  CAS  PubMed  ADS  Google Scholar 

  5. Damgaard, P. B. et al. 137 ancient human genomes from across the Eurasian steppes. Nature 557, 369–374 (2018).

    Article  CAS  PubMed  ADS  Google Scholar 

  6. Narasimhan, V. M. et al. The formation of human populations in South and Central Asia. Science 365, eaat7487 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Fu, Q. et al. An early modern human from Romania with a recent Neanderthal ancestor. Nature 524, 216–219 (2015).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  8. Fu, Q. et al. DNA analysis of an early modern human from Tianyuan Cave, China. Proc. Natl Acad. Sci. USA 110, 2223–2227 (2013).

    Article  CAS  PubMed  ADS  PubMed Central  Google Scholar 

  9. Korneliussen, T. S., Albrechtsen, A. & Nielsen, R. ANGSD: analysis of next generation sequencing data. BMC Bioinformatics 15, 356 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  10. Patterson, N., Price, A. L. & Reich, D. Population structure and eigenanalysis. PLoS Genet. 2, e190 (2006).

    Article  PubMed  PubMed Central  Google Scholar 

  11. Loh, P. R. et al. Inferring admixture histories of human populations using linkage disequilibrium. Genetics 193, 1233–1254 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  12. Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Yang, M. A. et al. 40,000-year-old individual from Asia provides insight into early population structure in Eurasia. Curr. Biol. 27, 3202–3208 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Massilani, D. et al. Denisovan ancestry and population history of early East Asians. Science 370, 579–583 (2020).

    Article  CAS  PubMed  Google Scholar 

  15. Wang, C. C. & Li, H. Inferring human history in East Asia from Y chromosomes. Investig. Genet. 4, 11 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  16. Patterson, N. et al. Ancient admixture in human history. Genetics 192, 1065–1093 (2012).

    Article  PubMed  PubMed Central  Google Scholar 

  17. Yang, M. A. et al. Ancient DNA indicates human population shifts and admixture in northern and southern China. Science 369, 282–288 (2020).

    Article  CAS  PubMed  ADS  Google Scholar 

  18. Nakashima, A., Ishida, H., Shigematsu, M., Goto, M. & Hanihara, T. Nonmetric cranial variation of Jomon Japan: implications for the evolution of eastern Asian diversity. Am. J. Hum. Biol. 22, 782–790 (2010).

    Article  PubMed  Google Scholar 

  19. Bellwood, P. & Renfrew, C. Examining the Farming/Language Dispersal Hypothesis (McDonald Institute for Archaeological Research, 2002).

  20. Robbeets, M. & Savelyev, A. The Oxford Guide to the Transeurasian Languages (Oxford Univ. Press, 2020).

  21. Siska, V. et al. Genome-wide data from two early Neolithic East Asian individuals dating to 7700 years ago. Sci. Adv. 3, e1601877 (2017).

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  22. Kamberov, Y. G. et al. Modeling recent human evolution in mice by expression of a selected EDAR variant. Cell 152, 691–702 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Zhang, X. L. et al. The earliest human occupation of the high-altitude Tibetan Plateau 40 thousand to 30 thousand years ago. Science 362, 1049–1051 (2018).

    Article  CAS  PubMed  ADS  Google Scholar 

  24. Chen, F. H. et al. Agriculture facilitated permanent human occupation of the Tibetan Plateau after 3600 B.P. Science 347, 248–250 (2015).

    Article  CAS  PubMed  ADS  Google Scholar 

  25. Zhang, M., Yan, S., Pan, W. & Jin, L. Phylogenetic evidence for Sino-Tibetan origin in northern China in the Late Neolithic. Nature 569, 112–115 (2019).

    Article  CAS  PubMed  ADS  Google Scholar 

  26. van Driem, G. in The Peopling of East Asia: Putting Together Archaeology, Linguistics and Genetics (eds Sagart, L. et al.) 81–106 (Routledge, 2005).

  27. Liu, S. et al. Genomic analyses from non-invasive prenatal testing reveal genetic associations, patterns of viral infections, and Chinese population history. Cell 175, 347–359 (2018).

    Article  CAS  PubMed  Google Scholar 

  28. Chiang, C. W. K., Mangul, S., Robles, C. & Sankararaman, S. A comprehensive map of genetic variation in the world’s largest ethnic group—Han Chinese. Mol. Biol. Evol. 35, 2736–2750 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Reich, D. et al. Reconstructing Native American population history. Nature 488, 370–374 (2012).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  30. Lipson, M. et al. Ancient genomes document multiple waves of migration in Southeast Asian prehistory. Science 361, 92–95 (2018).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  31. McColl, H. et al. The prehistoric peopling of Southeast Asia. Science 361, 88–92 (2018).

    Article  CAS  PubMed  ADS  Google Scholar 

  32. Wang, L. X. et al. Reconstruction of Y-chromosome phylogeny reveals two neolithic expansions of Tibeto-Burman populations. Mol. Genet. Genomics 293, 1293–1300 (2018).

    Article  CAS  PubMed  Google Scholar 

  33. Ge, J. X., Wu, S. D. & Chao, S. J. Zhongguo yimin shi (The Migration History of China) (Fujian People’s Publishing House, 1997).

  34. Ning, C. et al. Ancient genomes from northern China suggest links between subsistence changes and human migration. Nat. Commun. 11, 2700 (2020).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  35. Wei, L. H. et al. Phylogeography of Y-chromosome haplogroup O3a2b2-N6 reveals patrilineal traces of Austronesian populations on the eastern coastal regions of Asia. PLoS ONE 12, e0175080 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  36. Ko, A. M. et al. Early Austronesians: into and out of Taiwan. Am. J. Hum. Genet. 94, 426–436 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Skoglund, P. et al. Genomic insights into the peopling of the Southwest Pacific. Nature 538, 510–513 (2016).

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  38. Lipson, M. et al. Reconstructing Austronesian population history in island Southeast Asia. Nat. Commun. 5, 4689 (2014).

    Article  CAS  PubMed  ADS  Google Scholar 

  39. Bellwood, P. The checkered prehistory of rice movement southwards as a domesticated cereal—from the Yangzi to the equator. Rice 4, 93–103 (2011).

    Article  Google Scholar 

  40. Yang, X. et al. Early millet use in northern China. Proc. Natl Acad. Sci. USA 109, 3726–3730 (2012).

    Article  CAS  PubMed  ADS  PubMed Central  Google Scholar 

  41. Wilkin, S. et al. Dairy pastoralism sustained eastern Eurasian steppe populations for 5,000 years. Nat. Ecol. Evol. 4, 346–355 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  42. Kovalev, A. The great migration of the Chemurchek people from France to the Altai in the early 3rd millennium bce. Int. J. Eurasian Stud. 1, 1–58 (2011).

    Google Scholar 

  43. Jeong, C. et al. A dynamic 6,000-year genetic history of Eurasia’s Eastern Steppe. Cell 183, 890–904 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Ning, C. et al. Ancient genomes reveal Yamnaya-related ancestry and a potential source of Indo-European speakers in Iron Age Tianshan. Curr. Biol. 29, 2526–2532 (2019).

    Article  CAS  PubMed  Google Scholar 

  45. Bellwood, P. in The Encyclopedia of Global Human Migration (Wiley-Blackwell, 2013).

  46. Mallory, J. P. in Search of the Indo-Europeans: Language, Archaeology and Myth (Thames & Hudson, 1991).

  47. Anthony, D. The Horse, the Wheel, and Language: How Bronze-Age Riders from the Eurasian Steppes Shaped the Modern World (Princeton Univ. Press, 2007).

  48. Schiffels, S. & Durbin, R. Inferring human population size and separation history from multiple genome sequences. Nat. Genet. 46, 919–925 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Wang, K., Mathieson, I., O’Connell, J. & Schiffels, S. Tracking human population structure through time from whole genome sequences. PLoS Genet. 16, e1008552 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Jeong, C. et al. Long-term genetic stability and a high-altitude East Asian origin for the peoples of the high valleys of the Himalayan arc. Proc. Natl Acad. Sci. USA. 113, 7485–7490 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Pinhasi, R., Fernandes, D. M., Sirak, K. & Cheronet, O. Isolating the human cochlea to generate bone powder for ancient DNA analysis. Nat. Protocols 14, 1194–1205 (2019).

    Article  CAS  PubMed  Google Scholar 

  52. Sirak, K. A. et al. A minimally-invasive method for sampling human petrous bones from the cranial base for ancient DNA analysis. Biotechniques 62, 283–289 (2017).

    Article  CAS  PubMed  Google Scholar 

  53. Dabney, J. et al. Complete mitochondrial genome sequence of a Middle Pleistocene cave bear reconstructed from ultrashort DNA fragments. Proc. Natl Acad. Sci. USA 110, 15758–15763 (2013).

    Article  CAS  PubMed  ADS  PubMed Central  Google Scholar 

  54. Korlević, P. et al. Reducing microbial and human contamination in DNA extractions from ancient bones and teeth. Biotechniques 59, 87–93 (2015).

    Article  PubMed  Google Scholar 

  55. Rohland, N., Glocke, I., Aximu-Petri, A. & Meyer, M. Extraction of highly degraded DNA from ancient bones, teeth and sediments for high-throughput sequencing. Nat. Protocols 13, 2447–2461 (2018).

    Article  CAS  PubMed  Google Scholar 

  56. Rohland, N., Harney, E., Mallick, S., Nordenfelt, S. & Reich, D. Partial uracil-DNA-glycosylase treatment for screening of ancient DNA. Phil. Trans. R. Soc. Lond. B 370, 20130624 (2015).

    Article  Google Scholar 

  57. Gansauge, M. T. & Meyer, M. Selective enrichment of damaged DNA molecules for ancient genome sequencing. Genome Res. 24, 1543–1549 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Meyer, M. et al. A high-coverage genome sequence from an archaic Denisovan individual. Science 338, 222–226 (2012).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  59. Maricic, T., Whitten, M. & Pääbo, S. Multiplexed DNA sequence capture of mitochondrial genomes using PCR products. PLoS ONE 5, e14004 (2010).

    Article  PubMed  PubMed Central  ADS  Google Scholar 

  60. Rohland, N. & Hofreiter, M. Ancient DNA extraction from bones and teeth. Nat. Protocols 2, 1756–1762 (2007).

    Article  CAS  PubMed  Google Scholar 

  61. John, J. S. SeqPrep. GitHub https://github.com/jstjohn/SeqPrep (2011).

  62. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Behar, D. M. et al. A “Copernican” reassessment of the human mitochondrial DNA tree from its root. Am. J. Hum. Genet. 90, 675–684 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Weissensteiner, H. et al. HaploGrep 2: mitochondrial haplogroup classification in the era of high-throughput sequencing. Nucleic Acids Res. 44, W58–W63 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Günther, T. & Nettelblad, C. The presence and impact of reference bias on population genomic studies of prehistoric human populations. PLoS Genet. 15, e1008302 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  66. Kennett, D. J. et al. Archaeogenomic evidence reveals prehistoric matrilineal dynasty. Nat. Commun. 8, 14115 (2017).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  67. Lohse, J. C., Madsen, D. B., Culleton, B. J. & Kennett, D. J. Isotope paleoecology of episodic mid-to-late Holocene bison population expansions in the southern Plains, U.S.A. Quat. Sci. Rev. 102, 14–26 (2014).

    Article  ADS  Google Scholar 

  68. Reimer, P. J. et al. The IntCal20 Northern Hemisphere radiocarbon age calibration curve (0–55 cal kBP). Radiocarbon 62, 725–757 (2020).

    Article  CAS  Google Scholar 

  69. Bronk Ramsey, C. Bayesian analysis of radiocarbon dates. Radiocarbon 51, 337–360 (2009).

    Article  Google Scholar 

  70. Rasmussen, M. et al. An Aboriginal Australian genome reveals separate human dispersals into Asia. Science 334, 94–98 (2011).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  71. Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 4, 7 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  72. Busing, F. T. A., Meijer, E. & van der Leeden, R. Delete-m jackknife for unequal m. Stat. Comput. 9, 3–8 (1999).

    Article  Google Scholar 

  73. Mallick, S. et al. The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature 538, 201–206 (2016).

    Article  CAS  PubMed  PubMed Central  ADS  Google Scholar 

  74. Monroy Kuhn, J. M., Jakobsson, M. & Günther, T. Estimating genetic kin relationships in prehistoric populations. PLoS ONE 13, e0195491 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  75. Ringbauer, H., Novembre, J. & Steinruecken, M. Human parental relatedness through time — detecting runs of homozygosity in ancient DNA. Preprint at bioRxiv https://doi.org/10.1101/2020.05.31.126912 (2020).

Download references

Acknowledgements

We thank D. Anthony, O. Bar-Yosef, K. Brunson, R. Flad, P. Flegontov, Q. Fu, W. Haak, I. Lazaridis, M. Lipson, I. Mathieson, R. Meadow, I. Olalde, N. Patterson, P. Skoglund, D. Xu, P. Bellwood and C. Chiang for comments; N. Saitou and the Asian DNA Repository Consortium for sharing genotype data from present-day Japanese groups; T. Nishimoto and T. Fujisawa from the Rebun Town Board of Education for sharing the Funadomari Jomon samples, and H. Tanaka and W. Nagahara from the Archeological Center of Chiba City, who are excavators of the Rokutsu Jomon site. The excavations at Boisman-2 site (Boisman culture), the Pospelovo-1 site (Yankovsky culture) and the Roshino-4 site (Heishui Mohe culture) were funded by the Far Eastern Federal University and the Institute of History, Archaeology and Ethnology Far Eastern Branch of the Russian Academy of Sciences; research on Pospelovo-1 is funded by RFBR project number 18-09-40101. C.-C.W. was funded by the Max Planck Society, the National Natural Science Foundation of China (NSFC 31801040), the Nanqiang Outstanding Young Talents Program of Xiamen University (X2123302), the Major project of National Social Science Foundation of China (20&ZD248), a European Research Council (ERC) grant to D. Xu (ERC-2019-ADG-883700-TRAM) and Fundamental Research Funds for the Central Universities (ZK1144). H.M. was supported by grant JSPS 16H02527. M.R. and C.-C.W. received funding from the ERC under the European Union’s Horizon 2020 research and innovation program (grant no. 646612) to M.R. H. Li was funded NSFC (91731303, 31671297), B&R International Joint Laboratory of Eurasian Anthropology (18490750300). J.K. was funded by DFG grant KR 4015/1-1, the Baden Württemberg Foundation and the Max Planck Institute. Accelerator Mass Spectrometry radiocarbon dating work was supported by the National Science Foundation (NSF) (BCS-1460369) to D.J.K. and B.J.C. D.R. was funded by NSF grant BCS-1032255, NIH (NIGMS) grant GM100233, the Paul M. Allen Frontiers Group, John Templeton Foundation grant 61220, a gift from J.-F. Clin and the Howard Hughes Medical Institute.

Author information

Authors and Affiliations

Authors

Contributions

C.-C.W., H.-Y.Y., A.N.P., H.M., A.M.K., L.J., H. Li, J.K., R.P. and D.R. conceptualized the study. C.-C.W., R.B., M. Mah, S.M., Z.Z., B.J.C. and D.R. carried out the formal analysis; C.-C.W., K. Sirak, O.C., A.K., N.R., A.M.K., M. Mah, S.M., K.W., N.A., N.B., K.C., F.C., K.S.D.C., B.J.C., L.E., S.F., D.K., A.M.L., K.M., M. Michel, J.O., K.T.O., K. Stewardson, S.W., S.Y., F.Z., J.G., Q.D., L.K., Dawei Li, Dongna Li, R.L., W.C., N., R.S., L.-X.W., L.W., G.X., H.Y., M.Z., G.H., X.Y., R.H., S.S., D.J.K., L.J., H. Li, J.K., R.P. and D.R. carried out the investigation. H.-Y.Y., A.N.P., R.B., D.T., J.Z., Y.-C.L., J.-Y.L., M. Mah, S.M., Z.Z., R.C., H. Looh, C.-J.H., C.-C.S., Y.G.N., A.V.T., A.A.T., S.L., Z.-Y.S., X.-M.W., T.-L.Y., X.H., L.C., H.D., J.B., E. Mijiddorj, D.E., T.-O.I., E. Myagmar, H.K.-K., M.N., K.-i.S., O.A.S., D.J.K., R.P. and D.R. provided resources. C.-C.W., K. Sirak, O.C., A.K., N.R., R.B., M. Mah, S.M., B.J.C., L.E., A.A.T. and D.R. curated the data. C.-C.W., H.-Y.Y., A.N.P., H.M., A.K. and D.R. wrote the paper. C.-C.W., H.-Q.Z., N.R., M.R., S.S., D.J.K., L.J., H. Li, J.K., R.P. and D.R. supervised the study.

Corresponding authors

Correspondence to Chuan-Chao Wang, Johannes Krause, Ron Pinhasi or David Reich.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature thanks Peter Bellwood, Charleston Chiang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 PCA of ancient samples.

Projection of ancient samples onto PCA dimensions 1 and 2 defined by East Asian, European, Siberian and Native American populations.

Extended Data Fig. 2 PCA of present-day samples.

a, PCA dimensions 1 and 2 defined by present-day East Asian, European, Siberian and Native American populations. b, PCA dimensions 1 and 2 defined by present-day East Asian groups with little West Eurasian mixture.

Extended Data Fig. 3 Neighbour-joining tree of present-day East Eurasian individuals using the human origin dataset.

a, Neighbour-joining tree of present-day East Eurasian individuals based on FST distances using the human origin dataset. The branch length is shown in FST distance. b, Neighbour-joining tree of present-day East Eurasian individuals in which internal branches are all shown with the same branch length for better visualization.

Extended Data Fig. 4 Admixture plot at K = 15 using the human origin dataset.

af, We grouped the populations roughly into six groups based on geographical and genetic affinity. a, Populations mainly from Africa (yellow), America (magenta), West Eurasia (dark green and light brown) and Oceania (light magenta). b, Populations mainly from Mongolia (blue) and Siberia (purple). c, Populations mainly from southern China and Southeast Asia (light blue). d, Populations mainly from the Tibetan Plateau (olive) and Neolithic Yellow River Basin (red). e, Mainly Han Chinese groups from China (light blue and red). f, Populations mainly from the Amur River Basin (blue and red) and northeast Asia.

Extended Data Fig. 5 Estimates of population split times.

a, Cross-coalescence rates for selected population pairs. We ran MSMC for four pairs of populations: Tibetan–Ami, Tibetan–Atayal, Tibetan–Ulchi and Tibetan–Mixe. We used one individual from each population in this analysis. The modern genomic data for those individuals are from the Simons Genome Diversity Project. The times are calculated based on the mutation rate and generation time specified on the x axis. b, Cross-coalescence rates for selected population pairs. The same analysis as shown in a but using MSMC2 instead of MSMC, and using two individuals per population except for the Tibetan–Atayal pair, for which we used only one.

Extended Data Fig. 6 Admixture graph model.

This figure is the same as Fig. 2 except we show the fitted genetic drifts on each lineage. We used all available sites in the dataset comprising 1,237,207 SNPs, restricting to transversions only to confirm that the same model fit (Supplementary Information section 3). We started with a skeleton tree that fits the data for Denisovan, Mbuti, Onge, Tianyuan and Luxembourg Loschbour and one admixture event. We grafted on Mongolia East Neolithic, Late Neolithic farmers from the Upper Yellow River, Liangdao 2, Japan Jomon, Nepal Chokhopani, Taiwan Hanben and Late Neolithic farmers from the West Liao River in turn, adding them consecutively to all possible edges in the tree and retaining only graph solutions that provided no differences of |Z| < 3 between fitted and estimated statistics (maximum |Z| = 2.95 here). We used the MSMC and MSMC2 relative population split time estimates to constrain models. Deep splits are not well constrained because of the minimal availability of data on East Asian populations from the Upper Paleolithic. a, Locations and dates of the East Asian individuals used in model fitting, with colours indicating whether the majority ancestry is from the hypothesized coastal expansion (green), interior expansion south (red) and interior expansion north (blue). The map is based on the ‘Google Map Layer’ from ArcGIS Online Basemaps (map data ©2020 Google). The grey circles represent sampled populations and white circles represent unsampled hypothesized nodes. b, In the model visualization, we colour lineages modelled as deriving entirely from one of these expansions, and also colour populations according to ancestry proportions. Dashed lines represent admixture (proportions are marked), and we show the amount of genetic drift on each lineage in units of FST × 1,000.

Extended Data Fig. 7 Shared genetic drift among Tibetan groups, measured by f3(X, Y; Mbuti).

Lighter colours indicate more shared drift. Lahu groups with the Southeast Asian cluster probably due to substantial admixture. The Tibetan_Yajiang are geographically in the Tibeto-Burman Corridor but group with Core Tibetan individuals, presumably reflecting less genetic admixture from people of the Southeast Asian cluster.

Extended Data Table 1 Population information for newly genotyped present-day individuals
Extended Data Table 2 Kinship detected between pairs of individuals

Supplementary information

Supplementary Information

This Supplementary Information file contains an Ethics Statement, Supplementary Information sections 1-4 including, 15 Supplementary Figures, 5 Supplementary Tables and Supplementary References. The supplementary figures and tables provide information on the genetic structure and population history of East Asians.

Reporting Summary

Supplementary Tables

This zipped file contains 26 Supplementary Tables and a table guide.

Supplementary Data

Genotypes of the newly reported 166 ancient individuals.

.Peer Review File

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, CC., Yeh, HY., Popov, A.N. et al. Genomic insights into the formation of human populations in East Asia. Nature 591, 413–419 (2021). https://doi.org/10.1038/s41586-021-03336-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41586-021-03336-2

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing