This is an unedited manuscript that has been accepted for publication. Nature Research are providing this early version of the manuscript as a service to our authors and readers. The manuscript will undergo copyediting, typesetting and a proof review before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers apply.

Genomic Insights into the Formation of Human Populations in East Asia


The deep population history of East Asia remains poorly understood due to a lack of ancient DNA data and sparse sampling of present-day people1,2. We report genome-wide data from 166 East Asians dating to 6000 BCE – 1000 CE and 46 present-day groups. Hunter-gatherers from Japan, the Amur River Basin, and people of Neolithic and Iron Age Taiwan and the Tibetan plateau are linked by a deeply-splitting lineage likely reflecting a Late Pleistocene coastal migration. We follow Holocene expansions from four regions. First, hunter-gatherers of Mongolia and the Amur River Basin have ancestry shared by Mongolic and Tungusic language speakers but do not carry West Liao River farmer ancestry contradicting theories that their expansion spread these proto-languages. Second, Yellow River Basin farmers at ~3000 BCE likely spread Sino-Tibetan languages as their ancestry dispersed both to Tibet where it forms up ~84% to some groups and to the Central Plain where it contributed ~59-84% to Han Chinese. Third, people from Taiwan ~1300 BCE to 800 CE derived ~75% ancestry from a lineage also common in modern Austronesian, Tai-Kadai and Austroasiatic speakers likely deriving from Yangtze River Valley farmers; ancient Taiwan people also derived ~25% ancestry from a northern lineage related to but different from Yellow River farmers implying an additional north-to-south expansion. Fourth, Yamnaya Steppe pastoralist ancestry arrived in western Mongolia after ~3000 BCE but was displaced by previously established lineages even while it persisted in western China as expected if it spread the ancestor of Tocharian Indo-European languages. Two later gene flows affected western Mongolia: after ~2000 BCE migrants with Yamnaya and European farmer ancestry, and episodic impacts of later groups with ancestry from Turan.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Author information



Corresponding authors

Correspondence to Chuan-Chao Wang or Johannes Krause or Ron Pinhasi or David Reich.

Supplementary information

Supplementary Information

This Supplementary Information file contains an Ethics Statement and Supplementary Information sections 1-4, including 15 Supplementary Figures, 5 Supplementary Tables and Supplementary References. The supplementary figures and tables provide information on the genetic structure and population history of East Asians.

Reporting Summary

Supplementary Tables

This zipped file contains 26 Supplementary Tables and a table guide.

Supplementary Data

Genotypes of the newly reported 166 ancient individuals.

Peer Review File

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wang, CC., Yeh, HY., Popov, A.N. et al. Genomic Insights into the Formation of Human Populations in East Asia. Nature (2021).

Download citation


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing