The genetic variation in Northern Asian populations is currently undersampled. To address this, we generated a new genetic variation reference panel by whole-genome sequencing of 175 ethnic Mongolians, representing six tribes. The cataloged variation in the panel shows strong population stratification among these tribes, which correlates with the diverse demographic histories in the region. Incorporating our results with the 1000 Genomes Project panel identifies derived alleles shared between Finns and Mongolians/Siberians, suggesting that substantial gene flow between northern Eurasian populations has occurred in the past. Furthermore, we highlight that North, East, and Southeast Asian populations are more aligned with each other than these groups are with South Asian and Oceanian populations.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Data availability

Raw sequencing data and variant sets have been deposited to the CNGB (China National Genebank) Nucleotide Sequence Archive (CNSA) with accession CNP0000063 (https://db.cngb.org/cnsa/).

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


  1. 1.

    Bai, H. et al. The genome of a Mongolian individual reveals the genetic imprints of Mongolians on modern human populations. Genome Biol. Evol. 6, 3122–3136 (2014).

  2. 2.

    Kolman, C. J., Sambuughin, N. & Bermingham, E. Mitochondrial DNA analysis of Mongolian populations and implications for the origin of New World founders. Genetics 142, 1321–1334 (1996).

  3. 3.

    Merriwether, D. A., Hall, W. W., Vahlne, A. & Ferrell, R. E. mtDNA variation indicates Mongolia may have been the source for the founding population for the New World. Am. J. Hum. Genet. 59, 204–212 (1996).

  4. 4.

    Karafet, T. M. et al. Ancestral Asian source(s) of new world Y-chromosome founder haplotypes. Am. J. Hum. Genet. 64, 817–831 (1999).

  5. 5.

    Brace, C. L. et al. Old World sources of the first New World human inhabitants: a comparative craniofacial view. Proc. Natl Acad. Sci. USA 98, 10017–10022 (2001).

  6. 6.

    Franke, H. & Twitchett, D. The Cambridge History of China: Alien Regimes and Border States, 907–1368 (Cambridge Univ. Press, New York, 1994).

  7. 7.

    Zerjal, T. et al. The genetic legacy of the Mongols. Am. J. Hum. Genet. 72, 717–721 (2003).

  8. 8.

    Hellenthal, G. et al. A genetic atlas of human admixture history. Science 343, 747–751 (2014).

  9. 9.

    Weatherford, J. M. Genghis Khan and the Making of the Modern World (Three Rivers Press, New York, 2004).

  10. 10.

    Li, J. Z. et al. Worldwide human relationships inferred from genome-wide patterns of variation. Science 319, 1100–1104 (2008).

  11. 11.

    The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).

  12. 12.

    Pagani, L. et al. Genomic analyses inform on migration events during the peopling of Eurasia. Nature 538, 238–242 (2016).

  13. 13.

    Mallick, S. et al. The Simons Genome Diversity Project: 300 genomes from 142 diverse populations. Nature 538, 201–206 (2016).

  14. 14.

    The HUGO Pan-Asian SNP Consortium. Mapping human genetic diversity in Asia. Science 326, 1541–1545 (2009).

  15. 15.

    Mondal, M. et al. Genomic analysis of Andamanese provides insights into ancient human migration into Asia and adaptation. Nat. Genet. 48, 1066–1070 (2016).

  16. 16.

    Qin, P. et al. Quantitating and dating recent gene flow between European and East Asian populations. Sci. Rep. 5, 9500 (2015).

  17. 17.

    Wong, E. H. et al. Reconstructing genetic history of Siberian and Northeastern European populations. Genome Res. 27, 1–14 (2017).

  18. 18.

    Kong, Q. P. et al. Phylogeny of east Asian mitochondrial DNA lineages inferred from complete sequences. Am. J. Hum. Genet. 73, 671–676 (2003).

  19. 19.

    Derenko, M. et al. Phylogeographic analysis of mitochondrial DNA in northern Asian populations. Am. J. Hum. Genet. 81, 1025–1041 (2007).

  20. 20.

    Su, B. et al. Y-chromosome evidence for a northward migration of modern humans into Eastern Asia during the last Ice Age. Am. J. Hum. Genet. 65, 1718–1724 (1999).

  21. 21.

    Ke, Y. et al. African origin of modern humans in East Asia: a tale of 12,000 Y chromosomes. Science 292, 1151–1153 (2001).

  22. 22.

    Shi, H. et al. Y chromosome evidence of earliest modern human settlement in East Asia and multiple origins of Tibetan and Japanese populations. BMC Biol. 6, 45 (2008).

  23. 23.

    Zhong, H. et al. Global distribution of Y-chromosome haplogroup C reveals the prehistoric migration routes of African exodus and early settlement in East Asia. J. Hum. Genet. 55, 428–435 (2010).

  24. 24.

    Xing, J. et al. Genomic analysis of natural selection and phenotypic variation in high-altitude mongolians. PLoS Genet. 9, e1003634 (2013).

  25. 25.

    McCarthy, S. et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279–1283 (2016).

  26. 26.

    The Genome of the Netherlands Consortium. Whole-genome sequence variation, population structure and demographic history of the Dutch population. Nat. Genet. 46, 818–825 (2014).

  27. 27.

    Huang, J. et al. Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel. Nat. Commun. 6, 8111 (2015).

  28. 28.

    Reich, D. et al. Reconstructing Native American population history. Nature 488, 370–374 (2012).

  29. 29.

    Alexander, D. H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).

  30. 30.

    Kong, Q. P. et al. Mitochondrial DNA sequence polymorphisms of five ethnic populations from northern China. Hum. Genet. 113, 391–405 (2003).

  31. 31.

    Stewart, J. B. & Chinnery, P. F. The dynamics of mitochondrial DNA heteroplasmy: implications for human health and disease. Nat. Rev. Genet. 16, 530–542 (2015).

  32. 32.

    Katoh, T. et al. Genetic features of Mongolian ethnic groups revealed by Y-chromosomal analysis. Gene 346, 63–70 (2005).

  33. 33.

    Poznik, G. D. et al. Punctuated bursts in human male demography inferred from 1,244 worldwide Y-chromosome sequences. Nat. Genet. 48, 593–599 (2016).

  34. 34.

    Schiffels, S. & Durbin, R. Inferring human population size and separation history from multiple genome sequences. Nat. Genet. 46, 919–925 (2014).

  35. 35.

    Botigue, L. R. et al. Gene flow from North Africa contributes to differential human genetic diversity in southern Europe. Proc. Natl Acad. Sci. USA 110, 11791–11796 (2013).

  36. 36.

    Gravel, S. et al. Reconstructing Native American migrations from whole-genome and whole-exome data. PLOS Genet. 9, e1004023 (2013).

  37. 37.

    Green, R. E. et al. A draft sequence of the Neandertal genome. Science 328, 710–722 (2010).

  38. 38.

    Patterson, N. et al. Ancient admixture in human history. Genetics 192, 1065–1093 (2012).

  39. 39.

    Henikoff, S. & Henikoff, J. G. Position-based sequence weights. J. Mol. Biol. 243, 574–578 (1994).

  40. 40.

    Pickrell, J. K. & Pritchard, J. K. Inference of population splits and mixtures from genome-wide allele frequency data. PLOS Genet. 8, e1002967 (2012).

  41. 41.

    Manichaikul, A. et al. Robust relationship inference in genome-wide association studies. Bioinformatics 26, 2867–2873 (2010).

  42. 42.

    Li, H. & Durbin, R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 26, 589–595 (2010).

  43. 43.

    DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).

  44. 44.

    Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

  45. 45.

    The 1000 Genomes Project Consortium. A map of human genome variation from population-scale sequencing. Nature 467, 1061–1073 (2010).

  46. 46.

    The 1000 Genomes Project Consortium. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).

  47. 47.

    Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).

  48. 48.

    Delaneau, O., Marchini, J. & Zagury, J. F. A linear complexity phasing method for thousands of genomes. Nat. Methods 9, 179–181 (2012).

  49. 49.

    Browning, B. L. & Browning, S. R. Genotype imputation with millions of reference samples. Am. J. Hum. Genet. 98, 116–126 (2016).

  50. 50.

    Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).

  51. 51.

    Weir, B. S. & Cockerham, C. C. Estimating F-statistics for the analysis of population structure. Evolution 38, 1358–1370 (1984).

  52. 52.

    Liu, K. & Muse, S. V. PowerMarker: an integrated analysis environment for genetic marker analysis. Bioinformatics 21, 2128–2129 (2005).

  53. 53.

    Saitou, N. & Nei, M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425 (1987).

  54. 54.

    Van Geystelen, A., Decorte, R. & Larmuseau, M. H. AMY-tree: an algorithm to use whole genome SNP calling for Y chromosomal phylogenetic applications. BMC Genomics 14, 101 (2013).

  55. 55.

    Zhang, F. et al. YHap: a population model for probabilistic assignment of Y haplogroups from re-sequencing data. BMC Bioinformatics 14, 331 (2013).

  56. 56.

    Stamatakis, A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22, 2688–2690 (2006).

  57. 57.

    Lewis, P. O. A likelihood approach to estimating phylogeny from discrete morphological character data. Syst. Biol. 50, 913–925 (2001).

  58. 58.

    van Oven, M. & Kayser, M. Updated comprehensive phylogenetic tree of global human mitochondrial DNA variation. Hum. Mutat. 30, E386–E394 (2009).

  59. 59.

    Fan, L. & Yao, Y. G. An update to MitoTool: using a new scoring system for faster mtDNA haplogroup determination. Mitochondrion 13, 360–363 (2013).

  60. 60.

    Kloss-Brandstatter, A. et al. HaploGrep: a fast and reliable algorithm for automatic classification of mitochondrial DNA haplogroups. Hum. Mutat. 32, 25–32 (2011).

  61. 61.

    Bergström, A. et al. A Neolithic expansion, but strong genetic structure, in the independent history of New Guinea. Science 357, 1160–1163 (2017).

  62. 62.

    de Manuel, M. et al. Chimpanzee genomic diversity reveals ancient admixture with bonobos. Science 354, 477–481 (2016).

  63. 63.

    Browning, B. L. & Browning, S. R. Improving the accuracy and efficiency of identity-by-descent detection in population data. Genetics 194, 459–471 (2013).

  64. 64.

    Browning, B. L. & Browning, S. R. A fast, powerful method for detecting identity by descent. Am. J. Hum. Genet. 88, 173–182 (2011).

  65. 65.

    Atzmon, G. et al. Abraham’s children in the genome era: major Jewish diaspora populations comprise distinct genetic clusters with shared Middle Eastern ancestry. Am. J. Hum. Genet. 86, 850–859 (2010).

  66. 66.

    Reich, D., Thangaraj, K., Patterson, N., Price, A. L. & Singh, L. Reconstructing Indian population history. Nature 461, 489–494 (2009).

  67. 67.

    Yi, X. et al. Sequencing of 50 human exomes reveals adaptation to high altitude. Science 329, 75–78 (2010).

  68. 68.

    Alexandros, S. et al. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313 (2014).

Download references


We sincerely thank the Mongolian volunteers who agreed to contribute blood samples and participate in this study. We thank D. Reich for sharing genotype data on populations from Siberia and South Asia, and J. Fekecs for graphical assistance. We acknowledge F.S. Collins and C.D. Bustamante for their helpful discussions and comments on the manuscript, as well as Shuangshan Shuangshan, Y. Bao, and S. Ba for contributing to the sample collection process. This study was supported by Shenzhen Municipal Government of China (CXB201108250094A), Inner Mongolia University for Nationalities Scientific Research Project (MD2012038), the National Science Foundation of China (81560176, 81511130050), China National Genebank, Foundation of the Inner Mongolia Department of Science and Technology (2015MS0875, 201502103), Science and Technology Planning Project of Inner Mongolia, China (20120409), and the Guangdong Provincial Key Laboratory of Genome Read and Write (2017B030301010). C.R.G. is supported by the US National Institutes of Health (4U01HG007419-04) and National Science Foundation (1201234). N.N., S.R.B., and L.C.B are supported by the Intramural Research Program of the National Human Genome Research Institute, National Institutes of Health.

Author information

Author notes

  1. These authors contributed equally: Haihua Bai, Xiaosen Guo, Narisu Narisu, Tianming Lan, Qizhu Wu.


  1. School of Life Science, Inner Mongolia University for the Nationalities, Tongliao, China

    • Haihua Bai
    • , Ying Gao
    • , Suyalatu Suyalatu
    • , Huiguang Wu
    •  & Yujie Chen
  2. Inner Mongolia Engineering Research Center of Personalized Medicine, Tongliao, China

    • Haihua Bai
  3. BGI-Shenzhen, Shenzhen, China

    • Xiaosen Guo
    • , Tianming Lan
    • , Yong Zhang
    • , Dandan Zhang
    • , Bingyi Ding
    • , Haorong Lu
    • , Wangsheng Li
    • , Ningxin Dang
    • , Huixin Xu
    • , Xin Luo
    • , Xiaolian Ning
    • , Bo Wang
    • , Chen Ye
    • , Lin Fang
    • , Wenhao Xu
    • , Xin Liu
    • , Xun Xu
    • , Huanming Yang
    • , Jun Wang
    •  & Karsten Kristiansen
  4. Laboratory of Genomics and Molecular Biomedicine, Department of Biology, University of Copenhagen, Copenhagen, Denmark

    • Xiaosen Guo
    • , Tianming Lan
    • , Zongze Wu
    • , Jun Wang
    • , Karsten Kristiansen
    •  & Ye Yin
  5. China National GeneBank, BGI-Shenzhen, Shenzhen, China

    • Xiaosen Guo
    • , Tianming Lan
    • , Yong Zhang
    • , Dandan Zhang
    • , Bingyi Ding
    • , Haorong Lu
    • , Wangsheng Li
    • , Ningxin Dang
    • , Huixin Xu
    • , Xin Luo
    • , Xiaolian Ning
    • , Bo Wang
    • , Chen Ye
    • , Lin Fang
    • , Wenhao Xu
    • , Xin Liu
    • , Xun Xu
    •  & Huanming Yang
  6. Medical Genomics and Metabolic Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA

    • Narisu Narisu
  7. Affiliated Hospital of Inner Mongolia University for the Nationalities, Tongliao, China

    • Qizhu Wu
    • , Liqing Yang
    • , Hashenqimuge Hashenqimuge
    •  & Burenbatu Burenbatu
  8. College of Life Science, Inner Mongolia Agricultural University, Hohhot, China

    • Yanping Xing
    • , Yanru Zhang
    • , Dong Zhang
    • , Li Zhang
    • , Junwei Cao
    • , Yiyi Liu
    • , Shenyuan Wang
    • , Chunxia Liu
    • , Xueqiong Li
    • , Fanhua Meng
    • , Kaifeng Wu
    • , Yingchun Liu
    • , Lu Li
    • , Tao Li
    •  & Huanmin Zhou
  9. Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA

    • Stephen R. Bond
  10. College of Computer Science and Technology, Inner Mongolia University for the Nationalities, Tongliao, China

    • Zhili Pei
    •  & Mingyang Jiang
  11. College of Mathematics, Inner Mongolia University for the Nationalities, Tongliao, China

    • Jirimutu Jirimutu
  12. BGI Genomics, BGI-Shenzhen, Shenzhen, China

    • Xukui Yang
    • , Zongze Wu
    •  & Ye Yin
  13. College of Mongolian Studies, Inner Mongolia University for the Nationalities, Tongliao, China

    • Morigenbatu Morigenbatu
    •  & Dingzhu Wang
  14. Inner Mongolia International Mongolian Hospital, Hohhot, China

    • Baozhu Guan
  15. Guangdong Provincial Key Laboratory of Genome Read and Write, Shenzhen, China

    • Haorong Lu
  16. Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark

    • Kalle Leppälä
  17. College of Life Science and Technology, Huazhong Agricultural University, No.1 Shizishan Street, Wuhan, China

    • Wenhao Xu
  18. Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA

    • Christopher R. Gignoux
  19. James D. Watson Institute of Genome Sciences, Hangzhou, China

    • Huanming Yang
  20. Gene and Environment Interaction Section, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA

    • Lawrence C. Brody
  21. School of Life Science and Biotechnology, Dalian University of Technology, Dalian, China

    • Ye Yin


  1. Search for Haihua Bai in:

  2. Search for Xiaosen Guo in:

  3. Search for Narisu Narisu in:

  4. Search for Tianming Lan in:

  5. Search for Qizhu Wu in:

  6. Search for Yanping Xing in:

  7. Search for Yong Zhang in:

  8. Search for Stephen R. Bond in:

  9. Search for Zhili Pei in:

  10. Search for Yanru Zhang in:

  11. Search for Dandan Zhang in:

  12. Search for Jirimutu Jirimutu in:

  13. Search for Dong Zhang in:

  14. Search for Xukui Yang in:

  15. Search for Morigenbatu Morigenbatu in:

  16. Search for Li Zhang in:

  17. Search for Bingyi Ding in:

  18. Search for Baozhu Guan in:

  19. Search for Junwei Cao in:

  20. Search for Haorong Lu in:

  21. Search for Yiyi Liu in:

  22. Search for Wangsheng Li in:

  23. Search for Ningxin Dang in:

  24. Search for Mingyang Jiang in:

  25. Search for Shenyuan Wang in:

  26. Search for Huixin Xu in:

  27. Search for Dingzhu Wang in:

  28. Search for Chunxia Liu in:

  29. Search for Xin Luo in:

  30. Search for Ying Gao in:

  31. Search for Xueqiong Li in:

  32. Search for Zongze Wu in:

  33. Search for Liqing Yang in:

  34. Search for Fanhua Meng in:

  35. Search for Xiaolian Ning in:

  36. Search for Hashenqimuge Hashenqimuge in:

  37. Search for Kaifeng Wu in:

  38. Search for Bo Wang in:

  39. Search for Suyalatu Suyalatu in:

  40. Search for Yingchun Liu in:

  41. Search for Chen Ye in:

  42. Search for Huiguang Wu in:

  43. Search for Kalle Leppälä in:

  44. Search for Lu Li in:

  45. Search for Lin Fang in:

  46. Search for Yujie Chen in:

  47. Search for Wenhao Xu in:

  48. Search for Tao Li in:

  49. Search for Xin Liu in:

  50. Search for Xun Xu in:

  51. Search for Christopher R. Gignoux in:

  52. Search for Huanming Yang in:

  53. Search for Lawrence C. Brody in:

  54. Search for Jun Wang in:

  55. Search for Karsten Kristiansen in:

  56. Search for Burenbatu Burenbatu in:

  57. Search for Huanmin Zhou in:

  58. Search for Ye Yin in:


Y.Y., H.Z., B.B., and H.B. initiated and supervised the project. H.B., Q.W., Y.X., Z.P., J.J., X.Y., M.M., B.G., D.W., Y.G., H.H., S.S., Y.C., YanruZ., L.Z., YiyiL., C.L., F.M., K.W., L.L., and YingchunL. surveyed and collected the samples. Y.X., YanruZ., DongZ., J.C., S.W., X.Li, and T.Li performed extraction of the genomic DNA. H.B., X.G., Q.W., M.J., and B.W. did the genome sequencing. YongZ., L.F., H.W., and T.Lan did the mapping and variation calling. T.Lan, X.G., H.L., W.L., Z.W., and B.W. performed experimental validation. X.G., T.Lan, and B.D. did the construction of the haplotype reference panel. X.G., T.Lan, DandanZ., H.X., N.D., X.Luo, W.X., and L.Y. performed the analysis of population diversity and genetic structure. T.Lan, X.G., N.N., B.D., and X.N. did the inferences of population demographic history. N.N., S.R.B., K.L., and C.R.G. did the analysis of phylogeny of East Asians. X.G., N.N., T.Lan, and S.R.B. wrote the manuscripts. X.G., C.Y., X.Luo, and T.Li were in charge of data submission. N.N., X.G., T.Lan, S.R.B., N.D., C.R.G., X.X., X.Liu, H.Y., L.C.B., J.W., and K.K. revised the manuscript.

Competing interests

The authors declare no competing interests.

Corresponding authors

Correspondence to Burenbatu Burenbatu or Huanmin Zhou or Ye Yin.

Supplementary information

  1. Supplementary Text and Figures

    Supplementary Figures 1–17 and Supplementary Tables 1–8

  2. Reporting Summary

About this article

Publication history