There is a long-going debate on the genetic origin of Chinese Muslim populations, such as Uygur, Dongxiang, and Hui. However, genetic information for those Muslim populations except Uygur is extremely limited. In this study, we investigated the genetic structure and ancestry of Chinese Muslims by analyzing 15 autosomal short tandem repeats in 652 individuals from Dongxiang, Hui, and Han Chinese populations in Gansu province. Both genetic distance and Bayesian-clustering methods showed significant genetic homogeneity between the two Muslim populations and East Asian populations, suggesting a common genetic ancestry. Our analysis found no evidence of substantial gene flow from Middle East or Europe into Dongxiang and Hui people during their Islamization. The dataset generated in present study are also valuable for forensic identification and paternity tests in China.
Chinese Muslim populations refer to ten officially recognized Muslim ethnic groups, which are Uygur, Dongxiang, Hui, Bo’an, Kazakh, Kirghiz, Salar, Tatar, Tajik, and Uzbek. The origin of those populations via demic diffusion involves mass movement of people or simple cultural diffusion is a long-going debate.
According to historical materials, Islam was first introduced to China some 1,400 years ago in Tang Dynasty (618–907 AD) by a large number of soldiers, merchants and political emissaries from Arabia and Persia (nowadays Middle East)1. Chinese Muslim populations are believed to be decedents of those immigrants. Uygur has already been proven to be a typical admixture of East Asian and European by genome-wide scan2. However, Dongxinag, Bo’an, and Hui in Gansu and Ningxia have the common physical features of East Asians (Mongoloid type)3,4,5. Xie and Shan6 also detected the genetic similarity between Hui and Han Chinese using two autosomal short tandem repeats (STRs) TH01 and D13S317. The paternal Y chromosomal STR clustering has put Hui of Liaoning and Ningxia into the group of Han Chinese and Tibeto-Burman populations7. About 24–30% Y chromosomes of Salar, Bo’an, and Dongxiang belong to East Asian specific haplogroup O3-M122. The Central Asian, South Asian, and European prevalent Y chromosomal lineage R-M17 also comprises 17%, 26%, and 28% of Salar, Bo’an, and Dongxiang, respectively8.
The origin of Chinese Muslim populations likely involved massive assimilation of indigenous ethnic groups9. But previous studies with limited genetic markers and small sample size have not been able to give a clear answer to the genetic ancestry of those Muslim populations. Therefore, we analyzed 15 autosomal STRs in 652 individuals of Dongxiang, Hui, and the co-resident Han Chinese populations in Linxia, Gansu province to explore the genetic diversity of Chinese Muslims and to test population affinities and the level of admixture. Dongxiang and Hui are typical of contemporary Chinese Muslim communities. The comprehensive comparison of those two populations with worldwide Muslims and non-Muslims will shed more light on the origin of Chinese Muslims.
We collected blood samples of 163 and 219 unrelated individuals from two Muslim populations Dongxiang and Hui in Linxia, Gansu province. We also collected blood samples of 270 unrelated individuals from Han Chinese in Linxia for comparison purpose. Our study was approved by the Ethical Committee of Gansu Institute of Political Science and Law. The study was conducted in accordance with the human and ethical research principles of Gansu Institute of Political Science and Law. All individuals were adequately informed and signed their informed content before their participation. For each sample, genomic DNA was extracted according to the Chelex-100 method and proteinase K protocol10. 15 most widely used forensic loci were amplified simultaneously using AmpFlSTR Sinofiler PCR Amplification Kit (Applied Biosystems, Foster City, CA, USA) at the D8S1179, D21S11, D7S820, CSF1PO, D3S1358, D13S317, D16S539, D2S1338, D19S433, vWA, D18S51, D5S818, FGA, D6S1043 and D12S391 STR loci. The PCR products were analyzed with the 3500XL DNA Genetic Analyzer and Genemapper ID-X software (Applied Biosystems, Foster City, CA, USA).
Allele frequency, heterozygosity, polymorphism information content (PIC), power of discrimination (PD), probability of paternity exclusion (PPE), and other forensic parameters were calculated using PowerStatesV12 (http://www.promega.com/) and Cervus 3.011. Tests for Hardy–Weinberg equilibrium (HWE) were performed in Arlequin v220.127.116.112 using a likelihood ratio test and an exact test to prevent miscalling STR genotypes or biased sampling. Since the statistical analyses in this study were on the basis of Bayesian-clustering algorithm, raw genotypic data of 13 STRs (excluding D6S1043 and D12S391) from 45 populations (13793 individuals) all around the world were extracted to determine population affinity13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39. Average number of pairwise differences, pairwise Fst, Slatkins linearized Fst, and coancestry coefficients were all calculated in Arlequin v18.104.22.168 using genotype data12. The detailed population genetic structure was performed using model-based clustering method implemented in Structure 2.3.440,41 under assumptions of admixture and correlated allele frequencies. Each run used 100,000 estimation iterations for K = 2 to 8 after a 20,000 burn-in length with several replicates. Posterior probabilities for each K were computed for each set of runs. Graphical display for Matrix plot of genetic distance and population structure were carried out in R statistical software v3.0.242 and Distruct v1.143.
Forensic parameter analysis
The genotype data for the three populations Dongxiang, Hui, and Han Chinese was given in Supplementary Table 1. The allele frequency distributions and forensic parameters are listed in Table 1 and Supplementary Table 2. The Ho ranged from 0.688 at CSF1PO locus in Hui to 0.914 at D6S1043 locus in Dongxiang while the He ranged from 0.704 at D3S1358 locus in Han to 0.883 at D6S1043 locus in Dongxiang. In the test of HWE, the genotype frequency distributions showed no significant deviations from expectations except p-value of 0.030 at D19S433 locus in Hui. PIC of all selected loci ranged from 0.652 at D3S1358 in Han to 0.868 at D6S1043 in Dongxiang. The values of DP were in the range of 0.861 at D3S1358 in Han to 0.969 in Dongxiang and Hui. The highest PPE was found at D6S1043 locus in Dongxiang (0.824), with the lowest found at CSF1PO locus in Hui (0.410). The most polymorphic loci in all three populations were highly discriminating, which demonstrates that this set of loci will be useful for forensic identification.
Interpopulation genetic distances
We performed various parameters of genetic distances to infer population structure (Fig. 1 and Supplementary Table 3). Chinese Muslim populations Dongxiang and Hui showed the largest pairwise genetic distances with populations from Africa and Middle East. The smallest genetic distances were noted for Chinese Muslim populations with East Asian populations, especially Han Chinese. Dongxiang showed nonsignificant pairwise Fst difference from Hui in Linxia and Ningxia, Han Chinese in Linxia, Shaanxi, Shanghai, and Guangdong, and Tibetan in Lhasa (p > 0.005). The genetic divergence of Dongxiang and those populations are relatively small (pairwise Fst < 0.002 and Slatkin linearized Fst < 0.003). Hui in Linxia also showed nonsignificant genetic difference with Hui in Ningxia, Uygur in Yili, Han Chinese in Shaanxi and Yunnan, Russian in Inner Mongolia, and Tibetan in Lhasa. Hui in Ningxia also did not differ from all five Han Chinese populations in this study. However, almost all the pairwise Fst differences between Dongxiang and Hui with European, Middle Eastern, and African populations are all above 0.01. The average pairwise differences exhibit the very similar pattern. The two Uygur populations statistically differed from all other populations. The genetic distances of Uygur with East Asian, European, and most Middle Eastern populations are almost the same, indicating Uygur is an admixed population.
Clustering by structure analysis
Analysis of genetic distance failed to support the genetic affinity between Chinese Muslim populations Dongxiang and Hui with Middle Eastern or European populations. We then employed a cluster based algorithm to further clarify population genetic structure at individual level. According to the highest posterior probabilities, the most suitable K was observed at K = 3 (Supplementary Table 4). The clustering showed a very clear geographic pattern (Fig. 2). East Asian, European, and African populations belong to cluster 1, 2, and 3, respectively. Middle Eastern populations seem like to be admixture of African and European populations. Uygur populations shared a similar degree of membership with East Asian and European (35% to 40%), which is consistent with genetic distance analysis and previous reports2. The case of Uygur shows clearly that the 15 STRs and Structure analysis have enough power to delineate the ancestry of populations. The proportion of membership of Dongxiang and Hui in cluster 1 reaches 58.3% to 63.9%. Although this proportion is about 10% lower than Han Chinese (66.8% to 74.4%), it still fall into the general pattern of East Asian populations ranging from 57.8% (Lhoba) to 80.3% (She) (Supplementary Table 4). It’s possible that some individuals of Dongxiang and Hui might have excess affinity with West Eurasians as they are living closely together with Uygur and Central Asians (Supplementary 6). However, the majority of Dongxiang and Hui samples share very similar membership with other East Asian populations, revealing a common genetic makeup.
The origin of Chinese Muslim populations via demic diffusion or simple cultural diffusion has long been a hot debate. Previous genetic studies with limited markers and small sample size often came to contradictory conclusions. In this study, we focused on the genetic makeup and ancestry clustering of Dongxiang and Hui using 15 autosomal STRs genotyped from more than 600 individuals. The two Chinese Muslim populations Dongxiang and Hui showed significant genetic homogeneity with co-resident Han Chinese in Linxia and other East Asian populations rather than with European or Middle Eastern populations, which support a simple cultural diffusion for the origin of Dongxiang and Hui in China. This cultural transformation phenomenon has also been observed in other Muslim populations. Although the Utsat people in Hainan Island are thought to be descendants of the Champa Kingdom and have been officially recognized as Hui nationality, they are genetically much closer to the Hainan indigenous ethnic groups than to the Cham and other mainland Southeast Asian populations9. The spread of Islam in the Indian subcontinent was also proven to be predominantly cultural diffusion associated with minor gene flow from West Asia and Arabia by analyzing autosomal STRs44,45, mitochondrial DNA46,47, and Y chromosome47. Autosomal STRs also reveal common genetic ancestry of the Thai-Malay Muslims and Thai Buddhists36. In this context, cultural transformation has shaped worldwide Muslim populations.
How to cite this article: Yao, H.-B. et al. Genetic evidence for an East Asian origin of Chinese Muslim populations Dongxiang and Hui. Sci. Rep. 6, 38656; doi: 10.1038/srep38656 (2016).
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Gladney, D. C. Ethnic identity in China: the making of a Muslim minority nationality. (Harcourt Brace College, 1998).
Xu, S., Huang, W., Qian, J. & Jin, L. Analysis of genomic admixture in Uyghur and its implication in mapping strategy. Am J Hum Genet 82, 883–94 (2008).
Yang, D. Y. & Dai, Y. J. Physical characteristics of Bonan Nationality from Gansu province, northwest China. Acta Anthropologica Sinica 9, 56–63 (1990).
Dai, Y. J. & Yang, D. Y. Research on the physical characteristics of Dongxiang Nationality in Gansu province, northwest China. Acta Anthropologica Sinica 10, 127–134 (1991).
Zheng, L. B. et al. Physical characters of Hui Nationality in Ningxia. Acta Anthropologica Sinica 16, 11–21 (1997).
Xie, X. D. & Shan, X. M. The DNA evidence of the origin of Hui. Researches on the Hui 3, 75–78 (2002).
Zhang, F., Li, H., Huang, L., Lu, Y. & Hu, S. Similarities in Patrilineal Genetics between the Han Chinese of Central China and Chaoshanese in Southern China. Commun Contemp Anthropol 4, e2 (2010).
Wang, W., Wise, C., Baric, T., Black, M. L. & Bittles, A. H. The origins and genetic structure of three co-resident Chinese Muslim populations: the Salar, Bo’an and Dongxiang. Hum Genet 113, 244–52 (2003).
Li, D. N. et al. Substitution of Hainan indigenous genetic lineage in the Utsat people, exiles of the Champa kingdom. J Syst Evol 51, 287–294 (2013).
Zhou, Y. Q., Zhu, W., Liu, Z. P. & Wu, W. Q. A quick method of extraction of DNA by Chelex-100 from trace bloodstains, Fudan Univ J Med Sci 30, 379–380 (2003).
Marshall, T. C., Slate, J., Kruuk, L. E. & Pemberton, J. M. Statistical confidence for likelihood-based paternity inference in natural populations. Mol Ecol 7, 639–655 (1998).
Excoffier, L., Laval, G. & Schneider, S. Arlequin ver. 3.0: An integrated software package for population genetics data analysis. Evol Bioinform Online 1, 47–50 (2005).
Yan, J. et al. Genetic analysis of 15 STR loci on Chinese Tibetan in Qinghai Province. Forensic Sci Int 169, e3–6 (2007).
Zhu, B. F. et al. Population genetic analysis of 15 autosomal STR loci in the Russian population of northeastern Inner-Mongolia, China. Mol Biol Rep 37, 3889–3895 (2010).
Wu, Y. M., Zhang, X. N., Zhou, Y., Chen, Z. Y. & Wang, X. B. Genetic polymorphisms of 15 STR loci in Chinese Han population living in Xi’an city of Shaanxi Province. Forensic Sci Int Genet 2, e15–18 (2008).
Chen, J. G. et al. Population genetic data of 15 autosomal STR loci in Uygur ethnic group of China. Forensic Sci Int Genet 6, e178–9 (2012).
Zhu, B. F., Shen, C. M., Wu, Q. J. & Deng, Y. J. Population data of 15 STR loci of Chinese Yi ethnic minority group. Leg Med (Tokyo) 10, 220–224 (2008).
Zhu, J. et al. Population data of 15 STR in Chinese Han population from north of Guangdong. J Forensic Sci 50, 1510–1511 (2005).
Wang, Z. et al. Genetic polymorphisms of 15 STR loci in Chinese Hui population. J Forensic Sci 50, 1508–1509 (2005).
Zhu, B. et al. Genetic analysis of 15 STR loci of Chinese Uigur ethnic population. J Forensic Sci 50, 1235–1236 (2005).
Li, C. et al. Genetic polymorphism of 17 STR loci for forensic use in Chinese population from Shanghai in East China. Forensic Sci Int Genet 3, e117–118 (2009).
Rerkamnuaychoke, B. et al. Thai population data on 15 tetrameric STR loci-D8S1179, D21S11, D7S820, CSF1PO, D3S1358, TH01, D13S317, D16S539, D2S1338, D19S433, vWA, TPOX, D18S51, D5S818 and FGA. Forensic Sci Int 158, 234–237 (2006).
Alenizi, M., Goodwin, W., Ismael, S. & Hadi, S. STR data for the AmpFlSTR Identifiler loci in Kuwaiti population. Leg Med (Tokyo) 10, 321–325 (2008).
Coudray, C. et al. Population genetic data of 15 tetrameric short tandem repeats (STRs) in Berbers from Morocco. Forensic Sci Int 167, 81–86 (2007).
Forward, B. W., Eastman, M. W., Nyambo, T. B. & Ballard, R. E . AMPFlSTR Identifiler STR allele frequencies in Tanzania, Africa. J Forensic Sci 53, 245 (2008).
Muro, T. et al. Allele frequencies for 15 STR loci in Ovambo population using AmpFlSTR Identifiler Kit. Leg Med (Tokyo) 10, 157–159 (2008).
Immel, U. D., Erhuma, M., Mustafa, T., Kleiber, M. & Klintschar, M. Population genetic analysis in a Libyan population using the PowerPlex 16 system. Int Congr Ser 1288, 421–423 (2006).
Coudray, C., Guitard, E., EL-Chennawi, F., Larrouy, G. & Dugoujon, J. M. Allele frequencies of 15 short tandem repeats (STRs) in three Egyptian populations of different ethnic groups. Forensic Sci Int 169, 260–265 (2007).
Coudray, C. et al. Allele frequencies of 15 tetrameric short tandem repeats (STRs) in Andalusians from Huelva (Spain). Forensic Sci Int 68, e21–24 (2007).
Marian, C. et al. STR data for the 15 AmpFlSTR identifiler loci in the Western Romanian population. Forensic Sci Int 170, 73–75 (2007).
Sánchez-Diz, P., Menounos, P. G., Carracedo, A. & Skitsa, I. 16 STR data of a Greek population. Forensic Sci Int Genet 2, e71–72 (2008).
Mertens, G. et al. Flemish population genetic analysis using 15 STRs of the Identifiler® kit. Int Congr Ser 1288, 328–330 (2006).
Hernández-Gutiérrez, S., Hernández-Franco, P., Martínez-Tripp, S., Ramos-Kuri, M. & Rangel-Villalobos, H. STR data for 15 loci in a population sample from the central region of Mexico. Forensic Sci Int 151, 97–100 (2005).
Ossmani, H. E., Talbi, J., Bouchrif, B. & Chafik, A. Allele frequencies of 15 autosomal STR loci in the southern Morocco population with phylogenetic structure among worldwide populations. Leg Med (Tokyo) 11, 155–158 (2009).
Halima, M. S. A., Bernal, L. P. & Sharif, F. A. Genetic variation of 15 autosomal short tandem repeat (STR) loci in the Palestinian population of Gaza Strip. Leg Med (Tokyo) 11, 203–204 (2009).
Kutanan, W., Kitpipit, T., Phetpeng, S. & Thanakiatkrai, P. Forensic STR loci reveal common genetic ancestry of the Thai-Malay Muslims and Thai Buddhists in the deep Southern region of Thailand. J Hum Genet 59, 675–681 (2014).
Butler, J. M., Schoske, R., Vallone, P. M., Redman, J. W. & Kline, M. C. Allele frequencies for 15 autosomal STR loci on U.S. Caucasian, African American, and Hispanic populations. J Forensic Sci 48, 908–911 (2003).
Pereira, L. et al. PopAffiliator: online calculator for individual affiliation to a major population group based on 17 autosomal short tandem repeat genotype profile. Int J Legal Med 125, 629–636 (2011).
Kang, L. et al. Genetic structures of the Tibetans and the Deng people in the Himalayas viewed from autosomal STRs. J Hum Genet 55, 270–277 (2010).
Falush, D., Stephens, M. & Pritchard, J. K. Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164, 1567–1587 (2003).
Hubisz, M. J., Falush, D., Stephens, M. & Pritchard, J. K. Inferring weak population structure with the assistance of sample group information. Mol Ecol Resour 9, 1322–1332 (2009).
R Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, 2013.
Rosenberg, N. A. Distruct: A program for the graphical display of population structure. Mol Ecol Notes 4, 137–138 (2004).
Eaaswarkhanth, M. et al. Diverse genetic origin of Indian Muslims: evidence from autosomal STR loci. J Hum Genet 54, 340–348 (2009).
Eaaswarkhanth, M. et al. Microsatellite diversity delineates genetic relationships of Shia and Sunni Muslim populations of Uttar Pradesh, India. Hum Biol 81, 427–445 (2009).
Terreros, M. C. et al. North Indian Muslims: enclaves of foreign DNA or Hindu converts? Am J Phys Anthropol 133, 1004–1012 (2007).
Eaaswarkhanth, M. et al. Traces of sub-Saharan and Middle Eastern lineages in Indian Muslim populations. Eur J Hum Genet 18, 354–363 (2010).
We would like to thank Dr. Masood S.H. Abu Halima, Dr. Andrea Berti, and Dr. Phuvadol Thanakiatkrai for sharing STR genotype data with us. This work was supported by the Natural Science Foundation of Gansu province (1308RJZA190), Scientific Research Project for Colleges of Gansu province (2014A-085), National Excellent Youth Science Foundation of China (31222030), National Natural Science Foundation of China (91131002, 31260252, 31671297, 31071098, and 30760097), Shanghai Rising-Star Program (12QA1400300), MOE University Doctoral Research Supervisor’s Funds (20120071110021), and MOE Scientific Research Project (113022A). C.C.W is supported by Max Planck Society. The research leading to these results has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 646612) granted to Martine Robbeets.
The authors declare no competing financial interests.
Electronic supplementary material
Rights and permissions
This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
About this article
Cite this article
Yao, HB., Wang, CC., Tao, X. et al. Genetic evidence for an East Asian origin of Chinese Muslim populations Dongxiang and Hui. Sci Rep 6, 38656 (2016). https://doi.org/10.1038/srep38656
This article is cited by
Comprehensive evaluations of individual discrimination, kinship analysis, genetic relationship exploration and biogeographic origin prediction in Chinese Dongxiang group by a 60-plex DIP panel
Characterizing the diversity of MHC conserved extended haplotypes using families from the United Arab Emirates
Scientific Reports (2022)
Population genetics of 27 Y-STRs for the Yi population from Liangshan Yi Autonomous Prefecture, China
International Journal of Legal Medicine (2021)
Genetic polymorphism and phylogenetic differentiation of the Huaxia Platinum System in three Chinese minority ethnicities
Scientific Reports (2019)
Genetic structure of Tibetan populations in Gansu revealed by forensic STR loci
Scientific Reports (2017)
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.