Chinese Muslim populations refer to ten officially recognized Muslim ethnic groups, which are Uygur, Dongxiang, Hui, Bo’an, Kazakh, Kirghiz, Salar, Tatar, Tajik, and Uzbek. The origin of those populations via demic diffusion involves mass movement of people or simple cultural diffusion is a long-going debate.

According to historical materials, Islam was first introduced to China some 1,400 years ago in Tang Dynasty (618–907 AD) by a large number of soldiers, merchants and political emissaries from Arabia and Persia (nowadays Middle East)1. Chinese Muslim populations are believed to be decedents of those immigrants. Uygur has already been proven to be a typical admixture of East Asian and European by genome-wide scan2. However, Dongxinag, Bo’an, and Hui in Gansu and Ningxia have the common physical features of East Asians (Mongoloid type)3,4,5. Xie and Shan6 also detected the genetic similarity between Hui and Han Chinese using two autosomal short tandem repeats (STRs) TH01 and D13S317. The paternal Y chromosomal STR clustering has put Hui of Liaoning and Ningxia into the group of Han Chinese and Tibeto-Burman populations7. About 24–30% Y chromosomes of Salar, Bo’an, and Dongxiang belong to East Asian specific haplogroup O3-M122. The Central Asian, South Asian, and European prevalent Y chromosomal lineage R-M17 also comprises 17%, 26%, and 28% of Salar, Bo’an, and Dongxiang, respectively8.

The origin of Chinese Muslim populations likely involved massive assimilation of indigenous ethnic groups9. But previous studies with limited genetic markers and small sample size have not been able to give a clear answer to the genetic ancestry of those Muslim populations. Therefore, we analyzed 15 autosomal STRs in 652 individuals of Dongxiang, Hui, and the co-resident Han Chinese populations in Linxia, Gansu province to explore the genetic diversity of Chinese Muslims and to test population affinities and the level of admixture. Dongxiang and Hui are typical of contemporary Chinese Muslim communities. The comprehensive comparison of those two populations with worldwide Muslims and non-Muslims will shed more light on the origin of Chinese Muslims.


We collected blood samples of 163 and 219 unrelated individuals from two Muslim populations Dongxiang and Hui in Linxia, Gansu province. We also collected blood samples of 270 unrelated individuals from Han Chinese in Linxia for comparison purpose. Our study was approved by the Ethical Committee of Gansu Institute of Political Science and Law. The study was conducted in accordance with the human and ethical research principles of Gansu Institute of Political Science and Law. All individuals were adequately informed and signed their informed content before their participation. For each sample, genomic DNA was extracted according to the Chelex-100 method and proteinase K protocol10. 15 most widely used forensic loci were amplified simultaneously using AmpFlSTR Sinofiler PCR Amplification Kit (Applied Biosystems, Foster City, CA, USA) at the D8S1179, D21S11, D7S820, CSF1PO, D3S1358, D13S317, D16S539, D2S1338, D19S433, vWA, D18S51, D5S818, FGA, D6S1043 and D12S391 STR loci. The PCR products were analyzed with the 3500XL DNA Genetic Analyzer and Genemapper ID-X software (Applied Biosystems, Foster City, CA, USA).

Allele frequency, heterozygosity, polymorphism information content (PIC), power of discrimination (PD), probability of paternity exclusion (PPE), and other forensic parameters were calculated using PowerStatesV12 ( and Cervus 3.011. Tests for Hardy–Weinberg equilibrium (HWE) were performed in Arlequin v3.5.1.312 using a likelihood ratio test and an exact test to prevent miscalling STR genotypes or biased sampling. Since the statistical analyses in this study were on the basis of Bayesian-clustering algorithm, raw genotypic data of 13 STRs (excluding D6S1043 and D12S391) from 45 populations (13793 individuals) all around the world were extracted to determine population affinity13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39. Average number of pairwise differences, pairwise Fst, Slatkins linearized Fst, and coancestry coefficients were all calculated in Arlequin v3.5.1.3 using genotype data12. The detailed population genetic structure was performed using model-based clustering method implemented in Structure 2.3.440,41 under assumptions of admixture and correlated allele frequencies. Each run used 100,000 estimation iterations for K = 2 to 8 after a 20,000 burn-in length with several replicates. Posterior probabilities for each K were computed for each set of runs. Graphical display for Matrix plot of genetic distance and population structure were carried out in R statistical software v3.0.242 and Distruct v1.143.


Forensic parameter analysis

The genotype data for the three populations Dongxiang, Hui, and Han Chinese was given in Supplementary Table 1. The allele frequency distributions and forensic parameters are listed in Table 1 and Supplementary Table 2. The Ho ranged from 0.688 at CSF1PO locus in Hui to 0.914 at D6S1043 locus in Dongxiang while the He ranged from 0.704 at D3S1358 locus in Han to 0.883 at D6S1043 locus in Dongxiang. In the test of HWE, the genotype frequency distributions showed no significant deviations from expectations except p-value of 0.030 at D19S433 locus in Hui. PIC of all selected loci ranged from 0.652 at D3S1358 in Han to 0.868 at D6S1043 in Dongxiang. The values of DP were in the range of 0.861 at D3S1358 in Han to 0.969 in Dongxiang and Hui. The highest PPE was found at D6S1043 locus in Dongxiang (0.824), with the lowest found at CSF1PO locus in Hui (0.410). The most polymorphic loci in all three populations were highly discriminating, which demonstrates that this set of loci will be useful for forensic identification.

Table 1 Forensic statistical parameters of the 15 autosomal short tandem repeats from Dongxiang, Hui, and Han Chinese populations in Linxia.

Interpopulation genetic distances

We performed various parameters of genetic distances to infer population structure (Fig. 1 and Supplementary Table 3). Chinese Muslim populations Dongxiang and Hui showed the largest pairwise genetic distances with populations from Africa and Middle East. The smallest genetic distances were noted for Chinese Muslim populations with East Asian populations, especially Han Chinese. Dongxiang showed nonsignificant pairwise Fst difference from Hui in Linxia and Ningxia, Han Chinese in Linxia, Shaanxi, Shanghai, and Guangdong, and Tibetan in Lhasa (p > 0.005). The genetic divergence of Dongxiang and those populations are relatively small (pairwise Fst < 0.002 and Slatkin linearized Fst < 0.003). Hui in Linxia also showed nonsignificant genetic difference with Hui in Ningxia, Uygur in Yili, Han Chinese in Shaanxi and Yunnan, Russian in Inner Mongolia, and Tibetan in Lhasa. Hui in Ningxia also did not differ from all five Han Chinese populations in this study. However, almost all the pairwise Fst differences between Dongxiang and Hui with European, Middle Eastern, and African populations are all above 0.01. The average pairwise differences exhibit the very similar pattern. The two Uygur populations statistically differed from all other populations. The genetic distances of Uygur with East Asian, European, and most Middle Eastern populations are almost the same, indicating Uygur is an admixed population.

Figure 1
figure 1

Plots of pairwise Fst of Dongxinag, Hui, and Han Chinese in Linxia and other 45 worldwide populations.

Clustering by structure analysis

Analysis of genetic distance failed to support the genetic affinity between Chinese Muslim populations Dongxiang and Hui with Middle Eastern or European populations. We then employed a cluster based algorithm to further clarify population genetic structure at individual level. According to the highest posterior probabilities, the most suitable K was observed at K = 3 (Supplementary Table 4). The clustering showed a very clear geographic pattern (Fig. 2). East Asian, European, and African populations belong to cluster 1, 2, and 3, respectively. Middle Eastern populations seem like to be admixture of African and European populations. Uygur populations shared a similar degree of membership with East Asian and European (35% to 40%), which is consistent with genetic distance analysis and previous reports2. The case of Uygur shows clearly that the 15 STRs and Structure analysis have enough power to delineate the ancestry of populations. The proportion of membership of Dongxiang and Hui in cluster 1 reaches 58.3% to 63.9%. Although this proportion is about 10% lower than Han Chinese (66.8% to 74.4%), it still fall into the general pattern of East Asian populations ranging from 57.8% (Lhoba) to 80.3% (She) (Supplementary Table 4). It’s possible that some individuals of Dongxiang and Hui might have excess affinity with West Eurasians as they are living closely together with Uygur and Central Asians (Supplementary 6). However, the majority of Dongxiang and Hui samples share very similar membership with other East Asian populations, revealing a common genetic makeup.

Figure 2
figure 2

Estimated population genetic structure of Dongxinag, Hui, and Han Chinese in Linxia and other 45 worldwide populations.


The origin of Chinese Muslim populations via demic diffusion or simple cultural diffusion has long been a hot debate. Previous genetic studies with limited markers and small sample size often came to contradictory conclusions. In this study, we focused on the genetic makeup and ancestry clustering of Dongxiang and Hui using 15 autosomal STRs genotyped from more than 600 individuals. The two Chinese Muslim populations Dongxiang and Hui showed significant genetic homogeneity with co-resident Han Chinese in Linxia and other East Asian populations rather than with European or Middle Eastern populations, which support a simple cultural diffusion for the origin of Dongxiang and Hui in China. This cultural transformation phenomenon has also been observed in other Muslim populations. Although the Utsat people in Hainan Island are thought to be descendants of the Champa Kingdom and have been officially recognized as Hui nationality, they are genetically much closer to the Hainan indigenous ethnic groups than to the Cham and other mainland Southeast Asian populations9. The spread of Islam in the Indian subcontinent was also proven to be predominantly cultural diffusion associated with minor gene flow from West Asia and Arabia by analyzing autosomal STRs44,45, mitochondrial DNA46,47, and Y chromosome47. Autosomal STRs also reveal common genetic ancestry of the Thai-Malay Muslims and Thai Buddhists36. In this context, cultural transformation has shaped worldwide Muslim populations.

Additional Information

How to cite this article: Yao, H.-B. et al. Genetic evidence for an East Asian origin of Chinese Muslim populations Dongxiang and Hui. Sci. Rep. 6, 38656; doi: 10.1038/srep38656 (2016).

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.