Introduction

Patterns of Y-chromosome diversity provide a unique perspective into aspects of the origins and composition of populations.1, 2 Its restricted paternal inheritance, smaller effective population size and specific clustering of Y variants qualify Y chromosome as an indispensable tool to provide substantial genetic evidences.3, 4 In addition, increasing and stable binary markers were being discovered in the past decade, and consequently, the topology and nomenclatures of Y-genealogical tree were being established successively.5 These modifications empower us to discriminate haplogroups with higher resolution and less ambiguity. The comprehensive approach, that is, combining the high-resolution Y-single-nucleotide polymorphisms and the more rapidly evolving microsatellite markers, would shed more light on the origins and the complex history of populations.6

Central Asia serves as a geographic conjunction between East Asia and West Asia and East Europe, lying between Siberia in the north and South Asia subcontinent in the south. Within the non-African context, Central Asia shows a high level of both genetic and ethnic diversity, indicating that the settlement of this region was a complex process. Two competing hypotheses have been raised concerning the origin of Central Asians. One hypothesis suggests that Central Asians could represent an early incubator of Eurasian variation, whereas the other proposes that the current rich genetic diversity of Central Asians could result from recent admixture between western and eastern Eurasian populations. Y-chromosomal data have been interpreted as the indication that Central Asia was a major source of population migration events.7, 8 However, studies on mitochondrial DNA found that considerable western and eastern Eurasian haplogroups overlapped in present-day Central Asian populations. European and East Asian mitochondrial DNA lineages could be clearly demarcated, suggesting recent peopling.9, 10, 11

Northwest China closely neighbors Central Asia, and particularly Xinjiang Uygur Autonomous Region starts to extend into Central Asia. In addition, the Silk Road through the northwest of China, a trans-Eurasian trade route established in the second century BC, had a vital role in the east–west multifarious intercommunications, and thus attracted more attention to human migration in this region. Having different religious faith, cultures and life customs, a number of ethnic groups inhabit in northwestern China, and presumably had experienced complicated history.12, 13 Although some of them settled in this region since a long time, several groups were formed recently. The abundance of human genetic resources in this region and the lack of knowledge about population structure, especially the absence of a detailed Y-chromosomal dissection, motivated our endeavors to dissect paternally genetic architecture of northwestern populations using Y-chromosomal variation data. In addition, we anticipated providing some valuable information about the ethnogenesis of Central Asians plus a combination of archaeology- and language-based research results by others.

To elucidate the origin and the population-forming processes of the human populations in Northwest China and to explore their paternal genetic structure, we performed detailed analyses of the extant local ethnic groups.

Materials and methods

Population samples

A total of 503 male Y chromosomes from geographically and linguistically representative 14 ethnic groups were assayed. All individuals for this study are unrelated and belong to the same ethnic groups for at least three generations, whose blood samples were collected with appropriate ethical approval and informed consent. Information about ethnic groups in the study is listed in Table 1, and sampling locations are spotted in Figure 1. Genomic DNA was extracted from whole blood using normal phenol−chloroform method and was stored in 10 mM Tris-1 mM EDTA (TE solution) at −80 °C for further test.

Table 1 Population genotyped in this study
Figure 1
figure 1

Map of sampling locations of 14 ethnic groups in this study and the geographic distribution of these ethnic nationalities. Areas pigmented with colors represent the distribution of nationalities, and the number codes (consistent with those in Table 1) denote the sampling sites. A full color version of this figure is available at the Journal of Human Genetics journal online.

Y Haplogrouping and terminology

The selection of Y-chromosome single-nucleotide polymorphism markers and the nomenclature of haplogroup on Y genealogical tree referred to the International Society of Genetic Genealogy (Y-DNA Haplogroup Tree 2007, http://isogg.org/tree/ISOGG_YDNA_SNP_Index07.html), and the previous studies.2, 14, 15 A set of 29 informative binary polymorphic sites was included in our survey, defining Y haplogroups from C to R (Figure 2). Detailed discriminating of cladistic nodes internal to P-M45 was performed using more derived markers (M3, M17, M56, M87, M120, M124, M157, M173, M207, M242 and M343) given the high frequent occurrence of M45 in Central Asians. We adopted the hierarchical typing strategy3 to screen these markers using PCR-restriction fragment length polymorphism or sequencing methods. M175, a 5-bp indel polymorphism, was genotyped by GENESCAN on ABI 377genetic analyzer (Applied Biosystems, Foster City, CA, USA). The purified PCR products were sequenced on ABI 377genetic analyzer, using ABI PRISM BigDye Terminator V3.1 Sequencing Kit.

Figure 2
figure 2

The phylogenetic relationships and nomenclature of Y-chromosome binary haplogroups surveyed in this study.

Y-STR typing

A total of eight multiallelic short tandem repeat (STR) loci were genotyped for subjects of haplogroup R1 and J, including two trinucleotide-repeat polymorphisms (DYS388 and DYS392), and six tetranucleotide-repeat polymorphisms (DYS19, DYS389I, DYS389II, DYS390, DYS391 and DYS392). These eight loci were analyzed by PCR using published primers described by Kayser et al.16 and the STR website (http://www.yhrd.org). The twin polymorphisms, DYS389I and DYS389II, were amplified with single forward primer labeled by fluorescent dye and two respective reverse primers. PCR products were directly run on ABI 377 sequencer, with ABI GS500 TAMRA as the internal lane standard. The GENESCAN and GENOTYPER software packages were used to collect the data and to discriminate allele counting. Y-STRs alleles were called according to the number of repeat units they carried.

Data analysis

The frequencies of Y haplogroups were computed. The multidimensional scaling (MDS) analysis was conducted using SPSS 11.5 software after the generation of Fst genetic distance matrix by Arlequin 3.0.17 Results of the MDS were presented by the two-dimensional plots. The genetic structure of populations was dissected by the analysis of molecular variance (AMOVA) approach,18 still using Arlequin software. Additional data of Daur, Ewenki, Hezhe, Manchu, Oroqen, Korean, Sichuan Han (SC Han), Guangdong Han (GD Han), Yao, Buyi, Xinjiang Han (XJ Han) and Gansu Han (GS Han) from Xue et al.,19 Tibetan from Gayden et al.,20 and Turkmen, Kyrgyz, Tajik, Uzbek and Kazak (distinguished from Kirghiz, Tajike, Ozbek and Kazakh in the present study) from Wells et al.7 were included in the MDS and AMOVA analysis as referential populations.

Median-joining (MJ) networks21 of Y-STR haplotypes were constructed for R1a1-M17 and J2-M172 using NETWORK 4.2.0.1 (http://www.fluxus-engineering.com). Epsilon was set as zero. Published Y-STR data for 121 M17-carrying Russians from South Russia region22 and 236 western Eurasians belonging to J2-M17223 were also included, in the light of the putative origins of these two haplogroups. For the integration of data between reported and present, MJ networks for R1a1-M17 haplotypes were based on seven STR loci (DYS19, DYS389I, DYS389II, DYS390, DYS391, DYS392 and DYS393), and for J2-M17 based on six STR loci (DYS19, DYS388, DYS389I, DYS390, DYS391 and DYS392). Y-STR data are listed in Supplementary data file.

The time to the most recent common ancestor was estimated based on Y-chromosome microsatellite variation using the linear expansion,24, 25 BATWING (Bayesian analysis of trees with internal node generation)26 and single-nucleotide polymorphism-STR coalescence methods.27 All three methods assumed an average Y-STR mutation rate of 0.00069 per locus per 25 years.28 The model of exponential growth from initial constant population size at time beta was set for BATWING analysis, and prior distributions in calculations were specified for alpha as a gamma (1.01, 1), for beta as a uniform (0, 15) and for N as a gamma (1, 0.0001). The upper bound for the coalescent time was determined with the assumption V0=0,29 and the lower bound with the assumption V0=Va (Va is the within-population variance in ancestral populations30).

Results

Diversity of Y-single-nucleotide polymorphism markers

A total of 24 haplogroups was detected, including paragroup C, D, F*, G, H, I, J, K*, N, O, P*, P and R among 14 ethnic groups from Northwest China. However, polymorphisms on four markers (Q3-M3, R1a1a-M56, R1a1b-M157 and R1a1c-M87) were not observed. The frequency distribution of haplogroups is shown in Table 2.

Table 2 Haplogroup distribution and Y-chromosome diversity in 14 northwestern populations

The patrilineal gene pool of populations in this area was indicative of high haplogroup diversity (0.7602±0.0546 on average) in general, which was similar to the previously observed value in Central Asians,7 and being highest in Bao’an population (0.8946±0.0305). As to the distribution of haplogroups, paragroup O and P were the two most predominant lineages. The clade R1a1-M17 within subhaplogroup R had an extensive distribution among populations, and at significant frequency in several groups (68.9% in Kirghiz population, 60.6% in Tataer, 54.3% in Dongxiang, 45.2% in Tajike and 40% in Salar), but three derivatives of R1a1 (R1a1a-M56, R1a1b-M157 and R1a1c-M87) were not observed. Among the individual clades within paragroup O, O1-M119, O2a-M95 and O3-M122 were detected in our samples, and the major constituent of haplogroup O was O3-M122 lineages. Intriguingly, 47% males in Russ population belonged to O3-M122. N-M231 occurred at a moderate frequency in multiple ethnic groups. Haplogroup J was only observed in several groups from the extreme northwest of China. Moreover, 32 individuals out of 37 J-related samples were allocated into J2-M172 (30.4% occurred in Ozbek). G-M201, H-M69 and I-M170 sporadically occurred in several populations. Individuals carrying I-M170 were mostly from Tataer population. An extensive, but uneven, distribution of haplogroup C-M130 was observed among the majority of ethnic groups under investigation, occurring in Kazakhs and Mongolians at a climax frequency of 58.5 and 40%, respectively. Haplogroup R2-M124 is infrequent, but informative in Central Asia, Anatolia, Kurdistan, and particularly in Pakistan and India.7, 14, 31, 32, 33 In this study, there were merely three representatives of R2-M124 observed (one from Kazakh and the other two from Bao’an). All of the YAP+ associated individuals were designated into lineage D for the observation of the diagnostic marker M174. This may be to a large extent attributed to the long-term influence from Tibetan with the notable prevalence of D lineages.34, 35 The phylogeography of D was rather irregular in this region, and even significant discrepancy was among geographically neighboring groups. This pattern was concordant with the relic distribution of D-M174 delineated in a latest study by Shi et al.,36 which proposed M174 lineages were representative of the ancient northward colonizers into Asia.

Clustering analysis by MDS plot

Figure 3 presents MDS plots based on Fst genetic distance matrix calculated by the frequency of haplogroups. The MDS plot of 32 populations (Figure 3a) showed the between-population differentiating and clustering pattern, reflecting by and large an east–west cline. Nearly all Central Asians clustered together on the left part of the plot. In contrast, the right part was occupied by East Asians (Southeast Asians on the upper side and Northeast Asians on the lower side). A part of 14 populations in this study clustered either with Central Asians or with East Asians. Kazakh, Mongolian, Xibo, Tu, Russ and Yugu were closer to East Asians, and the other nine populations loosely clustered with Central Asians. In general, the paternal genetic structure of ethnic groups in northwestern China was perceptibly similar to Central Asians. The two spots of northwestern Han Chinese (XJ Han and GS Han) were more adjacent to Northeastern Asians on the plot, obviously separated from Central Asians.

Figure 3
figure 3

Two-dimensional plots of multidimensional scaling analysis based on Fst genetic distances: (a) expanded data including referential populations; and (b) the data of 14 populations in this study.

When only the haplogroup frequency data from the 14 populations surveyed in this study were subjected to MDS clustering analysis (Figure 3b), little coherence with either language affinity or geographic proximity was evident. To extend this observation, Mantel test was conducted by Arlequin to test correlation between genetic and geographic distances using pairwise Fst values and pairwise geographic distances (the geographic distances were drawn from the Great Circle Distance Calculator program [http://www.marinewaypoints.com/learn/greatcircle.shtml]). No correlation was observed (r=−0.063, P=0.675), further indicating that, in Northwest China, the paternal genetic pattern is inconsistent with isolation-by-distance model, which was commonly used to interpret the difference of genetic structure of populations with geographic scales.37

AMOVA dissecting population structure

The rationale behind AMOVA is that, among different grouping hypotheses, a grouping level of populations that accurately reflects their genetic architectures should allocate a higher proportion of the genetic variance between groups, and a lower proportion among populations within groups. Table 3 presents variance components and P-values at 10 grouping levels. Grouping levels 1–3 signified rich divergences among ethnic populations in this study, even among populations with close language affinity and geographic proximity, because of great percentages of the genetic variance among populations within groups. This point was exacerbated by negative values of the genetic variance among groups. Levels 4 and 5 were grouped according to the clustering result on MDS plot (Figure 3b) instead of the affiliation of geography and language. The high proportion of the variance among groups (nearly 14%; P<10−5) suggested the genetic intricacy in these populations, and simultaneously confirmed the validity of MDS analysis. After the referential populations data were incorporated, the variance among groups of 3.05 and 3.57% for grouping levels 6 and 7, respectively, suggested the closer similarity in genetic profile of northwestern Han Chinese with northeastern Asians than with northwestern populations, and furthermore the appreciable differentiation between northwestern populations and Northeastern Asians, Southeastern Asians as well as Central Asians. However, northwestern populations are genetically more similar to Central Asians than East Asians in general, which was corroborated by the significant discrepancy of variance among groups between level 8 (4.83%) and level 9 (negative value). Grouping level 9 according to the MDS plot (Figure 3a) obtained 9.03% variance among groups, which postulated the genetic contributions from East Asians and Central Asians to the gene pool of northwestern populations.

Table 3 AMOVA results

MJ networks and age estimation

MJ networks indicated that neither haplogroup R1a1 nor J2 showed Y-STR haplotypes sharing between northwestern Chinese and the quoted populations (Figure 4). This clear differentiation in Y-STR motifs suggested that these two lineages in Northwest China hardly came directly from their originating resources. MJ networks showed high haplotype diversity within R1a1-M17 and no obvious modal haplotype. The M17 lineages in Northwest China might originate from multiple founders instead of a single early dispersal event, given its prevalence in Central Asia, and then experience rapid demographic expansion and differentiation. Yet, M172 seemingly suffered from strong bottleneck effects after a single early immigrant event.

Figure 4
figure 4

Median-joining networks of Y-STR haplotypes for haplogroups R1a1-M17 (a) and J2-M172 (b). Haplotypes are represented by circles with area proportional to the number of individuals. Colors indicate the geographic origin. Northwestern Chinese haplotypes are shown in black, Russian haplotypes in grey and western Eurasian haplotypes in white.

The coalescent age estimates of R1a1 and J2 in Northwest China are listed in Table 4. The time calculated using linear expansion and BATWING methods was relatively overestimated than the divergence time. J2-M172 appeared slightly younger than R1a1-M17 except the estimate under linear expansion model. Therefore, the likely ages of these two haplogroups range from 5000 to 10 000 years. Although the precision of regional haplogroup dating can be affected by potential multiple founders, three independent estimating methods suggested similar time ranges. The age estimate indicated that the expansion of lineages in this area is a recent event.

Table 4 Estimates of TMRCA for R1a1-M17 and J2-M172 based on Y-chromosome microsatellite variation

Discussion

In all, 14 ethnic groups from Silk Road region in Northwest China belong to either Turkic- and Mongolian-branch of Altaic Family or Indo-European Family, and most of them are Islamic. Nevertheless, the Y-chromosome data currently obtained suggest that their genetic backgrounds seem quite heterogeneous because of the contributions from multiple separate paternal lineages. In an attempt to explore the composition of paternal gene pool in this region, we recomputed in proportion the contributions of individual lineages to the total gene pool of 14 ethnic groups. Figure 5 explicitly manifests haplogroup 0-M175 and R1a1-M17 accounting for the largest proportion, and C-M130 is also notable.

Figure 5
figure 5

The proportion of different lineages in 14 ethnic groups as a whole. F*-M89 encompasses haplogroup G-M201, H-M69 and I-M170. P*-M45 contains all derived subhaplogroups except R1a1-M17.

Two major lineages in paternal gene pool

Parahaplogroup O-M175, the East Asian-specific lineage, is extensively distributed across entire East Asia region, in which its overall frequency (sum of O1-M119, O2-M95 and O3-M122 frequency, above 50%) is uniquely high among East Asian populations.1, 34, 35 Among three major haplogroups under O-M175, O3-M122 occurs at the highest frequency, above 40% on average. After systematically screening O3-M122-associated male subjects over nearly whole East Asia, Shi et al.30 strongly argued southern origin of lineage O, which expanded northward approximately 25 000–30 000 years ago. In 14 ethnic groups involved in the present study, the overall frequency of haplogroup 0-M175 was 23.5%, out of which O3-M122 accounted for 14.1%. Present data were consistent with the argument of northward migration, and supported the proposed northwestward entrance of haplogroup O. As a result, O lineages became one of the main sources constituting the northwestern populations.

M173 is putatively regarded as an ancient marker across Eurasian. A vast majority of Y chromosomes belonging to R-M207 can be allocated to R1-M173 lineages, from which two predominant subclades, R1b3-M269 and R1a1-M17, were derived. Substantial reports supported that M269 had been well established throughout Europe since the Paleolithic era, and a spectrum of the frequency was observed at 40–80% with a descending cline from the west to the east.7, 14, 38, 39, 40 R1b-M343, the ancestry of M269, was just at a detectable frequency in populations involved in this study. R1a1-M17 has an extensive and frequent distribution in Europe, entire Central Asia, Pakistan, northwestern India and West Asia.7, 32, 38, 41 Haplogroup R1a1-M17 was proposed to originating from South Russia/Ukraine region approximately 10 000 years ago and symbolized the spread of Kurgan culture over Central Asian steppes.42 The distribution pattern of R1a1-M17 in Europe, in which the frequency increases with eastward gradient,43, 44, 45 is contrary to that of R1ba-M269. In South Asia, a frequency of 15.8% was observed for Indian populations on the whole, and 24.4% for Pakistani.32 The frequency of R1a1-M17 attained the highest in Central Asia.8 Thus, R1a1-M17 served as one main expanding source from Central Asian nomads. A significant proportion of genetic components was occupied by R1a1-M17 in populations of Northwest China (Figure 5), which signifies that large masses of Altaic-speaking nomads from Central Asia entered into Northwest China because of multiple founders of R1a1-M17, as inferred from MJ network analysis, except that their arrival time, as estimated by haplogroup dating analysis (Table 4), may be far more recent than that of East Asian O lineages. Our estimation is in large agreement with the history of pastoralism in Central Asia dating back to 6000 years ago.46

Consequently, significant proportions occupied by haplogroup O-M175 and R1a1-M17 in paternally genetic composition indicate that the main ancestors of present northwestern populations in China were composed of antecedent East Asians peopling northwestward and later Central Asians immigrating eastward.

Component originating from West Asia

The currently available data maintained Near East as the corroborative homeland of haplogroup J bearing a mark of the Neolithic demic diffusion associated with the development of agriculture.23, 47 Early farming practicers introduced J2-M172 lineages through Levant Corridor into the Europe, where this haplogroup is mainly confined to the Mediterranean coastal areas, southeastern Europe and Anatolia.14, 38, 48, 49, 50, 51 The occurrence at a noticeable frequency was also observed in Central Asians.7, 8 The eastward expansion of haplogroup J2 to Iraq, Iran and Central Asia was also well documented in the Neolithic archeological records.52 In this study, M172 was observed in Uygurs, Tajikes and Ozbeks, particularly 30.4% frequency in Ozbeks. Our result showed how far lineage J reached eastward. On the other hand, Islamism was delivered into China 1400 years ago from Arabia and Persia,12, 13 and the western lineages were largely attributable to the encroachment of these Muslims.53 However, the estimated age of J2-M172 in this region seems more ancient than the history of Islamism. The haplotype clustering in MJ networks clearly demarcated between northwestern populations and western Eurasians. Therefore, the J2-M172 lineages penetrated into China, probably following eastward nomads from Central Asia, instead of directly from West Asia.

Limited gene flow

I-M170, a European-specific haplogroup, was considered as being of Balkan origin.29, 38 I-M170 lineages subsequently dispersed toward Caucasus region and Central Europe.49, 54 In our samples, I-M170 was observed exclusively in Tataer population at 33.3% frequency. G-M201 likely arose in Mesopotamia, and Caucasus and Anatolian populations showed a relatively high frequency.14, 55, 56 Semino et al.38 observed that haplogroup G-M201 occurred at 30.1% in Georgian, a Caucasus population. However, the frequency of G-M201 largely reduced in West Asia, India and Pakistan.31, 32, 57 Haplogroup G-M201 was detected in as few as five individuals from our three northwestern populations. H-M69 was an emerging marker during the second great human migration from Middle East into Indian subcontinent, with a supposed history of 25 000 years. Moreover, H-M69 lineages more concentrated into clade H1-M52 in Indians, yet were rare and even absent in other populations.32, 40, 58, 59, 60 We only distinguished four Y chromosomes harboring M69 from the Uygur population. The restricted and occasional distribution of G-M201, H-M69 and I-M170 under trunk F-M89 among northwestern populations of China and Central Asians may result from gene flows mediated by ‘Silk Road’ from West Eurasia and South Asia. The transcontinental communication complicated the genetic scenario of northwestern populations, but the genetic admixture took a marginal effect.

Haplogroup Q-M242, the sister clade of R-M207 derived from P-M45, is a preponderant lineage throughout Siberia.61 A subset of Q-M242 lineages crossed over Bering Strait and entered America. During the expansion, a characteristic marker M3 emerged, defining Q3. As reported elsewhere, M3 in Asia was observed only in the extremely northeastern Siberian region of Chukchi Peninsula.62, 63 No Q3-M3 lineage was observed in our samples, and individuals carrying M242 sporadically occurred. Thus, it is hardly convictive to postulate the presence of genetic interaction between northwestern populations and Siberians.

Relay station during the northward migration of N

We typed an equivalent marker M231 for LLY22g polymorphism to distinguish haplogroup N, which was largely underrepresented in previous studies on East Asians. A detailed survey shed more insightful light on the phylogeography of haplogroup N.64 N-M231 was presumably originated in the southwest of East Asia, and then covered almost entire North Eurasia, except for the little distribution in Central Siberia. N3 is the most frequent subclade under haplogroup N. A higher frequency of M231 was observed in Northeast Asia than in the south.19 The presence of N3-Tat was also observed in Central Asians at a moderate frequency.7, 8 Our result that N-M231 extensively occurred, although infrequently, in northwestern populations suggests that Northwest China may serve as an intermediate transfer station on the migratory trajectory, where haplogroup N transferred west into Central Asia, and then entered Siberia.

Influence by recent historic events

C-M130 is a predominant haplogroup in northeastern Asians, owing to a higher frequency than southern populations. In particular, it is significantly frequent in Mongolians.7, 19, 37 Not a little distribution of haplogroup C was observed in Central Asians and northwestern populations of China. Zerjal et al.,65 through calculating the divergence time of 1000 years, revealed that the Mongol empire expansion left the distribution of C-M130 lineages crossing over a broad region from East to Central Asia. In our samples, the frequent occurrence of C-M130 in the Hazakh other than the Mongolian would be to a large degree relevant to the historical document that masses of Mongols admixed into the precursors of Hazakh ethnic.66

An additional finding that 58% of paternal components in the Russ population belonged into haplogroup O was of great interest. An aberrant distribution of the typical East Asian haplogroup was observed in an atypical East Asian population. Thanks to the availability of the unambiguous ethnic origin, the Russ in China was populated by the immigrants from Russia after the eighteenth century. During the process that they admixed into East Asians, a higher mobility of female variants under the context of the patrilocal society gave rise to the sex-biased admixture,67 which resulted in assimilating more genetic components from East Asians into the patrilineal gene pool of this ethnic population. Our unpublished data of mitochondrial DNA, which indicated more mitochondrial DNA lineages from European in the Russ population, substantiated this supposition.

Genetic heterogeneity among populations

All ethnic groups in this study are almost descendants of nomadic pastoralists. Notably, the practice of endogamous marriage and seasonal migration is often more prevailing in nomads.46 Highly mobile and endogamous populations will not show associations between genetic variation and geography, a fact that has been shown for Jewish populations.68 Multiple analysis results (the distribution of haplogroup, MDS plot and AMOVA analysis), not least the significant variance components among populations within groups (P<10−5, second column in Table 3), concordantly showed the significant heterogeneity even among linguistic and geographic neighbors. The genetic heterogeneity may be attributed to two reasons: (1) the difference in isolation and strong cultural boundaries, resulting from different culture and living customs, presented the difference in the degree of genetic admixture; and (2) different populations experienced different population events such as migration, and consequently the current resident regions of these ethnic groups may not really reflect the original geographic peopling patterns of the nomadic ethnics.

Collectively, the above analyses on Y-chromosomal variations revealed the paternal genetic constitutions and the origin of northwestern populations in China, and the gene admixture processes were specified. The early East Asians of northwestward colonization met the later immigrants from Central Asia. J2-M172 was probably introduced into China by Central Asians. The gene flows from West Eurasia diversified the genetic scenarios through Silk Road, the influence of which on local populations, nevertheless, was limited.