Reproductive success, widely used as a measure of fitness, is described as the genetic contribution of an individual to future generations.1 Human behavioural ecologists use various fertility-specific measures to directly estimate reproductive success in extant populations (eg, Strassmann and Gillespie2); however, indirect estimates of past reproductive success can also be obtained through evolutionary population genetic approaches, which demonstrate that its cultural transmission can be extremely effective.3, 4

High variance of male reproductive success is detectable from genetic data because it leads to many Y-chromosomal lineages becoming extinct through drift and others expanding markedly. For any increase to lead to a high population frequency, two factors are needed: biological, with a need for men to be fertile, and cultural, with a continued transmission of reproductive success over generations. Indeed, previously, strong signals of successful transmission of Y-lineages have been associated with recent social selection, and were explained by inherited social status, with two cases in Asia and one in the British Isles. The best-known instance is the finding that ~0.5% of the world’s Y-chromosomes belongs to a single Asian patrilineage,5 descending from a common ancestor in historical times, and suggested to be due to the imperial dynasty founded by Genghis Khan (died 1227). The lineage is characterised by a high-frequency Y-microsatellite haplotype and a set of close mutational neighbours—a so-called ‘star-cluster’. In its structure and haplotype diversity, it resembles the recent descent clusters (DCs) observed within rare British surnames,6 despite its presence in populations spread over an enormous geographical range and speaking many different languages. Subsequently, two further examples of high-frequency DCs were described and were associated with the Qing dynasty descendants of Giocangga (died 1582) in Asia,7 and the Irish early medieval ‘Uí Néill’ dynasty in Europe.8 These three examples might suggest an association between high reproductive success (detectable at the population level) and both wealth and socio-political power, as is observed in some present-day populations (eg, the Tsimané of Bolivia9).

In this study, we ask whether the two signals of continued transmission of success over generations detected in mainland Asia are the only examples of likely recent social selection, or if similar patterns can be detected among the Y-chromosomes of this continent, known for the numerous expansive polities that emerged there during the Bronze Age and later (beginning 5000–4000 YBP). Pinpointing the historical figures associated with any such signals of Y-lineage transmission would require their identification via ancient DNA testing, and/or a comparison with certified living descendants and will not be attempted here. Instead, we aim to understand whether efficient transmission is linked to rules that apply in populations with specific histories of subsistence or culture, or can be detected in other groups with diverse cultural features. We also ask whether any expansions detected converge to the same time period or reflect distinct temporal episodes.

To address these questions, we generated novel Y-microsatellite and Y-SNP data and combined these with published data in order to carry out a systematic analysis of the most frequent haplotypes and their associated DCs. We chose a lineage-based, or interpopulation approach, rather than an intrapopulation approach,10 as we reasoned that dominant lineages are likely to migrate.11 We considered the geographical pattern of identified expansions, their ages and the cultural factors (language and mode of subsistence) characteristic of each population in which they were detected. In total, the 5321 chromosomes analysed belong to 127 populations distributed from the Middle East to Korea. Our analysis focuses upon the 15 most frequent haplotypes that include two haplotypes reflecting the previously recognised ‘Khan’ and ‘Giocangga’ expansions.5, 7 Our analysis shows that the most successful Y-chromosomes in Asia form part of expansions that began between 2100 BCE and 1100 CE, found both among sedentary agriculturalists and pastoral nomads. The ages of these expansions are considered in their cultural contexts.

Materials and methods

DNA samples

DNA samples (461) from Central Asia were collected by the authors with appropriate informed consent.12 An additional 4860 samples form part of sets published previously13, 14, 15, 16, 17, 18, 19, 20, 21 with the exclusion of population samples containing fewer than 15 individuals. A total of 5321 males from 127 populations were analysed (Supplementary Table 1 and Figure 1).

Figure 1
figure 1

Locations of populations included in the analysis. In the upper panel, numbers represent a numerical code given to each population (Supplementary Table 1) and in the lower panel three-letter abbreviations of population names are given. AHE: Aheu; ALK: Alak; AMB: Ambalakarar; ACT: Anatolia Centre; AES: Anatolia East; AIS: Anatolia Istanbul; ANR: Anatolia North; ANE: Anatolia NE; ANW: Anatolia NW; AST: Anatolia South; ASE: Anatolia SE; ASW: Anatolia SW; AZE: Azeri; BLC: Balochi; BJH: Beijin-Han; BIT: Bit; BO: Bo; BRH: Brahui; BRA: Brau; BRS: Burusho; BRY: Buryat; BUY: Buyi; CHA: Chamar; DAU: Daur; DUN: Dungans; EWK: Ewenki; GEO: Georgian; HAL: Halba; HAN: Han; HCD: Han (Chendgu); HHR: Han (Harbin); HLZ: Han (Lanzhou); HMX: Han (Meixian); HYL: Han (Yili); HNI: Hani; HAO: Hanoi; HAZ: Hazara; HEZ: Hezhe; HMD: Hmong Daw; HO: Ho; HUI: Hui; INH: Inh; IMG: Inner Mongolian; IRA: Iran; ILA: Irula; IYN: Iyengar; IYR: Iyer; JAM: Jamatia; JAV: Javanese; JEH: Jeh; JOR: Jordan; KLS: Kalash; KMR: Kamar; OTU: Karakalpaks (On Tört Uruw); KK: Karakalpaks (Qongrat); KGD: Kataang; KUF: Katu; KAZ: Kazak (Karakalpakia); KZK: Kazak; KHL: Khalkh; KHM: Khmu; KBR: Koknasth Brahmin; KRY: Konda Reddy; KKO: Korean; KOR: Korean; KCH: Korean (China); KOT: Kota; KDR: Koya Dora; KYK: Krasnoyarsk Kurgan; KRD: Kurds; KRB: Kurumba; KYR: Kyrgyz; KRA: Kyrgyz (Andijan); KRM: Kyrgyz (Narin); KRG: Kyrgyz (Narin); LBN: Lamet; LBO: Laven; LI: Li; LOD: Lodha; MKR: Makrani; MLF: Mal; MAC: Manchu; MAN: Manchurian; MRT: Maratha; MZO: Mizo; MON: Mongolian; MUR: Muria; MUS: Muslim; NGT: Ngeq (Nkriang); ORQ: Oroqen; OSS: Ossetian; OMG: Outer Mongolian; UZ: Ouzbeks (Karakalpakia); OYB: Oy; PLN: Pallan; PAH: Pathan; QIG: Qiang; RAJ: Raiput; SHE: She; SDH: Sindhi; SO: So; SUY: Suy; SVA: Svan; SYR: Syria; TAJ: Tajik; TJK: Tajik (Ferghana); TJR: Tajik (Ferghana); TJA: Tajik (Samarkand); TJU: Tajik (Samarkand); TAL: Talieng; THA: Thai; TIB: Tibet; TRI: Tripuri; TUR: Turkmen; TKM: Turkmen (Karakalpakia); ULY: Uyghur; UYU: Uygur (Urumqi); UYY: Uygur (Yili); UZB: Uzbek; VAN: Vanniyan; VLR: Vellalan; WBR: W Bengali Brahman; XBE: Xibe; YKC: Central Yakutia; YKT: Yakut; YBM: Yao (Bama); YLN: Yao (Liannan). Different colours correspond to the mode of subsistence characteristic of each population as follows: orange—agricultural; green—pastoral nomadic; red—hunter–gatherer; purple—multiple modes; grey—information unavailable.

Y-chromosome haplotyping and nomenclature

In the 461 Central Asian samples up to 31 binary markers were typed hierarchically by using the SNaPshot protocol on an ABI3100 capillary electrophoresis apparatus (Applied Biosystems Inc., Foster City, CA, USA). Primers were based on ones published previously22, 23, 24 with additional primers based on published sequences.25 The binary markers define haplogroups that are represented in a maximum parsimony tree25 (Supplementary Figure 1). For comparison among data sets, some simplification of resolution was required, but the original haplogroup information was also retained and correspondence between previous nomenclatures was established.

Eight Y-specific microsatellites (DYS19, DYS389I, DYS389b (equivalent to DYS389II-DYS389I), DYS390, DYS391, DYS392, DYS393 and DYS439) were typed in two previously described multiplexes26, 27 by using an ABI3100 apparatus and GeneMapper v 4.0 (Applied Biosystems Inc.). When DYS19 was observed as two alleles, we retained for analysis the allele found at higher frequency in the population. Allele nomenclature was as described27 with appropriate adjustments in some published data sets.

DC definition and display

Having identified frequent haplotypes, we applied a rule to define a DC centred upon each, using a script (‘Cluster Generator’; Supplementary Information) to extract these DCs from the database. This selects individuals from the data set, starting from a given haplotype and including all haplotypes that are continuously traceable to a core haplotype via a pathway of increasing frequency, a topology consistent with a Poisson-distributed single-step mutational model.8 An additional ‘single-step+10% two-steps’ model was tested, but had the effect of incorporating unrelated chromosomes belonging to distinct haplogroups. We therefore applied only the most conservative single-step model. For each population, we reported the proportion of each DC detected, and, as a measure of DC diversity, its average within-population microsatellite variance. The latter measure was used only for cases in which the number of DC chromosomes per population was >7 in order to avoid strong variance bias.28, 29 Use of seven or six microsatellites instead of eight identifies the same DCs (data not shown).

Frequencies of DCs and local microsatellite variances were calculated, and both frequency and highest variance value (source) were displayed using Surfer 8.02 (Golden Software, LLC, Golden, CO, USA) by the gridding method. Latitudes and longitudes were based on sampling centres.

TMRCA estimation and multivariate statistical analysis

TMRCA estimates were generated using BATWING30 using an exponential growth model, an average mutation rate of 0.0028 per locus per generation for microsatellites31, 32 and a generation time of 30 years, compatible with population-based estimates for males.33 BATWING was run on samples comprising DCs to reduce computation time. Tests showed that running BATWING on populations and extracting DC TMRCAs led to similar results (data not shown).

Statistical tests were done using R.34 PCA was carried out on the frequency of DCs per population (arcsine square root transformed) to test relative contributions of DCs and potential associations between expansions. To test for shared cultural features (language and subsistence methods), a co-inertia analysis35 was carried out on the 95 populations for which cultural information was available. This examines two synthetic variables, one in each data set (here, frequency of DC and cultural features), with maximum covariance. The Rv-coefficient (a multivariate extension of the Pearson correlation coefficient) was calculated and tested by means of a Monte–Carlo method (1000 random permutations of the rows of each data set). PCA and co-inertia analyses were done using the ADE package.36


Identification of unusually frequent haplotypes and associated DCs

Through haplotyping of Central Asian samples (Supplementary Table 2) and a literature survey, we collected 5321 eight-locus Y-chromosomal microsatellite haplotypes belonging to 127 Asian populations (Figure 1 and Supplementary Table 1). We ranked haplotypes by frequency (Figure 2), reasoning that particularly frequent haplotypes should represent potential cores of clusters indicating past expansions of paternal lineages. The sample contained 2552 distinct haplotypes, 67% of which were unique, and 15 haplotypes (0.6%) were each present >20 times (and up to 71 times), indicating probable examples of successful transmissions (Figure 2). We focused on these 15 haplotypes and studied their spatial distributions and ages to illuminate the history of high reproductive success of their respective common ancestors.

Figure 2
figure 2

Microsatellite haplotype frequency distribution. The 2552 distinct microsatellite haplotypes among 5321 individuals are plotted by frequency. The 15 frequent haplotypes studied in detail here are numbered from 1 to 15, and contained in the dotted rectangle.

Frequent haplotypes are expected to be accompanied by neighbouring haplotypes within the same Y-SNP haplogroup that arise from them via mutation, thereby forming DCs.6 These DCs were uniformly and objectively extracted from the database using ‘Cluster Generator’. When applied to the 15 most frequent haplotypes, this led to some common haplotypes merging into clusters centred on other common haplotypes, and thus to a total of 11 DCs, whose characteristics are summarised in Table 1.

Table 1 Features of the fifteen primary descent clusters

Characteristics of DCs

A total of 2000 (37.8%) males carry Y-chromosomes belonging to these DCs, and they are remarkably widespread, with only three of the 127 populations analysed lacking any DC chromosomes; DC2 is also detected in one of the two ancient DNA population samples in our database (Krasnoyarsk_Kurgan17). Overall DC proportions in each population vary greatly from 5 to 96.7% (Supplementary Table 3).

As DC4 is restricted to Korea,18 it was not considered for interpopulation analysis. For the remaining 10 DCs, we explored three characteristics for each: its geographical frequency distribution, the geographical distribution of its mean microsatellite variance to indicate its likely expansion source and its TMRCA to suggest the age of expansion. Frequency and variance are represented on maps (Figure 3 and Supplementary Figure 2), and are summarised in Table 2 and Figure 4, which also gives TMRCA estimates. Note that the dating method used, BATWING, provides estimates with large confidence intervals.

Figure 3
figure 3

Geographical distribution of frequency and microsatellite variance for two DCs. The examples here are for DC1 and DC12 (see Supplementary Figure 2 for the remainder). Open circles indicate populations lacking the DC and filled circles indicate populations possessing it. Increasing frequency is indicated by darkening colour as indicated in the scale to the top right. The population with maximum microsatellite variance for the DC is indicated by a coloured circle and the direction of the expansion based on DC variance is indicated by arrows.

Table 2 TMRCA estimates for the 11 DCs and summary of geographical distributions and likely origins
Figure 4
figure 4

Schematic synthesis of DC features illustrated geographically. Black open circles indicate potential location origins and arrows indicate directionality. Mean TMRCA is given for each DC.

Our results show heterogeneity in DC features for all variables analysed: (i) Geographical pattern: two kinds of pattern are observed: six DCs display high frequencies spread over large geographic areas (DCs 1, 2, 6, 8, 10 and 14), and four DCs are more restricted to specific areas (DCs 3, 5, 8 and 12). (ii) Origin: three main source locations emerge: Silk Road/East Mongolia (DCs 1, 8 and 10), Middle East/Central Asia (DCs 2, 3 and 5) and East India/South East Asia (DCs 2, 6, 11, 12 and 14). The potential ‘source’ populations are given in Table 2. (iii) Age: mean TMRCA estimates range from 900 to 4100 YA. In considering these results, we focus on two periods: the ‘historical’ period (700–1060 CE) and the protohistorical period (2100–300 BCE), corresponding, respectively, to pre-Imperial ‘Ancient’ China,37 and the Bronze Age/Iron Age in Levantine/Inner Asia.

Seeking common features among expansions

We performed a PCA on DC frequencies per population (Figure 5) to reveal the relative contributions of the 10 DCs and to test any potential associations between expansions. The two first axes of the PCA contribute, respectively, 27% and 21% of the total variance, with the main contributions of DCs 2, 5, 12 and 14 along axis 1, and those of DC1, 8 and 10 along axis 2. DCs 12 and 14 are systematically associated, as are DCs 2 and 5, and DCs 1, 8 and 10, suggesting shared characteristics between these three sets of expansions (Figure 5).

Figure 5
figure 5

PCA showing the relative contribution of each DC on axes 1 and 2.

To test whether individuals included in these 10 expansions share any particular cultural features, such as language or subsistence methods, as is the case for previously published examples,5, 7 we performed a co-inertia test (Figure 6). This revealed the preferential association of DCs 12 and 14 with Sino-Tibetan and Austro-Asiatic languages and a ‘multiple’ mode of subsistence, of DC2s 2 and 5 with Indo-European languages and ‘agricultural’ subsistence, and of DCs 1, 8 and 10 with Altaic languages and pastoral nomadism (Figure 6). Genetic and cultural patterns (mode of subsistence and language) are significantly correlated (Rv=0.47, P-value <0.0001, N=95), suggesting that several expansions share the same cultural characteristics.

Figure 6
figure 6

Co-inertia analyses showing preferential associations between DCs and cultural factors.

The TMRCA estimates are consistent with six expansions beginning in protohistorical times (DCs 2, 5, 6, 11, 12 and 14), and four beginning in historical times (DCs 1, 3, 8 and 10; Table 2). DCs detected in the different periods differ (Figure 7) in regard to mode of subsistence (χ2=345.59; df=2; P<0.0001) and language (χ2=211.67; df=4; P<0.0001). This suggests a shift in population dynamics between the two periods, with an increase of successful Y-lineage expansions in Altai-speaking pastoral nomadic populations in more recent times. The historical-period expansions show the highest growth rates (Table 2).

Figure 7
figure 7

Proportion of DCs in protohistorical and historical periods in populations divided by different cultural factors.

Associating protohistorical and historical expansions with subsistence mode and language

Six of the 10 DCs analysed originated during the Bronze Age or earlier (2100–300 BCE), and correspond to 30% of the individuals in our data set. The ‘origins’ of these expansions lie within three main regions: South East Asia, including Laos (DCs 12 and 14), Tibet/East India (DCs 6 and 11) and Central Asia/Fertile Crescent (DC2 and DC5). The co-inertia analysis shows that these expansions are found mainly among agricultural populations, speaking Indo-European and Austro-Asiatic languages. DCs 5, 12 and 14 have their putative sources centred both on the Fertile Crescent and South East Asia—places known as centres of agricultural innovation;38 these DCs could be, then, by-products of the civilisations that underwent a shift to agriculture during the Bronze Age (Figure 4). The origin of DC2 is located in Central Asia and its TMRCA estimate is 1300 BCE; this is coherent with the ancient DNA Y-chromosomes from the Middle Bronze Age (Andronovo) detected in DC2.17

Four expansions date to the historical period (700–1100 CE). The most recent (DC3; 1100 CE), originating in the Near East and extending to the South East Indian coast, is mainly detected in agricultural populations. It could be linked to the rapid expansion of Muslim power from the Middle East, across Central Asia and to the borders of China and India, after the establishment of a unified polity in the Arabian Peninsula by Muhammad in the 7th century and under the subsequent Caliphates.

The three other expansions (DCs 1, 8 and 10; 700–1060 CE) encompass 79% of individuals involved in historical expansions and are almost exclusively detected in Altaic-speaking pastoral nomadic populations. The territory is large and follows the Silk Road corridor. The signal of expansion spreads from East to West (from Mongolia to the Caspian Sea), as DC1 has its source in Inner Mongolia (hgC3[xC3c]), DC8 in the Oroqen (hgC3c) and DC10 in the Hezhe (hgN1). Given their different sources and associated haplogroups, these DCs are likely to represent three distinct expansions. These expansions show high variance values not only at their sources but also in other locations (Supplementary Table 3); this could be explained by a rapid migration of founders and parallel transmissions to generate diversity in these different places, all descending from a unique ancestor.

The two previously recognised Asian star-clusters (‘Khan’ and ‘Giocangga’5, 7) are identified here, and correspond to DCs 1 and 8, respectively. Their TMRCAs are estimated at 1090 CE for DC1, almost identical to the published estimate of ~1000 YA,5 and at 700 CE for DC8, older than the published estimate of 590±340 YA.7 The ‘Khan’ DC remains the most striking signal of an Asian expansion lineage, representing 2.7% of the entire data set, and the highest number of identical Y-chromosomes (N=71). Interestingly, the westward directions of expansions DC8 and DC10, their potential sources in northeast China, their geographic extents from China to Karakalpakia, and also the Altai-speaking populations associated with them, could also indicate involvement of the Imperial or elite lineages associated with the Khitan Empire. Abaoji, Emperor Taizu of Liao and the Great Khan of the Khitans (died 926 CE), who officially designated his eldest son as his successor, maintained a pattern of seasonal movements typical of pastoral societies, which could also explain the geographic distribution of these lineages.


Highly represented Y-lineages as a proxy for high reproductive success

Reproductive success, widely used as a measure of fitness, is the genetic contribution of one individual to future generations; it implies that offspring are plentiful and that they survive. In humans, offspring number and survival can be strongly influenced by culture.4 Examples of long-term high reproductive success have been revealed by unusually frequent Asian Y-chromosome haplotypes suggested to descend from two individuals, Genghis Khan5 and Giocangga,7 and to have spread in past generations via social selection (because of associated power and prestige) during historical times. In this study, we used a similar approach, analysing 5321 Y-microsatellite haplotypes belonging to 127 Asian populations to ask whether other examples of continued Y-lineage transmission are observed.

A favourable post-Bronze Age socio-political context for successful Y-lineage transmissions

We have detected 11 highly represented Y-chromosome lineages among the 5321 males analysed; altogether the DCs derived from 11 founding chromosomes represent a large proportion (38%) of the Y-chromosomes analysed, attesting to the importance of efficient Y-lineage transmission in the history of the continent. The presence of DCs with TMRCAs from 2100 BCE to 1100 CE suggests that the socio-cultural context, from the Bronze Age up to more recent historical periods, has been sufficiently favourable to ensure a continuous transmission of these different Y-lineages. Indeed, several admixture events have deeply affected the Asian gene pool during this period, involving migrant groups coming from North East and South East Asia.39 It has been argued40 that the development and impact of complex polities beginning with the Bronze Age ~2500–100 BCE in Central Asia were responsible for drastic changes in population structure, movements and organisation. These polities and early states emerged from local traditions of mobility and multiresource pastoralism, including agriculture and distributed hierarchy and administration,41, 42 with a royal or imperial form of leadership combining elements of kinship and political office.42 Archaeological evidence from the late Bronze and early Iron Ages (1400–400 BCE) confirms the existence of control hierarchies associated with wealth differences and socio-political power, even before the emergence of political entities.42 The development of these polities was accompanied by the emergence of stratified societies, recognised elite lineages and differences in power and status between individuals.

High reproductive success is often associated with high social status, ‘prestigious’ men having higher intramarital fertility, lower offspring mortality9 and access to a greater than average number of wives.43 This suggests that the link between status and fitness detected in modern populations (eg, Nettle and Pollet,44 Snyder et al45 and Heyer et al46) also existed in ancient populations. For the TMRCA period from 2100 BCE to 1100 CE, the DCs are equally distributed between sedentary agriculturalist (36.8%) and pastoral nomadic populations (36.7%), indicating that uninterrupted transmission of Y-lineages over many generations could occur in both groups. Cultural transmission of fertility, reproductive success and prestige, seen in contemporaneous populations,47 could also explain the persistence of these frequent Y-lineages over time. The high variance of DC proportions among populations indicates that the ‘memes’ responsible for cultural transmission can vary greatly among populations, even when populations share similar cultural characteristics.

A shift from sedentary agriculturalist to pastoral nomadic populations for successful Y-lineages

Our study highlights a difference in population composition between protohistorical and historical periods. In protohistorical times, DCs are detected mainly in agriculturalist but also in multiresource and pastoralist populations, whereas in historical times they are predominantly seen in pastoralist populations. This shift towards pastoralists only suggests a modification of social organisation in Asia and a higher probability to generate reproductively successful lineages in pastoral-derived societies and states. The development of complex polities began with the Bronze Age in Central Asia ~2500–100 BCE.40 Agriculture arrived ~6000 BCE (Djeitun, Turkmenistan48, 49) and dispersed during the Bronze Age (3000–2000 BCE) in coexistence with hunter–gatherer communities. The pastoral nomadic lifestyle, linked to horse domestication,50 emerged in Central Asia ~5000 BCE and gained importance during the late Bronze Age. The pastoral nomadic tribes eventually came to dominate the Ponto–Caspian steppes during the first millennium BCE and entered historical records as Scythians (Herodotus) or Sakas (Persian accounts).51

The over-representation of historical-period DCs in pastoral nomadic Altaic-speaking populations is compatible with the development of new forms of political and social organisation in pastoralist economies. Among present-day pastoral nomads, patrilineal descent rules lead to a cultural transmission of reproductive success (Heyer et al, submitted) in male lineages.

New social systems and economic adaptations emerged after horse domestication. Horse-riding greatly enhanced both east–west connections and north–south trade between Siberia and southerly regions, and allowed new techniques of warfare, a key element explaining the successes of mobile pastoralists in their conflicts with more sedentary societies.41 A series of expansive polities emerged in Inner Asia by 200 BCE: 15 steppe polities beginning with Xiongnu (Khunnu) ~200 BCE and concluding with the Zunghars in the mid-18th century have been described, including the Mongol and the Qing empires/dynasties.42, 52 The DCs detected along the Silk Road corridor originated from 700 to 1060 CE and may be associated with dynasties that coexisted with five major empires that dominated the Ponto–Caspian steppes from the 7th to the 13th centuries: the Khitan (Great Liao), Tangut Xia, Jurchin, Kara-Khitan and the Mongol Empires.42 In order to identify the prestigious DC founders among potential candidates, the expansion strategies of these different dynasties, and also the mechanisms of co-inheritance of Y-lineages and prestige via marriage rules should be carefully investigated. As an example, sororal polygyny (in which a man can marry two or more sisters) followed by a shift to the Han Chinese system of taking one wife and one or more concubines, as occurred among the Liao elite throughout the length of the Liao dynasty, are likely to promote the efficient transmission of prestigious lineages among males.

Further investigations including ancient DNA approaches, and next-generation sequencing of modern expansion lineage Y-chromosomes could be undertaken to refine the history of the expansion lineages we have observed, and perhaps to identify the prestigious and powerful pastoralist founders associated with them.