Main

Population admixture may cause various genetic effects, including different disease incidences. In some instances, a population may show different admixture patterns in paternal and maternal lineages, a condition that is called ‘sex bias population admixture’.1 Sex bias admixture may result from specific population histories, such as military migration or frequent intermarriage. A military migration might involve a mobile male soldier population that marries indigenous females, which results in two different lineage origins. Intermarriage between neighboring populations might also eventually alter maternal lineages. In a patriarchy, women usually move more frequently than men because of virilocal residence practices. Although military migration and intermarriage account for the majority of sex bias admixture, the Gelong population of Hainan is an exception.

The Gelong are a small population derived from the Gelao people in Guizhou, China. They arrived in Hainan, China, around 1000 years ago and kept a low frequency of genetic contact with the neighboring Han and Hlai people.2 In this study, we investigated the mitochondrial DNA (mtDNA) diversity of 110 Gelong individuals. Our study was approved by the Fudan School of Life Sciences Ethics Committee. All subjects signed informed consent. We employed the same methods used in our previous study.1, 3, 4 The hypervariable segments of mtDNA were sequenced and 23 coding regions of single-nucleotide polymorphisms (SNPs) were typed using restriction fragment length polymorphism (Supplementary Table 1). All sequences have been submitted to the GenBank (accession numbers: HM145965–6074). The haplogroups were determined according to the most updated nomenclature for mtDNA.5

The Gelong people show moderate genetic diversity (haplotype diversity6 0.968±0.008, close to the average of southern China3) in mtDNA lineages (Figure 1a) and have high M7 lineage frequency (30%). Previous studies show that the Gelong originated in Gelao and were also influenced by the neighboring Han and Hlai in Y chromosome lineages.2 Therefore, all the three populations may serve as genetic contributors for Gelong mtDNA lineages as well. To assess the possible sex bias admixture of the Gelong, we performed 5000 simulations for both Y chromosome and mtDNA data using Admix 2.07 (reference data are from our previous study,2, 3, 4, 8 including Hlai,3 Gelao3 and southern Han Chinese4). The admixture patterns show an obvious sex bias with only 4.9% paternal contribution and 30.7% maternal contribution from the Hlai. Also, there is less genetic material from the Gelao in maternal lineages (37.6%) than in paternal lineages (67.9%) (Figure 1b).

Figure 1
figure 1

Mitochondrial DNA haplogroup frequencies in Gelong (a) and sex bias admixture pattern of Gelong (b). Detailed frequency data of each haplogroup are shown in Supplementary Table 1. The admixture analyses were done using Admix 2.0 (see Excoffier et al.6) running at the default settings of the program.

To further investigate which lineages were obtained from the Hlai, we compared the haplotype diversity of mtDNA among East Asian populations1, 3, 4, 9 using Network4.201.10 The Gelong share several haplotypes with the Hlai in lineages M7e, R9c and R9b (Figure 2). There is a unique sublineage (locus 1d in Figure 2) of populations from Hainan: within M7e, only one haplotype was found among 14 Gelong individuals, whereas five were found among the Hlai. It is most likely that this haplotype was passed from the Hlai to the Gelong and spread quickly because of a genetic advantage. The high frequencies of the M7 lineages might also result from positive selection.

Figure 2
figure 2

Networks of mtDNA exhibit the gene flow from Hlai to Gelong. The Network program was run at the default settings with the same weight for each SNP.9 Node sizes are proportional to the individual numbers of haplotypes (see the lower left corner for scale). The lengths of the lines between nodes are proportional to the step numbers of the STR mutations. Locus 1a and 1b contain haplotypes shared only by the Gelong and Hlai. On locus 1c, the clade is shared only by the Gelong and Hlai. The clade on Locus 1d is a Hainan-specific clade; individuals originate from Hainan, including the Hlai, Gelong, Tsat, Hainan Han and Hainan Kimmun. The haplotypes on loci 1e and 1f are more widely shared among populations. Loci 2a and 2b show a close genetic relationship between the Gelong and Gelao.

Selection signals on mtDNA were examined by frequency spectrum tests for deviation from neutrality. We employed a series of classical frequency spectrum tests, including Tajima's D,11 Fu and Li's D, D*, F and F*12, 13 using DnaSP5.014 to detect deviation from neutrality (Table 1). Significant negative values of these tests indicate positive selection, purifying selection or demographic expansion effect; significant positive values imply balance selection or population subdivision effect. From neutral constant size simulation, we detected significant negative values for M7b1 and M7c3b lineages in the Gelong but not for M7e as was expected. Interestingly, some values for all lineages were also significantly negative, indicating that selection might have affected various lineages.

Table 1 Frequency spectrum tests for deviation from neutrality

These significant values might be signals for either selection and/or demographic expansion. To distinguish the effect of demographic expansion, we performed demographic coalescence simulations 10 000 times using a MS program.15 Reference East Asian and African population models16, 17 were set as best-fit demographic models. The mutation rate was set as 3.24 × 10−6 per site per generation18 (20 years per generation). According to the demographic history of Gelong, we assumed three scenarios for simulation (Supplementary Table 2). The P=0.05 significant cut-off values for each statistic are shown in Table 1. The previously observed significant values are still significant when compared with the simulation results of demographic expansion. Therefore, these values are most likely signals for selection.

To validate the authenticity of these signals for selection, we employed one more test, the mismatch test of demographic expansion,19 using Arlequin3.116 to show the bias of the expansion effects among lineages. The less significant the result, the stronger the expansion effect on the lineage. The result of M7b1 and M7c3b is not significant (P=0.128), whereas those of total lineages (P=0.011) and all the non-M7 lineages (P=0.006) are significant, indicating that the M7 lineages expanded much faster than other lineages in the population. This might be another signal for the selective advantage for M7 lineages among Gelong people.

As a source of cellular energy, mtDNA may affect human's adaption to different climates. Therefore, some variants of mtDNA may have undergone selection. Hainan is the hottest province of China, a stark contrast to the mountain areas of Guizhou from where the Gelong originated. Tropical climates might serve as a selective force on mtDNA M7 lineages. However, in the phylogeny of mtDNA (http://www.phylotree.org/),5 we found only three synonymous variations (C6455 T, T9824A and C4071 T) and one control region variation (T199C) in M7b1 and M7c3b. A future genome re-sequencing of mtDNA for these M7 samples is expected to reveal the functional relevance of M7 lineages.