Genetic Variability and Phylogenetic Analysis of Han Population from Guanzhong Region of China based on 21 non-CODIS STR Loci

Abstract

In the present study, we presented the population genetic data and their forensic parameters of 21 non-CODIS autosomal STR loci in Chinese Guanzhong Han population. A total of 166 alleles were observed with corresponding allelic frequencies ranging from 0.0018 to 0.5564. No STR locus was observed to deviate from the Hardy-Weinberg equilibrium and linkage disequilibriums after applying Bonferroni correction. The cumulative power of discrimination and probability of exclusion of all the 21 STR loci were 0.99999999999999999993814 and 0.999998184, respectively. The results of genetic distances, phylogenetic trees and principal component analysis revealed that the Guanzhong Han population had a closer relationship with Ningxia Han, Tujia and Bai groups than other populations tested. In summary, these 21 STR loci showed a high level of genetic polymorphisms for the Guanzhong Han population and could be used for forensic applications and the studies of population genetics.

Introduction

China is an ancient country with 5,000-year-long civilization and has the largest population in the world, about 1.371 billion in the sixth national population census of China in 2010. As the biggest one of the 56 ethnic groups and with a population of approximately 1.226 billion, the Han population is widespread across China. Their spoken and written language is Chinese, one branch of the Sino-Tibetan language family. Chu et al. constructed the phylogenies using the neighbor-joining method based on difference population data for short tandem repeat (STR) loci and concluded that there was the distinction between southern and northern populations in China1. For Chinese Han population, previous population genetic studies based on STRs or single nucleotide polymorphisms (SNPs) have shown that the Chinese Han population was intricately sub-structured and clustered roughly to two (northern Han and southern Han)2,3 or three (northern Han, central Han and southern Han) subgroups4. So, it is of significance to further clarify the genetic structure of Chinese Han populations from different regions.

Guanzhong region, literally means “within the passes” in Chinese, is located in the middle of the Chinese mainland and includes the cities of Xi'an, Tongchuan, Baoji, Xianyang and Weinan in Shaanxi province, China. There are several ethnic groups, mainly including Han, Hui and Manchu nationalities living together in the region. Shen et al. reported that the Guanzhong Han population had the close genetic relationship with the northern and southern Han populations using genetic distance measurements, neighbor-joining dendrograms and principal component analysis (PCA) base on different HLA loci5.

STRs have been the most widely used in forensic science and population genetics. In order to provide more genetic information and increase the power of discrimination (PD) and probability of exclusion (PE), more novel STR loci with high genetic polymorphisms were integrated into one fluorescence-labeled multiplex amplification system. And, it is necessary to analyze the allelic distribution of STR loci before used in forensic applications. We have so far reported population data6,7,8,9,10,11,12,13,14 for a panel of 21 STR loci and these STR loci demonstrated tremendous potential for forensic applications. In the present study, we first aimed to present the population genetic data and forensic parameters of the Chinese Guanzhong Han (Northern Han in geography) with a panel of 21 non-CODIS autosomal STRs. Moreover, we investigated the genetic relationships and population differentiations between Guanzhong Han and other Chinese groups.

Methods

Populations and DNA extraction

Blood samples were randomly collected from 275 unrelated individual of the Han Chinese living in Guanzhong region, Shaanxi province, China. Before getting involved in the study, all the participants signed the written informed consents for the sample collections and succedent analyses. This study was conducted according to the humane and ethical research principles and approved by the ethical committee of Xi'an Jiaotong University Health Science Center, China. The genomic DNA was extracted from blood-stained samples using the Chelex-100 method as described by Walsh et al.15.

Genotyping results of the 21 STR loci from 10 Chinese groups were chosen for population comparison, including Mongolian (n = 86) from Inner Mongolia autonomous region6, Bai (n = 106) from Yunnan province7, Kazak (n = 114) from Xinjiang autonomous region8, Ningxia Han (Northern Han) (n = 202) from Ningxia autonomous region9, Russian (n = 114) from Inner Mongolia autonomous region10, Tibetan (n = 104) from Tibet autonomous region11, Tujia (n = 107) from Hubei province12, Uigur (n = 218) from Xinjiang autonomous region13, Yi (n = 110) from Yunnan province14, Salar (n = 120) from Qinghai province16. The geographical locations of the reference populations were shown in Figure 1.

Figure 1
figure1

The geographical locations of the Guanzhong Han and 10 reference groups in China.

The map was created in matlab R2013b software (MathWorks Inc., USA).

PCR amplification and STR typing

A panel of STRs were amplified in a single reaction using the AGCU 21+1 STR system (AGCU ScienTech Incorporation, Wuxi, Jiangsu, China), according to the manufacturer's instructions. The PCR products were separated and detected by capillary electrophoresis on the ABI 3130xl Genetic Analyzer (Applied Biosystems, Foster City, CA, USA). The STR typing results were obtained by comparing to the 21+1 Allelic Ladder using the program GeneMapper® ID-X v1.3 (Applied Biosystems, Foster City, CA, USA). Control DNA from 9947A cell line (Promega Corporation, Madison, WI, USA) was typed for quality control. All laboratory procedures were in accordance with the laboratory internal control standards.

Statistical analyses

Allelic frequencies and forensic parameters were calculated using the modified Powerstats v1.217. The Genepop v4.0.10 (http://genepop.curtin.edu.au/) was utilized to estimate the linkage disequilibriums (LDs) for all pair-wise STR loci. To estimate the inter-population differentiations between the Guanzhong Han and 10 reference populations in China, the locus-by-locus Fst, associated p and overall Fst values were calculated using the method of analysis of molecular variance (AMOVA) by the software ARLEQUIN v3.1 (http://cmpg.unibe.ch/software/arlequin3) and the DA distances were calculated using the DISPAN program18. To visually estimate the genetic relationships between the Guanzhong Han and reference populations, we performed two kinds of phylogenetic trees using the software MEGA v5 with the unweighted pair-group method with arithmetic means (UPGMA) based on DA distances and the software PHYLIP v3.6 by a bootstrap-over-loci method with 1,000 replicates based on allelic frequencies, respectively. A PCA plot was conducted with MATLAB 2007a (MathWorks Inc., USA) based on allelic frequencies of 21 STRs. The existence of significant LD among STRs has an impact on some subsequent analyses, including DA calculation and MEGA, so the STR loci which observed to be in significant LD with one or more other loci would be removed in the analyses mentioned above.

Results and Discussion

The typing results of the 21 STR loci from the Guanzhong Han population were listed in supplemental Table 1 and the allelic frequencies and forensic parameters were shown in Table 1. A total of 166 alleles were observed with corresponding allelic frequencies in the range of 0.0018 to 0.5564. No STR locus was observed to deviate from the Hardy-Weinberg equilibrium (after Bonferroni's correction; p > 0.00238). All the 21 loci showed a high level of PD values, ranging from 0.7700 for D1S1627 locus to 0.9437 for D19S433 locus. The values ranged from 0.5325 (D1S1627 locus) to 0.7916 (D19S433 locus) for polymorphism information content and 0.2738 (D1GATA113 locus) to 0.5856 (D19S433 locus) for PE, respectively. Observed heterozygosity ranged from 0.5855 (D1GATA113 locus) to 0.7927 (D19S433 locus), while the expected heterozygosity ranged from 0.5940 (D1S1627 locus) to 0.8147 (D19S433 locus). The cumulative PD and PE of all the 21 STR loci were 0.99999999999999999993814 and 0.999998184, respectively. The results indicated that the panel of 21 STRs showed a high level of polymorphisms and were suited for personal identification and parentage testing in forensic science.

Table 1 The allelic frequencies and statistical parameters for the 21 STR loci in Han population from Guanzhong region, Shaanxi, China (n = 275)

LD is the correlations among neighboring alleles descended from single, ancestral chromosomes19. The level of LD is affected by multiple factors, for example, genetic linkage, population structure and natural selection. In the present study, 11 out of 210 pairwise loci were observed to be in linkage disequilibriums for 21 STR loci in Guanzhong Han population (shown in supplementary Table 2). However, no significant linkage disequilibrium remained after applying Bonferroni correction (p < 0.05/210 = 0.00024). In addition, the loci previously reported to be in significant LD with other loci in the reference groups were removed in some subsequent analyses and there were 10 loci (including D10S1248, D11S4463, D14S1434, D18S853, D1GATA113, D22S1045, D2S441, D4S2408, D6S1017 and D9S1122) reserved for the analyses of DA calculation and MEGA.

Population differentiations between the Guanzhong Han and other 10 previously published groups were performed by the method of AMOVA based on the allelic frequencies of 21 STR loci. As shown in Table 2, the Guanzhong Han population was observed to be significantly different from the Uigur group at 9 loci, then from the Yi, Tibetan, Kazak, Russian and Salar groups at 3, 3, 2, 1 and 1 STR loci, respectively (after Bonferroni's correction; p < 0.00238). No significant difference was observed between the Guanzhong Han population and the Ningxia Han, Bai, Tujia and Mongolian groups. Nine loci, including D11S4463, D14S1434, D18S853, D19S433, D20S482, D2S1776, D4S2408, D6S1017 and D9S1122, showed no significant difference between the Guanzhong Han and reference groups. There were up to 4 reference groups at D22S1045 locus; 2 groups at D12ATA63, D1GATA113, D3S4529 and D5S2500 loci, showing significant difference from the Guanzhong Han population, respectively; and the results indicated that these loci had higher population differentiation and were appropriate for the studies of inter-population comparison.

Table 2 Pairwise Fst and associated p values of 21 STR loci between Chinese Guanzhong Han population and 10 reference populations

The DA distance values based on the 10 loci between the Guanzhong Han and 10 reference groups were shown in Table 3. The largest DA distance (0.0337) was observed between the Guanzhong Han and the Yi group, followed by Russian (0.0281) and Salar (0.0264) groups; whereas the smallest distance was found with the Ningxia Han population (0.0073), followed by Tujia (0.0077) and Bai (0.0091) groups. The DA distances between Guanzhong Han and Kazak, Tibetan, Mongolian and Uigur groups were 0.0126, 0.0133, 0.0141 and 0.0153, respectively. The DA distances showed closer relationship between the Guanzhong Han and the Ningxia Han, Tujia and Bai populations. In addition, the population differentiations between the Guanzhong Han and reference groups obtained from the overall Fst values based on all the 21 loci by the AMOVA method were basically in line with that from the DA distances.

Table 3 The DA distances between Guanzhong Han population and other groups based on 10 STR loci

The phylogenetic tree constructed by the software MEGA v5 based on DA distances was shown in Figure 2A. From the figure, three clusters were observed: the Guanzhong Han, Ningxia Han, Tibetan, Tujia and Bai groups shared the same clade; Yi, Russian, Salar and Mongolian groups were delineated in a branch; the remaining groups including Uigur and Kazak groups clustered together. In order to further confirm the phylogenetic relationship, the phylogenetic tree was also constructed using PHYLIP v3.6 based on the allelic frequencies of 21 STR loci and the result was shown in Figure 2B. The results obtained from two phylogenetic trees were extremely similar and the only exception was Tibetan group. The exception may due to the different number of STR loci.

Figure 2
figure2

Phylogenetic tree for Guanzhong Han and 10 reference populations constructed by the software MEGA v5 based on DA distances (A) and by the software PHYLIP v3.6 based on allelic frequencies (B), respectively.

As shown in Figure 3, the PCA plot among 11 groups was obtained with the first two components to be 29.92% and 16.37%, respectively, which could explain 46.29% of the variance. The Guanzhong Han population was observed to cluster closest with the Ningxia Han population, then with the Tujia and Bai groups, which is consistent with the results of phylogenetic trees above. The genetic evidence in our study showed that the Guanzhong Han population had closer relationship with Ningxia Han, Tujia and Bai populations than other 7 groups. The present result was basically consistent with the previous result of HLA loci as described by Shen et al.5. In order to further understand their genetic relationships and ancestry information, more genetic markers, such as SNPs and insertion/deletion polymorphisms should be used and analyzed in future.

Figure 3
figure3

Principal component analysis plot structured based on allelic frequencies of 21 STR loci in 11 populations.

Conclusions

In conclusion, we presented the genetic data of the Guanzhong Han population with 21 STR loci and these STR loci showed high level of genetic polymorphisms and were suited for forensic application for the Guanzhong Han population. The population comparison showed the Guanzhong Han had a close genetic relationship with the Ningxia Han, Tujia and Bai populations among the populations tested.

References

  1. Chu, J. Y. et al. Genetic relationship of populations in China. Proc. Natl. Acad. Sci. U S A. 95, 11763–11768 (1998).

    CAS  ADS  Article  Google Scholar 

  2. Chen, J. et al. Genetic structure of the Han Chinese population revealed by genome-wide SNP variation. Am. J. Hum. Genet. 85, 775–785 (2009).

    CAS  Article  Google Scholar 

  3. Qin, P. et al. A panel of ancestry informative markers to estimate and correct potential effects of population stratification in Han Chinese. Eur. J. Hum. Genet. 22, 248–253 (2014).

    CAS  Article  Google Scholar 

  4. Xu, S. et al. Genomic dissection of population substructure of Han Chinese and its implication in association studies. Am. J. Hum. Genet. 85, 762–774 (2009).

    CAS  Article  Google Scholar 

  5. Shen, C. M. et al. Allelic diversity and haplotype structure of HLA loci in the Chinese Han population living in the Guanzhong region of the Shaanxi province. Hum. Immunol. 71, 627–633 (2010).

    CAS  Article  Google Scholar 

  6. Gao, Y. et al. Structural polymorphism analysis of Chinese Mongolian ethnic group revealed by a new STR panel: Genetic relationship to other groups. Electrophoresis 35, 2008–2013 (2014).

    CAS  Article  Google Scholar 

  7. Shen, C. M. et al. Allelic polymorphic investigation of 21 autosomal short tandem repeat loci in a Chinese Bai ethnic group. Leg. Med. (Tokyo) 15, 109–113 (2013).

    CAS  Article  Google Scholar 

  8. Yuan, J. Y. et al. Genetic profile characterization and population study of 21 autosomal STR in Chinese Kazak ethnic minority group. Electrophoresis 35, 503–510 (2014).

    CAS  Article  Google Scholar 

  9. Wang, H. D. et al. Allelic diversity distributions of 21 new autosomal short tandem repeat loci in Chinese Ningxia Han population. Forensic Sci. Int. Genet. 7, 78–79 (2013).

    Article  Google Scholar 

  10. Wang, H. D. et al. Allelic frequency distributions of 21 non-combined DNA index system STR loci in a Russian ethnic minority group from Inner Mongolia, China. J. Zhejiang Univ. Sci. B. 14, 533–540 (2013).

    Article  Google Scholar 

  11. Zhu, B. F. et al. Genetic diversities of 21 non-CODIS autosomal STRs of a Chinese Tibetan ethnic minority group in Lhasa. Int. J. Legal Med. 125, 581–585 (2011).

    Article  Google Scholar 

  12. Yuan, G. L. et al. Genetic data provided by 21 autosomal STR loci from Chinese Tujia ethnic group. Mol. Biol. Rep. 39, 10265–10271 (2012).

    CAS  Article  Google Scholar 

  13. Deng, Y. J. et al. Polymorphic analysis of 21 new STR loci in Chinese Uigur group. Forensic Sci Int Genet. Forensic Sci. Int. Genet. 7, e97–e98; (2013).

    CAS  Article  Google Scholar 

  14. Zhu, B. F. et al. Population genetics and forensic efficiency of twenty-one novel microsatellite loci of Chinese Yi ethnic group. Electrophoresis 34, 3345–3351 (2013).

    CAS  Article  Google Scholar 

  15. Walsh, P. S., Metzger, D. A. & Higuchi, R. Chelex 100 as a medium for simple extraction of DNA for PCR-based typing from forensic material. Biotechniques 10, 506–513 (1991).

    CAS  PubMed  Google Scholar 

  16. Teng, Y. et al. Genetic variation of new 21 autosomal short tandem repeat loci in a Chinese Salar ethnic group. Mol. Biol. Rep. 39, 1465–1470 (2012).

    CAS  Article  Google Scholar 

  17. Tereba, A. Tools for Analysis of Population Statistics. profiles in DNA 2, 14–16 (1999).

    Google Scholar 

  18. Ota, T. DISPAN: genetic distance and phylogenetic analysis. (Pennsylvania State Univ. 1993).

  19. Reich, D. E. et al. Linkage disequilibrium in the human genome. Nature 411, 199–204 (2001).

    CAS  ADS  Article  Google Scholar 

Download references

Acknowledgements

This project was supported by the National Natural Science Foundation of China (NSFC, No. 81373248, 81471824), the Program for New Century Excellent Talents of the Ministry of Education, China (NECT-10-0687); supported by the Fundamental Research Funds for the Central University, China (2011jdgz20).

Author information

Affiliations

Authors

Contributions

B.Z. and Y.Z. wrote the main manuscript text, X.T., H.M., W.L., H.W., G.Y., R.J. and C.Y did the data processing and the manuscript modification and J.Y. and C.S. prepared the figures. All authors reviewed the manuscript.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Electronic supplementary material

Rights and permissions

This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder in order to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhang, YD., Tang, XL., Meng, HT. et al. Genetic Variability and Phylogenetic Analysis of Han Population from Guanzhong Region of China based on 21 non-CODIS STR Loci. Sci Rep 5, 8872 (2015). https://doi.org/10.1038/srep08872

Download citation

Further reading

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing