Article | Published:

A genome-wide characterization of copy number variations in native populations of Peninsular Malaysia

European Journal of Human Geneticsvolume 26pages886897 (2018) | Download Citation


Copy number variations (CNVs) are genomic structural variations that result from the deletion or duplication of large genomic segments. The characterization of CNVs is largely underrepresented, particularly those of indigenous populations, such as the Orang Asli in Peninsular Malaysia. In the present study, we first characterized the genome-wide CNVs of four major native populations from Peninsular Malaysia, including the Malays and three Orang Asli populations; namely, Proto-Malay, Senoi, and Negrito (collectively called PM). We subsequently assessed the distribution of CNVs across the four populations. The resulting global CNV map revealed 3102 CNVs, with an average of more than 100 CNVs per individual. We identified genes harboring CNVs that are highly differentiated between PM and global populations, indicating that these genes are predominantly enriched in immune responses and defense functions, including APOBEC3A_B, beta-defensin genes, and CCL3L1, followed by other biological functions, such as drug and toxin metabolism and responses to radiation, suggesting some attributions between CNV variations and adaptations of the PM groups to the local environmental conditions of tropical rainforests.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Additional information

These authors contributed equally: Ruiqing Fu, Boon-Peng Hoh.


  1. 1.

    Iafrate AJ, Feuk L, Rivera MN, et al. Detection of large-scale variation in the human genome. Nat Genet. 2004;36:949–51.

  2. 2.

    Sebat J, Lakshmi B, Troge J, et al. Large-scale copy number polymorphism in the human genome. Science. 2004;305:525–8.

  3. 3.

    Redon R, Ishikawa S, Fitch KR, Feuk L, Redon R, Ishikawa S, et al. Global variation in copy number in the human genome. Nature. 2006;444:444–54.

  4. 4.

    Lupski JR, Stankiewicz P. Genomic disorders: molecular mechanisms for rearrangements and conveyed phenotypes. PLoS Genet. 2005;1:0627–33.

  5. 5.

    Wong KK, deLeeuw RJ, Dosanjh NS, et al. A comprehensive analysis of common copy-number variations in the human genome. Am J Hum Genet. 2007;80:91–104.

  6. 6.

    Perry GH, Dominy NJ, Claw KG, et al. Diet and the evolution of human amylase gene copy number variation. Nat Genet. 2007;39:1256–60.

  7. 7.

    Lupski JR, Wise CA, Kuwano A, et al. Gene dosage is a mechanism for Charcot-Marie-Tooth disease type 1A. Nat Genet. 1992;1:29–33.

  8. 8.

    Hollox EJ, Hoh B-P. Human gene copy number variation and infectious disease. Hum Genet. 2014;133:1217–33.

  9. 9.

    Fanciulli M, Norsworthy PJ, Petretto E, et al. FCGR3B copy number variation is associated with susceptibility to systemic, but not organ-specific, autoimmunity. Nat Genet. 2007;39:721–3.

  10. 10.

    Mamtani M, Anaya J-M, He W, Ahuja SK. Association of copy number variation in the FCGR3B gene with risk of autoimmune diseases. Genes Immun. 2010;11:155–60.

  11. 11.

    Molokhia M, Fanciulli M, Petretto E, et al. FCGR3B copy number variation is associated with systemic lupus erythematosus risk in Afro-Caribbeans. Rheumatology. 2011;50:1206–10.

  12. 12.

    Sebat J, Lakshmi B, Malhotra D, et al. Strong association of de novo copy number mutations with autism. Science. 2007;316:445–9.

  13. 13.

    Stefansson H, Rujescu D, Cichon S, et al. Large recurrent microdeletions associated with schizophrenia. Nature. 2008;455:232–6.

  14. 14.

    Xu B, Roos JL, Levy S, van Rensburg EJ, Gogos JA, Karayiorgou M. Strong association of de novo copy number mutations with sporadic schizophrenia. Nat Genet. 2008;40:880–5.

  15. 15.

    Pollex RL, Hegele RA. Copy number variation in the human genome and its implications for cardiovascular disease. Circulation. 2007;115:3130–8.

  16. 16.

    Stankiewicz P, Lupski JR. Structural variation in the human genome and its role in disease. Annu Rev Med. 2010;61:437–55.

  17. 17.

    Girirajan S, Campbell CD, Eichler EE. Human copy number variation and complex genetic disease. Annu Rev Genet. 2011;45:203–26.

  18. 18.

    Gu W, Zhang F, Lupski JR. Mechanisms for human genomic rearrangements. Pathogenetics. 2008;1:4.

  19. 19.

    Lam HYK, Mu XJ, Stütz AM, et al. Nucleotide-resolution analysis of structural variants using BreakSeq and a breakpoint library. Nat Biotechnol. 2010;28:47–55.

  20. 20.

    Mills RE, Walter K, Stewart C, et al. Mapping copy number variation by population-scale genome sequencing. Nature. 2011;470:59–65.

  21. 21.

    Gene D, Asia E, Hardwick RJ, et al. A worldwide analysis of beta-defensin copy number variation suggests recent selection of a high-expressing. Hum Mutat. 2011;67948.

  22. 22.

    Song H, Hu H, Seok I, Chung Y. Identifying copy number variants under selection in geographically structured populations based on F-statistics. Genom Inform. 2012;10:81–7.

  23. 23.

    Sudmant PH, Mallick S, Nelson BJ, et al. Global diversity, population stratification, and selection of human copy-number variation. Science 2015;349:aab3761.

  24. 24.

    Aghakhanian F, Yunus Y, Naidu R, et al. Unravelling the genetic history of negritos and indigenous populations of Southeast Asia. Genome Biol Evol. 2015;7:1206–15.

  25. 25.

    Liu X, Yunus Y, Lu D, et al. Differential positive selection of malaria resistance genes in three indigenous populations of Peninsular Malaysia. Hum Genet. 2015;134:375–92.

  26. 26.

    Deng L, Hoh BP, Lu D, et al. The population genomic landscape of human genetic structure, admixture history and local adaptation in Peninsular Malaysia. Hum Genet. 2014;133:1169–85.

  27. 27.

    Jinam Ta, Phipps ME, Saitou N. Admixture patterns and genetic differentiation in negrito groups from West Malaysia estimated from genome-wide SNP data. Hum Biol. 2013;85:173–88.

  28. 28.

    Mokhtar SS, Marshall CR, Phipps ME, et al. Novel population specific autosomal copy number variation and its functional analysis amongst Negritos from Peninsular Malaysia. PLoS ONE. 2014;9:e100371

  29. 29.

    Ku C-S, Pawitan Y, Sim X, et al. Genomic copy number variations in three Southeast Asian populations. Hum Mutat. 2010;31:851–7.

  30. 30.

    MacDonald JR, Ziman R, Yuen RKC, Feuk L, Scherer SW. The Database of Genomic Variants: a curated collection of structural variation in the human genome. Nucleic Acids Res. 2014;42:986–92.

  31. 31.

    Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006;38:904–9.

  32. 32.

    Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics. 2000;155:945–59.

  33. 33.

    Falush D, Stephens M, Pritchard JK. Inference of population structure using multilocus genotype data: Linked loci and correlated allele frequencies. Genetics. 2003;164:1567–87.

  34. 34.

    Falush D, Stephens M, Pritchard JK. Inference of population structure using multilocus genotype data: Dominant markers and null alleles. Mol Ecol Notes. 2007;7:574–8.

  35. 35.

    Hubisz MJ, Falush D, Stephens M, Pritchard JK. Inferring weak population structure with the assistance of sample group information. Mol Ecol Resour. 2009;9:1322–32.

  36. 36.

    Weir BS, Cockerham CC. Estimating F-statistics for the analysis of population structure. Evolution. 1984;38:1358.

  37. 37.

    Lou H, Li S, Yang Y, et al. A map of copy number variations in chinese populations. PLoS ONE. 2011;6:e27341

  38. 38.

    Felsenstein J. PHYLIP (Phylogeny Inference Package)Version 3.6. Distributed by the author. Department of Genome Sciences, University of Washington, Seattle. Cladistics. 2004;5:164–6.

  39. 39.

    Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011;28:2731–9.

  40. 40.

    Jha P, Sinha S, Kanchan K, et al. Deletion of the APOBEC3B gene strongly impacts susceptibility to falciparum malaria. Infect Genet Evol. 2012;12:142–8.

  41. 41.

    Hatin WI, Nur-Shafawati AR, Zahri MK, et al. Population genetic structure of peninsular Malaysia Malay sub-ethnic groups. PLoS ONE. 2011;6:2–6.

  42. 42.

    Korn JM, Kuruvilla FG, McCarroll SA, et al. Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat Genet. 2008;40:1253–60.

  43. 43.

    Huson DH, Scornavacca C. Dendroscope 3: an interactive tool for rooted phylogenetic trees and networks. Syst Biol. 2012;61:1061–7.

  44. 44.

    Purcell S, Neale B, Todd-Brown K, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81:559–75.

Download references


We thank the Department of Orang Asli Development (JAKOA) and especially all subjects who voluntarily participated in this study. SX acknowledges financial support from the Strategic Priority Research Program (XDB13040100) and Key Research Program of Frontier Sciences (QYZDJ-SSW-SYS009) of the Chinese Academy of Sciences (CAS), the National Natural Science Foundation of China (NSFC) grant (91331204, 91731303, 31771388, and 31711530221), the National Science Fund for Distinguished Young Scholars (31525014), the National Key Research and Development Program (2016YFC0906403), and the Program of Shanghai Academic Research Leader (16XD1404700). B-PH acknowledges the Chinese Academy of Sciences President’s International Fellowship Initiatives (2017VBA0008) awarded to him. This study is also supported by Ministry of Science, Technology and Innovation (MOSTI) grant erBiotek Grant #100-RM/BIOTEK 16/6/2 B (1/2011) and [100- RMI/GOV 16/6/2 (19/2011)] awarded to B-PH and MEP. SX is Max-Planck Independent Research Group Leader and member of CAS Youth Innovation Promotion Association. SX also gratefully acknowledges the support of the National Program for Top-notch Young Innovative Talents of The “Wanren Jihua” Project. We thank LetPub ( for providing linguistic assistance during the preparation of this manuscript. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author information

Author notes


    1. Chinese Academy of Sciences (CAS), Key Laboratory of Computational Biology, Max Planck Independent Research Group on Population Genomics, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Shanghai, 200031, China

      • Ruiqing Fu
      • , Boon-Peng Hoh
      •  & Shuhua Xu
    2. University of Chinese Academy of Sciences, Beijing, 100049, China

      • Ruiqing Fu
      •  & Shuhua Xu
    3. Institute of Medical Molecular Biotechnology, Faculty of Medicine, Universiti Teknologi MARA, Sungai Buloh Campus, Selangor, Malaysia

      • Siti Shuhada Mokhtar
    4. School of Medicine, Monash University Sunway Campus, Petaling Jaya, Malaysia

      • Maude Elvira Phipps
    5. Faculty of Medicine and Health Sciences, UCSI University, Jalan Menara Gading, Taman Connaught, Cheras, Kuala Lumpur, Malaysia

      • Boon-Peng Hoh
    6. School of Life Science and Technology, ShanghaiTech University, Shanghai, 201210, China

      • Shuhua Xu
    7. Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, 650223, China

      • Shuhua Xu


    1. Search for Ruiqing Fu in:

    2. Search for Siti Shuhada Mokhtar in:

    3. Search for Maude Elvira Phipps in:

    4. Search for Boon-Peng Hoh in:

    5. Search for Shuhua Xu in:

    Conflict of interest

    The authors declare that they have no conflict of interest.

    Corresponding author

    Correspondence to Shuhua Xu.

    Electronic supplementary material

    About this article

    Publication history