Genotype, haplotype and copy-number variation in worldwide human populations

Jakobsson, Mattias; Scholz, Sonja W.; Scheet, Paul; Gibbs, J. Raphael; VanLiere, Jenna M.; Fung, Hon-Chung; Szpiech, Zachary A.; Degnan, James H.; Wang, Kai; Guerreiro, Rita; Bras, Jose M.; Schymick, Jennifer C.; Hernandez, Dena G.; Traynor, Bryan J.; Simon-Sanchez, Javier; Matarin, Mar; Britton, Angela; van de Leemput, Joyce; Rafferty, Ian; Bucan, Maja; Cann, Howard M.; Hardy, John A.; Rosenberg, Noah A.; Singleton, Andrew B.

doi:10.1038/nature06742

Letter
Published: 21 February 2008

Genotype, haplotype and copy-number variation in worldwide human populations

Mattias Jakobsson^1,2^na1,
Sonja W. Scholz^4,5^na1,
Paul Scheet^1,3^na1,
J. Raphael Gibbs^4,5,
Jenna M. VanLiere¹,
Hon-Chung Fung^4,6,
Zachary A. Szpiech¹,
James H. Degnan^1,2,
Kai Wang⁷,
Rita Guerreiro^4,8,
Jose M. Bras^4,8,
Jennifer C. Schymick^4,9,
Dena G. Hernandez⁴,
Bryan J. Traynor^4,10,
Javier Simon-Sanchez^4,11,
Mar Matarin⁴,
Angela Britton⁴,
Joyce van de Leemput^4,5,
Ian Rafferty⁴,
Maja Bucan⁷,
Howard M. Cann¹²,
John A. Hardy⁵,
Noah A. Rosenberg^1,2,3 &
…
Andrew B. Singleton^4,13

Nature volume 451, pages 998–1003 (2008)Cite this article

9272 Accesses
633 Citations
71 Altmetric
Metrics details

Abstract

Genome-wide patterns of variation across individuals provide a powerful source of data for uncovering the history of migration, range expansion, and adaptation of the human species. However, high-resolution surveys of variation in genotype, haplotype and copy number have generally focused on a small number of population groups^1,2,3. Here we report the analysis of high-quality genotypes at 525,910 single-nucleotide polymorphisms (SNPs) and 396 copy-number-variable loci in a worldwide sample of 29 populations. Analysis of SNP genotypes yields strongly supported fine-scale inferences about population structure. Increasing linkage disequilibrium is observed with increasing geographic distance from Africa, as expected under a serial founder effect for the out-of-Africa spread of human populations. New approaches for haplotype analysis produce inferences about population structure that complement results based on unphased SNPs. Despite a difference from SNPs in the frequency spectrum of the copy-number variants (CNVs) detected—including a comparatively large number of CNVs in previously unexamined populations from Oceania and the Americas—the global distribution of CNVs largely accords with population structure analyses for SNP data sets of similar size. Our results produce new inferences about inter-population variation, support the utility of CNVs in human population-genetic research, and serve as a genomic resource for human-genetic studies in diverse worldwide populations.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

Figure 1: **SNP, haplotype, and copy-number variation across populations.**

Figure 2: **Genetic distance and linkage disequilibrium.**

Figure 3: **Haplotype cluster frequencies for 156 consecutive SNPs on chromosome 2 in the region surrounding the** ***LCT*** **gene (136.373–136.478 megabases).**

Figure 4: **CNVs across populations, based on 3,552 CNVs at 1,428 copy-number-variable loci.**

High-resolution inference of genetic relationships among Jewish populations

Article 09 January 2020

Differences in local population history at the finest level: the case of the Estonian population

Article Open access 25 July 2020

Population relationships based on 170 ancestry SNPs from the combined Kidd and Seldin panels

Article Open access 11 December 2019

Accession codes

Primary accessions

Gene Expression Omnibus

GSE10331

Data deposits

The array data described in this paper are deposited in the Gene Expression Omnibus (http://www.ncbi.nlm.nih.gov/geo) under accession number GSE10331.

References

The International Haplotype Map Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005)
Hinds, D. A. et al. Whole-genome patterns of common DNA variation in three human populations. Science 307, 1072–1079 (2005)
Article CAS ADS Google Scholar
Redon, R. et al. Global variation in copy number in the human genome. Nature 444, 444–454 (2006)
Article CAS ADS Google Scholar
Cann, H. M. et al. A human genome diversity cell line panel. Science 296, 261–262 (2002)
Article CAS Google Scholar
Kalinowski, S. T. Counting alleles with rarefaction: private alleles and hierarchical sampling designs. Conserv. Genet. 5, 539–543 (2004)
Article CAS Google Scholar
Falush, D., Stephens, M. & Pritchard, J. K. Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164, 1567–1587 (2003)
CAS PubMed PubMed Central Google Scholar
Bastos-Rodrigues, L., Pimenta, J. R. & Pena, S. D. J. The genetic structure of human populations studied through short insertion–deletion polymorphisms. Ann. Hum. Genet. 70, 658–665 (2006)
Article Google Scholar
Rosenberg, N. A. et al. Clines, clusters, and the effect of study design on the inference of human population structure. PLoS Genet. 1, e70 (2005)
Article Google Scholar
Rosenberg, N. A. et al. Genetic structure of human populations. Science 298, 2381–2385 (2002)
Article CAS ADS Google Scholar
Lawson Handley, L. J., Manica, A., Goudet, J. & Balloux, F. Going the distance: human population genetics in a clinal world. Trends Genet. 23, 432–439 (2007)
Article Google Scholar
Ramachandran, S. et al. Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa. Proc. Natl Acad. Sci. USA 102, 15942–15947 (2005)
Article CAS ADS Google Scholar
Sabatti, C. & Risch, N. Homozygosity and linkage disequilibrium. Genetics 160, 1707–1719 (2002)
PubMed PubMed Central Google Scholar
Conrad, D. F. et al. A worldwide survey of haplotype variation and linkage disequilibrium in the human genome. Nature Genet. 38, 1251–1260 (2006)
Article CAS Google Scholar
Gabriel, S. B. et al. The structure of haplotype blocks in the human genome. Science 296, 2225–2229 (2002)
Article CAS ADS Google Scholar
Reich, D. E. et al. Linkage disequilibrium in the human genome. Nature 411, 199–204 (2001)
Article CAS ADS Google Scholar
Tishkoff, S. A. & Kidd, K. K. Implications of biogeography of human populations for ‘race’ and medicine. Nature Genet. 36, S21–S27 (2004)
Article CAS Google Scholar
McVean, G. A. T. A genealogical interpretation of linkage disequilibrium. Genetics 162, 987–991 (2002)
PubMed PubMed Central Google Scholar
Bersaglieri, T. et al. Genetic signatures of strong recent positive selection at the lactase gene. Am. J. Hum. Genet. 74, 1111–1120 (2004)
Article CAS Google Scholar
Tishkoff, S. A. et al. Convergent adaptation of human lactase persistence in Africa and Europe. Nature Genet. 39, 31–40 (2007)
Article CAS Google Scholar
Wang, K. et al. PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res. 17, 1665–1674 (2007)
Article CAS Google Scholar
Wong, K. K. et al. A comprehensive analysis of common copy-number variations in the human genome. Am. J. Hum. Genet. 80, 91–104 (2007)
Article CAS Google Scholar
Locke, D. P. et al. Linkage disequilibrium and heritability of copy-number polymorphisms within duplicated regions of the human genome. Am. J. Hum. Genet. 79, 275–290 (2006)
Article CAS Google Scholar
Sharp, A. J. et al. Segmental duplications and copy-number variation in the human genome. Am. J. Hum. Genet. 77, 78–88 (2005)
Article CAS Google Scholar
Scherer, S. W. et al. Challenges and standards in integrating surveys of structural variation. Nature Genet. 39, S7–S15 (2007)
Article CAS Google Scholar
Servin, B. & Stephens, M. Imputation-based analysis of association studies: candidate regions and quantitative traits. PLoS Genet. 3, e114 (2007)
Article Google Scholar
Need, A. C. & Goldstein, D. B. Genome-wide tagging for everyone. Nature Genet. 38, 1227–1228 (2006)
Article CAS Google Scholar
Eberle, M. A., Rieder, M. J., Kruglyak, L. & Nickerson, D. A. Allele frequency matching between SNPs reveals an excess of linkage disequilibrium in genic regions of the human genome. PLoS Genet. 2, e142 (2006)
Article Google Scholar
Scheet, P. & Stephens, M. A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am. J. Hum. Genet. 78, 629–644 (2006)
Article CAS Google Scholar
Jakobsson, M. & Rosenberg, N. A. CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure. Bioinformatics 23, 1801–1806 (2007)
Article CAS Google Scholar
Zhang, J., Feuk, L., Duggan, G. E., Khaja, R. & Scherer, S. W. Development of bioinformatics resources for display and analysis of copy number and other structural variants in the human genome. Cytogenet. Genome Res. 115, 205–214 (2006)
Article CAS Google Scholar

Download references

Acknowledgements

We thank the Biological Resource Center at the Fondation Jean Dausset – CEPH for preparing HGDP–CEPH diversity panel DNA samples, and S. Chanock and A. Hutchinson for assistance with the DNAs. This work was supported in part by NIH grants, by a postdoctoral fellowship from the University of Michigan Center for Genetics in Health and Medicine, by grants from the Alfred P. Sloan Foundation and the Burroughs Wellcome Fund, by the National Center for Minority Health and Health Disparities, and by the Intramural Program of the National Institute on Aging. The study used the Biowulf Linux cluster at the National Institutes of Health (http://biowulf.nih.gov).

Author Contributions N.A.R. and A.B.S. wish to be regarded as joint last authors.

Author information

Mattias Jakobsson, Sonja W. Scholz and Paul Scheet: These authors contributed equally to this work.

Authors and Affiliations

Center for Computational Medicine and Biology,,
Mattias Jakobsson, Paul Scheet, Jenna M. VanLiere, Zachary A. Szpiech, James H. Degnan & Noah A. Rosenberg
Department of Human Genetics,,
Mattias Jakobsson, James H. Degnan & Noah A. Rosenberg
Department of Biostatistics, University of Michigan, Ann Arbor, Michigan 48109, USA,
Paul Scheet & Noah A. Rosenberg
Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, Maryland 20892, USA,
Sonja W. Scholz, J. Raphael Gibbs, Hon-Chung Fung, Rita Guerreiro, Jose M. Bras, Jennifer C. Schymick, Dena G. Hernandez, Bryan J. Traynor, Javier Simon-Sanchez, Mar Matarin, Angela Britton, Joyce van de Leemput, Ian Rafferty & Andrew B. Singleton
Department of Molecular Neuroscience and Reta Lila Weston Institute of Neurological Studies, Institute of Neurology, University College London, Queen Square, London WC1N 3BG, UK,
Sonja W. Scholz, J. Raphael Gibbs, Joyce van de Leemput & John A. Hardy
Department of Neurology, Chang Gung Memorial Hospital and College of Medicine, Chang Gung University, Taipei 10591, Taiwan
Hon-Chung Fung
Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA,
Kai Wang & Maja Bucan
Center for Neurosciences and Cell Biology, Faculty of Medicine, University of Coimbra, 3004-504 Coimbra, Portugal
Rita Guerreiro & Jose M. Bras
Department of Clinical Neurology, University of Oxford, John Radcliffe Hospital, Oxford OX3 9DU, UK
Jennifer C. Schymick
Neurogenetics Branch, National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, Maryland 20892, USA,
Bryan J. Traynor
Departamento de Genómica y Proteómica, Unidad de Genética Molecular, Instituto de Biomedicina de Valencia-CSIC, 46010, Valencia, Spain,
Javier Simon-Sanchez
Fondation Jean Dausset – Centre d’Étude du Polymorphisme Humain (CEPH), 27 rue Juliette Dodu, 75010 Paris, France,
Howard M. Cann
Center for Public Health Genomics, University of Virginia, Charlottesville, Virginia 22908, USA,
Andrew B. Singleton

Authors

Mattias Jakobsson
View author publications
You can also search for this author in PubMed Google Scholar
Sonja W. Scholz
View author publications
You can also search for this author in PubMed Google Scholar
Paul Scheet
View author publications
You can also search for this author in PubMed Google Scholar
J. Raphael Gibbs
View author publications
You can also search for this author in PubMed Google Scholar
Jenna M. VanLiere
View author publications
You can also search for this author in PubMed Google Scholar
Hon-Chung Fung
View author publications
You can also search for this author in PubMed Google Scholar
Zachary A. Szpiech
View author publications
You can also search for this author in PubMed Google Scholar
James H. Degnan
View author publications
You can also search for this author in PubMed Google Scholar
Kai Wang
View author publications
You can also search for this author in PubMed Google Scholar
Rita Guerreiro
View author publications
You can also search for this author in PubMed Google Scholar
Jose M. Bras
View author publications
You can also search for this author in PubMed Google Scholar
Jennifer C. Schymick
View author publications
You can also search for this author in PubMed Google Scholar
Dena G. Hernandez
View author publications
You can also search for this author in PubMed Google Scholar
Bryan J. Traynor
View author publications
You can also search for this author in PubMed Google Scholar
Javier Simon-Sanchez
View author publications
You can also search for this author in PubMed Google Scholar
Mar Matarin
View author publications
You can also search for this author in PubMed Google Scholar
Angela Britton
View author publications
You can also search for this author in PubMed Google Scholar
Joyce van de Leemput
View author publications
You can also search for this author in PubMed Google Scholar
Ian Rafferty
View author publications
You can also search for this author in PubMed Google Scholar
Maja Bucan
View author publications
You can also search for this author in PubMed Google Scholar
Howard M. Cann
View author publications
You can also search for this author in PubMed Google Scholar
John A. Hardy
View author publications
You can also search for this author in PubMed Google Scholar
Noah A. Rosenberg
View author publications
You can also search for this author in PubMed Google Scholar
Andrew B. Singleton
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Noah A. Rosenberg or Andrew B. Singleton.

Supplementary information

Supplementary Information

This file contains extensive Supplementary Information with Supplementary Notes, Supplementary Data, Supplementary Tables S1-S17, Supplementary Figures S1-S30 with Legends and additional references. (PDF 10195 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jakobsson, M., Scholz, S., Scheet, P. et al. Genotype, haplotype and copy-number variation in worldwide human populations. Nature 451, 998–1003 (2008). https://doi.org/10.1038/nature06742

Download citation

Received: 02 December 2007
Accepted: 29 January 2008
Issue Date: 21 February 2008
DOI: https://doi.org/10.1038/nature06742

This article is cited by

The Allen Ancient DNA Resource (AADR) a curated compendium of ancient human genomes
- Swapan Mallick
- Adam Micco
- David Reich
Scientific Data (2024)
Hybrid autoencoder with orthogonal latent space for robust population structure inference
- Meng Yuan
- Hanne Hoskens
- Peter Claes
Scientific Reports (2023)
An integrative framework and recommendations for the study of DNA methylation in the context of race and ethnicity
- Meingold Hiu-ming Chan
- Sarah M. Merrill
- Michael S. Kobor
Discover Social Science and Health (2023)
Comprehensive genome-wide association study of different forms of hernia identifies more than 80 associated loci
- João Fadista
- Line Skotte
- Frank Geller
Nature Communications (2022)
Association between TAP gene polymorphisms and tuberculosis susceptibility in a Han Chinese population in Guangdong
- Fang Luo
- PinAng Zou
- BaoGuo Wang
Molecular Genetics and Genomics (2022)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Genotype, haplotype and copy-number variation in worldwide human populations

Abstract

Access options

Similar content being viewed by others

High-resolution inference of genetic relationships among Jewish populations

Differences in local population history at the finest level: the case of the Estonian population

Population relationships based on 170 ancestry SNPs from the combined Kidd and Seldin panels

Accession codes

Primary accessions

Gene Expression Omnibus

Data deposits

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Supplementary information

Supplementary Information

Rights and permissions

About this article

Cite this article

This article is cited by

The Allen Ancient DNA Resource (AADR) a curated compendium of ancient human genomes

Hybrid autoencoder with orthogonal latent space for robust population structure inference

An integrative framework and recommendations for the study of DNA methylation in the context of race and ethnicity

Comprehensive genome-wide association study of different forms of hernia identifies more than 80 associated loci

Association between TAP gene polymorphisms and tuberculosis susceptibility in a Han Chinese population in Guangdong

Comments

Search

Quick links

Abstract

Access options

Similar content being viewed by others

Accession codes

Primary accessions

Gene Expression Omnibus

Data deposits

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links