Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Integrating common and rare genetic variation in diverse human populations


Despite great progress in identifying genetic variants that influence human disease, most inherited risk remains unexplained. A more complete understanding requires genome-wide studies that fully examine less common alleles in populations with a wide range of ancestry. To inform the design and interpretation of such studies, we genotyped 1.6 million common single nucleotide polymorphisms (SNPs) in 1,184 reference individuals from 11 global populations, and sequenced ten 100-kilobase regions in 692 of these individuals. This integrated data set of common and rare alleles, called ‘HapMap 3’, includes both SNPs and copy number polymorphisms (CNPs). We characterized population-specific differences among low-frequency variants, measured the improvement in imputation accuracy afforded by the larger reference panel, especially in imputing SNPs with a minor allele frequency of ≤5%, and demonstrated the feasibility of imputing newly discovered CNPs and SNPs. This expanded public resource of genome variants in global populations supports deeper interrogation of genomic variation and its role in human disease, and serves as a step towards a high-resolution map of the landscape of human genetic variation.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Size and frequency spectra of common and rare CNPs.
Figure 2: SNP discovery informativeness across populations.
Figure 3: Effect of sample size on SNP ascertainment.
Figure 4: Haplotype sharing around SNPs and CNPs.
Figure 5: Imputation accuracy and reference panel size.
Figure 6: Imputation: new populations, new variants.

Similar content being viewed by others


  1. International Human Genome Sequencing Consortium.. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001)

  2. The Internation SNP Map Working Group.. A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 409, 928–933 (2001)

  3. The International HapMap Consortium. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007)

  4. Donnelly, P. Progress and challenges in genome-wide association studies in humans. Nature 456, 728–731 (2008)

    CAS  Google Scholar 

  5. Manolio, T. A. et al. Finding the missing heritability of complex diseases. Nature 461, 747–753 (2009)

    CAS  Google Scholar 

  6. Korn, J. M. et al. Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nature Genet. 40, 1253–1260 (2008)

    CAS  Google Scholar 

  7. McCarroll, S. A. et al. Integrated detection and population-genetic analysis of SNPs and copy number variation. Nature Genet. 40, 1166–1174 (2008)

    CAS  Google Scholar 

  8. Barnes, C. et al. A robust statistical method for case-control association testing with copy number variation. Nature Genet. 40, 1245–1252 (2008)

    CAS  Google Scholar 

  9. Redon, R. et al. Global variation in copy number in the human genome. Nature 444, 444–454 (2006)

    CAS  Google Scholar 

  10. Conrad, D. F. et al. Origins and functional impact of copy number variation in the human genome. Nature 464, 704–712 (2010)

    CAS  Google Scholar 

  11. Teo, Y. Y. et al. A genotype calling algorithm for the Illumina BeadArray platform. Bioinformatics 23, 2741–2746 (2007)

    CAS  Google Scholar 

  12. The Internatinal HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005)

  13. Zhang, J. et al. SNPdetector: a software tool for sensitive and accurate SNP detection. PLOS Comput. Biol. 1 e53 10.1371/journal.pcbi.0010053 (2005)

    CAS  Google Scholar 

  14. Campbell, M. C. & Tishkoff, S. A. African genetic diversity: implications for human demographic history, modern human origins, and complex disease mapping. Annu. Rev. Genomics Hum. Genet. 9, 403–433 (2008)

    CAS  Google Scholar 

  15. Keinan, A., Mullikin, J. C., Patterson, N. & Reich, D. Measurement of the human allele frequency spectrum demonstrates greater genetic drift in East Asians than in Europeans. Nature Genet. 39, 1251–1255 (2007)

    CAS  Google Scholar 

  16. van Heel, D. A. et al. A genome-wide association study for celiac disease identifies risk variants in the region harboring IL2 and IL21. Nature Genet. 39, 827–829 (2007)

    CAS  Google Scholar 

  17. The Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007)

  18. Pe’er, I. et al. Biases and reconciliation in estimates of linkage disequilibrium in the human genome. Am. J. Hum. Genet. 78, 588–603 (2006)

    Google Scholar 

  19. Grossman, S. R. et al. A composite of multiple signals distinguishes causal variants in regions of positive selection. Science 327, 883–886 (2010)

    CAS  Google Scholar 

  20. Sabeti, P. C. et al. Positive natural selection in the human lineage. Science 312, 1614–1620 (2006)

    CAS  Google Scholar 

  21. Lamason, R. L. et al. SLC24A5, a putative cation exchanger, affects pigmentation in zebrafish and humans. Science 310, 1782–1786 (2005)

    CAS  Google Scholar 

  22. Akey, J. M. Constructing genomic maps of positive selection in humans: where do we go from here? Genome Res. 19, 711–722 (2009)

    CAS  Google Scholar 

  23. Pickrell, J. K. et al. Signals of recent positive selection in a worldwide sample of human populations. Genome Res. 19, 826–837 (2009)

    CAS  Google Scholar 

  24. Carlson, C. S. et al. Genomic regions exhibiting positive selection identified from dense genotype data. Genome Res. 15, 1553–1565 (2005)

    CAS  Google Scholar 

  25. Gu, J. et al. A genome scan for positive selection in thoroughbred horses. PLoS ONE 4 e5767 10.1371/journal.pone.0005767 (2009)

    CAS  Google Scholar 

  26. Li, Y. & Abecasis, G. R. Mach 1.0: rapid haplotype reconstruction and missing genotype inference. Am. J. Hum. Genet. S79, 2290 (2006)

    Google Scholar 

  27. Colella, S. et al. QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data. Nucleic Acids Res. 35, 2013–2025 (2007)

    CAS  Google Scholar 

Download references


We dedicate this work to Leena Peltonen for her vital leadership role in this study, and in memory of a valued friend and colleague. We thank E. Boerwinkle and R. Durbin for critical reading of the manuscript. We thank the USA National Institutes of Health, the National Human Genome Research Institute, the National Institute on Deafness and Other Communication Disorders and the Wellcome Trust for supporting the majority of this work. Funding was also provided by the Louis-Jeantet Foundation and the NCCR ‘Frontiers in Genetics’ (Swiss National Science Foundation). We thank the people from the following communities who were generous in donating their blood samples to be studied in this project: the Yoruba in Ibadan, Nigeria; the Maasai in Kinyawa, Kenya; the Luhya in Webuye, Kenya; the Han Chinese in Beijing, China; the Japanese in Tokyo, Japan; the Chinese in metropolitan Denver, Colorado; the Gujarati Indians in Houston, Texas; the Toscani in Italia; the community of African ancestry in the southwestern USA; and the community of Mexican ancestry in Los Angeles, California. We also thank the people in the Utah Centre d’Etude du Polymorphisme Humain community who allowed the samples they donated earlier to be used for the project. The authors acknowledge use of DNA from the 1958 British birth cohort collection, funded by the UK Medical Research Council grant G0000934 and the Wellcome Trust grant 068545/Z/02. The Illumina 550K genotype data for the 1958 British birth cohort samples were made available by the Sanger Institute. For the 1958 British birth cohort Affymetrix 500K genotype data, we thank the Wellcome Trust Case Control Consortium (, which was funded by Wellcome Trust award 076113.

Author information

Authors and Affiliations


Corresponding authors

Correspondence to David M. Altshuler, Richard A. Gibbs, David M. Altshuler, Richard A. Gibbs, David M. Altshuler, Richard A. Gibbs, Richard A. Gibbs or Richard A. Gibbs.

Ethics declarations

Competing interests

The author declare no competing financial interests.

Additional information

The HapMap 3/ENCODE 3 data set has been deposited at The sequence traces of ENCODE 3 can be accessed at by submitting the query:species_code5“HOMO SAPIENS” and CENTER_NAME 5 “BCM” and CENTER_PROJECT 5 “RHIAY”.

A list of participants and their affiliations appears at the end of the paper

Supplementary information

Supplementary Information

This file contains Supplementary Information comprising Introduction, Large Scale Genotyping, Rare Allele Calling Bias, Deep PCR Sequencing, Copy Number Polymorphism (CNP) Analysis, Population Analyses, Recurrent SNPs and Haplotype sharing (see Contents page for full details). It also includes Supplementary Tables 1-11, Supplementary Figures 1-9 with legends and additional references. (PDF 1111 kb)

PowerPoint slides

Rights and permissions

Reprints and permissions

About this article

Cite this article

The International HapMap 3 Consortium. Integrating common and rare genetic variation in diverse human populations. Nature 467, 52–58 (2010).

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI:

This article is cited by


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research