Abstract
Despite great progress in identifying genetic variants that influence human disease, most inherited risk remains unexplained. A more complete understanding requires genome-wide studies that fully examine less common alleles in populations with a wide range of ancestry. To inform the design and interpretation of such studies, we genotyped 1.6 million common single nucleotide polymorphisms (SNPs) in 1,184 reference individuals from 11 global populations, and sequenced ten 100-kilobase regions in 692 of these individuals. This integrated data set of common and rare alleles, called ‘HapMap 3’, includes both SNPs and copy number polymorphisms (CNPs). We characterized population-specific differences among low-frequency variants, measured the improvement in imputation accuracy afforded by the larger reference panel, especially in imputing SNPs with a minor allele frequency of ≤5%, and demonstrated the feasibility of imputing newly discovered CNPs and SNPs. This expanded public resource of genome variants in global populations supports deeper interrogation of genomic variation and its role in human disease, and serves as a step towards a high-resolution map of the landscape of human genetic variation.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
International Human Genome Sequencing Consortium.. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001)
The Internation SNP Map Working Group.. A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 409, 928–933 (2001)
The International HapMap Consortium. A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851–861 (2007)
Donnelly, P. Progress and challenges in genome-wide association studies in humans. Nature 456, 728–731 (2008)
Manolio, T. A. et al. Finding the missing heritability of complex diseases. Nature 461, 747–753 (2009)
Korn, J. M. et al. Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nature Genet. 40, 1253–1260 (2008)
McCarroll, S. A. et al. Integrated detection and population-genetic analysis of SNPs and copy number variation. Nature Genet. 40, 1166–1174 (2008)
Barnes, C. et al. A robust statistical method for case-control association testing with copy number variation. Nature Genet. 40, 1245–1252 (2008)
Redon, R. et al. Global variation in copy number in the human genome. Nature 444, 444–454 (2006)
Conrad, D. F. et al. Origins and functional impact of copy number variation in the human genome. Nature 464, 704–712 (2010)
Teo, Y. Y. et al. A genotype calling algorithm for the Illumina BeadArray platform. Bioinformatics 23, 2741–2746 (2007)
The Internatinal HapMap Consortium. A haplotype map of the human genome. Nature 437, 1299–1320 (2005)
Zhang, J. et al. SNPdetector: a software tool for sensitive and accurate SNP detection. PLOS Comput. Biol. 1 e53 10.1371/journal.pcbi.0010053 (2005)
Campbell, M. C. & Tishkoff, S. A. African genetic diversity: implications for human demographic history, modern human origins, and complex disease mapping. Annu. Rev. Genomics Hum. Genet. 9, 403–433 (2008)
Keinan, A., Mullikin, J. C., Patterson, N. & Reich, D. Measurement of the human allele frequency spectrum demonstrates greater genetic drift in East Asians than in Europeans. Nature Genet. 39, 1251–1255 (2007)
van Heel, D. A. et al. A genome-wide association study for celiac disease identifies risk variants in the region harboring IL2 and IL21. Nature Genet. 39, 827–829 (2007)
The Wellcome Trust Case Control Consortium. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447, 661–678 (2007)
Pe’er, I. et al. Biases and reconciliation in estimates of linkage disequilibrium in the human genome. Am. J. Hum. Genet. 78, 588–603 (2006)
Grossman, S. R. et al. A composite of multiple signals distinguishes causal variants in regions of positive selection. Science 327, 883–886 (2010)
Sabeti, P. C. et al. Positive natural selection in the human lineage. Science 312, 1614–1620 (2006)
Lamason, R. L. et al. SLC24A5, a putative cation exchanger, affects pigmentation in zebrafish and humans. Science 310, 1782–1786 (2005)
Akey, J. M. Constructing genomic maps of positive selection in humans: where do we go from here? Genome Res. 19, 711–722 (2009)
Pickrell, J. K. et al. Signals of recent positive selection in a worldwide sample of human populations. Genome Res. 19, 826–837 (2009)
Carlson, C. S. et al. Genomic regions exhibiting positive selection identified from dense genotype data. Genome Res. 15, 1553–1565 (2005)
Gu, J. et al. A genome scan for positive selection in thoroughbred horses. PLoS ONE 4 e5767 10.1371/journal.pone.0005767 (2009)
Li, Y. & Abecasis, G. R. Mach 1.0: rapid haplotype reconstruction and missing genotype inference. Am. J. Hum. Genet. S79, 2290 (2006)
Colella, S. et al. QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data. Nucleic Acids Res. 35, 2013–2025 (2007)
Acknowledgements
We dedicate this work to Leena Peltonen for her vital leadership role in this study, and in memory of a valued friend and colleague. We thank E. Boerwinkle and R. Durbin for critical reading of the manuscript. We thank the USA National Institutes of Health, the National Human Genome Research Institute, the National Institute on Deafness and Other Communication Disorders and the Wellcome Trust for supporting the majority of this work. Funding was also provided by the Louis-Jeantet Foundation and the NCCR ‘Frontiers in Genetics’ (Swiss National Science Foundation). We thank the people from the following communities who were generous in donating their blood samples to be studied in this project: the Yoruba in Ibadan, Nigeria; the Maasai in Kinyawa, Kenya; the Luhya in Webuye, Kenya; the Han Chinese in Beijing, China; the Japanese in Tokyo, Japan; the Chinese in metropolitan Denver, Colorado; the Gujarati Indians in Houston, Texas; the Toscani in Italia; the community of African ancestry in the southwestern USA; and the community of Mexican ancestry in Los Angeles, California. We also thank the people in the Utah Centre d’Etude du Polymorphisme Humain community who allowed the samples they donated earlier to be used for the project. The authors acknowledge use of DNA from the 1958 British birth cohort collection, funded by the UK Medical Research Council grant G0000934 and the Wellcome Trust grant 068545/Z/02. The Illumina 550K genotype data for the 1958 British birth cohort samples were made available by the Sanger Institute. For the 1958 British birth cohort Affymetrix 500K genotype data, we thank the Wellcome Trust Case Control Consortium (http://www.wtccc.org.uk), which was funded by Wellcome Trust award 076113.
Author information
Authors and Affiliations
Consortia
Corresponding authors
Ethics declarations
Competing interests
The author declare no competing financial interests.
Additional information
The HapMap 3/ENCODE 3 data set has been deposited at http://www.hapmap.org. The sequence traces of ENCODE 3 can be accessed at http://www.ncbi.nlm.nih.gov/Traces/trace.cgi by submitting the query:species_code5“HOMO SAPIENS” and CENTER_NAME 5 “BCM” and CENTER_PROJECT 5 “RHIAY”.
A list of participants and their affiliations appears at the end of the paper
Supplementary information
Supplementary Information
This file contains Supplementary Information comprising Introduction, Large Scale Genotyping, Rare Allele Calling Bias, Deep PCR Sequencing, Copy Number Polymorphism (CNP) Analysis, Population Analyses, Recurrent SNPs and Haplotype sharing (see Contents page for full details). It also includes Supplementary Tables 1-11, Supplementary Figures 1-9 with legends and additional references. (PDF 1111 kb)
Rights and permissions
About this article
Cite this article
The International HapMap 3 Consortium. Integrating common and rare genetic variation in diverse human populations. Nature 467, 52–58 (2010). https://doi.org/10.1038/nature09298
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1038/nature09298
This article is cited by
-
A multi-ancestry genetic study of pain intensity in 598,339 veterans
Nature Medicine (2024)
-
Associations between low serum levels of ANRIL and some common gene SNPs in Iranian patients with premature coronary artery disease
Scientific Reports (2024)
-
South Asia: The Missing Diverse in Diversity
Behavior Genetics (2024)
-
Association of polygenic risk score with response to deep brain stimulation in Parkinson’s disease
BMC Neurology (2023)
-
Prioritizing genes associated with brain disorders by leveraging enhancer-promoter interactions in diverse neural cells and tissues
Genome Medicine (2023)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.