An effort to sequence thousands of people’s genomes reaches the end of the beginning.
The 1000 Genomes Project
From One Genome to 1000 and Beyond in 25 years
We celebrate the 25th anniversary of the launch of the Human Genome Project (HGP) with recognition for human genomics resources, from one genome to over one thousand genomes and beyond.
The 1000 Genomes Project began in 2007 with the goal of developing a comprehensive resource of human genetic variation across worldwide populations. Eight years later, we publish in this issue the final phase reports from this project, representing the most comprehensive assessment of human genetic variation across global populations to date. The already established datasets have, since their launch, provided a foundational open resource that has enabled a wealth of robust genetic associations to disease as well as many key insights into population history and evolution.
The final phase 1000 Genomes Project publications represent not only the completion of this project, but also the culmination of a series of international collaborations stemming from the HGP, including the International HapMap Project, all focused on establishing open reference catalogues of genetic variation as a resource to the community.
We are pleased to present this Nature Collection of all the primary publications and related news and commentary on the International HapMap and 1000 Genomes Projects.
- Orli Bahcall, Senior Editor, Nature
Podcast on completion of 1000 Genomes Project with Gil McVean and NHGRI Director Eric Green. What has been achieved in the twenty-five years since the start of the Human Genome Project.
News & Comment
The Human Genome Project, which launched a quarter of a century ago this week, still holds lessons for the consortium-based science it ushered in, say Eric D. Green, James D. Watson and Francis S. Collins.
In the final phase of a seven-year project, the genomes of 2,504 people across five continental regions have been sequenced. The result is a compendium of in-depth data on variation in human populations. See Articles p.68 & p.75
The 1000 Genomes Project has completed its pilot phase, sequencing the whole genomes of 179 individuals and characterizing all the protein-coding sequences of many others. Welcome to the third phase of human genomics. See Article p.1061
Medical genomics has focused almost entirely on those of European descent. Other ethnic groups must be studied to ensure that more people benefit, say Carlos D. Bustamante, Esteban González Burchard and Francisco M. De La Vega.
Humans are identical at most of the 3 billion base pairs in their genome, but focusing on the 300,000 to 600,000 key changes that separate us is already leading to genetic disease identification and treatments. These changes in single nucleotides often travel in blocks, or haplotypes. The International HapMap Project has mapped the haplotypes in several human populations based on over a million single changes, and the first fruits of their work are published in a landmark paper this week. The cover depicts part of the genome around the CFTR (cystic fibrosis) gene for three geographic populations. You can see clear breaks in the blocks of variation that each population shares within itself (dark colours) and regions of the genome that have changed more over time and not travelled in blocks (the intervening grey lines). The vast resource of genomic variation is freely available to the public via
The consortium that is mapping human haplotypes establishes some important principles of access and credit in this issue.
Single base differences between human genomes underlie differences in susceptibility to, or protection from, a host of diseases. Hence the great potential of such information in medicine.
The burgeoning commercial sector that is based on genome information poses a challenge to the norms of scientific publication. But it remains to be established that the conditions of access to published sequence data need to change.
A new effort to map human genetic variation should provide a shortcut for researchers trying to uncover the roots of disease. Carina Dennis profiles the 'HapMap' project.
Looking back over the past decade of human genomics, Francis Collins finds five key lessons for the future of personalized medicine — for technology, policy, partnerships and pharmacogenomics.
Results for the final phase of the 1000 Genomes Project are presented including whole-genome sequencing, targeted exome sequencing, and genotyping on high-density SNP arrays for 2,504 individuals across 26 populations, providing a global reference data set to support biomedical genetics.
The Structural Variation Analysis Group of The 1000 Genomes Project reports an integrated structural variation map based on discovery and genotyping of eight major structural variation classes in 2,504 unrelated individuals from across 26 populations; structural variation is compared within and between populations and its functional impact is quantified.
This report from the 1000 Genomes Project describes the genomes of 1,092 individuals from 14 human populations, providing a resource for common and low-frequency variant analysis in individuals from diverse populations; hundreds of rare non-coding variants at conserved sites, such as motif-disrupting changes in transcription-factor-binding sites, can be found in each individual.
Harnessing information from whole genome sequencing in 185 individuals, this study generates a high-resolution map of copy number variants. Nucleotide resolution of the map facilitates analysis of structural variant distribution and identification of the mechanisms of their origin. The study provides a resource for sequence-based association studies.
The goal of the 1000 Genomes Project is to provide in-depth information on variation in human genome sequences. In the pilot phase reported here, different strategies for genome-wide sequencing, using high-throughput sequencing platforms, were developed and compared. The resulting data set includes more than 95% of the currently accessible variants found in any individual, and can be used to inform association and functional studies.
Integrating sequence and array data to create an improved 1000 Genomes Project haplotype reference panel
1000 Genomes imputation can increase the power of genome-wide association studies to detect genetic variants associated with human traits and diseases. Here, the authors develop a method to integrate and analyse low-coverage sequence data and SNP array data, and show that it improves imputation performance.
Here, the analysis of 'HapMap 3' is reported — a public data set of genomic variants in human populations. The resource integrates common and rare single nucleotide polymorphisms (SNPs) and copy number polymorphisms (CNPs) from 11 global populations, providing insights into population-specific differences among variants. It also demonstrates the feasibility of imputing newly discovered rare SNPs and CNPs.
A consortium reports the tripling of the number of genetic markers in Phase II of the International HapMap Project. This map of human genetic variation will continue to revolutionize discovery of susceptibility loci in common genetic diseases, and study of genes under selection in humans.
Sabeti et al. build on their This paper builds on previous work of detecting selection on human genes, using the many more markers available in the Phase II HapMap project. Three examples of apparent population-specific selection based on geographic area are described, and how these may relate to human biology is discussed.