The number of people to have their genome fully sequenced has doubled, from two to four. Craig Venter and James Watson are now joined by two anonymous individuals — one Han Chinese and the other Nigerian — as reported in two studies that explore the potential of 'next-generation' technologies for individual genome sequencing.

The genomes described in these studies were sequenced in weeks, rather than months or years, and at a fraction of the cost of the initial human reference genome.

The new wave of sequencing technologies are characterized by short reads, greatly increased throughput and reduced costs (see our recent Poster and Podcast for an overview of these technologies). One of these methods (the 454/Roche technology) was used in the sequencing of the Venter and Watson genomes. The two new individual genomes were sequenced with another commercially available technology, developed by Illumina, which generates millions of 35-nucleotide reads in a single sequencing cycle. Both studies used a combination of single reads and a paired-end read strategy, in which stretches of DNA that are a known distance apart are sequenced as a pair. The latter approach is invaluable for the assembly of some genomic regions. For both individuals, high-quality sequence was generated for more than 92% of the genome, with a coverage (that is, the number of reads per position) of greater that 30×.

An important goal of individual genome sequencing is to provide a detailed picture of how genetic variants correlate with phenotypes, particularly those that are relevant to human health. Some envisage an era of 'personalized genomics', in which people will gain insights into their susceptibility to disease and likely response to treatment through a full picture of their genetic variation. With these goals in mind, sequencing accuracy and the comprehensive surveying of variation are paramount.

Both papers showed that the Illumina method produces highly accurate sequence information. In terms of variation, this technology proved to be well suited for the identification of SNPs; comparing SNP identification with that from array-based genotyping revealed a high level of accuracy, with very low (less than 1%) rates of both false positives and false negatives. The findings were similarly encouraging for short insertions and deletions (indels), and revealed a previously unrecognized amount of variation of this kind. A wide range of structural variants were also identified, and the studies highlight the potential of combining single and paired-end sequencing to gain a more comprehensive picture of structural variation.

The genomes described in these studies were sequenced in weeks, rather than months or years, and at a fraction of the cost of the initial human reference genome. Ongoing progress in improving existing high-throughput sequencing technologies, and developing new ones, should bring down costs further. An era in which information from individual genomes becomes a core part of genetic research — spearheaded by endeavours such as the 1000 Genomes project and the Personal Genome Project — is moving closer.