The rapid evolution of high-throughput sequencing has shifted the focus from data acquisition to data interpretation. Researchers working on human genomes have made big strides in primary-sequence analysis; they can map short reads, call single-nucleotide polymorphisms, insertion-deletions, or indels, and even map more complex structural variants. But two key challenges remain. The first challenge is to determine the haplotypes—the individual sequences of each chromosome in an individual's genome—especially when no information on the genomes of close relatives exists; the second challenge is to determine the functional effect of all the detected variants and to pinpoint those that cause disease.

The impact of mutations may depend on whether they occur together on the same chromosome or not. Credit: Katie Vicari

The year 2011 has seen several efforts to determine an individual's haplotype-resolved genome, for example, the high-throughput sequencing of fosmid clones in the Max Planck One genome (Genome Res. 10, 1672–1685; 2011) or the sequence analysis of individual chromosomes separated via a microfluidic device (Nat. Biotechnol. 29, 51–57; 2011). These studies showed the occurrence of novel variants in many genes, and they underscored the importance of phasing these mutations to be able to assess the impact they can have on the individual.

Assigning mutations to a haplotype is only the first step; one then needs to decide which of these mutations are functionally deleterious to the individual—a task Gregory Cooper and Jay Shendure referred to as “finding needles in a pile of needles once the haystack has been cleared away” (Nat. Rev. Genet. 12, 628–640; 2011). Computational approaches based on evolutionary conservation of sequences, biochemical properties of protein sequences and structural information have come a long way in providing candidate lists of potential mutations both in protein-coding sequences as well as in noncoding stretches of the genome. But even at their best, these programs only provide candidate lists. They will need to be complemented with large-scale experimental approaches to analyze these variants and provide molecular phenotypes that lead to the functional assessment of a given mutation.