The most important issue when trying to find mutations that cause disease via DNA sequencing is to keep the sequencing error rate as low as possible. Although base-calling algorithms for short-read sequencing technologies have been improved, David Galas and Leroy Hood from the Institute for Systems Biology in Seattle were seeking further improvement and decided to test the power of Mendelian genetics for sequence-error correction.

The researchers and their colleagues selected a family of four—including two siblings affected by two recessive disorders, Miller syndrome and ciliary dyskinesia, and their unaffected parents—and used the service of the company Complete Genomics to sequence each genome.

“The quality of the data we got was superb,” said Hood, but he added that for this, as for all, short-read technologies, repeat regions are still a challenge. Nevertheless, 92% of the sequences across the four genomes aligned to the human reference genome.

By comparing genomic sequences of parents and offspring, the scientists delineated a very precise recombination map for the childrens' genomes showing exactly which pieces of parental chromosomes had been assembled. This allowed them not only to correct 70% of sequencing errors but also to reduce the search space for the disease-causing variants. In their final analysis mutations in only four candidate genes remained, including the gene that is known to be mutated in ciliary dyskinesia and the variant that causes Miller syndrome.

The causal mutation for Miller syndrome has just been identified in a study in which the researchers sequenced only the exome, the protein-coding part, rather than the full genome. Although this approach is much cheaper, Hood cautions that not all Mendelian traits are present in coding regions, and thus one needs the entire genome sequence to find them.

For Hood, these results suggest that for any simple Mendelian trait, full genome sequencing of one or two families will likely identify the causal variant. His plan now is to see whether sequencing genomes of families with Huntington's disease will bring to light the genes modulating this disorder.

Hood summarizes the lesson from this work: “When genomes are going to be part of our medical record, it will be really imperative to sequence [genomes of] whole families to do error correction.”