In the hands of geneticists, the old saw “each of us is unique” becomes a challenge. The human genetic landscape consists of 6 to 8 million common genetic variants, defined as those carried by at least 5% of the population—a subset of which contribute appreciably to the heritability of many common diseases. Only recently have researchers begun to rise to this challenge, typified in the study of Duerr et al.1

These authors are among the first to deploy a genome-wide association strategy to identify risk factors for common disease—in this case, Crohn disease [p 26-28]. Such genome-wide approaches have been used previously to identify genetic variants conferring risk for age-related macular degeneration2, and promise to accelerate the search for disease-associated polymorphisms in the future.

Whole-genome approaches have only recently become feasible through advances in high-throughput genotyping, which enable hundreds of thousands of markers to be genotyped in hundreds of individuals at an affordable cost, and the development of a reference panel of common genetic variation known as the HapMap3. With such tools, geneticists can examine a panel of genetic markers that effectively capture a substantial fraction of common variants indirectly, taking advantage of the fact that many variants are in 'linkage disequilibrium', or statistically correlated, in the human population3. The marker panel used by Duerr et al., for instance, consisted of 308,332 single nucleotide polymorphisms and is estimated to provide information on roughly 75% of the common variants present in populations of European ancestry4,5.

Key aspects of any genome-wide association study are the study design and analysis strategy. As with any epidemiological study, sample sizes must be sufficiently large to allow identification of risk factors, which, in the setting of common diseases, are generally expected to confer only small effects on overall risk. Balanced against this is the considerable cost of genotyping, at high density, large numbers of individuals. The study by Duerr et al. strikes a good balance, comparing more than 300,000 genetic markers in roughly 500 individuals with Crohn disease and 500 healthy controls.

The next key challenge, after controlling for potential sources of bias such as genotyping error, is to set an appropriate statistical threshold for deciding which of the many observed differences in marker frequency between affected individuals and controls are significant, a critical issue given the very large number of markers being examined. Duerr et al. applied a stringent statistical cutoff to minimize the risk of false positives, leaving them with three markers considered significant at a genome-wide level. As a final critical step in establishing the robustness of their results, Duerr et al. replicated the association between IL23R and Crohn disease in two independent collections of samples.

In contrast to 'candidate gene' approaches, genome-wide association studies have a key advantage: they offer a fairly unbiased survey of the genome and make no a priori assumptions about where risk variants might reside. An important weakness is that current genotyping platforms are limited in their coverage of rare genetic variants, which may also contribute substantially to the genetic architecture of common diseases. At present, the latter are best examined through targeted resequencing efforts as exemplified by recent studies showing that rare varients in two genes, ATM and BRIP1, are associated with increased risk of familial breast cancer6,7,7. If the so-called '$1,000 genome' comes to fruition, these targeted efforts may soon give way to whole-genome resequencing that would reveal the complete spectrum of genetic variants underlying common diseases.