Genome-wide association studies, also known as GWASs, have made headlines in recent years, touting risk markers for illness. The approach relies on a simple concept: compare the genomes of affected individuals with healthy controls and look at significant points of genetic variance between the two that could contribute to the illness.

But the statistical methods used to determine points of interest can yield thousands of single nucleotide polymorphisms (SNPs), the sites of genetic difference. “Traditionally, you can go from millions [of candidates] to 25,000 SNPs, all with the appropriate significance value,” says Jack Taylor of the US National Institute of Environmental Health Sciences (NIEHS). “Almost all of those are false positives, though, and you'd have to individually filter them out.”

It's a difficult process, and one that could also throw out true connections that might not be revealed without information on gene function, gene interactions or other factors. For those reasons, Taylor and fellow NIEHS researcher Zongli Xu have worked on a solution: the web-based SNP selection tool called SNPinfo (Nucl. Acids Res., doi:10.1093/nar/gkp290; 2009).

“SNPinfo looks to use all the information we have available—other GWAS data, functional information, et cetera—to help isolate which SNPs are truly important,” Taylor says.

Put to the test with a study of prostate cancer, the system helped identify five SNPs of interest. A more traditional GWAS algorithm had found the same five points, but SNPinfo did so with less than 3% of the computing effort needed to connect SNPs to genes, according to Taylor.

Meanwhile, Melanie Wilson, a doctoral student at Duke University, has developed an as-yet-unpublished method that takes into account the potential interactions between SNPs. The approach, dubbed Multilevel Inference of SNP Associations, or MISA, also uses statistical probability to remove highly unlikely SNPs as the data set becomes larger.

Both methods—just a few among many being developed—will be tweaked with time. The SNPinfo developers, for example, are improving their tool on the basis of feedback from the more than 18,000 visitors that accessed the online tool since its launch in mid-September.

“I think we can always do better,” says Stephen Chanock at the US National Cancer Institute. “The question is whether we rush to find more variants [that contribute to illness] or we make a better study to account for issues such as environment and gene-by-gene interactions.”