The unstoppable pace at which genotype, phenotype and other 'omic' data are being amassed for hundreds of human traits obliges geneticists to focus on the most useful way to organize this information.

Samuels and Rouleau argue in their Comment (p378) that locus-specific databases (LSDBs) are the most comprehensive repositories of data on a specific locus or disease. LSDBs — most of which are freely accessible — are curated by expert academics who can provide the best assessment of the biological validity of the deposited information. This advantage of LSDBs is also their main drawback: research groups, which rely on academic grants, cannot ensure regular site updates, a standardized format or an automated submission system. These obstacles are being overcome, although progress cannot come fast enough given the tide of allelic variants generated by resequencing studies and the increasing need to compare data across populations. Equally important is the need to ensure that the variants are linked to consistent clinical descriptions.

The challenge of linking biological data with clinical information also lies at the heart of another effort in human genomics research — the use of electronic health records (EHRs) to match genomics data to an individual's medical history. As Kohane describes in his Review (p417), EHR-driven genomics research has many advantages — mainly timeliness and cost savings — as populations can be phenotyped for clinically relevant traits at greater speeds and lower costs than traditional cohorts. Over 100,000 individuals are now included in EHRs, which have been used to validate the results of genome-wide association studies and to identify novel associations.

The continued success of LSDBs and EHRs will lie in standardizing frameworks and attracting data and database adoption on a wider scale.