“Using data is popular; contributing data is unpopular.”

This is a typically fresh quote from the official report (http://www.cordis.lu/lifescihealth/genomics/home.htm) of the far-sighted Workshop on European Database and Analysis Resources for Research in Human Genetic Variation. Held on March 2–3, 2006 in Brussels, this effective workshop brought bioinformaticians together with medical, clinical and biological experts to examine ways to extend existing European projects into an integrated human genome variation database along the lines discussed in our August 2005 editorial (http://www.nature.com/ng/journal/v37/n8/full/ng0805-783.html). The workshop concluded that Europe has already started most of the projects necessary to the success of such an integrated database (which at the global scale we call HUGOBase). Participants also emphasized that it will be necessary to hold a peer-reviewed competition to identify those coalitions that have the capacity to integrate the results of other data producers.

Some of the existing HUGOBase components assembled by the EU researchers and their allies. As with the EU itself, the eventual number of participants is an open question.

A single database for human variation data is unfeasible because of the diversity of producers, disciplines, funding mechanisms and user needs. Fortunately, Europe is home to a number of bioinformatics grid technologies for linking databases (for example, see http://www.embracegrid.info), and it is hoped that these will provide the interface through which specialized users of the underlying data will apply their own tools. From such a regional nucleus, the HUGOBase could grow via grid technology by linking to databases elsewhere in the world (the US, Australia, Japan, China, India and Brazil immediately spring to mind). It will be particularly interesting to see how the proposals of the EU group are integrated into the Human Variome meeting held later this month (http://www.humanvariomeproject.org/).

Europe is also a good place to start a global human variation database because of the quality of existing data annotation and the existence of systematic health care with associated record-keeping. The report warns, however, that genetic data in isolation will not be useful. “With some exceptions, understanding of genetic etiology in isolation from environmental considerations will NOT lead to new therapies and will have NO direct impact on the health of European populations with chronic diseases.” In other words, the maximal health benefit will come at the cost of collecting and sharing the sensitive health and environmental data from European citizens' private lives. Since the workshop sought immediate practical steps toward data integration, participants concentrated only on sources of information already in the public domain, leaving aside the considerable negotiation to come concerning donors' interests in controlling access to genotypes and phenotypes originating in their own bodies.

This integrated approach of bringing representative experts to a common round table revealed the holes in the overall research strategy that need to be filled by funding for shared resources. In particular, there is a need for a comprehensive study describing the population genetics of multiple control populations and a need for agreement on HapMap SNPs or another maximally informative set of pan-European SNP markers, bearing in mind that the SNPs are markers for what may prove to be more complex sets of causative genetic variants.

The usual problems of attribution for data producers were raised, exacerbated by the prospect of a community knowledge web spanning across and outside existing journals. Clearly, there is more for us to do than sit on the sidelines suggesting that researchers get their act together. When this splendid report serves up lines such as, “A full range of consultations should be initiated with journals to investigate ways of improving direct deposition and facilitating data mining via publication standards,” the ball is firmly back in our court.

Obtaining long-term sustainable funding for database projects is always a concern, but surely it must be possible to quantify the benefit of open, communal resources. We hope that the EU will incorporate the idea of integrating genetics research (and indeed, all European science) within successive Framework funding programs.

May the best-organized and the most collaborative consortia win!