Collaborative approach aims to keep pace with discoveries.
Barend Mons's first objective would be ambitious enough for most people: to meld some of the most important biomedical databases into a single information resource. But that's just the beginning. Mons, a bioinformatician at the Erasmus Medical Centre in Rotterdam, the Netherlands, also wants to apply the Wikipedia philosophy. He's inviting the whole research community to help update a vast store of interlinked data. If he and his colleagues can pull it off — and even the project's advocates are not sure they can — they could transform the databases that are central to the work of many life scientists.
A test version of the project, provisionally dubbed Wiki for Professionals (http://www.wikiprofessional.info), is due to launch in the next month. It already contains data from key sources, such as protein information from Swiss-Prot and gene descriptions from Gene Ontology. Over the past year, Mons's team has woven together these and other archives to create what, from a user's point of view, seems to be a single database. The page on the muscular-dystrophy protein dystrophin, for example, contains data from Swiss-Prot together with links to disease information from the US National Library of Medicine, as well as explanatory text. Links to relevant publications in PubMed are also available.
Existing databases interlink to an extent, although the new resource is more comprehensive. But the next stage is the really radical bit. Biomedical research produces hundreds of thousands of papers a year, overwhelming database curators. To clear this bottleneck, Mons and his colleagues are allowing anyone to edit the entries, modifying and adding text and links as new work is published.
That's an attractive proposition, say database administrators. Michael Ashburner, a geneticist at the University of Cambridge, UK, helps run FlyBase, a collection of gene data on the model organism Drosophila melanogaster. The database receives around US$4 million a year from the US National Institutes of Health and employs up to five full-time curators, but still can't keep up with the relevant literature, says Ashburner, who is working with Mons on the new project. “We have a list of around 12 journals that we try to cover. Even that's tough.”
Anyone motivated to register can curate Wiki for Professionals. Visitors to the dystrophin entry, for example, can update almost any of the information on the page, such as statements about the role of the protein in disease. Users can also start new pages, and from later this year will be given the option of creating pages for themselves, with links to relevant publications. A final function, and the one that most excites Mons, is the availability of text-mining software. This will allow users to probe links between proteins, genes and disease that may be revealed only by comparing a large number of papers and other data.
This will be a revolution.
“Mons is a visionary,” says Amos Bairoch at the Swiss Institute of Bioinformatics in Geneva, a collaborator on the project and the creator of Swiss-Prot. “This will be a revolution.”
Yet realizing the vision will be difficult. Top of the list of challenges is persuading the community to get involved. Adding one's own data is likely to be the biggest motivator — Bairoch and Ashburner say they get several calls a week asking for updates to databases, usually from researchers who want their own papers added. Whether this will be enough to keep the database fresh remains to be seen, given that employers and funders tend not to value updating information highly.
Wiki for Professionals will also have to ensure that additions don't just reflect individual researchers' pet theories. Mons hopes scientists will adopt entries relevant to their work and use automated systems to alert them to changes, which they can then amend if necessary. The original data in Swiss-Prot and other databases will also be protected.
The resource has been set up by Knewco, a scientific computing company based in Rockville, Maryland, and co-founded by Mons. The firm raised around $2 million in private funding to pay for the initial effort, and says basic access will be free. Revenue will be generated by charging drug firms and other users for premium services, such as the option to run a private version of the system incorporating proprietary data.
Related links in Nature Research
Related external links
About this article
Cite this article
Giles, J. Key biology databases go wiki. Nature 445, 691 (2007). https://doi.org/10.1038/445691a
Journal of Digital Imaging (2011)
BMC Bioinformatics (2009)
Molecular Systems Biology (2009)