Databases in peril

Life-sciences databases are in crisis, say their operators, as funders keen to support exciting new projects lose interest in maintaining existing services. Nature investigates the scale of the problem.

A lack of stable funding is threatening biology's core databases. Unless funding agencies set aside dedicated grants, the fear is that researchers will lose access to information vital to their work.

Several major international databases and research centres, including the European Bioinformatics Institute (EBI) at Hinxton near Cambridge, UK, face funding cuts. And the outlook for specialist databases is even worse: more than half of the operators contacted by Nature say their databases are updated sporadically or not at all because no funding was available after their original grants expired.

“There is a funding crisis right now,” says Rolf Apweiler, a member of the EBI and head of the UniProt/Swiss-Protprotein-sequence database.

“It's a paradox,” adds Lincoln Stein, a bioinformaticist at Cold Spring Harbour Laboratory in New York. “The funding system assumes that projects have a lifespan of three to five years. But if biological databases are to do their job, they need funding for a decade or so.”

Life-sciences databases have proliferated over the past decade, driven by genome-sequencing efforts and easy Internet access.

For some scientists, resources such as UniProt are as much a part of the basic research infrastructure as reagents and test-tubes. The EBI's website, for example, recorded 2 million hits on a single day this April. Researchers use the site to access everything from molecular structures to nucleotide sequences. Hundreds of smaller databases, often maintained by individual labs, focus on molecules and genes associated with particular functions or species.

But as the number of databases mushrooms, many operators are finding that once their initial money has run out, funding agencies show little interest in helping maintain their service.

EBI director Janet Thornton is using the institute's reserves to support three databases whose funding has run out, including InterPro, an archive of data on protein families. “If we don't get new money we'll have to halve the number of staff on those projects,” she says.

Thornton and others say the problem is particularly acute in Europe, where most grants are tied to original research. “Researchers feel like they have to invent new projects every three years to get money,” says Thornton.

But databases in the United States are also feeling the pinch. The Alliance for Cellular Signaling, an ambitious ten-year attempt to amass data on the chemical signals inside cells, has scaled back its operations following a mid-project review. Funders at the National Institute of General Medical Sciences ruled last month that the project will receive less than half of the $5 million a year it had asked for. The alliance says it will now have to shut five of the nine labs that are generating data from mouse-cell experiments.

Another key North American resource — the Biomolecular Interaction Network Database (BIND) — also faces cuts. Many journals, including Nature, routinely send their papers to BIND staff, who curate records on almost 180,000 molecular interactions (see page 1028). Last month BIND was forced to cut 33 jobs when a grant application to the Canadian government, which provided money to establish the database, fell through. Although BIND will continue to function thanks to money from the Singapore government, plans to integrate with other databases have been put on hold.

“Canada is good at starting up projects like this, but there is no mechanism for continuing them,” says Chris Hogue, principal investigator at the Blueprint Initiative, the Toronto-based organization that runs BIND.

Quest for novelty

The cutback at the Alliance for Cellular Signaling may be the result of a conscious change of heart by funders, but bioinformaticists say other cuts are part of a broader problem facing databases: agencies want to fund innovative and hypothesis-driven initiatives, rather than ongoing infrastructure projects.

“Long-term maintenance is expensive,” says Carol Bult of the Jackson Laboratory in Bar Harbor, Maine, home of the Mouse Genome Database. She says it costs around US$4 million a year to run. The resource is widely used and Bult is confident that funding will be renewed this year, but many other databases aren't so lucky. “We've faced this issue for a decade, but the funding agencies haven't caught up.”

Smaller, cheaper databases are in even more trouble. Nature contacted 89 databases operating in 2000, and more than half said they are now struggling financially. Seven databases have folded, and many others are updated on an irregular basis as a labour of love by their owners (see ‘Survival of the fittest?’).


