munich

In a bid to speed up the exploitation of human genome sequencing efforts, the European Bioinformatics Institute (EBI) — an outstation of the European Molecular Biology Laboratory (EMBL) — is planning to launch a publicly accessible repository for DNA microarray-based gene expression data.

It hopes to create a single location where all the data on gene expression obtained from microarray technologies can be stored. But some scientists doubt whether the technology is sufficiently developed.

The EBI is organizing an international meeting later this year of representatives from laboratories that use such technologies. It will take place at the Wellcome Trust Genome Campus in Hinxton, near Cambridge, UK, where the EBI is based. It hopes to set up working groups to develop standards for microarray-based gene expression data and analysis.

DNA microarray, or ‘chip’, technology, allows scientists to rapidly monitor which genes are being expressed in a particular tissue in a highly automated way. The microarrays are coated with a mixture of cDNAs (or synthetic oligonucleotides identifying particular genes) whose sequences have been identified through genome sequencing projects. These bind mRNAs, the specific messengers made by a gene when it is turned on.

Expression patterns can then be compared between healthy and diseased tissues, providing clues to the genetic complexities of diseases such as cancer.

It is not enough to know whether a gene is present in a disease, says Annemarie Poustka, senior molecular geneticist at the German Cancer Research Centre (DKFZ) in Heidelberg. It is also necessary to know whether it is switched on, and whether it stays on through all the stages of a disease. As much data as possible needs to be pooled for this, she says.

“The microarray gene expression repository may become one of the most important databases in bioinformatics,” says Alvis Brazma, a staff scientist at the EBI.

Although DNA microarray technology is in its infancy, it has already created a large amount of data, which are either held privately or scattered across the Internet. “As more laboratories acquire this technology, the amounts of large-scale gene expression data and profiles will grow rapidly,” says Brazma. “This could lead to an explosion in gene expression data that may dwarf even the human sequencing projects.”

But some researchers think the move may be premature. Poustka agrees that a central public database is needed, and that the EBI is an ideal host, but says that different laboratories have not yet developed the tools to make comparisons between their data straightforward.

But the EBI, which has discussed its plans with European and US laboratories that use these technologies, is confident that the time is right to develop resources and standards.

“The database will allow us to cross-validate data obtained by different technologies, to characterize various techniques, and to establish error rates, benchmarks and gold standards,” says Brazma.

EBI scientists, along with European colleagues, are applying for grants to develop the database, and hope it will become an international, not just a European, effort.