Free for all: from genomes to proteomes.

A vast body of annotated and linked data is available in the public databases. But how do you find the database that best fits your needs? One place to start is the supplement produced as the first issue of each year by the journal Nucleic Acids Research, which is free online.

Alternatively, you could plunge straight into one of the large, general-purpose, bioinformatics databases such as the Ensembl Genome Browser (EGB) or the National Center for Biotechnology Information's Entrez portal. Most are now so closely integrated with more specialized databases that navigation through a collection of databases is all but seamless. Careful design makes them powerful tools even in the hands of a novice, yet it is easy to progress to more sophisticated use. Simple search-text boxes take a term through a selection of databases, and various options control the amount and type of data returned. Help buttons, along with links that provide simple explanations of each term, are never far away.

For genomics, the EGB is a popular port of entry. This collaborative effort between the Sanger Institute, the European Bioinformatics Institute (EBI) and the European Molecular Biology Laboratory provides automated annotation of human, mouse, rat, fugu, zebrafish, mosquito, fruitfly and nematode genomes. The homepage contains a link to an ‘Ensembl tour’ and ‘worked examples’. Users can search for a term across species for protein, disease or mRNA, or can follow links to a page dedicated to each species.

PEP, a database of Predictions for Entire Proteomes, is the result of a sophisticated analysis of proteomes from over 60 species. The work of Burkhard Rost's bioinformatics group at Columbia University, New York, PEP primarily contains open reading frames (ORFs) along with predicted structural domains detected within the ORFs. PEP can be searched online either directly or through the EBI. Alternatively, the entire PEP database can be downloaded — if you have space.

Some commercial companies offer free online access to their own high-quality databases to academic researchers, such as the MendelBase database of structural and functional protein information from Array Genetics of Newtown, Connecticut.

S.B.