Steve Gardner: linking concepts. Credit: BIOWISDOM

Although a relative newcomer to bioinformatics, ontologies have already attracted commercial interest. BioWisdom of Cambridge, UK, supplies ontologies in various fields. “Life science R&D poses a multidimensional problem,” says Steve Gardner, BioWisdom's chief technical officer. “The problem is being able to communicate the information to a user interested not just in a molecule, but also in the context surrounding that molecule.” BioWisdom currently offers more than 10 million distinct concepts linked by over 100 million relationships.

BioWisdom can also assist researchers to develop their own ontology. The first task is to build a database framework to encapsulate it. An additional framework embeds methods to normalize the incoming data, so that an entity is recognized despite having different names in different data sources. This is not easy: the sedative diazepam, for example, has some 197 synonyms.

Good ontology software can even help the researcher develop new hypotheses. “We have inferencing programs that draw together different concepts,” says Gardner. “If one ontology says that COX2 is expressed in synoviocytes, and another says that synoviocytes are implicated in rheumatoid arthritis, the inferencing program would suggest that COX2 may be implicated in rheumatoid arthritis.”

The output of an ontology is a graph: a representation of the relationships between concepts. Once a graph has been generated, users can then bring their experience to bear. For example, they can exclude types of information on the strength of the evidence. “We call this a semantic lens,” says Gardner. “You pass this lens over the data and it filters them out like a polarizing filter. This makes a new graph that lets you highlight the interactions that are interesting to you.” BioWisdom's system has a hierarchical family of relationships: the protein-to-protein class, for example, has 400 potential relationships (such as ‘interacts with’, ‘upregulates’ and ‘activates’). Thus, ontologies allow the user to search using one key term by resolving the meaning of that term, and then searching against it.

A taste of how ontologies work is provided by the public-domain Genome Ontology (GO) Browser, which gives free access to the genome ontologies developed by the GO Consortium. Three ontologies have been developed: molecular function, biological process and a cellular component. Using the Ensembl GO browser, the user can find the Ensembl genes that have been mapped to these ontologies. The search term is presented at the centre of a ‘mind map’. Clicking on a ‘child’ or ‘parent’ term will produce a new Ensembl GO report centred on that term. The genes found are listed, along with links to different types of views of each gene and its chromosomal location. The ontologies can be also searched directly, with the results showing the connections between the terms.

S.B.