Nature Publishing Group, publisher of Nature, and other science journals and reference works
proteomics
my account e-alert subscribe register
SEARCH
advanced search
 
Sigma-Aldrich
NPG Subject areas
Access material from all our publications in your subject area:
Biotechnology Biotechnology
Cancer Cancer
Chemistry Chemistry
Dentistry Dentistry
Development Development
Drug Discovery Drug Discovery
Earth Sciences Earth Sciences
Evolution & Ecology Evolution & Ecology
Genetics Genetics
Immunology Immunology
Materials Materials Science
Medical Research Medical Research
Microbiology Microbiology
Molecular Cell Biology Molecular Cell Biology
Neuroscience Neuroscience
Pharmacology Pharmacology
Physics Physics
Browse all publications
 
web focus Proteomics
Review

Nature Reviews Drug Discovery, S4 (2005)

Inferring function

Joanna Owens

The availability of whole genome sequences for many species has made it possible to structurally characterize many of the proteins that make up the proteome, including those with unknown function. However, it has been difficult so far to learn more about the function of a protein from sequence and structure alone without any experimental Figure 1characterization. To help elucidate the functions of unknown proteins, scientists at the Howard Hughes Medical Institute have developed a new informatics tool called ProKnow that enables one to infer function for unknown proteins based on existing knowledge of the structural features of proteins with known functions.

Currently used methods for assigning function are often based on the assumption that similar sequences have descended from a common ancestor and therefore share similar function. However, several reports suggest that this approach is not particularly accurate. A more accurate annotation of function can be obtained by using protein folds, sequence motifs, domains and sequence orthology, or by the use of algorithms based on the identification of functionally significant residues. But for all these methods, some existing knowledge of sequence or structural similarity is essential.

To add further challenges to protein characterization, new insights into the complexities of protein function have recently been described. The existence of moonlighting proteins that behave differently depending on cellular context has led to attempts to study proteins in their native environments, and even proteins that have the same fold and active site architecture have been shown to have completely different functions.

Learning systems based on statistical theory, such as support vector machines, have recently been developed using information about the properties of amino-acid residues, such as hydrophobicity and polarity, and the use of neural networks that have been trained on protein features is a promising tool; however, both these technologies are limited by their extent of coverage and accuracy.

When a protein is submitted to ProKnow, it extracts all the structural features such as three-dimensional fold, motifs, sequence and functional linkages (such as those from Database of Interacting Proteins, DIP, http://dip.doe-mbi.ucla.edu) from the uncharacterized protein and maps these to the same structural features within the ProKnow knowledgebase, which it then links to functions that are described (annotated) using the controlled vocabulary of Gene Ontology. The functions that seem to link to most of the structural features are statistically weighted to give a final list of putative functions and their statistical scoring. Using ProKnow, the authors were able to correctly assign 70% of proteins overall, with 93% coverage of the function annotations for 1,507 distinct folded proteins. The authors plan to regularly update the knowledgebase and include additional algorithms in their future release to improve prediction accuracy.

WEB SITES ProKnow | Gene Ontology |

 

ORIGINAL RESEARCH PAPER

1.

Pal, D.& Eisenberg, D. Inference of protein function from protein structure Structure 13, 121–130 ( 2005). | Article | | PubMed |


Focus home | Editorial | NPG library | Sponsor | Contact
© 2004 Nature Publishing Group
Privacy Policy