Figure 1 - Protein analysis using InterProScan.
From the following article
Linking publication, gene and protein data
Paul Kersey & Rolf Apweiler
Nature Cell Biology 8, 1183 - 1189 (2006)
doi:10.1038/ncb1495

Curators of member databases of InterPro identify proteins that share a known functional domain and manually identified sequences can be supplemented through iterative searching of the public databases until a comprehensive set is found. The sequences are aligned and a classifying function is built from the alignment, which is designed to recognise other proteins that posses the same domain. Many of these classifiers are built using hidden Markov models. Alternative classifying functions judged (by curators) to identify the same domain are grouped into a single entry in the InterPro database, and annotated with relevant information (expressed both in free text and in the terms of the GO controlled vocabulary). InterProScan is a programme that allows users to characterize a sequence by applying these classifying functions. Performance can be improved by precomputing the results for known sequences.
