Cell Research

FIGURE 3

FROM:

Understanding biological functions through molecular networks

Jing-Dong Jackie Han

BACK TO ARTICLE

Figure 3.

Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, please contact help@nature.com or the author

An example of data integration by a probabilistic model. Heterogeneous dataset types can be evaluated by gold standard positive (GSP) and gold standard negative (GSN) functional relationships, for example PPIs. The potential of forming a true functional relationship can be scored as the likelihood ratio (LR) for protein/gene pairs to be true positive interactions versus true negative interactions, according to the GSP and GSN datasets. Taking each data type as independent, a Naïve Bayesian model can be used to integrate heterogeneous data. Each interaction is assigned a LR within a data type. When evidence arises from more than one dataset within a data type, the maximal LR among the datasets is used for a gene pair. Then the LRs given by different data types are multiplied to generate a final prediction score for a potential functional relationship. Based on an acceptable confidence level, a final integrated network can be obtained with each edge representing a likelihood of forming the functional relationship. PCC, GO, SSBP and DDI stand for Pearson Correlation Coefficient, Gene Ontology, Smallest Shared Biological Process and Domain-Domain Interaction, respectively. Adapted from 41.

BACK TO ARTICLE