Ideker and colleagues used the gene-expression data from two previous metastases studies: those carried out by Wang and colleagues and van de Vijver and colleagues. Only the 8,141 genes that were common to both data sets were considered. Data from patients in both studies who subsequently developed metastases were used as the metastatic class, with the remaining patients acting as the non-metastatic class. The relevant protein–protein interaction data were assembled using three approaches — two-hybrid screening, computational predictions and interactions gathered from the literature — and consisted of a pool of 57,235 interactions from 11,203 proteins. To integrate the gene and protein data, the authors overlaid the expression values of each gene with its corresponding protein and searched for subnetworks for which the activities across the patients were predictive of metastasis. Each subnetwork is suggestive of a functional pathway or complex.
The authors identified 149 discriminative subnetworks in the van de Vijver data and 243 in the Wang data. Importantly, the subnetworks were more reproducible between the two data sets than the original metastasis gene-expression signatures isolated in each study, and were also more accurate. Moreover, the subnetworks can identify proteins that are not subject to changes in gene-expression profile, such as Myc and cyclin D1, but still contribute to metastasis development — almost all of the subnetworks contained at least one of these proteins, and many of the subnetworks also contained known cancer susceptibility genes, such as
HRAS
and
TP53
. And, once a network is identified, the biological relevance of the proteins in question can also be more easily established.
This is a preview of subscription content, access via your institution