Michiels S et al. (2005) Prediction of cancer outcome with microarrays: a multiple random validation strategy. Lancet 365: 488–492

Gene-expression profiling aims to classify patients according to a 'molecular signature' derived from microarray analysis. In the field of oncology, this approach promises to identify genes that are differentially expressed in tumors with different outcomes. Treatment strategies can then be tailored to the patient, based on the gene-expression profile.

This type of analysis involves large amounts of data and several possible methods for classifying patients, so it is possible that some findings are not robust. Michiels and co-workers reanalyzed data from the seven largest studies in this area and concluded that 'the prognostic value of published microarray results in cancer studies should be considered with caution'.

Each study included at least 60 patients and provided data on disease-free, event-free or overall survival. Using data from multiple, random sets of patients derived from the original training and validation sets, Michiels et al. examined the stability of the molecular signature and the extent to which patients were misclassified. In each case, they defined 'favorable' and 'unfavorable' expression profiles based on the 50 signature genes for which expression most closely predicted outcome.

The molecular signature was highly unstable; the list of genes that appeared to predict outcome varied greatly, depending on which patients were included in the training sets. In five of the seven studies, classification of patients was no more accurate than would have been expected by chance. Michiels et al. recommend that future studies should include larger sample sizes and use repeated random sampling for validation.