Original Paper
Oncogene (2005) 24, 6155–6164. doi:10.1038/sj.onc.1208984; published online 1 August 2005
Colon cancer prognosis prediction by gene expression profiling
Alain Barrier1,2,3, Antoinette Lemoine4, Pierre-Yves Boelle2, Chantal Tse5, Didier Brault5, Franck Chiappini4, Julia Breittschneider6, François Lacaine1, Sidney Houry1, Michel Huguier1, Mark J Van der Laan3, Terry Speed6, Brigitte Debuire4, Antoine Flahault2 and Sandrine Dudoit3
- 1Service de Chirurgie Digestive, Hôpital Tenon, Université Pierre et Marie Curie, Assistance Publique, 75020 Paris, France
- 2INSERM U444, Faculté de Médecine Saint-Antoine, Université Pierre et Marie Curie, 75571 Paris Cedex 12, France
- 3Division of Biostatistics, School of Public Health, University of California, 140 Earl Warren Hall, Berkeley, CA 94720-7360, USA
- 4INSERM U602, Service de Biochimie, Hôpital Paul Brousse, Université Paris XI, 94800 Villejuif, France
- 5Service de Biochimie, Hôpital Tenon, Université Pierre et Marie Curie, Assistance Publique, 75020 Paris, France
- 6Department of Statistics, University of California, Berkeley, CA 94720-7360, USA
Correspondence: A Barrier, E-mail: barrier@stat.Berkeley.edu
Received 18 February 2005; Revised 8 June 2005; Accepted 23 June 2005; Published online 1 August 2005.
Abstract
This study assessed the possibility to build a prognosis predictor, based on microarray gene expression measures, in stage II and III colon cancer patients. Tumour (T) and non-neoplastic mucosa (NM) mRNA samples from 18 patients (nine with a recurrence, nine with no recurrence) were profiled using the Affymetrix HGU133A GeneChip. The k-nearest neighbour method was used for prognosis prediction using T and NM gene expression measures. Six-fold cross-validation was applied to select the number of neighbours and the number of informative genes to include in the predictors. Based on this information, one T-based and one NM-based predictor were proposed and their accuracies were estimated by double cross-validation. In six-fold cross-validation, the lowest numbers of informative genes giving the lowest numbers of false predictions (two out of 18) were 30 and 70 with the T and NM gene expression measures, respectively. A 30-gene T-based predictor and a 70-gene NM-based predictor were then built, with estimated accuracies of 78 and 83%, respectively. This study suggests that one can build an accurate prognosis predictor for stage II and III colon cancer patients, based on gene expression measures, and one can use either tumour or non-neoplastic mucosa for this purpose.
Keywords:
functional genomics, colon cancer, prognosis prediction, non-neoplastic mucosa, cross-validation
