A universal molecular prognostic score for gastrointestinal tumors

Shimizu, Hideyuki; Nakayama, Keiichi I.

doi:10.1038/s41525-021-00172-1

Download PDF

Article
Open access
Published: 04 February 2021

A universal molecular prognostic score for gastrointestinal tumors

npj Genomic Medicine volume 6, Article number: 6 (2021) Cite this article

1682 Accesses
5 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Colorectal and gastric cancers are a leading cause of cancer deaths in developed countries. Precise estimation of prognosis is important with regard to clinical decision making for individuals with such cancers. We here comprehensively compiled a complete atlas of prognostic genes based on an integrated meta-analysis of one of the largest assembled colorectal cancer cohorts. A simple yet robust machine learning approach was then applied to establish a universal molecular prognostic score (mPS_colon) that relies on the expression status of only 16 genes and which was validated with independent data sets. This score was found to be an independent prognostic indicator in multivariate models including cancer stage, to be valid independent of tumor characteristics or patient ethnicity, and to be also applicable to gastric cancer. We conclude that mPS_colon is a universal prognostic classifier for patients with gastrointestinal cancers and that it should prove informative for optimization of personalized therapy for such patients.

RNA expression of 6 genes from metastatic mucosal gastric cancer serves as the global prognostic marker for gastric cancer with functional validation

Article Open access 11 March 2024

Yun-Suhk Suh, Jieun Lee, … Charles Lee

Development and validation of a prognostic and predictive 32-gene signature for gastric cancer

Article Open access 09 February 2022

Jae-Ho Cheong, Sam C. Wang, … Tae Hyun Hwang

Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta

Article Open access 19 April 2023

Neha Shree Maurya, Shikha Kushwah, … Ashutosh Mani

Introduction

Cancer is the leading cause of death in developed countries, with an estimated 1.8 million new cases expected in the United States alone in 2020¹. Among the many types of cancer, gastrointestinal cancers, including colorectal cancer (CRC) and gastric cancer, are among the most prevalent worldwide².

Only a few biomarkers—including mutant KRAS, mutant BRAF, and microsatellite instability (MSI)—are currently recommended by expert panels for estimation of prognosis in CRC³, with the result that most patients receive similar treatment. The therapeutic strategy for CRC has thus been TNM staging, surgery, and chemotherapy. Current TNM criteria, however, may give rise to substantial under- or overtreatment of individuals with CRC. In addition, despite their receipt of similar treatment, CRC patients at the same stage show a wide range of outcomes. We hypothesized that such a difference in clinical outcome might be related to diverse transcriptome profiles of tumors. Identification of the molecular features of CRC that determine patient prognosis and stratification of patients on the basis of these features might be expected to inform the development of more effective clinical strategies and personalized therapies.

There is thus a growing need for new and efficient biomarkers to ensure optimal treatment of CRC patients. An ideal biomarker should be readily translated into clinical practice, identify patients who can be spared treatment or who can benefit from therapy, and, ultimately, support the implementation of precision medicine. Many analyses of CRC transcriptomes have been performed, and consensus molecular subtypes (CMSs) have recently been proposed by a multicenter initiative that undertook a comprehensive and cross-sectional comparison of such transcriptomes⁴. Almost all CRC tumors can thus be classified into one of four subtypes that show substantial biological differences: CMS1 is characterized by a high mutation rate, MSI, and pronounced activation of the immune system; CMS2 is epithelial in nature and manifests activation of WNT and MYC signaling pathways; CMS3 is also epithelial and shows overt metabolic dysregulation; and CMS4 exhibits marked transforming growth factor–β activity, stromal invasion, and angiogenesis⁴. However, CMS classification is not suited to prognostication, given that only CMS4 patients show a significant difference in overall survival (OS) and disease-free survival (DFS), with no prognostic difference being apparent among the other three subtypes⁴. These results highlight the need for a data-driven approach that does not rely on known biological findings and is focused on clinical outcomes.

Several previous studies have identified gene expression signatures that have a prognostic impact in individuals with stage 2 or 3 CRC. ColoGuide EX can stratify the prognosis of CRC patients on the basis of the expression levels of 13 genes⁵. However, this scoring system is specific to stage 2 patients and stratifies them into only two groups⁵. Oncotype DX Colon Recurrence Score is a 12 gene-based classifier that is applicable either to patients with stage 2 tumors that are mismatch repair (MMR) proficient or to those with stage 3 cancer^6,7. The major drawback of this latter system is that it is protected by patents and only available with the use of an expensive test kit.

We recently developed the molecular Prognostic Score (mPS), a machine learning-based method for stratifying the prognosis of breast cancer patients⁸ on the basis of the expression levels of only 23 genes. Unlike existing prognostic classifiers for breast cancer (such as MammaPrint⁹ and Oncotype¹⁰), mPS is a universal indicator that can stratify prognosis regardless of breast cancer subtype or stage. We, therefore, aimed to develop a similarly simple but the accurate method to estimate the prognosis of CRC patients.

In this study, we exploited a large and multicenter series of gastrointestinal cancer samples to establish a robust molecular classification method on the basis of their transcriptomic profiles. We first identified prognostic genes differentially expressed between CRC and surrounding normal mucosa. We then trained a simple yet robust machine learning model and finally developed a 16-gene classifier, which we termed mPS for colon cancer (mPS_colon). We also demonstrate the applicability of mPS_colon to gastric cancer. mPS_colon is a universal prognostic indicator for gastrointestinal tumors across cancer types and patient ethnicities.

Results

Identification of differentially expressed and prognostic genes in CRC

We first identified genes that met both the following criteria: (1) genes that are differentially expressed between normal mucosa and CRC tissue (DEGs), and (2) genes that are associated with patient prognosis. We surveyed 4123 DEGs in the TCGA-COAD (CRC cohort of The Cancer Genome Atlas) data set (Supplementary Data 1). For the integrated identification of prognostic genes across different CRC cohorts, we collected eight public CRC data sets for which gene expression and prognostic data were available (Supplementary Data 2). We divided CRC patients of each cohort into two groups according to the median expression level for each identified DEG, and the hazard ratios (HRs) for relapse-free survival (RFS) were combined in a random-effects model (meta-analysis). The application of these two consecutive filters resulted in the identification of 77 prognostic and differentially expressed genes for CRC patients (Supplementary Data 3).

Establishment of a 16 gene-based predictive score, mPS for CRC

We next attempted to stratify the prognosis of CRC patients according to the expression levels of these 77 genes. We applied a simple yet robust machine learning method known as lasso regression to TCGA-COAD training data in order to predict 5-year DFS (Fig. 1a). This approach resulted in the extraction of 16 of the 77 genes that are important for estimation of prognosis and provided weight for each (Table 1). We named this weighted summation mPS_colon, and it was designed to range from 0 to 50, with an average of 25.05 (Fig. 1b). A representative example of mPS_colon calculation is shown in Supplementary Fig. 1a.

**Fig. 1: mPS_colon provides a simple and robust estimation of prognosis for colorectal cancer (CRC) patients.**

Table 1 The 16 genes necessary and sufficient for calculation of mPS_colon.

Full size table

On the basis of this scoring method, we stratified CRC patients in the training data set into four groups (<10, 10–25, 25–40, and >40), and we found that the higher the score, the more likely the patients were to experience relapse (Supplementary Fig. 1b). To test the robustness of mPS_colon, we adopted another independent cohort, GSE39582¹¹, which is the largest publicly available CRC data set, for external validation. We found that mPS_colon also stratified RFS in this validation cohort (Fig. 1c), and we, therefore, concluded that mPS_colon is a robust prognostic indicator for CRC patients.

mPS_colon is a universal prognostic scoring system

CRC has been well investigated as a model of multistage carcinogenesis¹². In particular, about half of CRC tumors have been found to harbor TP53 mutations, with the frequency of such mutations being higher in cancers of the distal colon and rectum than in those of the proximal colon¹³. In accordance with this observation, 53.75% of CRC patients in the TCGA cohort harbor TP53 mutations. Any prognostic indicator would therefore need to be applicable to TP53 mutation-positive patients. The providers of the GSE39582 data set have proposed molecular subtypes (C1–C6) of CRC based on unsupervised consensus hierarchical clustering¹¹. However, this classification system is not able to stratify patients with TP53 mutations with regard to RFS (Fig. 2a). In contrast, mPS_colon was able to stratify CRC patients harboring these mutations (Fig. 2b).

**Fig. 2: mPS_colon stratifies colorectal cancer patients in the GSE39582 cohort regardless of molecular status.**

Mutations in KRAS and BRAF are also frequently present in CRC patients¹⁴. We surveyed patients in the GSE39582 cohort for the seven most frequent mutations in codons 12 or 13 of KRAS¹⁵ and for the c.1799T > A (p.V600E) mutation of BRAF, and we found that mPS_colon was also able to stratify the prognosis of CRC patients with these mutations (Fig. 2c, d).

Among molecular markers developed previously for CRC characterization and prognosis estimation, MSI dependent largely on deficient MMR is the only one reproducibly found to be a significant prognostic factor in early CRC by both a meta-analysis and a prospective trial^3,16. We found that mPS_colon could stratify the prognosis of CRC patients with either MMR-deficient (dMMR) (Fig. 2e) or MMR-proficient (pMMR) (Supplementary Fig. 2a) tumors. Of note, whereas Oncotype DX is not applicable to stage 2 patients with dMMR tumors, mPS_colon was able to stratify these patients (Fig. 2f), suggestive of broader applicability of mPS_colon.

Other molecular features of CRC include its CpG island methylator phenotype (CIMP) and chromosome instability (CIN). CIMP is an indicator based on DNA methylation status, whereas CIN is based on chromosomal aberrations¹⁷. The prognosis of subsets of CRC patients classified according to CIMP (Supplementary Fig. 2b, c) or CIN (Supplementary Fig. 2d, e) could be further stratified by mPS_colon.

We found that mPS_colon could also stratify prognosis in both young (<60 years old) (Fig. 3a) and elderly (>75 years) (Fig. 3b) CRC patients. In addition, the prognosis for patients of TNM stage 3 (Fig. 3c) or stage 4 (Fig. 3d) was stratified by mPS_colon. These various findings indicate that mPS_colon is a universally applicable prognostic indicator without regard to molecular subtypes and macroscopic clinical features of CRC.

**Fig. 3: mPS_colon stratifies colorectal cancer patients regardless of age or disease stage.**

To assess whether mPS_colon is a prognostic indicator for CRC patients independent of various clinicopathologic features, we performed the multivariate analysis with the use of a Cox proportional hazard model including patient age and sex, disease stage, TP53 and KRAS mutation status, MMR status, and molecular subtypes proposed by the providers of the GSE39582 data set (Fig. 4a, Table 2). We found that mPS_colon was able to stratify the recurrence risk of CRC patients independently of these features, with its impact on RFS being similar to that of TNM stage classification. Of note, mPS_colon was the most powerful prognostic factor among the known microscopic indicators (TP53 mutation, KRAS mutation, MMR status, and molecular subtypes).

**Fig. 4: mPS_colon is an independent prognostic indicator.**

Table 2 Univariate and multivariate analyses for relapse-free survival of metastasis-free patients with colorectal cancer in the GSE39582 cohort.

Full size table

Stage 2 and 3 patients often receive adjuvant chemotherapy. We found that CRC patients with a high mPS_colon score at either of these stages have a higher recurrence rate after chemotherapy compared with those with a low score (Fig. 4b, c). These results suggest that mPS_colon also has the potential to identify high-risk patients among those receiving adjuvant chemotherapy and that such patients with a high mPS_colon score may need more frequent follow-up examinations or additional therapy, or both.

mPS_colon stratifies patients with gastric cancer

Stomach cancer is responsible for an estimated 783,000 deaths annually worldwide, making it the third leading cause of cancer deaths¹⁸. The stomach and colon share a similar embryological origin and similar cancer histopathology. Although gastric cancer is the second most common gastrointestinal cancer after CRC, no molecular score for stratification of the prognosis of gastric cancer patients has been adopted clinically.

We hypothesized that mPS_colon, which was trained with data for the CRC cohort (mostly Caucasian) of TCGA, might also be applicable to gastric cancer, given the common characteristics shared by gastrointestinal cancers. Application of mPS_colon to patients in the gastric cancer cohort of TCGA (TCGA-STAD) without metastasis revealed that both DFS (Fig. 5a) and OS (Fig. 5b) were significantly stratified. These results thus indicate that mPS_colon also partially stratifies gastric cancer patients.

**Fig. 5: mPS_colon universally stratifies patients with gastrointestinal tumors.**

Given that the incidence of gastric cancer shows regional differences, being highest in East Asia¹⁸, we also assessed the utility of mPS_colon with the GSE84437 cohort, which is the largest gastric cancer data set currently available to the public and was derived from patients in South Korea. We found that mPS_colon also stratified the prognosis of these East Asian patients (Fig. 5c). Our results thus suggested that mPS_colon is a simple yet robust classifier for gastrointestinal tumors across cancer types and patient ethnicities.

Discussion

In this study, we made use of public data to comprehensively identify prognosis-related genes with a meta-analysis of >1200 patients and we then developed a universal prognostic classifier for gastrointestinal cancers, mPS_colon, with the use of machine learning technology. Although it is simple to apply, mPS_colon is a robust prognostic indicator for gastrointestinal tumors across cancer types and patient ethnicities.

This score is calculated from the binary expression status of only 16 genes. It is of note that well-characterized genes associated with CRC, including APC and KRAS, are not included among the prognostic genes we identified, suggestive of strong experimenter bias in previous studies that focused mostly on mutation status. In contrast, we adopted a strategy to identify in systematic manner genes that are most associated with patient prognosis at the transcriptome level with an unbiased machine learning approach. Given that mPS_colon was developed in a manner independent of the mechanistic contribution of each gene, most of the 16 genes identified in our study are not well characterized with regard to how they might affect the prognosis of patients with gastrointestinal cancers.

The biological relevance of these novel genes remains to be determined, but a few studies have suggested possible relations between some of these genes and disease. The nicotinic cholinergic receptor gene CHRNA3 is highly expressed in the human colon and small intestine¹⁹; the product of CCT2 was shown to cooperate with Gli1 and Hedgehog in the development of CRC²⁰; CEP55 contributes to a feedback loop with the master transcription factor FOXM1 in malignant transformation²¹; and the prognosis of patients with a high level of TYRP1 expression in their tumors was found to be poor in a small CRC cohort²², consistent with our present findings. No previous biological analysis has examined a possible relationship between CRC and seven of the 16 genes (PRSS53, KIFC3, BTBD16, TESK2, LYPD6, F2RL1, and SIAE). Further studies are thus warranted to elucidate the biological relevance of each of the 16 genes in the context of CRC.

The 16 prognostic genes found here to be related to colorectal and gastric cancer show no overlap with the 23 prognostic genes that we previously identified for breast cancer by a similar data-driven approach⁸. The transcriptomic characteristics of cancer would not be expected to be highly tissue-specific, given the common mechanisms thought to contribute to the development of various cancer types. Again, it should be emphasized, however, that most of the prognostic genes we identified in both the present and our previous⁸ study have not been extensively investigated in the field of cancer research. Further advances in unbiased methodology, together with the increasing accumulation of data, should lead to a more precise stratification of cancer patients for personalized medicine.

Although mPS_colon is a simple 16 gene-based classifier, it is substantially superior to the molecular subtypes proposed by the providers of the GSE39582 data set, especially for patients harboring TP53 mutations (Fig. 2a, b). The simplicity of mPS_colon would minimize the effort required for and cost of its application to clinical practice. We have also developed a Web tool to help clinicians who are unfamiliar with computational skills to perform the necessary calculations (https://hideyukishimizu.github.io/mPS_GI). The model we present is highly interpretable, making it easy to identify transcriptome patterns that may increase the risk for disease recurrence. The development of drugs that target the identified prognostic genes may reduce such risk in the future.

Other transcriptome-based methods, such as ColoGuide EX, have been developed for the prediction of the prognosis of patients with CRC. However, whereas these previous methods categorize patients into binary classes, mPS_colon is able to stratify patients into several groups. Furthermore, whereas many of the existing methods can be applied only to patients at specific stages of the disease or require certain platforms (as is the case for Oncotype DX Colon), our scoring system is applicable to a broader range of patients and is platform-independent, with both RNA-sequencing data (TCGA-COAD and TCGA-STAD) and microarray data (GSE39582 and GSE84437) being acceptable.

There are three main limitations of our study. First, all analyses were performed in a retrospective manner. Demonstration of the effectiveness of our prognostic stratification method for clinical use will require prospective evaluation of patients with gastrointestinal cancers and their prognosis for 10 years. Second, some information may be lost in the binarization process, which we used to maximize available data and to develop a platform-independent score. The availability of more RNA-sequencing data associated with clinical outcomes in the future may allow the application of other processes (such as meta-regression) that produce a better prognosis classifier. Third, although mPS_colon can stratify patient outcomes, it remains difficult to determine the best treatment for each group of patients. This problem is not limited to mPS_colon, however. It applies to all prognostic methods in the developmental stage, with the determination of an optimal treatment for each patient group generally requiring many years of further study. In breast cancer, for example, attempts to stratify patient prognosis on the basis of molecular markers have a relatively long history. A representative method, MammaPrint, was proposed in 2002²³, but interventions for the poor and good prognosis groups were not proposed until 2016, after the performance of several clinical trials²⁴. Our mPS_colon will also undergo multiple validation and intervention trials in the future in order to determine the appropriate course of treatment for each category of patients.

In summary, we have developed a universal prognostic indicator for gastrointestinal cancers that is based on the expression status of only 16 genes. The resulting score, mPS_colon, is able to stratify patients with gastrointestinal cancers, and further detailed characterization of each group of such patients categorized by mPS_colon may provide clues for future personalized medicine. In addition, many of the 16 prognostic genes identified have not been characterized in the context of colorectal or gastric cancer, and further studies of these genes may therefore provide insight into the development and progression of gastrointestinal cancers.

Methods

Study design and data sources

An overview of the development of the mPS_colon system is shown in Fig. 1a. We performed an integrated retrospective analysis of nine independent CRC cohorts for the establishment of mPS_colon. The initial analysis (Supplementary Data 1) was conducted with the TCGA-COAD data set (n = 382)²⁵. We then performed a meta-analysis (random-effects model) to determine prognosis-related genes in a large combined multicenter cohort consisting of eight international CRC data sets (GSE14333²⁶, n = 290; GSE17536²⁷, n = 177; GSE29623²⁸, n = 65; GSE33113²⁹, n = 90; GSE37892³⁰, n = 130; GSE38832³¹, n = 122; GSE72970³², n = 124; and GSE87211³³, n = 203) that include 1201 patients (Supplementary Data 2). These cohorts were selected because of their inclusion of a substantial number of CRC patients (>50 each) with both clinicopathologic and prognostic data available. We then adopted half of the TCGA-COAD cohort as a training set (TCGA-COAD-training) and the other half as an internal validation set (TCGA-COAD-validation) for machine learning to develop mPS_colon (Fig. 1a). For external validation, another independent data set, GSE39582¹¹ (n = 585), which is the largest public CRC data set available, was used. We extracted data with an RFS > 0 (n = 536). The clinicopathologic characteristics of each cohort are described in the original reports^{11,25,26,27,28,29,30,31,32,33}. Molecular subtypes for the GSE39582 cohort were calculated by the providers and included in the public data set¹¹. For the application of mPS_colon to gastric cancer, we adopted the stomach cancer data set of TCGA (TCGA-STAD, n = 415) together with the largest gastric cancer data set publicly available (GSE84437, n = 433³⁴, which was established by researchers at Yonsei University in South Korea. We extracted data with a DFS > 0 (TCGA-STAD, n = 291; GSE84437, n = 431). Both the GSE39582 (CRC) and TCGA-STAD (stomach cancer) cohorts have a sufficient number of patients to achieve a statistical power of at least 80%.

Predictive modeling

We first identified DEGs between normal mucosa and CRC samples deposited in TCGA-COAD (Supplementary Data 1). We performed this analysis with the TCGAbiolinks R package as recommended by the developer. In brief, we fitted a negative binomial generalized log-linear model to the read counts for each gene (TCGAanalyze_DEA function with method = ‘glmLRT’ option). We defined DEGs as genes with a false discovery rate (FDR) of <0.01 and absolute log₂[fold change] of >1, meaning that the expression level differs by a factor of >2 or <0.5 between normal and cancer samples. We did not use the data in the other columns of Supplementary Data 1, including log[CPM], to define DEGs.

We next downloaded eight public CRC data sets from GEO (Supplementary Data 2) and examined the relation between DEGs and prognosis by meta-analysis (random-effects model). For this analysis, we used the median value as the cutoff between low and high expression levels for each DEG in each cohort. A total of 77 genes was identified after the application of these two consecutive filters (Supplementary Data 3).

We then used the TCGA-COAD data set for the establishment of the prognostic classifier. The expression status (X) of the 77 genes was first transformed to “Gene_Score” on the basis of the expression level and integrated HR for each gene with the following step function (Eq. (1))

$${\rm{Gene}}\_{\rm{Score}} = \left\{ {\begin{array}{*{20}{c}} {1,\,{\rm{if}}\,X\,{\rm{is}}\,{\rm{LOW}}\,{\rm{and}}\,{\rm{integrated}}\,{\rm{HR}} < 1} \\ {1,\,{\rm{if}}\,X\,{\rm{is}}\,{\rm{HIGH}}\,{\rm{and}}\,{\rm{integrated}}\,{\rm{HR}} > 1} \\ {0,\,{\rm{if}}\,X\,{\rm{is}}\,{\rm{LOW}}\,{\rm{and}}\,{\rm{integrated}}\,{\rm{HR}} > 1} \\ {0,\,{\rm{if}}\,X\,{\rm{is}}\,{\rm{HIGH}}\,{\rm{and}}\,{\rm{integrated}}\,{\rm{HR}} < 1} \end{array}} \right.$$

(1)

We truncated the clinical information to 5 years (60 months), built a simple learning algorithm, L1 (lasso) regression, and trained the model to predict the defined period, 60 minus DFS in months. Lasso regression tends to be unstable when the sample size is limited compared with the number of variables. We, therefore, narrowed down the promising genes by the application of two consecutive filters before lasso regression. A similar approach (DEGs + meta-analysis followed by lasso) was also adopted in a recent study³⁵.

We generated the machine learning model with the use of the Python-based Keras library. It extracted 16 genes necessary to predict the prognosis of CRC patients. We defined Gene_Weight of these 16 genes in order to maximize the value to 50. Importantly, in this initial training, we used data only from CRC patients. However, we found that the 16 gene-based molecular score is also applicable to stomach cancer patients.

Statistics

Kaplan–Meier plots were generated with the survival R package. In general, we used a four-way split for the survival curves. When the number of patients in each category was small, we used a two-way split, with the exception of the data in Fig. 3d (which were split four ways for easy comparison with Fig. 3c). We used the meta R package for meta-analysis (random-effects model). The l1 lambda parameter of the lasso regression was set to 0.5 because with this value among the tested values (0.01, 0.05, 0.1, 0.5, and 1) the model yielded the best prediction with the TCGA-COAD-validation cohort. For the external validation of mPS_colon, we truncated the survival data at 10 years. We computed time from the date of diagnosis to the date of the event. Survival outcomes were compared with the log-rank test. The HR and its 95% confidence interval (CI) were calculated by univariate or multivariate Cox regression. Statistical significance was determined at a two-sided P value of <0.05, with the exception of the initial RNA-sequencing data analysis (FDR of <0.01, with the use of the TCGAbiolinks R package³⁶) and subsequent meta-analysis (P value of <0.01).

Ethics

Ethical approval was not needed because the datasets are publicly available.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability

All the data analyzed in this study were downloaded from cBioPortal or GEO. The links for each data are provided in Supplementary Data 2.

Code availability

Algorithms used for the creation of mPS_colon are detailed in the “Methods” section. A Web-based tool we generated in this study is freely available on our GitHub page.

References

Siegel, R. L., Miller, K. D. & Jemal, A. Cancer statistics, 2020. CA Cancer J. Clin. 70, 7–30 (2020).
Article Google Scholar
Torre, L. A. et al. Global cancer incidence and mortality rates and trends—an update. Cancer Epidemiol. Biomark. Prev. 25, 16–27 (2016).
Article Google Scholar
Hutchins, G. et al. Value of mismatch repair, KRAS, and BRAF mutations in predicting recurrence and benefits from chemotherapy in colorectal cancer. J. Clin. Oncol. 29, 1261–1270 (2011).
Article Google Scholar
Guinney, J. et al. The consensus molecular subtypes of colorectal cancer. Nat. Med. 21, 1350–1356 (2015).
Article CAS Google Scholar
Agesen, T. H. et al. ColoGuideEx: a robust gene classifier specific for stage II colorectal cancer prognosis. Gut 61, 1560–1567 (2012).
Article Google Scholar
Clark-Langone, K. M. et al. Translating cancer biology into personalized treatment planning: analytical performance characteristics of the Oncotype DX Colon Cancer Assay. BMC Cancer 10, 691 (2010).
Article CAS Google Scholar
Gray, R. G. et al. Validation study of a quantitative multigene reverse transcriptase-polymerase chain reaction assay for assessment of recurrence risk in patients with stage II colon cancer. J. Clin. Oncol. 29, 4611–4619 (2011).
Article Google Scholar
Shimizu, H. & Nakayama, K. I. A 23 gene-based molecular prognostic score precisely predicts overall survival of breast cancer patients. EBioMedicine 46, 150–159 (2019).
Article Google Scholar
Slodkowska, E. A. & Ross, J. S. MammaPrint 70-gene signature: another milestone in personalized medical care for breast cancer patients. Expert Rev. Mol. Diagn. 9, 417–422 (2009).
Article Google Scholar
Breaux, A. et al. Impact of 21-gene expression assay on staging estrogen receptor-positive HER2-negative breast cancer. Clin. Breast Cancer 19, e261–e269 (2019).
Article CAS Google Scholar
Marisa, L. et al. Gene expression classification of colon cancer into molecular subtypes: characterization, validation, and prognostic value. PLoS Med. 10, e1001453 (2013).
Article CAS Google Scholar
Fearon, E. R. Molecular genetics of colorectal cancer. Annu. Rev. Pathol. 6, 479–507 (2011).
Article CAS Google Scholar
Iacopetta, B. TP53 mutation in colorectal cancer. Hum. Mutat. 21, 271–276 (2003).
Article CAS Google Scholar
Cicenas, J. et al. KRAS, NRAS and BRAF mutations in colorectal cancer and melanoma. Med. Oncol. 34, 26 (2017).
Article Google Scholar
Lievre, A. et al. KRAS mutations as an independent prognostic factor in patients with advanced colorectal cancer treated with cetuximab. J. Clin. Oncol. 26, 374–379 (2008).
Article CAS Google Scholar
Popat, S., Hubner, R. & Houlston, R. S. Systematic review of microsatellite instability and colorectal cancer prognosis. J. Clin. Oncol. 23, 609–618 (2005).
Article CAS Google Scholar
Yiu, A. J. & Yiu, C. Y. Biomarkers in colorectal cancer. Anticancer Res. 36, 1093–1102 (2016).
CAS PubMed Google Scholar
Bray, F. et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 68, 394–424 (2018).
Article Google Scholar
Zhang, B. et al. Uncovering the transcriptomic and epigenomic landscape of nicotinic receptor genes in non-neuronal tissues. BMC Genomics 18, 439 (2017).
Article Google Scholar
Park, S. H. et al. Activating CCT2 triggers Gli-1 activation during hypoxic condition in colorectal cancer. Oncogene 39, 136–150 (2020).
Article CAS Google Scholar
Jeffery, J. et al. Beyond cytokinesis: the emerging roles of CEP55 in cancerigenesis. Oncogene 35, 683–690 (2016).
Article CAS Google Scholar
Hsu, Y. L. et al. Interaction between cancer-associated dendritic cells and colon cancer cells contributes to cancer progression via CXCL1. Int. J. Mol. Sci. 19, 2427 (2018).
Article Google Scholar
van de Vijver, M. J. et al. A gene-expression signature as a predictor of survival in breast cancer. N. Engl. J. Med. 347, 1999–2009 (2002).
Article Google Scholar
Cardoso, F. et al. 70-Gene signature as an aid to treatment decisions in early-stage breast cancer. N. Engl. J. Med. 375, 717–729 (2016).
Article CAS Google Scholar
TCGA Network. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330–337 (2012).
Article Google Scholar
Jorissen, R. N. et al. Metastasis-associated gene expression changes predict poor outcomes in patients with Dukes stage B and C colorectal cancer. Clin. Cancer Res. 15, 7642–7651 (2009).
Article CAS Google Scholar
Smith, J. J. et al. Experimentally derived metastasis gene expression profile predicts recurrence and death in patients with colon cancer. Gastroenterology 138, 958–968 (2010).
Article CAS Google Scholar
Chen, D. T. et al. Complementary strand microRNAs mediate acquisition of metastatic potential in colonic adenocarcinoma. J. Gastrointest. Surg. 16, 905–913 (2012).
Article Google Scholar
de Sousa, E. M. F. et al. Methylation of cancer-stem-cell-associated Wnt target genes predicts poor prognosis in colorectal cancer patients. Cell Stem Cell 9, 476–485 (2011).
Article Google Scholar
Laibe, S. et al. A seven-gene signature aggregates a subgroup of stage II colon cancers with stage III. OMICS 16, 560–565 (2012).
Article CAS Google Scholar
Tripathi, M. K. et al. Nuclear factor of activated T-cell activity is associated with metastatic capacity in colon cancer. Cancer Res. 74, 6947–6957 (2014).
Article CAS Google Scholar
Del Rio, M. et al. Molecular subtypes of metastatic colorectal cancer are associated with patient response to irinotecan-based therapies. Eur. J. Cancer 76, 68–75 (2017).
Article Google Scholar
Hu, Y. et al. Colorectal cancer susceptibility loci as predictive markers of rectal cancer prognosis after surgery. Genes Chromosomes Cancer 57, 140–149 (2018).
Article CAS Google Scholar
Dai, J. et al. Whole genome messenger RNA profiling identifies a novel signature to predict gastric cancer survival. Clin. Transl. Gastroenterol. 10, e00004 (2019).
Article Google Scholar
Prasad, B., Tian, Y. & Li, X. Large-scale analysis reveals gene signature for survival prediction in primary glioblastoma. Mol. Neurobiol. 57, 5235–5246 (2020).
Article CAS Google Scholar
Colaprico, A. et al. TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data. Nucleic Acids Res. 44, e71 (2016).
Article Google Scholar

Download references

Acknowledgements

This work was supported by KAKENHI grants from the Japan Society for the Promotion of Science (JSPS) to H.S. (19K20403) and to K.I.N. (18H05215). We thank members of our laboratory for discussion as well as A. Ohta for help with the preparation of the paper.

Author information

Authors and Affiliations

Department of Molecular and Cellular Biology, Medical Institute of Bioregulation, Kyushu University, Fukuoka, Japan
Hideyuki Shimizu & Keiichi I. Nakayama

Authors

Hideyuki Shimizu
View author publications
You can also search for this author in PubMed Google Scholar
Keiichi I. Nakayama
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

H.S. conceived the study, performed the data analysis, developed the mPS_colon, prepared the figures, and wrote the original draft of the paper. K.I.N. revised the draft and supervised the study.

Corresponding author

Correspondence to Keiichi I. Nakayama.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Data 1

Supplementary Data 2

Supplementary Data 3

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Shimizu, H., Nakayama, K.I. A universal molecular prognostic score for gastrointestinal tumors. npj Genom. Med. 6, 6 (2021). https://doi.org/10.1038/s41525-021-00172-1

Download citation

Received: 07 September 2020
Accepted: 06 January 2021
Published: 04 February 2021
DOI: https://doi.org/10.1038/s41525-021-00172-1

This article is cited by

Discovery of pathway-independent protein signatures associated with clinical outcome in human cancer cohorts
- Mariam M. Konaté
- Ming-Chung Li
- Yingdong Zhao
Scientific Reports (2022)