Abstract

Colorectal cancer (CRC) is a frequently lethal disease with heterogeneous outcomes and drug responses. To resolve inconsistencies among the reported gene expression–based CRC classifications and facilitate clinical translation, we formed an international consortium dedicated to large-scale data sharing and analytics across expert groups. We show marked interconnectivity between six independent classification systems coalescing into four consensus molecular subtypes (CMSs) with distinguishing features: CMS1 (microsatellite instability immune, 14%), hypermutated, microsatellite unstable and strong immune activation; CMS2 (canonical, 37%), epithelial, marked WNT and MYC signaling activation; CMS3 (metabolic, 13%), epithelial and evident metabolic dysregulation; and CMS4 (mesenchymal, 23%), prominent transforming growth factor–β activation, stromal invasion and angiogenesis. Samples with mixed features (13%) possibly represent a transition phenotype or intratumoral heterogeneity. We consider the CMS groups the most robust classification system currently available for CRC—with clear biological interpretability—and the basis for future clinical stratification and subtype-based targeted interventions.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from $8.99

All prices are NET prices.

Accessions

References

  1. 1.

    et al. Multiplatform analysis of 12 cancer types reveals molecular classification within and across tissues of origin. Cell 158, 929–944 (2014).

  2. 2.

    Cancer Genome Atlas Network. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330–337 (2012).

  3. 3.

    et al. Colorectal cancer intrinsic subtypes predict chemotherapy benefit, deficient mismatch repair and epithelial-to-mesenchymal transition. Int. J. Cancer 134, 552–562 (2014).

  4. 4.

    et al. Gene expression patterns unveil a new level of molecular heterogeneity in colorectal cancer. J. Pathol. 231, 63–76 (2013).

  5. 5.

    et al. Subtypes of primary colorectal tumors correlate with response to targeted treatment in colorectal cell lines. BMC Med. Genomics 5, 66 (2012).

  6. 6.

    et al. A colorectal cancer classification system that associates cellular phenotype and responses to therapy. Nat. Med. 19, 619–625 (2013).

  7. 7.

    et al. Poor-prognosis colon cancer is defined by a molecularly distinct subtype and develops from serrated precursor lesions. Nat. Med. 19, 614–618 (2013).

  8. 8.

    et al. Gene expression classification of colon cancer into molecular subtypes: characterization, validation and prognostic value. PLoS Med. 10, e1001453 (2013).

  9. 9.

    et al. Colon cancer molecular subtypes identified by expression profiling and associated to stroma, mucinous type and different clinical behavior. BMC Cancer 12, 260 (2012).

  10. 10.

    et al. Randomized phase III trial comparing biweekly infusional fluorouracil/leucovorin alone or with irinotecan in the adjuvant treatment of stage III colon cancer: PETACC-3. J. Clin. Oncol. 27, 3117–3125 (2009).

  11. 11.

    Graph clustering via a discrete uncoupling process. SIAM J. Matrix Anal. Appl. 30, 121–141 (2008).

  12. 12.

    , & An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30, 1575–1584 (2002).

  13. 13.

    et al. The vigorous immune microenvironment of microsatellite instable colon cancer is balanced by multiple counter-inhibitory checkpoints. Cancer Discov. 5, 43–51 (2015).

  14. 14.

    , , , & Capturing the metabolomic diversity of KRAS mutants in non-small-cell lung cancer cells. Oncotarget 5, 4722–4731 (2014).

  15. 15.

    et al. Glutamine supports pancreatic cancer growth through a KRAS-regulated metabolic pathway. Nature 496, 101–105 (2013).

  16. 16.

    et al. Hypoxic and Ras-transformed cells support growth by scavenging unsaturated fatty acids from lysophospholipids. Proc. Natl. Acad. Sci. USA 110, 8882–8887 (2013).

  17. 17.

    et al. Oncogenic Kras maintains pancreatic tumors through regulation of anabolic glucose metabolism. Cell 149, 656–670 (2012).

  18. 18.

    et al. Identification of molecular subtypes of gastric cancer with different responses to PI3-kinase inhibitors and 5-fluorouracil. Gastroenterology 145, 554–565 (2013).

  19. 19.

    Cancer Genome Atlas Research Network. Comprehensive molecular characterization of gastric adenocarcinoma. Nature 513, 202–209 (2014).

  20. 20.

    et al. Absolute quantification of somatic DNA alterations in human cancer. Nat. Biotechnol. 30, 413–421 (2012).

  21. 21.

    et al. Proteogenomic characterization of human colon and rectal cancer. Nature 513, 382–387 (2014).

  22. 22.

    , , , & MYC through miR-17–92 suppresses specific target genes to maintain survival, autonomous proliferation and a neoplastic state. Cancer Cell 26, 262–272 (2014).

  23. 23.

    , , & The miR-200 family determines the epithelial phenotype of cancer cells by targeting the E-cadherin repressors ZEB1 and ZEB2. Genes Dev. 22, 894–907 (2008).

  24. 24.

    et al. A comprehensive DNA methylation profile of epithelial-to-mesenchymal transition. Cancer Res. 74, 5608–5619 (2014).

  25. 25.

    et al. Impact of BRAF mutation and microsatellite instability on the pattern of metastatic spread and prognosis in metastatic colorectal cancer. Cancer 117, 4623–4632 (2011).

  26. 26.

    et al. Mutation profiling and microsatellite instability in stage II and III colon cancer: an assessment of their prognostic and oxaliplatin predictive value. Clin. Cancer Res. 18, 6531–6541 (2012).

  27. 27.

    et al. Context-dependent interpretation of the prognostic value of BRAF and KRAS mutations in colorectal cancer. BMC Cancer 13, 439 (2013).

  28. 28.

    et al. Molecular markers identify subtypes of stage III colon cancer associated with patient outcomes. Gastroenterology 148, 88–99 (2015).

  29. 29.

    et al. PD-1 blockade in tumors with mismatch-repair deficiency. N. Engl. J. Med. 372, 2509–2520 (2015).

  30. 30.

    et al. Inferring tumor purity and stromal and immune cell admixture from expression data. Nat. Commun. 4, 2612 (2013).

  31. 31.

    et al. Developing predictive molecular maps of human disease through community-based modeling. Nat. Genet. 44, 127–130 (2012).

  32. 32.

    , & Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 8, 118–127 (2007).

  33. 33.

    , & Frozen robust multiarray analysis (fRMA). Biostatistics 11, 242–253 (2010).

  34. 34.

    & A gene expression bar code for microarray data. Nat. Methods 4, 911–913 (2007).

  35. 35.

    , & Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl. Acad. Sci. USA 98, 5116–5121 (2001).

  36. 36.

    , , & Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc. Natl. Acad. Sci. USA 99, 6567–6572 (2002).

  37. 37.

    , , & Metagenes and molecular pattern discovery using matrix factorization. Proc. Natl. Acad. Sci. USA 101, 4164–4169 (2004).

  38. 38.

    , & Distance-weighted discrimination. J. Am. Stat. Assoc. 102, 1267–1271 (2007).

  39. 39.

    , , & affy–analysis of Affymetrix GeneChip data at the probe level. Bioinformatics 20, 307–315 (2004).

  40. 40.

    & RSEM: accurate transcript quantification from RNA-seq data with or without a reference genome. BMC Bioinformatics 12, 323 (2011).

  41. 41.

    , & arrayQualityMetrics–a bioconductor package for quality assessment of microarray data. Bioinformatics 25, 415–416 (2009).

  42. 42.

    et al. A robust genomic signature for the detection of colorectal cancer patients with microsatellite instability phenotype and high mutation frequency. J. Pathol. 228, 586–595 (2012).

  43. 43.

    Random forest. Mach. Learn. 45, 5–32 (2001).

  44. 44.

    & Random forests for genomic data analysis. Genomics 99, 323–329 (2012).

  45. 45.

    , , & Bayesian Gaussian copula factor models for mixed data. J. Am. Stat. Assoc. 108, 656–665 (2013).

  46. 46.

    & Default prior distributions and efficient posterior computation in Bayesian factor analysis. J. Comput. Graph. Stat. 18, 306–320 (2009).

  47. 47.

    et al. Comprehensive identification of mutational cancer driver genes across 12 tumor types. Sci. Rep. 3, 2650 (2013).

  48. 48.

    et al. Revised Bethesda guidelines for hereditary nonpolyposis colorectal cancer (Lynch syndrome) and microsatellite instability. J. Natl. Cancer Inst. 96, 261–268 (2004).

  49. 49.

    et al. Immunohistochemistry versus microsatellite instability testing in phenotyping colorectal tumors. J. Clin. Oncol. 20, 1043–1048 (2002).

  50. 50.

    et al. CpG island methylator phenotype underlies sporadic microsatellite instability and is tightly associated with BRAF mutation in colorectal cancer. Nat. Genet. 38, 787–793 (2006).

  51. 51.

    et al. Gene expression patterns of human colon tops and basal crypts and BMP antagonists as intestinal stem cell niche factors. Proc. Natl. Acad. Sci. USA 104, 15418–15423 (2007).

  52. 52.

    et al. The intestinal Wnt/TCF signature. Gastroenterology 132, 628–632 (2007).

  53. 53.

    , , , & An integrated database of genes responsive to the Myc oncogenic transcription factor: identification of direct genomic targets. Genome Biol. 4, R69 (2003).

  54. 54.

    et al. EMT is the dominant program in human colon cancer. BMC Med. Genomics 4, 9 (2011).

  55. 55.

    et al. The intestinal stem cell signature identifies colorectal cancer stem cells and predicts disease relapse. Cell Stem Cell 8, 511–524 (2011).

  56. 56.

    et al. Biomolecular network reconstruction identifies T cell homing factors associated with survival in colorectal cancer. Gastroenterology 138, 1429–1440 (2010).

  57. 57.

    et al. Clinical impact of different classes of infiltrating T cytotoxic and helper cells (TH1, TH2, Treg, TH17) in patients with colorectal cancer. Cancer Res. 71, 1263–1271 (2011).

  58. 58.

    et al. Type, density and location of immune cells within human colorectal tumors predict clinical outcome. Science 313, 1960–1964 (2006).

  59. 59.

    et al. Molecular signatures mostly associated with NK cells are predictive of relapse-free survival in breast cancer patients. J. Transl. Med. 11, 145 (2013).

  60. 60.

    et al. CD4+ follicular helper T cell infiltration predicts breast cancer survival. J. Clin. Invest. 123, 2873–2892 (2013).

  61. 61.

    et al. β-Catenin promotes colitis and colon cancer through imprinting of proinflammatory properties in T cells. Sci. Transl. Med. 6, 225ra28 (2014).

  62. 62.

    et al. Comparison of stable human Treg and TH clones by transcriptional profiling. Eur. J. Immunol. 39, 869–882 (2009).

  63. 63.

    et al. Transcriptomic analysis comparing tumor-associated neutrophils with granulocytic myeloid-derived suppressor cells and normal neutrophils. PLoS ONE 7, e31524 (2012).

  64. 64.

    & On testing the significance of sets of genes. Ann. Appl. Stat. 1, 107–129 (2007).

  65. 65.

    et al. miR-143 acts as a tumor suppressor by targeting N-RAS and enhances temozolomide-induced apoptosis in glioma. Oncotarget 5, 5416–5427 (2014).

  66. 66.

    et al. RAS is regulated by the let-7 microRNA family. Cell 120, 635–647 (2005).

  67. 67.

    et al. p53 regulates epithelial-mesenchymal transition through microRNAs targeting ZEB1 and ZEB2. J. Exp. Med. 208, 875–883 (2011).

  68. 68.

    , & Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell 120, 15–20 (2005).

  69. 69.

    et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 5, R80 (2004).

Download references

Acknowledgements

The authors would like to acknowledge the goodwill and generosity of the colorectal research community who made this study possible. J.G. and S.H.F. are supported by the Integrative Cancer Biology Program of the National Cancer Institute (grant U54CA149237). R.D. is supported by La Caixa International Program for Cancer Research & Education. L.V. is supported by grants from the Dutch Cancer Society (UVA2011-4969 and UVA2014-7245), Worldwide Cancer Research (14-1164), the Maag Lever Darm Stichting (MLDS) (MLDS-CDG 14-03) and the European Research Council (ERG-StG 638193). J.P.M. is supported by grants from the Dutch Cancer Society (UVA2012-573, UVA2013-6331 and UVA2015-7587) and the MLDS (FP012). S.K. is supported by the US National Institutes of Health (grants R01CA172670, R01CA184843, R01 CA187238 and P30CA016672 (Biostatistic and Bioinformatic Core)). A. Sadanandam and G.N. acknowledge support from the National Health Service. S.T. is supported by the Katholieke Universiteit Leuven GOA/12/2106 grant, the EU FP7 Coltheres grant, the Research Foundation Flanders and the Belgian National Cancer Plan.

Author information

Author notes

    • Justin Guinney
    • , Rodrigo Dienstmann
    • , Xin Wang
    • , Aurélien de Reyniès
    • , Andreas Schlicker
    • , Charlotte Soneson
    • , Laetitia Marisa
    • , Paul Roepman
    •  & Gift Nyamundanda

    These authors contributed equally to this work.

    • Pierre Laurent-Puig
    • , Jan Paul Medema
    • , Anguraj Sadanandam
    • , Lodewyk Wessels
    • , Mauro Delorenzi
    • , Scott Kopetz
    • , Louis Vermeulen
    •  & Sabine Tejpar

    These authors jointly directed this work.

Affiliations

  1. Sage Bionetworks, Seattle, Washington, USA.

    • Justin Guinney
    • , Rodrigo Dienstmann
    • , Brian M Bot
    • , Ted Laderas
    •  & Stephen H Friend
  2. Vall d'Hebron Institute of Oncology (VHIO), Universitat Autònoma de Barcelona, Barcelona, Spain.

    • Rodrigo Dienstmann
    •  & Josep Tabernero
  3. Laboratory for Experimental Oncology and Radiobiology (LEXOR), Center for Experimental Molecular Medicine (CEMM), Academic Medical Center (AMC), University of Amsterdam, Amsterdam, the Netherlands.

    • Xin Wang
    • , Evelyn Fessler
    • , Felipe De Sousa E Melo
    • , Jan Paul Medema
    •  & Louis Vermeulen
  4. Department of Biomedical Sciences, City University of Hong Kong, Hong Kong.

    • Xin Wang
  5. Ligue Nationale Contre le Cancer, Paris, France.

    • Aurélien de Reyniès
    •  & Laetitia Marisa
  6. Netherlands Cancer Institute (NKI), Amsterdam, the Netherlands.

    • Andreas Schlicker
    • , Rene Bernards
    •  & Lodewyk Wessels
  7. Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland.

    • Charlotte Soneson
    • , Paolo Angelino
    • , Sarah Gerster
    • , Edoardo Missiaglia
    • , Hena Ramay
    • , David Barras
    •  & Mauro Delorenzi
  8. Agendia NV, Amsterdam, the Netherlands.

    • Paul Roepman
    •  & Iris M Simon
  9. Institute of Cancer Research, London, UK.

    • Gift Nyamundanda
    •  & Anguraj Sadanandam
  10. The University of Texas, M.D. Anderson Cancer Center, Houston, Texas, USA.

    • Jeffrey S Morris
    • , Dipen Maru
    • , Ganiraju C Manyam
    • , Bradley Broom
    •  & Scott Kopetz
  11. École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland.

    • Krisztian Homicsko
    •  & Douglas Hanahan
  12. Gustave Roussy, Villejuif, France.

    • Valerie Boige
  13. Laboratorio de Genomica y Microarrays, Instituto de Investigación Sanitaria San Carlos, Hospital Clinico San Carlos, Madrid, Spain.

    • Beatriz Perez-Villamil
  14. Institut Catala d'Oncologia, L'Institut d'Investigació Biomèdica de Bellvitge, Barcelona, Spain.

    • Ramon Salazar
  15. Biomedical Engineering, Oregon Health Sciences University, Portland, Oregon, USA.

    • Joe W Gray
  16. Université Paris Descartes, Paris, France.

    • Pierre Laurent-Puig
  17. Department of Biology, Hôpital Européen Georges-Pompidou, Assistance Publique - Hôpitaux de Paris, Paris, France.

    • Pierre Laurent-Puig
  18. Ludwig Center for Cancer Research, University of Lausanne, Lausanne, Switzerland.

    • Mauro Delorenzi
  19. Department of Oncology, University of Lausanne, Lausanne, Switzerland.

    • Mauro Delorenzi
  20. Universitair ziekenhuis Leuven, Leuven, Belgium.

    • Sabine Tejpar

Authors

  1. Search for Justin Guinney in:

  2. Search for Rodrigo Dienstmann in:

  3. Search for Xin Wang in:

  4. Search for Aurélien de Reyniès in:

  5. Search for Andreas Schlicker in:

  6. Search for Charlotte Soneson in:

  7. Search for Laetitia Marisa in:

  8. Search for Paul Roepman in:

  9. Search for Gift Nyamundanda in:

  10. Search for Paolo Angelino in:

  11. Search for Brian M Bot in:

  12. Search for Jeffrey S Morris in:

  13. Search for Iris M Simon in:

  14. Search for Sarah Gerster in:

  15. Search for Evelyn Fessler in:

  16. Search for Felipe De Sousa E Melo in:

  17. Search for Edoardo Missiaglia in:

  18. Search for Hena Ramay in:

  19. Search for David Barras in:

  20. Search for Krisztian Homicsko in:

  21. Search for Dipen Maru in:

  22. Search for Ganiraju C Manyam in:

  23. Search for Bradley Broom in:

  24. Search for Valerie Boige in:

  25. Search for Beatriz Perez-Villamil in:

  26. Search for Ted Laderas in:

  27. Search for Ramon Salazar in:

  28. Search for Joe W Gray in:

  29. Search for Douglas Hanahan in:

  30. Search for Josep Tabernero in:

  31. Search for Rene Bernards in:

  32. Search for Stephen H Friend in:

  33. Search for Pierre Laurent-Puig in:

  34. Search for Jan Paul Medema in:

  35. Search for Anguraj Sadanandam in:

  36. Search for Lodewyk Wessels in:

  37. Search for Mauro Delorenzi in:

  38. Search for Scott Kopetz in:

  39. Search for Louis Vermeulen in:

  40. Search for Sabine Tejpar in:

Contributions

J.G., R.D., J.P.M., A. Sadanandam, L.W., M.D., S.K., L.M., L.V., S.T. and S.H.F. conceived and designed the study. A.d.R., P.R., P.L.-P., I.M.S., E.F., F.D.S.E.M., E.M., D.B., K.H., J.W.G., B.B., D.H., J.T., R.B., J.P.M., A. Sadanandam, L.W., M.D., S.K., L.V., V.B. and S.T. provided study materials. J.G., R.D., P.A., B.B., S.G., E.F., D.B., K.H., D.M., G.C.M. and B.M.B. collected and assembled the data. J.G., R.D., X.W., A.d.R., A. Schlicker, C.S., L.M., G.N., P.A., B.M.B., J.M., T.L., L.V., A. Schlicker, J.S.M., B.P.-V., R.S. and M.D. analysed and interpreted the data. J.G., R.D., X.W., A.d.R., A. Sadanandam, C.S., L.M., J.T., R.S., J.P.M., A. Schlicker, M.D., S.K., L.V. and S.T. wrote the manuscript. All authors contributed to the final approval of the manuscript.

Competing interests

I.M.S. and P.R. are employees of Agendia. R.B. is a shareholder of Agendia.

Corresponding authors

Correspondence to Justin Guinney or Louis Vermeulen or Sabine Tejpar.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–13

Excel files

  1. 1.

    Supplementary Table 1

    Summary of individual groups subtyping strategy

  2. 2.

    Supplementary Table 2

    Summary of clinical, pathological and molecular associations of individual groups' subtypes

  3. 3.

    Supplementary Table 3

    Data sets and variables used for correlative analyses

  4. 4.

    Supplementary Table 4

    Report of Random Forest CMS classifier during training and validation steps

  5. 5.

    Supplementary Table 5

    Clinicopathological and molecular associations of CMS groups

  6. 6.

    Supplementary Table 6

    Adjusted P values for enrichment in selected copy number counts across CMS groups

  7. 7.

    Supplementary Table 7

    Adjusted P values for enrichment in reverse-phase protein array measurements across CMS groups

  8. 8.

    Supplementary Table 8

    Adjusted P values for enrichment in cancer drivers mutations across CMS groups

  9. 9.

    Supplementary Table 9

    Adjusted P values for gene set mRNA enrichment analysis

  10. 10.

    Supplementary Table 10

    Comparison of TCGA proteomic subtypes and CMS groups

  11. 11.

    Supplementary Table 11

    Adjusted P values for gene set protein enrichment analysis

  12. 12.

    Supplementary Table 12

    Differential microRNA expression levels across CMS groups

  13. 13.

    Supplementary Table 13

    Univariate and multivariate survival models

  14. 14.

    Supplementary Table 14

    Major clinicopathological and molecular features of classified and undeterminate samples

  15. 15.

    Supplementary Table 15

    Major clinicopathological and molecular features of samples with network labels (consensus samples) versus samples with classifier labels (non-consensus classified samples) for each CMS group

  16. 16.

    Supplementary Table 16

    Final performance metrics of CMS classifiers (Random Forest and Single Sample Predictor) applied to consensus samples

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nm.3967