Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Metastatic recurrence in colorectal cancer arises from residual EMP1+ cells

Abstract

Around 30–40% of patients with colorectal cancer (CRC) undergoing curative resection of the primary tumour will develop metastases in the subsequent years1. Therapies to prevent disease relapse remain an unmet medical need. Here we uncover the identity and features of the residual tumour cells responsible for CRC relapse. An analysis of single-cell transcriptomes of samples from patients with CRC revealed that the majority of genes associated with a poor prognosis are expressed by a unique tumour cell population that we named high-relapse cells (HRCs). We established a human-like mouse model of microsatellite-stable CRC that undergoes metastatic relapse after surgical resection of the primary tumour. Residual HRCs occult in mouse livers after primary CRC surgery gave rise to multiple cell types over time, including LGR5+ stem-like tumour cells2,3,4, and caused overt metastatic disease. Using Emp1 (encoding epithelial membrane protein 1) as a marker gene for HRCs, we tracked and selectively eliminated this cell population. Genetic ablation of EMP1high cells prevented metastatic recurrence and mice remained disease-free after surgery. We also found that HRC-rich micrometastases were infiltrated with T cells, yet became progressively immune-excluded during outgrowth. Treatment with neoadjuvant immunotherapy eliminated residual metastatic cells and prevented mice from relapsing after surgery. Together, our findings reveal the cell-state dynamics of residual disease in CRC and anticipate that therapies targeting HRCs may help to avoid metastatic relapse.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Identification of epithelial CRC cells associated with poor prognosis.
Fig. 2: Spatiotemporal dynamics of CRC metastases resolved by scRNA-seq.
Fig. 3: EMP1 marks cells enriched in invasion fronts and micrometastases.
Fig. 4: EMP1high cells are the origin of metastatic relapse.
Fig. 5: Neoadjuvant immunotherapy prevents metastatic relapse in CRC.

Similar content being viewed by others

Data availability

All data relevant to this study are available from the corresponding author on reasonable request. Expression arrays and RNA-seq data are available at the Gene Expression Omnibus (GEO). Accession numbers for gene expression sequencing experiments reported in this paper are GEO: GSE190055 (arrays, EMP1high versus EMP1low AKTP tumour cells), GSE208139 (arrays, MTOs co-cultured with fibroblasts), GSE207974 (RNA-seq chemotherapy) and GSE207668 (RNA-seq CTOs). Count matrices for single-cell RNA-seq experiments were deposited at ArrayExpress under accession number E-MTAB-11284 (10x AKTP primary tumours), E-MTAB-11302 (Smart-seq metastatic progression) and E-MTAB-11981 (Smart-seq AKP micrometastases). Additional metadata and processed data files, including UMAP embeddings and gene signature scores, are available at Synapse (syn35000645). Source data are provided with this paper.

References

  1. Amin, M.B. et al. The Eighth Edition AJCC Cancer Staging Manual: continuing to build a bridge from a population-based to a more “personalized” approach to cancer staging. CA 67, 93–99 (2017).

  2. Shimokawa, M. et al. Visualization and targeting of LGR5+ human colon cancer stem cells. Nature 545, 187–192 (2017).

    Article  ADS  CAS  PubMed  Google Scholar 

  3. de Sousa e Melo, F. et al. A distinct role for Lgr5+ stem cells in primary and metastatic colon cancer. Nature 543, 676–680 (2017).

    Article  ADS  PubMed  Google Scholar 

  4. Cortina, C. et al. A genome editing approach to study cancer stem cells in human tumors. EMBO Mol. Med. 9, 869–879 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Calon, A. et al. Dependency of colorectal cancer on a TGF-β-driven program in stromal cells for metastasis initiation. Cancer Cell 22, 571–584 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Calon, A. et al. Stromal gene expression defines poor-prognosis subtypes in colorectal cancer. Nat. Genet. 47, 320–329 (2015).

    Article  CAS  PubMed  Google Scholar 

  7. Isella, C. et al. Stromal contribution to the colorectal cancer transcriptome. Nat. Genet. 47, 312–319 (2015).

    Article  CAS  PubMed  Google Scholar 

  8. Lee, H.-O. et al. Lineage-dependent gene expression programs influence the immune landscape of colorectal cancer. Nat. Genet. 52, 594–603 (2020).

    Article  CAS  PubMed  Google Scholar 

  9. Guinney, J. et al. The consensus molecular subtypes of colorectal cancer. Nat. Med. 21, 1350–1356 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Raghavan, S. et al. Microenvironment drives cell state, plasticity, and drug response in pancreatic cancer. Cell 184, 6119–6137 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Joanito, I. et al. Single-cell and bulk transcriptome sequencing identifies two epithelial tumor cell states and refines the consensus molecular classification of colorectal cancer. Nat. Genet. 54, 963–975 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Tauriello, D. V. F. et al. TGFβ drives immune evasion in genetically reconstituted colon cancer metastasis. Nature 554, 538–543 (2018).

    Article  ADS  CAS  PubMed  Google Scholar 

  13. Massagué, J. & Obenauf, A. C. Metastatic colonization by circulating tumour cells. Nature 529, 298–306 (2016).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  14. Barriga, F. M. et al. Mex3a marks a slowly dividing subpopulation of Lgr5+ intestinal stem cells. Cell Stem Cell 20, 801–816 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Lange, M. et al. CellRank for directed single-cell fate mapping. Nat. Methods 19, 159–170 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Álvarez-Varela, A. et al. Mex3a marks drug-tolerant persister colorectal cancer cells that mediate relapse after chemotherapy. Nat. Cancer 3, 1052–1070 (2022).

    Article  PubMed  Google Scholar 

  17. Tyler, M. & Tirosh, I. Decoupling epithelial-mesenchymal transitions from stromal profiles by integrative expression analysis. Nat. Commun. 12, 2592 (2021).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  18. Grigore, A. D., Jolly, M. K., Jia, D., Farach-Carson, M. C. & Levine, H. Tumor budding: the name is EMT. Partial EMT. J. Clin. Med. 5, 51 (2016).

    Article  PubMed Central  Google Scholar 

  19. Roa-Peña, L. et al. Keratin 17 identifies the most lethal molecular subtype of pancreatic cancer. Sci. Rep. 9, 11239 (2019).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  20. Durgan, J. et al. SOS1 and Ras regulate epithelial tight junction formation in the human airway through EMP1. EMBO Rep. 16, 87–96 (2015).

    Article  CAS  PubMed  Google Scholar 

  21. Bangsow, T. et al. The epithelial membrane protein 1 is a novel tight junction protein of the blood-brain barrier. J. Cereb. Blood Flow Metab. 28, 1249–1260 (2008).

    Article  CAS  PubMed  Google Scholar 

  22. Aceto, N. et al. Circulating tumor cell clusters are oligoclonal precursors of breast cancer metastasis. Cell 158, 1110–1122 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Barry, E. R. et al. Restriction of intestinal stem cell expansion and the regenerative response by YAP. Nature 493, 106–110 (2013).

    Article  ADS  PubMed  Google Scholar 

  24. Cheung, P. et al. Regenerative reprogramming of the intestinal stem cell state via hippo signaling suppresses metastatic colorectal cancer. Cell Stem Cell 27, 590–604 (2020).

    Article  CAS  PubMed  Google Scholar 

  25. Vasquez, E. G. et al. Dynamic and adaptive cancer stem cell population admixture in colorectal neoplasia. Cell Stem Cell 29, 1213–1228 (2022).

    Article  CAS  PubMed  Google Scholar 

  26. Han, T. et al. Lineage reversion drives WNT independence in intestinal cancer. Cancer Discov. 10, 1590–1609 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Lupo, B. et al. Colorectal cancer residual disease at maximal response to EGFR blockade displays a druggable Paneth cell-like phenotype. Sci. Transl. Med. 12, eaax8313 (2020).

    Article  CAS  PubMed  Google Scholar 

  28. Heinz, M. C. et al. Liver colonization by colorectal cancer metastases requires YAP-controlled plasticity at the micrometastatic stage. Cancer Res. 82, 1953–1968 (2022).

    Article  CAS  PubMed  Google Scholar 

  29. Solé, L. et al. p53 wild-type colorectal cancer cells that express a fetal gene signature are associated with metastasis and poor prognosis. Nat. Commun. 13, 2866 (2022).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  30. Ohta, Y. et al. Cell-matrix interface regulates dormancy in human colon cancer stem cells. Nature 680, 784–794 (2022).

    Article  ADS  Google Scholar 

  31. Mustata, R. C. et al. Identification of Lgr5-independent spheroid-generating progenitors of the mouse fetal intestinal epithelium. Cell Rep. 5, 421–432 (2013).

    Article  CAS  PubMed  Google Scholar 

  32. Wang, Y. et al. Comprehensive molecular characterization of the hippo signaling pathway in cancer. Cell Rep. 25, 1304–1317 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Yuan, Y. et al. YAP1/TAZ-TEAD transcriptional networks maintain skin homeostasis by regulating cell proliferation and limiting KLF4 activity. Nat. Commun. 11, 1472 (2020).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  34. Morral, C. et al. Zonation of ribosomal DNA transcription defines a stem cell hierarchy in colorectal cancer. Cell Stem Cell 26, 845–861 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Le, D. T. et al. PD-1 blockade in tumors with mismatch-repair deficiency. N. Engl. J. Med. 372, 2509–2520 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Fumagalli, A. et al. Plasticity of Lgr5-negative cancer cells drives metastasis in colorectal cancer. Cell Stem Cell 26, 569–578 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Ganesh, K. et al. L1CAM defines the regenerative origin of metastasis-initiating cells in colorectal cancer. Nat. Cancer 1, 28–45 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Padmanaban, V. et al. E-cadherin is required for metastasis in multiple models of breast cancer. Nature 573, 439–444 (2019).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  39. Chalabi, M. et al. Neoadjuvant immunotherapy leads to pathological responses in MMR-proficient and MMR-deficient early-stage colon cancers. Nat. Med. 26, 566–576 (2020).

    Article  CAS  PubMed  Google Scholar 

  40. Matano, M. et al. Modeling colorectal cancer using CRISPR-Cas9–mediated engineering of human intestinal organoids. Nat. Med. 21, 256–262 (2015).

    Article  CAS  PubMed  Google Scholar 

  41. Drost, J. et al. Sequential cancer mutations in cultured human intestinal stem cells. Nature 521, 43–47 (2015).

    Article  ADS  CAS  PubMed  Google Scholar 

  42. Céspedes, M. V. et al. Orthotopic microinjection of human colon cancer cells in nude mice induces tumor foci in all clinically relevant metastatic sites. Am. J. Pathol. 170, 1077–1085 (2007).

    Article  PubMed  PubMed Central  Google Scholar 

  43. Chen, Y.-C. et al. Gut fecal microbiota transplant in a mouse model of orthotopic rectal cancer. Front. Oncol. 10, 568012 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  44. Conti, S. et al. CAFs and cancer cells co-migration in 3D spheroid invasion assay. Methods Mol. Biol. 2179, 243–256 (2020).

    Article  Google Scholar 

  45. Gonzalez-Roca, E. et al. Accurate expression profiling of very small cell populations. PLoS ONE 5, e14418 (2010).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  46. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    Article  CAS  PubMed  Google Scholar 

  47. Tarasov, A., Vilella, A. J., Cuppen, E., Nijman, I. J. & Prins, P. Sambamba: fast processing of NGS alignment formats. Bioinformatics 31, 2032–2034 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Liao, Y., Smyth, G. K. & Shi, W. The R package Rsubread is easier, faster, cheaper and better for alignment and quantification of RNA sequencing reads. Nucleic Acids Res. 47, e47 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  50. Carvalho, B. S. & Irizarry, R. A. A framework for oligonucleotide microarray preprocessing. Bioinformatics 26, 2363–2367 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Bolstad, B. M. et al. in Bioinformatics and Computational Biology Solutions Using R and Bioconductor (eds Gentleman, R. et al.) (Springer, 2005).

  52. Fridlyand, J. Microarray Data Analysis. in Selected Works in Probability and Statistics (ed Dudoit, S.) https://doi.org/10.1007/978-1-4614-1347-9_15 (Springer, 2012).

  53. Ritchie, M. E. et al. Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  54. Eklund, A. C. & Szallasi, Z. Correction of technical bias in clinical microarray data improves concordance with known biological information. Genome Biol. 9, R26 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  55. Wu, D. et al. ROAST: rotation gene set tests for complex microarray experiments. Bioinformatics 26, 2176–2182 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Efron, B. & Tibshirani, R. On testing the significance of sets of genes. Ann. Appl. Stat. 1, 107–129 (2007).

    Article  MathSciNet  MATH  Google Scholar 

  57. Lee, E., Chuang, H. Y., Kim, J. W., Ideker, T. & Lee, D. Inferring pathway activity toward precise disease classification. PLoS Comput. Biol. 4, e1000217 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  58. Zheng, G. X. Y. et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  59. Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Satija, R., Farrell, J. A., Gennert, D., Schier, A. F. & Regev, A. Spatial reconstruction of single-cell gene expression data. Nat. Biotechnol. 33, 495–502 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Butler, A., Hoffman, P., Smibert, P., Papalexi, E. & Satija, R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. Nat. Biotechnol. 36, 411–420 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Hafemeister, C. & Satija, R. Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression. Genome Biol. 20, 296 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. van Dijk, D. et al. Recovering gene interactions from single-cell data using data diffusion. Cell 174, 716–729 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  65. Subramanian, A. et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. USA 102, 15545–15550 (2005).

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  66. Parekh, S., Ziegenhain, C., Vieth, B., Enard, W. & Hellmann, I. zUMIs—a fast and flexible pipeline to process RNA sequencing data with UMIs. Gigascience 7, giy059 (2018).

  67. La Manno, G. et al. RNA velocity of single cells. Nature 560, 494–498 (2018).

    Article  ADS  PubMed  PubMed Central  Google Scholar 

  68. Bergen, V., Lange, M., Peidli, S., Wolf, F. A. & Theis, F. J. Generalizing RNA velocity to transient cell states through dynamical modeling. Nat. Biotechnol. 38, 1408–1414 (2020).

    Article  CAS  PubMed  Google Scholar 

  69. R Core Team. R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2020).

  70. Barrett, T. & Edgar, R. [19] Gene Expression Omnibus: microarray data storage, submission, retrieval, and analysis. Methods Enzymol. 411, 352–369 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Grossman, R. L. et al. Toward a shared vision for cancer genomic data. N. Engl. J. Med. 375, 1109–1112 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  72. Muzny, D. M. et al. Comprehensive molecular characterization of human colon and rectal cancer. Nature 487, 330–337 (2012).

    Article  ADS  CAS  Google Scholar 

  73. Tripathi, M. K. et al. Nuclear factor of activated T-cell activity is associated with metastatic capacity in colon cancer. Cancer Res. 74, 6947–6957 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  74. Sanz-Pamplona, R. et al. Aberrant gene expression in mucosa adjacent to tumor reveals a molecular crosstalk in colon cancer. Mol. Cancer 13, 46 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  75. Kemper, K. et al. Mutations in the Ras-Raf axis underlie the prognostic value of CD133 in colorectal cancer. Clin. Cancer Res. 18, 3132–3141 (2012).

    Article  CAS  PubMed  Google Scholar 

  76. Jorissen, R. N. et al. Metastasis-associated gene expression changes predict poor outcomes in patients with dukes stage B and C colorectal cancer. Clin. Cancer Res. 15, 7642–7651 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Marisa, L. et al. Gene expression classification of colon cancer into molecular subtypes: characterization, validation, and prognostic value. PLoS Med. 10, e1001453 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Laibe, S. et al. A seven-gene signature aggregates a subgroup of stage II colon cancers with stage III. OMICS 16, 560–565 (2012).

    Article  CAS  PubMed  Google Scholar 

  79. Jorissen, R. N. et al. DNA copy-number alterations underlie gene expression differences between microsatellite stable and unstable colorectal cancers. Clin. Cancer Res. 14, 8061–8069 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Azzalini, A. & Menardi, G. Clustering via nonparametric density estimation: the R package pdfcluster. J. Stat. Softw. 57, 1–26 (2014).

    Article  MATH  Google Scholar 

  81. Azzalini, A. & Torelli, N. Clustering via nonparametric density estimation. Stat. Comput. 17, 71–80 (2007).

    Article  MathSciNet  Google Scholar 

  82. Smedley, D. et al. The BioMart community portal: an innovative alternative to large, centralized data repositories. Nucleic Acids Res. 43, W589–W598 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  83. Drost, H. G. & Paszkowski, J. Biomartr: genomic data retrieval with R. Bioinformatics 33, 1216–1217 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  84. Li, B. & Dewey, C. N. RSEM: accurate transcript quantification from RNA-seq data with or without a reference genome. BMC Bioinform. 12, 323 (2011).

    Article  CAS  Google Scholar 

  85. Bates, D., Mächler, M., Bolker, B. M. & Walker, S. C. Fitting linear mixed-effects models using lme4. J. Stat. Softw. 67, 1–48 (2015).

    Article  Google Scholar 

  86. Therneau, T. M., Grambsch, P. M. & Pankratz, V. S. Penalized survival models and frailty. J. Comput. Graph. Stat. 12, 156–175 (2003).

    Article  MathSciNet  Google Scholar 

  87. Therneau, T. coxme: mixed effects Cox models. R package version 2.2-3 www.cran.R-project.org/package=coxme.Oikos (2012).

  88. Sanchez-Vega, F. et al. Oncogenic signaling pathways in the Cancer Genome Atlas. Cell 173, 321–337 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  89. Mootha, V. K. et al. PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat. Genet. 34, 267–273 (2003).

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We thank all of the members of the laboratory for their support and discussions; the staff at the IRB Barcelona core facilities for biostatistics, histopathology, functional genomics and advanced digital microscopy, as well as the flow cytometry, animal facilities of the UB/PCB, and the CRG genomic unit for assistance; the staff at the HCB-IDIBAPS Biobank for sample and data procurement. Sample collection of this work was also supported by the Xarxa de Bancs de Tumours de Catalunya (XBTC) sponsored by Pla Director d’Oncologia (PDO). A.C.-S., G.T. and L.J.-G. have held FPU fellowships from Spanish Ministry of Universities and A.A.-V. has held a La Caixa predoctoral fellowship. S.C. holds a FPI fellowship from the Spanish Ministry of Economy and Competitiveness (MINECO). H.H. is a Miguel Servet (CP14/00229) researcher funded by the Spanish Institute of Health Carlos III (ISCIII) and the Agencia Estatal de Investigación (AEI) and FEDER (SAF2017-89109-P). This work has been supported by ERC advanced grants 884623 (residualCRC to E.B.) and 883739 (Epifold to X.T.); LCF/PR/HR19/52160018 from La Caixa foundation; PID2020-119917RB-I00 from the Spanish MICINN; and CRUK Accelerator Award C7932/A26825 (ACRCelerate), in collaboration with AECC (grant GEACC19006BAT_2021). E.B., X.T. and H.H. are supported by the Fundació La Marató de TV3 (201903-30-31-32). H.H. received support for the project PID2020-115439GB-I00- funded by MCINN/AEI/10.13039/501100011033. S.L. was supported by a Wellcome Trust Senior Clinical Research Fellowship (206314/Z/17/Z)), E.J.M. is funded by the Lee Placito Medical Research Fund (University of Oxford). IRB Barcelona and IBEC are recipients of a Severo Ochoa Award of Excellence from MINECO. Single-cell profiling of CRC samples was supported by the Belgian Federation against Cancer grant nos 2018-127 and 2016-133 and by a grant from Fondation Roi-Baudouin. S.T. is supported by a Fundamental Clinical Researchers KU Leuven grant and Foundation against Cancer grant for this work. The Magdalena Socias Moyà fund supports metastasis research at IRB Barcelona.

Author information

Authors and Affiliations

Authors

Contributions

E.B. and A.C.-S. conceived the study, coordinated experiments and wrote the manuscript. A.C.-S. designed and performed key experiments including the profiling of residual disease, the analyses of tumour buds and micrometastases, genetic ablation studies and immunotherapy experiments in the CRC relapse model. A.C.-S., C.C., G.T., F.S. and D.S. generated and characterized MTO knock-in lines. A.C.-S., X.H.-M., S.P.-P. and G.T. developed the CRC relapse model. A.C.-S., X.H.-M. and S.P.-P. performed all mouse work. L.M. and C.S.-O.A. analysed scRNA-seq data. C.S.-O.A. and A.B.-L. analysed human CRC transcriptomic datasets. A.C.-M., L.M. and C.S.-O.A. performed statistical analyses. C.C. and T.S. performed IF and imaged organoids in vitro. A.C.-S., C.C. and O.R. generated and characterized YAP-KD models. A.C.-S., C.C., O.R., S.C. and X.T. performed in vitro co-cultures. A.C.-S. and A.A.-V. quantified immunofluorescence stainings. A.C.-S. and N.F. developed the method to purify residual tumour cells in whole livers. M.S. performed IF and IHC. E.J.M. and S.L. performed multiplex IF. L.N. performed chemotherapy experiments. L.B. and J.C. performed 3D light-sheet imaging. L.J.-G., C.C., P.L. and H.H. provided support with scRNA-seq experiments. S.T. generated scRNA-seq data from CRC patient samples. D.V.F.T. and D.S. generated MTO and CTO biobanks. E.S. provided strategic support and helped with figures and manuscript writing. E.B. supervised the study.

Corresponding author

Correspondence to Eduard Batlle.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature thanks Hugo Snippert, Itay Tirosh, Nicola Valeri and the other, anonymous, reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 The EpiHR gene set marks a defined tumour cell population across CRCs.

a-c, UMAP layout of whole tumours (stroma + epithelium cells) from 7 CRC patients in the KUL dataset. Coloured by (a) gene expression of all high hazard ratio genes (AllHR), (b) tumour microenvironment-specific HR genes (TME-HR), and (c) epithelial-specific HR genes (EpiHR). d, Association between clinical variables and the EpiHR signature in the CRC meta cohort was assessed by fitting a linear model for each variable independently. Technical factors (dataset and centre, as described in extended methods) were included as covariates. Lines show the left and right confidence intervals. n = 1688 patients. e. Kaplan-Meier survival curves indicating relapse-free survival according to EpiHR gene signature expression for CRC patients classified by CMS. Two-sided Wald test. f-g, UMAP layout of 2718 CRC tumour cells from the KUL cohort coloured by f) patient ID and g) expression of the EpiHR signature. h, Heatmap showing Pearson correlation scores in gene expression among EpiHR signature genes in patients from the SMC cohort. Note that most genes belong to one coherent subset (Cluster 1). Gene lists are detailed in Supplementary Table 2. i, UMAP layout of human CRC tumour cells coloured by the expression of genes belonging to Clusters 1, 2, 3 and 4 identified in (h).

Source data

Extended Data Fig. 2 Characterization of HRC features.

a, Heatmap showing scaled expression of the top 50 most correlated genes with the EpiHR signature across SMC patients. Tumour cells are divided as non-HRCs (left, FALSE) and HRCs (right, TRUE). The EpiHR signature score for each individual cell is plotted above the heatmap. b, UMAP layout of CRC tumour cells coloured by the expression of the coreHRC signature. The coreHRC signature is defined as the top 100 genes with better correlation with the EpiHR signature. c-d, Heatmap showing Normalized Enrichment Scores (NES) for Gene sets in Gene Ontology Biological Processes (GOBP) in HRCs from different patients in the KUL (c) and SMC (d) cohorts. Only GOBP gene sets with NES scores above 0.5 are shown. Gene sets and patients are ordered by hierarchical clustering. e, UMAP layout of human CRC tumour cells in the SMC cohort painted with the Basal cell state signature in Pancreatic Ductal Adenocarcinoma (PDAC) by Raghavan et al10. f, UMAP layout of tumour cells from the KUL cohort showing the expression of the Lgr5 signature. g, UMAP of same tumour cells labelled according to their classification into HRCs, Lgr5+, double positive or other cells. h-j, UMAP layout of same tumour cells showing gene expression levels of canonical intestinal stem cell genes LGR5, OLFM4 and ASCL2. k-o, UMAPs of tumour cells in the SMC dataset showing gene expression levels of canonical intestinal stem cell genes LGR5, ASCL2, AXIN2, OLFM4, and SMOC2. p,q, Violin plots showing WNT-ON signature expression levels in epithelial tumour cells from patients in the SMC (p) and KUL (q) cohorts. r, Barplot quantifying the HRC composition of each patient (combined SMC and KUL datasets). Patients are classified as iCMS2 or iCMS3 according to Joanito et al11. Two-sided Kruskal-Wallis test. s,t, UMAPs of mouse CRC AKTP tumour cells coloured according to (s) the coreHRC signature and (t) the Basal cell state signature in Pancreatic cancer by Raghavan et al.10 u, Gene expression levels of EpiHR (left) and coreHRC (right) signatures in MTOs derived from primary tumours or from liver metastases. Boxes represent the first, second (median) and third quartiles. Whiskers indicate maximum and minimum values. Welch two-sided t-test. n = 5 (primary) 10 (metastatic).

Source data

Extended Data Fig. 3 Analysis of the CRC relapse mouse models and purification of DTCs.

a, Representative micrograph of hematoxylin- and eosin (HE)-stained adenocarcinoma with subserosal invasion (T4) generated by injection of an AKTP MTO in the mouse caecum. Tumour centre (TC), invasive fronts (IF), muscle layer (ML) and normal mucosa (NM) are indicated. Scale bar, 2.5 mm. b, Representative image of a different T4 tumour penetrating the muscle layer (ML) and reaching the serosa layer. TB: Tumour buds. Scale bar, 1 mm. c, Picture of a caecum 21 days after injection and imaged at the time of surgery showing a primary tumour (arrow) in the distal part. d-e, Haematoxilin-Eosin (HE) staining of micrometastases and large metastases observed in the liver of orthotopic isografted mouse. Scale bars, 50 µm and 1 mm, respectively. In e, tumoral tissue is surrounded by dashed lines. f-h, HE staining of lung (f), lymph node (g) and diaphragm (h) metastases from orthotopic isografted mice. Scale bars, 100 µm (f and h) 1 mm (g). i, Graph showing liver longitudinal BLI measurements (photons per second), normalized to the day of primary tumour resection. Points and lines represent individual mice. n = 9 (AKTP), 24 (AKP) mice. j, Schematic representation of a novel tissue-dissociation strategy that enables recovery of DTCs from livers. Whole livers are dissected and minced thoroughly. After a mild collagenase IV digestion, samples are filtered through 100 µm meshes. The filter retained sample is highly enriched in tumour cells. Remaining tissue in the filter is re-digested with a stronger enzymatic cocktail to fully digest it, and then re-filtered. k, Representative bioluminescent image of a whole liver sample containing luciferase+ tumour cells before enzymatic digestion (B, input), after filtering through 100 µm (B’) and 40 µm (B’’) meshes (previous protocol), and after recovering and re-digesting tissue retained in the 100 µm filter (B’’’). l, Image showing the large cell pellet containing liver cells after 1 mild digestion and the small pellet in the retained and re-digested sample enriched in DTCs. m, Percentage of GFP+ cells measured by flow cytometry in samples with 1 round of digestion compared to re-digested samples. Boxes represent the first, second (median) and third quartiles. Whiskers indicate maximum and minimum values. Paired two-sided Wilcoxon test on percentages. n = 6 independent paired samples examined in 2 independent experiments. n, Representative bioluminescent images, tumour burden and flow-cytometry plots of the 4 different stages analysed by single-cell Smart-sequencing described in Fig. 2. Micrometastases samples were DTCs collected from livers with absent or low bioluminescence in which metastases were not visible. For small metastases samples, metastatic nodules were visible but small in size (<1.5 mm). Macrometastases samples were metastatic nodules larger than 4mm.

Source data

Extended Data Fig. 4 Additional description of residual AKTP and AKP metastatic cells.

a, UMAPs of colorectal primary tumours and liver metastases at different stages (micro, small and large) coloured according to sequencing batch, mouse ID, and sample ID. b, UMAPs showing the expression levels of coreHRC, EpiHR, and mKi67 gene signatures and Lgr5 and Krt20 genes. c, Violin plots showing expression of relevant genes used to define the 6 different Seurat clusters. d, Fraction of cells (y axis) from each Seurat cluster (x axis) present in the different sample types: Primary Tumour, micro-, small- and macro- metastases according to the indicated color code. Note the “HRCs Krt20-” are mostly exclusive from micro metastases samples, whereas Lgr5+ cells are highly enriched in small metastases samples. e, Smoothed Krt20 gene and partial EMT gene signature17 expression trends fitted with Generalized Additive Models as a function of pseudotime in primary tumours, micro+small and large metastases. f, g, UMAP of AKP liver micrometastases coloured according to timing of profiling and Seurat clusters. h, Barplot showing proportion of different Seurat tumour cell types captured in AKP early vs late micrometastases. i-l, UMAPs showing the expression levels of the coreHRC, mKi67 and Mex3a gene signatures and Lgr5 mRNA in AKP metastases. m, Barplot showing Seurat cluster distribution across AKP early and late micrometastases. n. Violin plots showing expression levels of the Mex3a signature16 in AKP early and late micrometastases versus AKTP micro and small metastases. o, Vector fields representing RNA velocity projected on UMAPs of AKP micrometastases, coloured by the pseudotime estimated for each cell with scVelo. p, Smoothed coreHRC, mKi67, and Lgr5 gene signature expression trends in the early and late AKP micrometastasis dataset fitted with generalized additive models as a function of CellRank pseudotime.

Source data

Extended Data Fig. 5 Epithelial membrane protein 1 (EMP1) marks HRCs.

a, Scatter plot showing the correlation value between individual genes in the human SMC cohort (x axis) and in mouse primary tumours (y axis) with the EpiHR signature. Genes with correlation scores higher than 0.8 in both datasets are highlighted. b, UMAP of tumour cells from CRC patients in the SMC dataset coloured according to the expression of EpiHR signature (left) and of EMP1 gene (right). c, As in b, for CRC tumour cells from the KUL datasets. d, UMAP representation of Smart-sequencing single cell data of AKTP mouse tumour cells along metastatic relapse sequence coloured by the EpiHR signature (left) and Emp1 gene (right). e, Vector fields representing RNA velocity projected on AKTP primary CRC, micro+small and macro metastases UMAPs, coloured by the pseudotime estimated for each cell with scVelo. f, AKTP tumour cell UMAPs coloured by Emp1 gene expression. g, Smoothed Emp1 gene expression trends fitted with Generalized Additive Models as a function of pseudotime in AKTP primary tumour, micro+small and macro metastases samples. h, UMAP representation of AKP micrometastases coloured by Emp1 gene expression. i, Smoothed Emp1 gene expression trends fitted with Generalized Additive Models as a function of pseudotime in AKP micrometastases samples. j, Representative flow cytometry plot of TOM expression in wild-type and Emp1-iCT AKTP MTOs. k, Relative mRNA expression of indicated marker genes in Emp1-TOMhigh and Emp1- TOMlow sorted cell populations from Emp1-iCT AKTP MTOs in vitro. Two-sided t-test after normalizing by Ppia. n = 3 technical replicates. Mean +/- SD. l, Boxplot showing normalized intensity of coreHRC signature expression in Emp1-TOMhigh and Emp1- TOMlow cells dissociated from primary tumours 4 weeks post-implantation. Box plots have whiskers of maximum 1.5 times the interquartile range; boxes represent first, second (median) and third quartiles. n = 4 mice per condition. ROAST-GSA adjusted p-values are shown.

Source data

Extended Data Fig. 6 HRCs are enriched in invasion fronts and micrometastases.

a, Primary tumour outlined by cyan line and coloured in 4 different regions identified with HALO image analysis classifier (tumour-red, stroma-green, background-yellow, necrosis-blue). Scale bar, 1 mm. b, TOM cell intensity analysis in the tumour area after segmentation into individual cells. B’ and B’’ show magnified regions corresponding to tumour core (B’) and invasion fronts + tumour buds (B’’). Scale bars, 1 mm (B), 100 µm (B’ and B’’). c, Representative immunostaining of TOM and E-CADH in the tumour core and in tumour buds of primary tumours derived from Emp1-iCT MTOs 4 weeks post implantation in the caecum. TOM fluorescence is shown with mpl-inferno LUT. The dashed line delimits the caecum edge. Arrows point to tumour buds. Scale bars, 100 µm (tumour core) 50 µm (tumour buds). d, Quantification of Emp1-TOMhigh (defined as cells in percentile 90 for TOM expression) in the tumour core (submucosal area), invasion fronts (inside muscular layer) and isolated glands (over muscular layer). Boxes represent the first, second (median) and third quartiles. Whiskers indicate maximum and minimum values. Two-sided Wilcoxon test on percentages. n = 8 mice. e, Immunofluorescence of TOM, CD31 and DAPI in primary tumours. Amplified insets show the tumour core and invasive glands intermingled in mucosal layers (ML) next to blood vessels. Dashed lines outline healthy intestinal epithelium. Scale bars, 250 µm, 100 µm (tumour core) and 50 µm (tumour buds). f, Representative flow cytometry plot of TOM expression in wild-type and Emp1-iCT AKP MTOs. g, Relative mRNA expression of indicated genes in Emp1-TOMhigh and Emp1-TOMlow sorted cell populations from Emp1-iCT AKP MTOs. Two-sided t-test after normalizing by Ppia. n = 3 technical replicates. Mean +/− SD. h, Representative immunostaining for TOM and E-CADHERIN in Emp1-iCT AKP tumours implanted in the caecum 4 weeks post-implantation. Emp1-TOM fluorescence is shown with an mpl-inferno LUT. Dashed lines delimit the edge of the caecum. Scale bar: 250 µm. i, Representative images of TOM and E-CADHERIN staining in micro (left) and medium (right) size metastases. Scale bars: 50 µm and 250 µm. j, Percentage of tumour area containing TOM-high and TOM-low fluorescent pixels versus metastases size (in pixels). Each dot represents an individual metastasis. k, TOM, KRT20 and E- CADHERIN staining in primary tumours generated by Emp1-iCasp9-tdTomato AKTP MTOs. Dashed lines encompass invasion fronts and tumour buds. KRT20 staining is observed in normal mucosa (NM) and to a lesser extent in the tumour core. Tumour cell clusters invading the muscular layer (ML) express high levels of TOM and no KRT20. Amplified insets show an example of tumour core (K’) and invasion fronts (K’’) with TOM (left) and KRT20 (right) stainings. Scale bars, 500 µm (k) and 100 µm (K’ and K’’). l, Immunofluorescence of TOM and E-CADHERIN (left) and KRT20 and E-CADHERIN (right) in a cluster of tumour cells that enter the liver through a portal vein (PV, delimited with dashed lines). Scale bar, 50 µm.

Source data

Extended Data Fig. 7 HRCs retain an epithelial phenotype.

a, Immunostaining of E-CADHERIN and TOM in Emp1-iCT primary tumours 4 weeks post-implantation of MTOs. Arrows point at examples of E-CADHERIN+ invasion fronts and tumour buds. Dashed lines show the caecum edge. Scale bars, 100 µm. b, Boxplot showing normalized expression of genes related to EMT in Emp1-TOMhigh versus Emp1- TOMlow cells. Box plots have whiskers of maximum 1.5 times the interquartile range; Boxes represent first, second (median) and third quartiles. P-value for differential expression with Linear Model for Microarray Analysis (limma). n = 4 biological replicates. c-d, Violin plots showing expression of selected EMT-related genes in HRCs versus the rest of other cells in mouse epithelial primary tumour cells (c) and human tumour cells from the SMC cohort (d). Genes present in b not shown (Snai1 and Snai2) were undetected in (c). e, Representative example of EMP1 mRNA FISH combined with LAMC2 immunofluorescence on human primary CRC tissue section showing an overlapping pattern of expression of EMP1 and LAMC2 in invasion fronts and tumour buds (arrows). Scale bar, 100 µm. f, Representative example of EMP1 mRNA FISH combined with KRT17 and EPCAM immunofluorescence on human primary CRC tissue sections showing an overlapping pattern of expression of EMP1 and KRT17 in invading fronts and tumour cell clusters (arrows). Scale bars, 100 µm. g, Violin plots showing enrichment of LAMC2, KRT17 and several cell-to-cell adhesion genes in HRCs (SMC cohort).

Source data

Extended Data Fig. 8 Emp1 and Lgr5 mark distinct tumour cell populations.

a, Emp1-iCasp9-tdTomato and Lgr5-EGFP alleles introduced in AKTP MTOs. Confocal imaging of TOM, EGFP and E-CADHERIN immunostaining in edited MTOs. Single z-plane. Scale bar, 10 µm. b, Relative mRNA expression of indicated marker genes in EGFP-high/TOM-low and EGFP-low/TOM-high sorted cells dissociated from subcutaneous AKTP Emp1-iCT Lgr5-EGFP tumours. Two-sided t-test after normalizing by PPIA. Mean +/− SD. n = 3 technical replicates. c, Immunostaining of TOM, EGFP and E-CADHERIN in Emp1-iCT Lgr5-EGFP primary tumours 4 weeks post-implantation of MTOs in the caecum. Dashed lines encompass tumour buds. Scale bar, 250 µm. d, Scatter plot showing normalized Emp1-TOM intensity versus normalized Lgr5-EGFP intensity in 855,330 cells from 18 different primary tumours. Note the absence of double positive cells (TOM and EGFP high). e, Representative immunofluorescence staining of TOM, EGFP and E-CADHERIN in liver metastases of increasing size (micro, small, medium) generated from the mouse CRC relapse model. Scale bars, 25 µm (micro) 100 µm (small) 250 µm (medium). f, Scatter plot showing TOM intensity versus EGFP intensity in 318,276 cells from 137 different liver metastases. Note the absence of double positive cells (TOM and EGFP high). g-p, Examples of dual EMP1 and LGR5 mRNA ISH combined with E-CADHERIN immunofluorescence on human primary CRC tissue sections demonstrating a mutually exclusive pattern of expression of EMP1 and LGR5. Note that EMP1 expression is elevated in invasion fronts and tumour cell buds (white arrows). Scale bars, 500 µm (l, p) 250 µm (g, h, i, j, m, n, o) 50 µm (H’, k).

Source data

Extended Data Fig. 9 YAP/TAZ signalling is not required for HRC specification.

a-c, Violin plots comparing the expression of genes belonging to the YAP-22 core signature32 (a), the top 50 genes from the Fetal intestine progenitor signature31 (b) and the top 50 genes of the coreHRC signature (c) in HRCs vs other cells in the SMC scRNA-seq cohort. d, Venn Diagram showing genes that overlap between the coreHRC signature and YAP-22 or Fetal intestine progenitor signatures. e. Boxplot showing normalized intensity of YAP-22 signature expression in Emp1-TOMhigh and Emp1- TOMlow cells dissociated from primary tumours 4 weeks post-implantation. Box plots have whiskers of maximum 1.5 times the interquartile range; boxes represent first, second (median) and third quartiles. n = 4 mice per condition. ROAST-GSA adjusted p-values are shown. f. Relative mRNA expression measured by RT-qPCR in Emp1-TOMhigh and Emp1-TOMlow sorted cell populations from AKTP Emp1-iCT primary CRCs. Two-sided t-test after normalizing by PPIA. n = 4 biological replicates. Mean +/- SD. g, Western blot for YAP and VINCULIN in non-infected, shControl and shYap infected AKTP organoids. h, Western blot quantification of YAP normalized by Vinculin. i, Relative mRNA expression (mean ± SD) in MTOs infected with shControl plasmid compared to uninfected MTOs and MTOs infected with three different shYAP plasmids. Analysed with a mixed effects linear model after normalizing by PPIA housekeeping gene. n = 2 biological replicates with 3 technical replicates. j, Percentage (mean ± SD) of Emp1-TOMhigh cells in organoids infected with shControl or shYAP plasmids. Two-sided Wilcoxon rank-sum test. n = 3 (sh67) 7 (all other) measurements examined over 4 independent experiments. k, Relative mRNA expression (mean ± SD) in MTOs infected with pInducer GFP-TEADi plasmid treated or untreated with doxycycline (DOX). GFP+ cells were sorted in DOX treated organoids, whereas alive cells were sorted in untreated MTOs. n = 3 technical replicates. Analysed with a mixed effects linear model after normalizing by PPIA housekeeping gene. l, Representative flow cytometry plot showing Emp1-TOM fluorescence versus pInducer GFP-TEADi fluorescence in TEADi MTOs untreated or treated with DOX. m, Quantification (mean ± SD) of Emp1-TOMhigh in TEADi MTOs untreated or treated with DOX for 5 days. Two-sided Wilcoxon rank-sum test. n = 2 biological replicates with 3 technical replicates. n, Boxplot showing expression levels (normalized intensity) of YAP-22, Fetal and coreHRC signature genes in control MTOs versus MTOs treated with chemotherapy (folfiri) for 4 days. Boxes represent the first, second (median) and third quartiles. Whiskers indicate maximum and minimum values. n = 3 biological replicates per condition. Two-sided t-test. o, Boxplot showing the expression levels of relevant genes in control MTOs versus MTOs treated with chemotherapy (folfiri) for 4 days. Boxes represent the first, second (median) and third quartiles. Whiskers indicate maximum and minimum values. n = 3 biological replicates per condition. Two-sided t-test.

Source data

Extended Data Fig. 10 KRAS mutations and CAFs specify the HRC population responsible for metastatic relapse.

a-b, Normalized intensity of EpiHR and coreHRC signature expression in CTOs grouped by gain of function mutation in Kras g12d and loss of function mutations in p53, Smad4 and Tgfbr2. Box plots have whiskers of maximum and minimum values; boxes represent first, second (median) and third quartiles. n = 6 (WT) 5 (MUT) CTOs; 2 technical replicates. P-values for two-sided T-tests. c, Percentage of Emp1high tumour cells (defined as the top 10% of the TOM population in control MTOs, mean ± SD) in parental (non-infected), control shRNA or shRNAs targeting YAP1. n = 5 biological replicates. P-value for two-sided t-test. d, Normalized intensity of the coreHRC signature expression in control MTOs versus MTOs co-cultured with colon fibroblasts. Box plots have whiskers of maximum 1.5 times the interquartile range; boxes represent first, second (median) and third quartiles. n = 4 biological replicates. ROAST-GSA adjusted p-value is shown. e, Representative images of MTOs Emp1-iCT Lgr5-EGFP co-cultured with colon fibroblasts. Maximum intensity projection of confocal stacks, step 4 µm, z stack 120 µm. Scale bar, 50 µm. f, Immunostaining of α-SMA Emp1-iCT Lgr5-EGFP MTOs co-cultured with colon fibroblasts for 2 days. Scale bars, 100 µm. g, Immunostaining of KRT17 in 4-days grown MTO Emp1-iCT organoids: fibroblast co-cultures and organoids alone control cultures. Scale bars, 50 µm. h, Ablation by dimerizer (DIM) treatment and surgery schedule of mice with AKTP Emp1-iCT primary tumours to assess the recovery of HRCs upon treatment cessation. i, Percentage (mean ± SD) of Emp1-TOMhigh cells (defined as top 10% in control animals) in untreated mice versus mice treated with DIM, with treatment discontinued at various timepoints post-injection. Two-sided T-test. n = 3 (control) 4 (rest) mice. j, Representative immunostainings showing effective Emp1-TOMhigh cell ablation in DIM-treated primary tumours and recovery upon treatment cessation. Dashed lines delimitate the caecum edge. Scale bars, 250 µm. k, Lung metastases (mean ± SD) generated by MTO Emp1-iCT up to one month after primary tumour resection, treated with vehicle or DIM as in Fig. 4a. Each dot is a mouse; n = 34 (control) 29 (DIM). P-value for generalized linear model with negative binomial family. l, Inducible ablation schedule of nude mice (nu/nu) implanted with AKTP Emp1-iCT primary tumours. Resection was not possible due to local spreading of tumours to neighbouring tissue. m, Primary tumour area (mean ± SD) measured at sacrifice. Each dot is a mouse; n = 4 (control), 5 (DIM) mice. P-value for linear model. n, Liver metastases (mean ± SD) generated by MTO Emp1-iCT. Each dot is a mouse; n = 4 (control), 6 (DIM) mice. P-value for generalized linear model. o, Schematics of an experiment to analyse the potential of Emp1+, Lgr5+ or double negative (DN) cells to colonize the liver and generate metastases. 25,000 FACS-sorted Emp1-TOM-high, Lgr5-EGFP-high or double negative cells were injected intrasplenically. p, Metastatic growth measured by BLI. q, Liver metastases (mean ± SD) generated by Emp1-high, Lgr5-high or double negative cells. Each dot is a mouse; n = 5 mice. P-value for generalized linear model. r, Distribution of liver metastasis diameters (mean ± SD). n = 5 mice per condition. s, Percentage (mean ± SD) of Emp1-TOMhigh, Lgr5-EGFPhigh and double negative tumour cells in metastases generated by the injection of Emp1-TOMhigh, Lgr5-EGFPhigh and double negative cells, n = 9 (Emp1 and Lgr5) and 10 (DN) mice. Two-sided t-test. t, Experimental setup showing inducible HRC ablation after surgery of primary AKPT CRCs. u, Liver metastases (mean ± SD) generated by MTO Emp1-iCT in mice treated with vehicle or DIM 1 day after primary tumour resection and until experimental endpoints. Each dot is a mouse; n = 30 (control) 12 (DIM) mice P-value for generalized linear model. v, Percentage (mean ± SD) of small (diameter equal or smaller than 1 mm2) and big metastases (bigger than 1 mm2) in mice treated with vehicle or DIM 1 day after primary tumour resection and until experimental endpoints. Mixed effects linear model after boxcox transformation with mouse as random effect, n = 20 (control) and 7 (DIM) mice. w, Percentage of mice that developed liver metastases in control and Emp1-ablated mice. Analysed with a generalized linear model.

Source data

Extended Data Fig. 11 Metastatic relapse in different mouse CRC models arises from HRCs.

a, Inducible ablation and surgery schedule of mice with AKP Emp1-iCT primary tumours. Panels A and A’ show immunostaining of TOM and E-CADHERIN demonstrating effective ablation of Emp1-high cells in primary CRCs. Dashed lines delimitate the caecum edge. Scale bars, 500 µm. b, Primary tumour area (mean ± SD) measured after resection. Each dot is a mouse, n = 12 (control) and 6 (DIM) mice. P-value for linear model after boxcox transformation. c, Liver metastases (mean ± SD) generated by MTO AKP Emp1-iCT up to one month after primary tumour resection. Each dot is a mouse, n = 12 (control) and 6 (DIM) mice. P-value for generalized linear model with negative binomial family. Bottom panel indicates the percentage of mice that developed liver metastases in the same experiment. Analysed with a two-sided fisher test. d, Inducible ablation and surgery schedule of mice with AKPS Emp1-iCT primary tumours. Panels D and D’ show immunostaining of TOM demonstrating effective ablation of Emp1-TOMhigh cells in DIM-treated primary tumours. Dashed lines delimitate the caecum edge. Scale bars, 250 µm. e, Primary tumour area (mean ± SD) measured after resection. Each dot is a mouse, n = 17 (control) and 19 (DIM) mice. P-value for linear model after boxcox transformation. f, Liver metastases (mean ± SD) generated by MTO AKPS Emp1-iCT up to one month after primary tumour resection. Each dot is a mouse, n as in panel e. P-value for generalized linear model with negative binomial family. Bottom panel indicate the percentage of mice that developed liver metastases in the same experiment. Analysed with a two-sided fisher test. g, Inducible ablation schedule of mice implanted with AKTP Emp1-iCT MTOs in the rectum. h, Longitudinal intravital BLI quantification of AKTP MTOs implanted in the rectum. i, Representative TOM and E-CADHERIN immunostainings of lungs from mice bearing AKTP rectal tumours. Lung metastases of increasing size are shown. Note that TOM expression is higher in micrometastases and progressively reduced. Scale bars, 50 µm. j, Primary rectal tumour area (mean ± SD) measured at sacrifice. Each dot is a mouse, n = 9 (control) and 10 (DIM) mice. P-value for linear model after boxcox transformation. k, Lung (left panel) and liver (middle panel) metastases (mean ± SD) generated by MTO Emp1-iCT injected in the rectum. Each dot is a mouse, n as in panel j. P-value for generalized linear model with Poisson family. Right panel shows the percentage of mice that developed metastases in the same experiment. Analysed with a two-sided fisher test. l, CRISPR-Cas9 targeting strategy to introduce an DTR-GFP cassette into the Lgr5 locus of MTOs. Confocal imaging of immunostaining for EGFP and EPCAM in Lgr5-DTR-EGFP organoids. Scale bar, 30 µm. Right panel shows a representative flow cytometry plot of EGFP expression in wild-type and Lgr5-EGFP organoids. m, Relative Lgr5 mRNA expression (mean ± SD) of Lgr5-EGFPhigh versus -low cells isolated from Lgr5-DTR-EGFP subcutaneous tumours. n = 3 biological replicates. Two-sided t-test normalizing to B2M. n, Immunofluorescence showing EGFP and E-CADHERIN in primary tumours. Insets (N’ and N’’) correspond to invasion fronts and tumour buds lacking EGFP expression at higher magnification. Scale bars, 500 µm (D) and 100 µm (D’ and D’’). o, Quantification of Lgr5-EGFPhigh cells (defined as cells in percentile 90 for EGFP expression) in the tumour core, invasion fronts and tumour buds. Boxes represent the first, second (median) and third quartiles. Whiskers indicate maximum and minimum values. Paired two-sided Wilcoxon test on percentages. n = 11 mice. p, Representative images of Lgr5-EGFP staining in micro (P) and small (P’) metastases. Dashed lines and the yellow arrow surround a micrometastasis. Scale bars: (F) 50 µm; (F’) 250 µm. q, Percentage of tumour area containing Lgr5-EGFPhigh and Lgr5-EGFPlow cells versus metastases size. Each dot represents an individual metastasis. r, CRISPR-Cas9 targeting strategy to introduce an iCaspase-9-TOM cassette into the LGR5 locus of AKTP MTOs. s, Representative flow cytometry plot of TOM expression in Lgr5-iCasp9-tdTomato organoids. t, Quantification of Lgr5 mRNA (mean ± SD) by RT-qPCR in Lgr5-TOMhigh and Lgr5-TOMlow cells dissociated from primary tumours grown for 4 weeks. n = 3 primary tumours. Analysed with a mixed effects linear model. u, Timing of inducible ablation and surgery in mice implanted with AKTP Lgr5-iCasp9-TOM primary tumours. v, Representative flow cytometry plot of Lgr5-TOM fluorescence in controls versus dimerizer-treated mice. DAPI-/EPCAM+ cells are shown. w, Percentage (mean ± SD) of Lgr5high tumour cells (defined as the top 10% of the TOM+ population) in control and treated mice. n = 4 mice each group. Two-sided Wilcoxon test. x, Primary tumour area measured after resection. n = 15 mice each group. Mean with SD, p-value of linear model after boxcox transformation. y, Liver metastases counted at experimental endpoints after primary tumour resection. n = 16 (control) and 21 (Lgr5-ablation) mice. Mean ± SD. Analysed with a linear model with negative binomial family. Left panel show the percentage of mice that developed liver metastases in control and Lgr5-ablated tumours in the same experiment. Two-sided Fisher test.

Source data

Extended Data Fig. 12 Immune checkpoint immunotherapies prevent metastatic relapse.

a, Representative image of CD3+ cell distribution in primary AKTP CRC showing T cell exclusion. Arrows point to T cells located at the tumour periphery. b, Representative immunostaining of Emp1-TOM, CD3 and α-SMA in primary tumours. Scale bars, 100 µm. c, Dotplot summarizing regression models applied to multiplex immunofluorescence data. Effects of the total number of cells on the composition of every cell population are represented by different point sizes (defining the magnitude of the effect) and colours (showing both the sign of the effect in blue(-)/red(+) and the statistical significance by color intensity). d, Dotplot showing examples of interferon response genes across 6 tumour scRNA-seq cell clusters as defined in Fig. 2g. e, Bioluminescence monitoring of the effect of the neoadjuvant immunotherapy regime used in Fig. 5k on primary tumour growth. Points and lines represent individual mice, trend lines (bold) show a LOESS model. n = 19 (control) 10 (PD1+CTLA4) mice. Mixed effects linear model with data normalized to time 0 and mouse as random effect. f, Schematics of an experiment comparing metastatic relapse in untreated mice and mice treated with neoadjuvant treatment with anti-PD1 monotherapy or anti-PD1+/anti-CTLA4 combined therapy. g, Primary tumour area (mean ± SD) measured after resection in the experiment described in f. Each dot is an individual mice. n = 10 mice each group. Linear model. h, Liver metastases (mean ± SD) generated by AKTP primary tumours up to one month after primary tumour resection in the experiment described in f. Each dot is an individual mice. n = 9 (control), 8 (PD1), 10 (PD1/CTLA4). Generalized linear model of Poisson family. i, Percentage of mice that developed liver metastases or remained metastases-free at experimental endpoints (4 weeks after resection) in control and immunotherapy-treated tumours. n = 9 (control), 8 (PD1), 10 (PD1/CTLA4). Generalized linear model with beta-binomial distribution.

Source data

Supplementary information

Supplementary Data

Supplementary Fig. 1: the raw western blot scan related to Extended Data Fig. 9g. Supplementary Fig. 2: the flow cytometry gating strategy.

Reporting Summary

Supplementary Table 1

Descriptive table of CRC metacohort. Datasets used for the metacohort of 1,830 patients and clinical information for each one.

Supplementary Table 2

EpiHR and coreHRC signatures. Correlation scores with the EpiHR signature for individual genes, in the SMC and KUL cohorts.

Supplementary Table 3

Correlation of basal pancreatic genes with the EpiHR signature. Correlation scores with the EpiHR signature for genes in the basal pancreatic signature.

Supplementary Table 4

Functional enrichment of cell population across metastatic relapse. −log10[Padj] for functional enrichment analysis on Hallmark gene sets in Seurat tumour cell clusters from Fig. 2g.

Supplementary Table 5

Genotyping primers to validate the insertion of our cassettes in the Emp1 and Lgr5 locus.

Supplementary Table 6

Immunofluorescence protocol. This table shows the antibodies that were used for immunofluorescence stainings, as well as the experimental protocol that we followed.

Supplementary Table 7

Gene signatures used in the study. The custom gene set used in this study. The table provides the NCBI gene symbol of the components of every set.

Supplementary Video 1

Surgical resection of AKPT primary tumours in the distal caecum. The surgical procedure to extirpate a single CRC tumour implanted in the tip of the caecum for three weeks is shown.

Source data

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cañellas-Socias, A., Cortina, C., Hernando-Momblona, X. et al. Metastatic recurrence in colorectal cancer arises from residual EMP1+ cells. Nature 611, 603–613 (2022). https://doi.org/10.1038/s41586-022-05402-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41586-022-05402-9

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing