This article has been updated


Intratumoral heterogeneity is a major obstacle to cancer treatment and a significant confounding factor in bulk-tumor profiling. We performed an unbiased analysis of transcriptional heterogeneity in colorectal tumors and their microenvironments using single-cell RNA–seq from 11 primary colorectal tumors and matched normal mucosa. To robustly cluster single-cell transcriptomes, we developed reference component analysis (RCA), an algorithm that substantially improves clustering accuracy. Using RCA, we identified two distinct subtypes of cancer-associated fibroblasts (CAFs). Additionally, epithelial–mesenchymal transition (EMT)-related genes were found to be upregulated only in the CAF subpopulation of tumor samples. Notably, colorectal tumors previously assigned to a single subtype on the basis of bulk transcriptomics could be divided into subgroups with divergent survival probability by using single-cell signatures, thus underscoring the prognostic value of our approach. Overall, our results demonstrate that unbiased single-cell RNA–seq profiling of tumor and matched normal samples provides a unique opportunity to characterize aberrant cell states within a tumor.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Change history

  • 12 November 2018

    In the version of the article published, the author list is not accurate. Igor Cima and Min-Han Tan should have been authors, appearing after Mark Wong in the author list, while Paul Jongjoon Choi should not have been listed as an author. Igor Cima and Min-Han Tan vboth have the affiliation Institute of Bioengineering and Nanotechnology, Singapore, Singapore, and their contributions should have been noted in the Author Contributions section as "I.C. preprocessed Primary Cell Atlas data with inputs from M.-H.T." The following description of the contribution of Paul Jongjoon Choi should not have appeared: "P.J.C. supported the smFISH experiments.” In the 'RCA: global panel' section of the Online Methods, the following sentence should have appeared as the second sentence, "An expression atlas of human primary cells (the Primary Cell Atlas) was preprocessed similarly to in ref. 55," with new reference 55 (Cima, I. et al. Tumor-derived circulating endothelial cell clusters in colorectal cancer. Science Transl. Med. 8, 345ra89, 2016).


Primary accessions

Gene Expression Omnibus

Referenced accessions


  1. 1.

    , , & The causes and consequences of genetic heterogeneity in cancer evolution. Nature 501, 338–345 (2013).

  2. 2.

    & Tumour heterogeneity and cancer cell plasticity. Nature 501, 328–337 (2013).

  3. 3.

    & Accessories to the crime: functions of cells recruited to the tumor microenvironment. Cancer Cell 21, 309–322 (2012).

  4. 4.

    The first five years of single-cell cancer genomics and beyond. Genome Res. 25, 1499–1507 (2015).

  5. 5.

    et al. Single-cell RNA–seq highlights intratumoral heterogeneity in primary glioblastoma. Science 344, 1396–1401 (2014).

  6. 6.

    et al. Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA–seq. Science 352, 189–196 (2016).

  7. 7.

    et al. Reconstructing lineage hierarchies of the distal lung epithelium using single-cell RNA–seq. Nature 509, 371–375 (2014).

  8. 8.

    et al. Single-cell messenger RNA sequencing reveals rare intestinal cell types. Nature 525, 251–255 (2015).

  9. 9.

    et al. De novo prediction of stem cell identity using single-cell transcriptome data. Cell Stem Cell 19, 266–277 (2016).

  10. 10.

    et al. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA–seq. Science 347, 1138–1142 (2015).

  11. 11.

    , , , & Spatial reconstruction of single-cell gene expression data. Nat. Biotechnol. 33, 495–502 (2015).

  12. 12.

    , & On the widespread and critical impact of systematic bias and batch effects in single-cell RNA–seq data. Preprint at bioRxiv (2015).

  13. 13.

    et al. MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data. Genome Biol. 16, 278 (2015).

  14. 14.

    et al. Global cancer statistics, 2012. CA Cancer J. Clin. 65, 87–108 (2015).

  15. 15.

    & Influence of tumour micro-environment heterogeneity on therapeutic response. Nature 501, 346–354 (2013).

  16. 16.

    et al. Stromal contribution to the colorectal cancer transcriptome. Nat. Genet. 47, 312–319 (2015).

  17. 17.

    et al. Stromal gene expression defines poor-prognosis subtypes in colorectal cancer. Nat. Genet. 47, 320–329 (2015).

  18. 18.

    et al. Single-cell transcriptome analysis reveals coordinated ectopic gene-expression patterns in medullary thymic epithelial cells. Nat. Immunol. 16, 933–941 (2015).

  19. 19.

    , , & Illegitimate transcription: transcription of any gene in any cell type. Proc. Natl. Acad. Sci. USA 86, 2617–2621 (1989).

  20. 20.

    , , & Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95, 14863–14868 (1998).

  21. 21.

    & Comparing partitions. J. Classif. 2, 193–218 (1985).

  22. 22.

    , , , & Improved ancestry estimation for both genotyping and sequencing data using projection Procrustes analysis and genotype imputation. Am. J. Hum. Genet. 96, 926–937 (2015).

  23. 23.

    et al. Single-cell dissection of transcriptional heterogeneity in human colon tumors. Nat. Biotechnol. 29, 1120–1127 (2011).

  24. 24.

    et al. Isolation of human colon stem cells using surface expression of PTK7. Stem Cell Rep. 5, 979–987 (2015).

  25. 25.

    et al. The intestinal stem cell signature identifies colorectal cancer stem cells and predicts disease relapse. Cell Stem Cell 8, 511–524 (2011).

  26. 26.

    et al. Exploring TCGA pan-cancer data at the UCSC cancer genomics browser. Sci. Rep. 3, 2652 (2013).

  27. 27.

    , , & Causal analysis approaches in Ingenuity Pathway Analysis. Bioinformatics 30, 523–530 (2014).

  28. 28.

    et al. Overexpression of peptide deformylase in breast, colon, and lung cancers. BMC Cancer 13, 321 (2013).

  29. 29.

    et al. Peptide deformylase inhibitor actinonin reduces celastrol's HSP70 induction while synergizing proliferation inhibition in tumor cells. BMC Cancer 14, 146 (2014).

  30. 30.

    & Macrophage diversity enhances tumor progression and metastasis. Cell 141, 39–51 (2010).

  31. 31.

    Motexafin gadolinium: a redox-active tumor selective agent for the treatment of cancer. Curr. Opin. Oncol. 16, 576–580 (2004).

  32. 32.

    et al. Dose-dependent roles for canonical Wnt signalling in de novo crypt formation and cell cycle properties of the colonic epithelium. Development 140, 66–75 (2013).

  33. 33.

    et al. Poor-prognosis colon cancer is defined by a molecularly distinct subtype and develops from serrated precursor lesions. Nat. Med. 19, 614–618 (2013).

  34. 34.

    et al. A colorectal cancer classification system that associates cellular phenotype and responses to therapy. Nat. Med. 19, 619–625 (2013).

  35. 35.

    et al. Gene expression classification of colon cancer into molecular subtypes: characterization, validation, and prognostic value. PLoS Med. 10, e1001453 (2013).

  36. 36.

    et al. Reconciliation of classification systems defining molecular subtypes of colorectal cancer: interrelationships and clinical implications. Cell Cycle 13, 353–357 (2014).

  37. 37.

    et al. The prognostic landscape of genes and infiltrating immune cells across human cancers. Nat. Med. 21, 938–945 (2015).

  38. 38.

    et al. The consensus molecular subtypes of colorectal cancer. Nat. Med. 21, 1350–1356 (2015).

  39. 39.

    et al. Clinical implications of fibroblast activation protein in patients with colon cancer. Clin. Cancer Res. 13, 1736–1741 (2007).

  40. 40.

    et al. Stromal myofibroblasts predict disease recurrence for colorectal cancer. Clin. Cancer Res. 13, 2082–2090 (2007).

  41. 41.

    et al. Application of genome-wide expression analysis to human health and disease. Proc. Natl. Acad. Sci. USA 102, 4801–4806 (2005).

  42. 42.

    et al. Dependency of colorectal cancer on a TGF-β-driven program in stromal cells for metastasis initiation. Cancer Cell 22, 571–584 (2012).

  43. 43.

    et al. Interaction with colon cancer cells hyperactivates TGF-β signaling in cancer-associated fibroblasts. Oncogene 33, 97–107 (2014).

  44. 44.

    et al. Functional heterogeneity of cancer-associated fibroblasts from human colon tumors shows specific prognostic gene expression signature. Clin. Cancer Res. 19, 5914–5926 (2013).

  45. 45.

    Cancer-associated fibroblasts as another polarized cell type of the tumor microenvironment. Front. Oncol. 4, 62 (2014).

  46. 46.

    et al. Cancer cell invasion driven by extracellular matrix remodeling is dependent on the properties of cancer-associated fibroblasts. J. Cancer Res. Clin. 142, 437–446 (2016).

  47. 47.

    , & Cancer-associated fibroblasts as targets for immunotherapy. Immunother. 4, 1129–1138 (2012).

  48. 48.

    , & Rationale behind targeting fibroblast activation protein–expressing carcinoma-associated fibroblasts as a novel chemotherapeutic strategy. Mol. Cancer Ther. 11, 257–266 (2012).

  49. 49.

    et al. Twist, a master regulator of morphogenesis, plays an essential role in tumor metastasis. Cell 117, 927–939 (2004).

  50. 50.

    et al. Epithelial-to-mesenchymal transition is not required for lung metastasis but contributes to chemoresistance. Nature 527, 472–476 (2015).

  51. 51.

    et al. Epithelial-to-mesenchymal transition is dispensable for metastasis but induces chemoresistance in pancreatic cancer. Nature 527, 525–530 (2015).

  52. 52.

    et al. Collagen-rich stroma in aggressive colon tumors induces mesenchymal gene expression and tumor cell invasion. Oncogene 35, 5263–5271 (2016).

  53. 53.

    et al. BCL-XL mediates the strong selective advantage of a 20q11. 21 amplification commonly found in human embryonic stem cell cultures. Stem Cell Rep. 1, 379–386 (2013).

  54. 54.

    , , , & Fast, scalable and accurate differential expression analysis for single cells. Preprint at bioRxiv (2016).

Download references


We would like to thank L. Suteja, C. Kang, S. Sudhagar, J. Sheik and M.N. Ramalingam for technical assistance, A. Brichkina (Institute of Molecular and Cell Biology, A*STAR) for providing the antibody to SMA, V. Sivakamasundari for guidance on single-cell protocols, M.H. Chew, R. Ten, W.J. Lim, J.H. Lai, C.Y. Ng and D. Koh for assistance with clinical sample collection, and Y. Hu, S. Ghosh, H. Kitano and D. Tan for feedback and scientific discussions. This study was supported by core funds from the Agency for Science, Technology and Research (A*STAR) and also by grant JCO1331CFG080 from A*STAR's Joint Council Office. P.R. acknowledges support from Agency of Science, Technology and Research grants IAF111091 and IAF111128 and associated in-kind contributions from industry partners Fluidigm Singapore and Becton Dickinson Holdings, respectively. S.L.K. and A.M.H. acknowledge the Strategic Positioning Fund (SPF2012/003) from the Biomedical Research Council (BMRC).

Author information

Author notes

    • Huipeng Li
    •  & Elise T Courtois

    These authors contributed equally to this work.


  1. Computational and Systems Biology, Genome Institute of Singapore, Singapore.

    • Huipeng Li
    • , Elise T Courtois
    • , Debarka Sengupta
    • , Yuliana Tan
    •  & Shyam Prabhakar
  2. Developmental Cellomics Laboratory, Genome Institute of Singapore, Singapore.

    • Elise T Courtois
    • , Yuliana Tan
    •  & Paul Robson
  3. Department of Computer Science and Engineering and Center for Computational Biology, Indraprastha Institute of Information Technology, Delhi, India.

    • Debarka Sengupta
  4. Synthetic Biology, Genome Institute of Singapore, Singapore.

    • Kok Hao Chen
    • , Jolene Jie Lin Goh
    •  & Paul Jongjoon Choi
  5. Cancer Therapeutics and Stratified Oncology, Genome Institute of Singapore, Singapore.

    • Say Li Kong
    • , Axel M Hillmer
    •  & Iain Beehuat Tan
  6. Department of Medical Oncology, National Cancer Centre Singapore, Singapore.

    • Clarinda Chua
    •  & Iain Beehuat Tan
  7. Department of Pathology, Singapore General Hospital, Singapore.

    • Lim Kiat Hon
  8. Department of Colorectal Surgery, Singapore General Hospital, Singapore.

    • Wah Siew Tan
    •  & Mark Wong
  9. Data Analytics Department, Institute for Infocomm Research, Singapore.

    • Lawrence J K Wee
  10. Program in Cancer and Stem Cell Biology, Duke–NUS Medical School, Singapore.

    • Iain Beehuat Tan
  11. The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, USA.

    • Paul Robson
  12. Department of Genetics and Genome Sciences, Institute for Systems Genomics, University of Connecticut, Farmington, Connecticut, USA.

    • Paul Robson
  13. Department of Biological Sciences, National University of Singapore, Singapore.

    • Paul Robson


  1. Search for Huipeng Li in:

  2. Search for Elise T Courtois in:

  3. Search for Debarka Sengupta in:

  4. Search for Yuliana Tan in:

  5. Search for Kok Hao Chen in:

  6. Search for Jolene Jie Lin Goh in:

  7. Search for Say Li Kong in:

  8. Search for Clarinda Chua in:

  9. Search for Lim Kiat Hon in:

  10. Search for Wah Siew Tan in:

  11. Search for Mark Wong in:

  12. Search for Paul Jongjoon Choi in:

  13. Search for Lawrence J K Wee in:

  14. Search for Axel M Hillmer in:

  15. Search for Iain Beehuat Tan in:

  16. Search for Paul Robson in:

  17. Search for Shyam Prabhakar in:


H.L., E.T.C., L.J.K.W. I.B.T., P.R. and S.P. conceived the idea and designed the study. H.L. developed the computational algorithms and performed the bioinformatic analysis. E.T.C. optimized and conducted the experiments. D.S. assisted with the data analysis. Y.T. assisted with the experiments. J.J.L.G. and K.H.C. performed the smFISH experiments. S.L.K. assisted with the initial protocol validation. W.S.T. and L.K.H. extracted and preprocessed clinical samples and guided patient selection. C.C. coordinated the clinical sample registration, preprocessing and logistics. P.J.C. supported the smFISH experiments. A.H. provided guidance in experimental protocol design. H.L. and E.T.C. analyzed the data and interpreted the results. S.P. guided the development of computational algorithms. I.B.T., P.R. and S.P. provided guidance in data analysis and interpretation of the results. H.L., E.T.C., I.B.T., P.R. and S.P. wrote the manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to Iain Beehuat Tan or Paul Robson or Shyam Prabhakar.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–16, Supplementary Tables 2 and 6, and Supplementary Note.

CSV files

  1. 1.

    Supplementary Table 1

    Cell type annotation for the melanoma data set.

  2. 2.

    Supplementary Table 5

    Differentially expressed genes between bulk tumor and matched normal tissue (TCGA data).

Excel files

  1. 1.

    Supplementary Table 3

    Differentially expressed genes between epithelial subtypes in normal mucosa.

  2. 2.

    Supplementary Table 4

    Cell-type-specific differentially expression analysis between normal mucosa and tumor.

About this article

Publication history





Further reading