Cultured cell lines are the workhorse of cancer research, but the extent to which they recapitulate the heterogeneity observed among malignant cells in tumors is unclear. Here we used multiplexed single-cell RNA-seq to profile 198 cancer cell lines from 22 cancer types. We identified 12 expression programs that are recurrently heterogeneous within multiple cancer cell lines. These programs are associated with diverse biological processes, including cell cycle, senescence, stress and interferon responses, epithelial–mesenchymal transition and protein metabolism. Most of these programs recapitulate those recently identified as heterogeneous within human tumors. We prioritized specific cell lines as models of cellular heterogeneity and used them to study subpopulations of senescence-related cells, demonstrating their dynamics, regulation and unique drug sensitivities, which were predictive of clinical response. Our work describes the landscape of heterogeneity within diverse cancer cell lines and identifies recurrent patterns of heterogeneity that are shared between tumors and specific cell lines.
This is a preview of subscription content, access via your institution
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Raw and processed scRNA-seq data are available through the Broad Institute’s single-cell portal (SCP542) and at the Gene Expression Omnibus (GEO) (accession number GSE157220). Publicly available databases used in our analysis included the DepMap portal (18q3 data release; https://depmap.org/), the CCLE portal (https://portals.broadinstitute.org/ccle), the CTD2 portal (https://ocg.cancer.gov/programs/), GTRD database version 20.06 (http://gtrd.biouml.org) and MSigDB version 7.0 (https://www.gsea-msigdb.org/gsea/msigdb/index.jsp).
R code for reproducing the analyses shown in the main figures is available at https://github.com/gabrielakinker/CCLE_heterogeneity. Additional code related to extended data and supplementary figures is available upon request from the corresponding author. Code used for the assignment of cells to reference expression profiles is available at https://github.com/broadinstitute/single_cell_classification.
McGranahan, N. & Swanton, C. Biological and therapeutic impact of intratumor heterogeneity in cancer evolution. Cancer Cell 27, 15–26 (2015).
Chaffer, C. L., San Juan, B. P., Lim, E. & Weinberg, R. A. EMT, cell plasticity and metastasis. Cancer Metastasis Rev. 35, 645–654 (2016).
Filbin, M. G. et al. Developmental and oncogenic programs in H3K27M gliomas dissected by single-cell RNA-seq. Science 360, 331–335 (2018).
Patel, A. P. et al. Single-cell RNA-seq highlights intratumoral heterogeneity in primary glioblastoma. Science 344, 1396–1401 (2014).
Puram, S. et al. Single-cell transcriptomic analysis of primary and metastatic tumor ecosystems in head and neck cancer. Cell 171, 1611–1624 (2017).
Tirosh, I. et al. Dissecting the multicellular ecosystem of metastatic melanoma by single-cell RNA-seq. Science 352, 189–196 (2016).
Tirosh, I. et al. Single-cell RNA-seq supports a developmental hierarchy in human oligodendroglioma. Nature 539, 309–313 (2016).
Venteicher, A. S. et al. Decoupling genetics, lineages, and microenvironment in IDH-mutant gliomas by single-cell RNA-seq. Science 355, eaai8478 (2017).
Chung, W. et al. Single-cell RNA-seq enables comprehensive tumour and immune cell profiling in primary breast cancer. Nat. Commun. 8, 15081 (2017).
Kim, K. T. et al. Application of single-cell RNA sequencing in optimizing a combinatorial therapeutic strategy in metastatic renal cell carcinoma. Genome Biol. 17, 80 (2016).
Li, H. et al. Reference component analysis of single-cell transcriptomes elucidates cellular heterogeneity in human colorectal tumors. Nat. Genet. 49, 708–718 (2017).
Lambrechts, D. et al. Phenotype molding of stromal cells in the lung tumor microenvironment. Nat. Med. 24, 1277–1289 (2018).
Barretina, J. et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603–607 (2012).
Ghandi, M. et al. Next-generation characterization of the Cancer Cell Line Encyclopedia. Nature 569, 503–508 (2019).
Yu, C. et al. High-throughput identification of genotype-specific cancer vulnerabilities in mixtures of barcoded tumor cell lines. Nat. Biotechnol. 34, 419–423 (2016).
Kang, H. M. et al. Multiplexed droplet single-cell RNA-sequencing using natural genetic variation. Nat. Biotechnol. 36, 89–94 (2018).
Aibar, S. et al. SCENIC: single-cell regulatory network inference and clustering. Nat. Methods 14, 1083–1086 (2017).
Neftel, C. et al. An integrative model of cellular states, plasticity, and genetics for glioblastoma. Cell 178, 835–849 (2019).
Izar, B. et al. A single-cell landscape of high-grade serous ovarian cancer. Nat. Med. 26, 1271–1279 (2020).
Chen, Q., Sun, L. & Chen, Z. J. Regulation and function of the cGAS–STING pathway of cytosolic DNA sensing. Nat. Immunol. 17, 1142–1149 (2016).
Kondo, T. et al. DNA damage sensor MRE11 recognizes cytosolic double-stranded DNA and induces type I interferon by regulating STING trafficking. Proc. Natl Acad. Sci. USA 110, 2969–2974 (2013).
Shaffer, S. M. et al. Rare cell variability and drug-induced reprogramming as a mode of cancer drug resistance. Nature 546, 431–435 (2017).
Aceto, N. et al. Circulating tumor cell clusters are oligoclonal precursors of breast cancer metastasis. Cell 158, 1110–1122 (2014).
Hernandez-Segura, A. et al. Unmasking transcriptional heterogeneity in senescent cells. Curr. Biol. 27, 2652–2660 (2017).
Jang, D. H. et al. A transcriptional roadmap to the senescence and differentiation of human oral keratinocytes. J. Gerontol. A Biol. Sci. Med. Sci. 70, 20–32 (2015).
Musiani, D. et al. PRMT1 is recruited via DNA-PK to chromatin where it sustains the senescence-associated secretory phenotype in response to cisplatin. Cell Rep. 30, 1208–1222 (2020).
Yang, L., Fang, J. & Chen, J. Tumor cell senescence response produces aggressive variants. Cell Death Discov. 3, 17049 (2017).
Pawlikowski, J. S. et al. Wnt signaling potentiates nevogenesis. Proc. Natl Acad. Sci. USA 110, 16009–16014 (2013).
Hanzelmann, S. et al. Replicative senescence is associated with nuclear reorganization and with DNA methylation at specific transcription factor binding sites. Clin. Epigenetics 7, 19 (2015).
Basisty, N. et al. A proteomic atlas of senescence-associated secretomes for aging biomarker development. PLoS Biol. 18, e3000599 (2020).
Coppe, J. P., Desprez, P. Y., Krtolica, A. & Campisi, J. The senescence-associated secretory phenotype: the dark side of tumor suppression. Annu. Rev. Pathol. 5, 99–118 (2010).
Lee, S. & Schmitt, C. A. The dynamic nature of senescence in cancer. Nat. Cell Biol. 21, 94–101 (2019).
Ben-David, U. et al. Genetic and transcriptional evolution alters cancer cell line drug response. Nature 560, 325–330 (2018).
te Poele, R. H., Okorokov, A. L., Jardine, L., Cummings, J. & Joel, S. P. DNA damage is able to induce senescence in tumor cells in vitro and in vivo. Cancer Res. 62, 1876–1883 (2002).
Yosef, R. et al. Directed elimination of senescent cells by inhibition of BCL-W and BCL-XL. Nat. Commun. 7, 11190 (2016).
Bozulic, L., Surucu, B., Hynx, D. & Hemmings, B. A. PKBα/Akt1 acts downstream of DNA-PK in the DNA double-strand break response and promotes survival. Mol. Cell 30, 203–213 (2008).
Wong, R. H. et al. A role of DNA-PK for the metabolic gene regulation in response to insulin. Cell 136, 1056–1072 (2009).
Elkabets, M. et al. AXL mediates resistance to PI3Kα inhibition by activating the EGFR/PKC/mTOR axis in head and neck and esophageal squamous cell carcinomas. Cancer Cell 27, 533–546 (2015).
Park, Y. M. et al. Anti-cancer effects of disulfiram in head and neck squamous cell carcinoma via autophagic cell death. PLoS ONE 13, e0203069 (2018).
Shah O’Brien, P. et al. Disulfiram (Antabuse) activates ROS-dependent ER stress and apoptosis in oral cavity squamous cell carcinoma. J. Clin. Med. 8, 611 (2019).
Bossi, P. et al. Functional genomics uncover the biology behind the responsiveness of head and neck squamous cell cancer patients to cetuximab. Clin. Cancer Res. 22, 3961–3970 (2016).
Beausejour, C. M. et al. Reversal of human cellular senescence: roles of the p53 and p16 pathways. EMBO J. 22, 4212–4222 (2003).
Sage, J., Miller, A. L., Perez-Mancera, P. A., Wysocki, J. M. & Jacks, T. Acute mutation of retinoblastoma gene function is sufficient for cell cycle re-entry. Nature 424, 223–228 (2003).
Parikh, A. S. et al. Immunohistochemical quantification of partial-EMT in oral cavity squamous cell carcinoma primary tumors is associated with nodal metastasis. Oral Oncol. 99, 104458 (2019).
Hoek, K. S. et al. Metastatic potential of melanomas defined by specific gene expression profiles with no BRAF signature. Pigment Cell Res. 19, 290–302 (2006).
Verfaillie, A. et al. Decoding the regulatory landscape of melanoma reveals TEADS as regulators of the invasive cell state. Nat. Commun. 6, 6683 (2015).
Konieczkowski, D. J. et al. A melanoma cell state distinction influences sensitivity to MAPK pathway inhibitors. Cancer Discov. 4, 816–827 (2014).
Muller, J. et al. Low MITF/AXL ratio predicts early resistance to multiple targeted drugs in melanoma. Nat. Commun. 5, 5712 (2014).
McFarland, J. M. et al. Multiplexed single-cell transcriptional response profiling to define cancer vulnerabilities and therapeutic mechanism of action. Nat. Commun. 11, 4296 (2020).
Benci, J. L. et al. Opposing functions of interferon coordinate adaptive and innate immune responses to cancer immune checkpoint blockade. Cell 178, 933–948 (2019).
Keren-Shaul, H. et al. MARS-seq2.0: an experimental and analytical pipeline for indexed sorting combined with single-cell RNA sequencing. Nat. Protoc. 14, 1841–1862 (2019).
Li, B. & Dewey, C. N. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12, 323 (2011).
Liberzon, A. A description of the Molecular Signatures Database (MSigDB) web site. Methods Mol. Biol. 1150, 153–160 (2014).
This work was supported by funding from the Israel Science Foundation (I.T.), the Zuckerman STEM leadership program (I.T.), a Mexican Friends New Generation grant (I.T.), the Rising Tide Foundation (I.T.), the AMN Fund for the Promotion of Science, Culture and Arts in Israel (I.T.), the Estate of Dr. David Levinson, the Dr. Celia Zwillenberg-Fridman and Dr. Lutz Zwillenberg Career Development Chair (I.T.), the Sao Paulo Research Foundation (FAPESP) (fellowships 2014/27287-0 and 2017/24287-8 (G.S.K.)), the Clore Foundation Postdoctoral Fellowship (A.C.G.), the Klarman Cell Observatory (A.R.), the Howard Hughes Medical Institute (A.R.), the National Cancer Institute (K08CA237732; S.V.P.), a V Foundation V Scholars Award (S.V.P.), a Cancer Research Foundation Young Investigator Award (S.V.P.) and the Dorris Duke Fund to Retain Clinical Scientists (S.V.P.).
A.R. is a cofounder and equity holder of Celsius Therapeutics, an equity holder of Immunitas and was an SAB member of Neogene Therapeutics, Thermo Fisher Scientific, Asimov and Syros Pharmaceuticals until 31 July 2020. Since 1 August 2020, A.R. is an employee of Genentech, a member of the Roche group. O.R.-R. is a co-inventor on patent applications filed by the Broad Institute for inventions relating to single-cell genomics, such as PCT/US2018/060860 and US provisional application no. 62/745,259. No other authors declare competing interests.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
a, Performance of DBSCAN using different sizes of epsilon neighborhood (eps) and minimum numbers of points required to form a dense region (MinPts). We randomly selected cells from two different cell lines and tested the ability of DBSCAN to distinguish between them (two-sided Fisher’s exact test, P < 0.001) using different parameter combinations. The procedure was repeated 1,000 times and the combination yielding the highest rate of correct classification was applied in the subsequent analyses. b, t-SNE plots for additional two examples of cell lines from each of the four classes defined by presence and number of discrete subpopulations identified by DBSCAN (as in Fig. 2b). c, d, Identification of discrete programs of heterogeneity, as in Fig. 2b, using less stringent eps (1.2 and 1.5) highlights common trends. e, Number of heterogeneity programs identified per cell line using NMF. NMF was applied to each cell line using k (number of factors) of 6–9, and gene programs identified as variable with 2 or more values of k were retained (left panel, n = 1,445). To identify common expression programs varying within multiple cell lines, we excluded programs with limited similarity to all other programs as well as those associated with technical confounders (right panel, n = 800). f, Pairwise similarities between programs identified by NMF across all the cell lines analyzed, with cell lines ordered by hierarchical clustering. Programs with limited similarity to all other programs were excluded. Top panel indicates correlations between program scores and cell complexity (that is number of genes detected per cell). The cluster of programs that correlates with complexity (indicated by dashed lines) was excluded from subsequent analyses.
Extended Data Fig. 2 In vivo programs of variability and comparison to in vitro cell cycle programs.
a, Heatmap depicts pairwise similarities between programs identified in tumor samples using NMF. Programs with limited similarity to all other programs were excluded. Top panel shows tumor type and correlations between program scores and cell complexity (that is number of genes detected per cell). Hierarchical clustering emphasizes multiple clusters (shown by squares), one of which is correlated with cell complexity and thus excluded as a potential technical artifact. b, NMF scores of G1/S genes (top panel) and G2/M genes (bottom panel) across all NMF programs associated with the corresponding cell cycle phase; each program is from a different cell line. Genes are ranked in each panel by average scores, and their assignment to in vitro and in vivo cell cycle programs is indicated in the right bar, demonstrating that G1/S programs differ both across cell lines and between cell lines and tumors, while G2/M programs are more consistent. Venn diagrams (right) illustrate the overlap of genes between in vivo and in vitro RHPs. c, Single-cell profiles (n = 264 cells from NCIH2126) showing G1/S and G2/M program score thresholds used to assign cells to different cell cycle phases. d, Examples of genes with distinct cell cycle upregulation in vitro and in vivo. Expression of HIST genes (preferentially induced in vitro) and MCM genes (preferentially induced in vivo) is shown along the cell cycle (relative to cells in G0) in cell lines (C, green lines) and tumors (D, yellow lines). e, Comparison of cell cycle phase distribution in vitro and in vivo. Scatterplot shows the percentage of cells in G0 (x-axis) and the ratio between the percentage of cells in G1/S and G2/M (y-axis) for each cell line (green, n = 198) and tumor (yellow, n = 25) analyzed. Cell lines display a significantly lower percentage of cells in G0 cells (P=2e−10, two-sided t-test).
a, Average distribution of cell cycle phases in cells with high RHP scores (top 5%). b, Median RHP scores of cells in each phase of the cell cycle. Cell cycle state was estimated for each individual cell based on the relative expression of the G1/S and G2/M metaprograms (see Extended Data Fig. 2c). For each RHP in (a) and (b) we only considered the respective model cell lines (see Supplementary Table 3). c, Distribution of RHPs (n = 680) across the 9 pools (H corresponds to the custom HNSCC pool). Each RHP was observed in multiple pools, underscoring the lack of pool-specific effects and the robustness of RHPs. d, The fraction of cell lines from each cancer type (rows) observed in each RHP (columns) are indicated by circle size. Red circles depict significant enrichments (P<0.001 by hypergeometric test). Left panel shows the fraction of each cancer type in our dataset and the number of cell lines profiled. e, Hierarchical clustering of cell lines based on their set of RHPs (bottom panel). Clusters composed of more than 5 cell lines are annotated (1–9). Relevant cell line features are shown on top and their associations with each cell line cluster was tested using hypergeometric test. Significant associations (P < 0.001) are indicated by black squares and arrows. These include associations of cluster 2 with melanoma, cluster 4 with HNSCC and squamous cells, and cluster 8 with wild-type p53.
a–d, Combined PCA of in vitro and in vivo cells in melanoma (A-B) and HNSCC (C-D). (a, c) Top 10 genes by positive (left) and negative (right) loadings of PC1–5; Genes of respective RHPs (SkinPig. and EMT-I in melanoma, EpiSen and EMT-II in HNSCC) are emphasized (bold). (b, d) Coordinates of the three PCs associated with the respective RHPs across in vitro (left, melanoma = 3,033 cells, HNSCC = 2,780 cells) and in vivo (right, melanoma = 1,169 cells, HNSCC = 1,078 cells) cells. Cells are colored by their cell line or tumor. Similar results were obtained when PCs were not combined (not shown). (e) Heatmap shows relative expression of genes shared by paired in vivo and in vitro programs in selected melanoma (top panels) and HNSCC (bottom panels) cell lines and tumors, highlighting similar patterns of variability in vivo and in vitro. Cells are sorted according to the relative average expression of genes in each program, showing the negative correlation between the AXL and MITF programs in melanomas and the pEMT and EpiSen programs in HNSCC. Programs are annotated (right) and selected genes are indicated (left).
a, Association between RHP variability scores and somatic non-silent mutations. We compared the variability of each program in mutated and non-mutated cell lines using two-sided t-test. Model cell lines (high variability) of EMT-II, IFN response and p53-dependent senescence RHPs are depleted of NOTCH4, MRE11 and TP53 mutations, respectively. Horizontal lines indicate the median. b, Association between drug response (CTRP database) and program variability calculated using linear regression including tumor type and program variability as independent variables. Increased sensitivity to NOTCH inhibition (gamma secretase inhibitors) and MDM2 inhibition (Nutlin-3) were observed in model cell lines (high variability) of the EMT-II and the p53-dependent senescence respectively.
a, Significance (-log10(P), hypergeometric test) of the overlaps between sets of genes upregulated upon senescence, as defined by multiple studies (see Table S8). b, Most common genes across the different senescence programs, including all genes appearing in at least three programs, ranked from top to bottom by the number of programs. c, Density of RHP signature genes within sliding windows of 300 genes, among the top 6,000 expressed genes, arranged by their expression response to etoposide, from the most upregulated (top) to the most downregulated (bottom), as also shown in Fig. 6b. RHPs are sorted from left to right by their enrichment with upregulated and downregulated genes, respectively. d, e, Significance of gene-set overlaps (d), and the most common genes (e), as in (a, b), when restricting the analysis to secreted genes, in order to define SASP.
a, Gating scheme for isolation of EpiSen-high and EpiSen-low subpopulations in the JHU006 cell line including doublet and dead cell exclusion. For sorting, the top 10% high and bottom 10% low cells were sorted. Validation of AXL+CLDN4- (EpiSen-high)-sorted cells enriching for the full EpiSen RHP was performed by bulk RNA-Seq of sorted cells (Fig. 5c, Methods). b, FACS analysis of cell cycle by the DNA binding dye propidium iodide (PI) on sorted EpiSen-high and EpiSen-low cells in SCC47, as shown for JHU006 in Fig. 5d. The table summarizes the results. c, Cell cycle scores of the G1/S (X-axis) and G2/M (Y-axis) programs, shown for in vivo HNSCC cells from six tumors (panels). Top 10% of cells scoring highly for the EpiSen-related program (EpiDif1) are shown in black, demonstrating their enrichment among non-cycling cells in each of the tumors (P<0.001, hypergeometric test). d, Three subpopulations were isolated by FACS (EpiSen-high: AXL-CLDN4+, EpiSen-low population: AXL+CLDN4-, control: unsorted) from SCC47 (left) and JHU006 (right), and analyzed immediately after sorting (day 0, top) and at two additional time points in culture (day 7 and 14, middle and bottom, respectively). Density plots correspond to the pie charts in Fig. 5e.
a, Average expression log-ratio for the gene-sets representing each RHP (each row represents an RHP), upon treatment of SCC47 or JHU006 with multiple perturbations (each column represents one perturbation in one cell line, as indicated at the bottom averaged over duplicates). b, Comparison of the EMT program induced upon TGFβ treatment of unsorted cells (X-axis) vs. the EMT-II RHP gene scores (Y-axis). In both axes, data was averaged over the results for JHU006 and SCC47. c, Isolation by FACS of the EMT-II-high population (AXL+ITGA6+) and the EMT-II-low population (AXL-ITGA6-) in JHU006. d, Both TGFβ3-treated cells and EMT-II-high sorted cells are associated with increased migration. Shown is a gap closure (migration) assay, performed on unsorted, unsorted but TGFβ3-treated, EMT-II-high, and EMT-II-low cells, at 0 h and 12 h following gap generation. Scale bar length is 1000 μM.
a, Pie charts depict the proportions of primary screen hits by type. b, Shared hits between SCC47 and JHU006 for compounds that preferentially killed the EpiSen-high (green) and EpiSen-low (orange) states. Selected hits are labeled. c, Viability of the control population (X-axis) and differential viability of the EpiSen-high vs. EpiSen-low populations (Y-axis) upon treatment with 2198 compounds in JHU006 (left) and SCC47 (right). Dotted lines represent thresholds for differential sensitivity, and hits are colored as defined in the lower legend. d, Dose response curves of selected compounds in three SCC47 subpopulations at seven concentrations measured in duplicate (continued from Fig. 6c). Change in viability was calculated relative to vehicle (DMSO-treated) controls. Error bars represent standard deviation, data points represent the mean of replicates.
Extended Data Fig. 10 Association between the abundance of EpiSen cells and clinical response of HNSCC patients to Cetuximab.
a, Volcano plots depict differential expression analysis comparing bulk pretreatment samples of HSNCC patients with short (n = 14) and long (n = 26) PFS following Cetuximab treatment plus platinum-based chemotherapy. Comparisons were performed using two-sided t-test, and P values were adjusted using the FDR procedure. Genes in the HNSCC EpiSen program from cell lines (in vitro, top panel) and tumors (in vivo, middle panel) are highlighted, as well as those shared between the two programs (bottom panel). b, Heatmap showing the expression of shared EpiSen genes in samples stratified into short and long PFS. Genes are order by differential expression (log2(fold change)), as shown in (a), and tumors are ordered within each group according to the EpiSen score. Top panels depicts sample type, tumor stage (T), tumor grade, and site of primary tumors.
About this article
Cite this article
Kinker, G.S., Greenwald, A.C., Tal, R. et al. Pan-cancer single-cell RNA-seq identifies recurring programs of cellular heterogeneity. Nat Genet 52, 1208–1218 (2020). https://doi.org/10.1038/s41588-020-00726-6
This article is cited by
Dual inhibition of CDK12 and CDK13 uncovers actionable vulnerabilities in patient-derived ovarian cancer organoids
Journal of Experimental & Clinical Cancer Research (2023)
Transcriptomic intratumor heterogeneity of breast cancer patient-derived organoids may reflect the unique biological features of the tumor of origin
Breast Cancer Research (2023)
Comprehensive analysis of scRNA-Seq and bulk RNA-Seq reveals dynamic changes in the tumor immune microenvironment of bladder cancer and establishes a prognostic model
Journal of Translational Medicine (2023)
Mesenchymal–epithelial transition in lymph node metastases of oral squamous cell carcinoma is accompanied by ZEB1 expression
Journal of Translational Medicine (2023)
BMC Medicine (2023)