Single-cell analysis reveals a stem-cell program in human metastatic breast cancer cells


Despite major advances in understanding the molecular and genetic basis of cancer, metastasis remains the cause of >90% of cancer-related mortality1. Understanding metastasis initiation and progression is critical to developing new therapeutic strategies to treat and prevent metastatic disease. Prevailing theories hypothesize that metastases are seeded by rare tumour cells with unique properties, which may function like stem cells in their ability to initiate and propagate metastatic tumours2,3,4,5. However, the identity of metastasis-initiating cells in human breast cancer remains elusive, and whether metastases are hierarchically organized is unknown2. Here we show at the single-cell level that early stage metastatic cells possess a distinct stem-like gene expression signature. To identify and isolate metastatic cells from patient-derived xenograft models of human breast cancer, we developed a highly sensitive fluorescence-activated cell sorting (FACS)-based assay, which allowed us to enumerate metastatic cells in mouse peripheral tissues. We compared gene signatures in metastatic cells from tissues with low versus high metastatic burden. Metastatic cells from low-burden tissues were distinct owing to their increased expression of stem cell, epithelial-to-mesenchymal transition, pro-survival, and dormancy-associated genes. By contrast, metastatic cells from high-burden tissues were similar to primary tumour cells, which were more heterogeneous and expressed higher levels of luminal differentiation genes. Transplantation of stem-like metastatic cells from low-burden tissues showed that they have considerable tumour-initiating capacity, and can differentiate to produce luminal-like cancer cells. Progression to high metastatic burden was associated with increased proliferation and MYC expression, which could be attenuated by treatment with cyclin-dependent kinase (CDK) inhibitors. These findings support a hierarchical model for metastasis, in which metastases are initiated by stem-like cells that proliferate and differentiate to produce advanced metastatic disease.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Single-cell analysis of normal human mammary epithelial cells.
Figure 2: Identification of human metastatic cells in PDX mice.
Figure 3: Early stage metastatic cells possess a distinct basal/stem-cell program.
Figure 4: Metastatic progression is blocked by cell cycle inhibition.

Accession codes

Primary accessions

Gene Expression Omnibus

Data deposits

Single-cell multiplex qPCR data have been deposited in the Gene Expression Omnibus under accession number GSE70555.


  1. 1

    Weigelt, B., Peterse, J. L. & van’t Veer, L. J. Breast cancer metastasis: markers and models. Nature Rev. Cancer 5, 591–602 (2005)

    CAS  Article  Google Scholar 

  2. 2

    Oskarsson, T., Batlle, E. & Massagué, J. Metastatic stem cells: sources, niches, and vital pathways. Cell Stem Cell 14, 306–321 (2014)

    CAS  Article  Google Scholar 

  3. 3

    Hermann, P. C. et al. Distinct populations of cancer stem cells determine tumor growth and metastatic activity in human pancreatic cancer. Cell Stem Cell 1, 313–323 (2007)

    CAS  Article  Google Scholar 

  4. 4

    Pang, R. et al. A subpopulation of CD26+ cancer stem cells with metastatic capacity in human colorectal cancer. Cell Stem Cell 6, 603–615 (2010)

    CAS  Article  Google Scholar 

  5. 5

    Dieter, S. M. et al. Distinct types of tumor-initiating cells form human colon cancer tumors and metastases. Cell Stem Cell 9, 357–365 (2011)

    CAS  Article  Google Scholar 

  6. 6

    Grigoriadis, A. et al. Establishment of the epithelial-specific transcriptome of normal and malignant human breast cells based on MPSS and array expression data. Breast Cancer Res. 8, R56 (2006)

    Article  Google Scholar 

  7. 7

    Jones, C. et al. Expression profiling of purified normal human luminal and myoepithelial breast cells: identification of novel prognostic markers for breast cancer. Cancer Res. 64, 3037–3045 (2004)

    CAS  Article  Google Scholar 

  8. 8

    Kendrick, H. et al. Transcriptome analysis of mammary epithelial subpopulations identifies novel determinants of lineage commitment and cell fate. BMC Genomics 9, 591 (2008)

    Article  Google Scholar 

  9. 9

    Raouf, A. et al. Transcriptome analysis of the normal human mammary cell commitment and differentiation process. Cell Stem Cell 3, 109–118 (2008)

    CAS  Article  Google Scholar 

  10. 10

    Shehata, M. et al. Phenotypic and functional characterisation of the luminal cell hierarchy of the mammary gland. Breast Cancer Res. 14, R134 (2012)

    CAS  Article  Google Scholar 

  11. 11

    Shackleton, M. et al. Generation of a functional mammary gland from a single stem cell. Nature 439, 84–88 (2006)

    ADS  CAS  Article  Google Scholar 

  12. 12

    Stingl, J. et al. Purification and unique properties of mammary epithelial stem cells. Nature 439, 993–997 (2006)

    ADS  CAS  Article  Google Scholar 

  13. 13

    Lim, E. et al. Aberrant luminal progenitors as the candidate target population for basal tumor development in BRCA1 mutation carriers. Nature Med. 15, 907–913 (2009)

    CAS  Article  Google Scholar 

  14. 14

    DeRose, Y. S. et al. Tumor grafts derived from women with breast cancer authentically reflect tumor pathology, growth, metastasis and disease outcomes. Nature Med. 17, 1514–1520 (2011)

    CAS  Article  Google Scholar 

  15. 15

    Dent, R. et al. Triple-negative breast cancer: clinical features and patterns of recurrence. Clin. Cancer Res. 13, 4429–4434 (2007)

    Article  Google Scholar 

  16. 16

    Malik, N., Canfield, V. A., Beckers, M. C., Gros, P. & Levenson, R. Identification of the mammalian Na,K-ATPase 3 subunit. J. Biol. Chem. 271, 22754–22758 (1996)

    CAS  Article  Google Scholar 

  17. 17

    Mani, S. A. et al. The epithelial-mesenchymal transition generates cells with properties of stem cells. Cell 133, 704–715 (2008)

    CAS  Article  Google Scholar 

  18. 18

    Guo, W. et al. Slug and Sox9 cooperatively determine the mammary stem cell state. Cell 148, 1015–1028 (2012)

    CAS  Article  Google Scholar 

  19. 19

    Landis, M. D., Lehmann, B. D., Pietenpol, J. A. & Chang, J. C. Patient-derived breast tumor xenografts facilitating personalized cancer therapy. Breast Cancer Res. 15, 201 (2013)

    Article  Google Scholar 

  20. 20

    Cheung, K. J., Gabrielson, E., Werb, Z. & Ewald, A. J. Collective invasion in breast cancer requires a conserved basal epithelial program. Cell 155, 1639–1651 (2013)

    CAS  Article  Google Scholar 

  21. 21

    Bragado, P. et al. TGF-β2 dictates disseminated tumour cell fate in target organs through TGF-β-RIII and p38α/β signalling. Nature Cell Biol. 15, 1351–1361 (2013)

    CAS  Article  Google Scholar 

  22. 22

    Kim, R. S. et al. Dormancy signatures and metastasis in estrogen receptor positive and negative breast cancer. PLoS ONE 7, e35569 (2012)

    ADS  CAS  Article  Google Scholar 

  23. 23

    Horiuchi, D. et al. MYC pathway activation in triple-negative breast cancer is synthetic lethal with CDK inhibition. J. Exp. Med. 209, 679–696 (2012)

    CAS  Article  Google Scholar 

  24. 24

    Huskey, N. E. et al. CDK1 inhibition targets the p53-NOXA-MCL1 axis, selectively kills embryonic stem cells, and prevents teratoma formation. Stem Cell Reports 4, 374–389 (2015)

    CAS  Article  Google Scholar 

  25. 25

    Parry, D. et al. Dinaciclib (SCH 727965), a novel and potent cyclin-dependent kinase inhibitor. Mol. Cancer Ther. 9, 2344–2353 (2010)

    CAS  Article  Google Scholar 

  26. 26

    Luo, B. et al. Highly parallel identification of essential genes in cancer cells. Proc. Natl Acad. Sci. USA 105, 20380–20385 (2008)

    ADS  CAS  Article  Google Scholar 

  27. 27

    Györffy, B. et al. An online survival analysis tool to rapidly assess the effect of 22,277 genes on breast cancer prognosis using microarray data of 1,809 patients. Breast Cancer Res. Treat. 123, 725–731 (2010)

    Article  Google Scholar 

  28. 28

    Nguyen-Ngoc, K. V. et al. ECM microenvironment regulates collective migration and local dissemination in normal and malignant mammary epithelium. Proc. Natl Acad. Sci. USA 109, E2595–E2604 (2012)

    CAS  Article  Google Scholar 

  29. 29

    Dalerba, P. et al. Single-cell dissection of transcriptional heterogeneity in human colon tumors. Nature Biotechnol. 29, 1120–1127 (2011)

    CAS  Article  Google Scholar 

  30. 30

    Guo, G. et al. Resolution of cell fate decisions revealed by single-from zygote to blastocyst. Dev. Cell 18, 675–685 (2010)

    CAS  Article  Google Scholar 

  31. 31

    Devonshire, A. S., Elaswarapu, R. & Foy, C. A. Applicability of RNA standards for evaluating RT-qPCR assays and platforms. BMC Genomics 12, 118 (2011)

    CAS  Article  Google Scholar 

  32. 32

    R. Developoment Core Team. A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2012)

  33. 33

    McDavid, A. et al. Data exploration, quality control and testing in single-cell qPCR-based gene expression experiments. Bioinformatics 29, 461–467 (2013)

    CAS  Article  Google Scholar 

Download references


We thank A. Welm for providing access to PDX tissues developed by her group, which served as the foundation for this study. We also thank K. Lee, R. Kumar, A. Le, R. Daneman, J. Stingl and M. Binneweis for comments and technical contributions. This study was supported by funds from the National Cancer Institute (CA180039 and CA136717), Stand Up To Cancer/AACR (DT0409), the Era of Hope Scholar Award (W81XWH-12-1-0272), the Breast Cancer Research Foundation and the Atwater Foundation, and D. and J. Vander Wall. D.A.L. was supported by a US Department of Defense Congressionally Directed Medical Research Program postdoctoral fellowship (11-1-0742), and C.W. is supported by a grant from the Ministry of Science and Technology, Taiwan (104-2917-I-006-002).

Author information




K.T. initiated the PDX models, and along with D.A.L., Y.Y., H.E. and A.Z. performed transplants and maintained serial passages of PDX models. D.A.L., K.D.P., and Y.Y. harvested and analysed PDX tissues. K.D.P. performed histological analysis of PDX mouse tissues. D.A.L., K.D.P., Y.Y., A.Z. and H.E. performed dinaciclib treatment experiments. K.K. performed dinaciclib experiments. P.Y. prepared reduction mammoplasty samples. D.A.L. isolated cells by FACS and performed single-cell dynamic array experiments. N.R.B. designed algorithms for single-cell qPCR analyses in R and contributed to multiplex PCR experimental design. D.A.L. and N.R.B. performed analyses in R. D.A.L. wrote the manuscript, and with K.K. designed figures and schematics. C.-Y.W. and S.B. performed bioinformatics analyses. All authors contributed to experimental design and conceived experiments. A.G. and Z.W. provided overall guidance, funding and assisted in manuscript completion.

Corresponding authors

Correspondence to Andrei Goga or Zena Werb.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Extended data figures and tables

Extended Data Figure 1 Identification and validation of CD298 for detection of human cells.

a, Analysis of published microarray data identified CD298 as highly expressed on many PDX breast cancer models and corresponding original patient tumours. The heatmap shows genes rank ordered from highest to lowest for raw expression values across all samples. The inset (bottom) highlights expression for CD298 (also known as ATP1B3). CD298 ranked number 35 out of over 590 plasma membrane genes. b, FACS for CD298 on human (top) and mouse (bottom) mammary cell lines to establish species specificity. c, FACS on primary PDX tumour cells comparing CD298 expression with other markers used in related applications (EpCAM, CD24, MHC I; percentages indicate dual-positive cells) (n = 3). EpCAM is used to identify CTCs in the clinic; CD24 is a pan-epithelial marker; and MHC I is used as a ubiquitous marker on all nucleated cells. These markers were not used in this study because they were not robustly expressed on all PDX models.

Extended Data Figure 2 Analysis of primary tumour growth kinetics and metastasis in PDX mice.

a, Weekly caliper measurements of primary tumours in two independent cohorts of animals show that growth kinetics were consistent within each PDX model. b, Bar graph shows that the average tumour volume at the endpoint was similar across PDX models. c, Bar graph shows the average number of weeks for tumours to reach endpoint (20–25 mm diameter) in each PDX model. d, Correlation plot shows that metastatic burden did not correlate with tumour volume in PDX animals. e, Table summarizing the number of metastatic cells detected (#Cells) and analysed (#Sort) from each tissue from each PDX animal. Tissues were rank ordered according to metastatic burden, from lowest (lightest grey) to highest (black). The table also shows the number of primary tumour cells analysed (#Sort) from PDX animals, and the number of normal mammary epithelial cells analysed (#Sort) from mammoplasty patients (Individuals 1, 2, and 3). ‘Transplanted’ indicates primary tumour cells derived from transplant of lymph node metastatic cells into marry fat pads; ‘resected’ indicates lung metastatic cells analysed 8 weeks after resection of the primary tumour. B, basal/stem; BM, bone marrow; BR, brain; L, luminal; LN, lymph node; LP, luminal progenitor; LU, lung; PB, peripheral blood; T, primary tumour.

Extended Data Figure 3 Primary tumours contain rare stem-like cells.

a, Unsupervised hierarchical clustering of metastatic and primary tumour cells from 10 animals (Extended Data Fig. 2e lists cells analysed from each animal) based on their expression of the 49-gene differentiation signature. The dendrogram shows two major clusters, where major cluster A contains basal/stem-like cells and major cluster B contains more luminal-like cells. The majority of low-burden metastatic cells reside in subcluster A3. 1.4% of the primary tumour cells analysed in this study reside in subcluster A3, and are therefore similar to low-burden metastatic cells in their stem-like differentiation status. The pie graph and table list the percentage of primary tumour cells that reside in each cluster. The table also shows the data by PDX model. b, Unsupervised hierarchical clustering of metastatic and primary tumour cells, based on their expression of genes associated with cell cycle and dormancy. Two major clusters are evident. Major cluster A contains cells with a less-proliferative signature, which express higher levels of ‘negative’ cell-cycle-associated genes and lower levels of ‘positive’ cell-cycle-associated genes. Major cluster B contains cells with a more proliferative signature. The majority of low-burden metastatic cells reside in major cluster A, and possess a less-proliferative signature. The pie chart and table show the number of primary tumour cells in each cluster.

Extended Data Figure 4 The correlation between differentiation and metastatic burden is conserved in each PDX model.

a, Unsupervised hierarchical clustering of lung metastatic cells from each PDX model is shown separately. Lung metastatic cells were specifically chosen for this analysis because they were the only tissue for which there were sufficient numbers of low- and high-burden cells. In each dendrogram, low-burden metastatic cells form a distinct cluster due to their basal/stem-like expression signature. High-burden metastatic cells also form distinct clusters and express higher levels of luminal genes. Supplementary Data 2 shows the entire heatmap for each PDX model. b, Unsupervised hierarchical clustering of lung metastatic cells that developed after primary tumour resection (#453, red) at 10–12 mm in diameter. Post-resection metastatic cells were clustered with lung metastatic cells from non-resected animals to investigate their differentiation status. All animals bore the HCI-010 model. 85.4% of post-resection metastatic cells displayed a luminal-like expression pattern, showing that luminal-like metastatic cells can arise from cells that disseminate at early stages of primary tumour growth. Supplementary Data 2 shows the entire heatmap. c, Unsupervised hierarchical clustering of lung metastatic cells from all three PDX models by their expression of the top genes differentially expressed between them. Although there were statistically significant differences between the models, the dendrogram shows that they were not powerful enough to cluster the cells separately by model. Supplementary Data 2 shows the entire heatmap. d, Box plots show top selected genes differentially expressed between the three PDX models. By ANOVA, 53 genes were significantly differentially expressed (P < 0.05, Supplementary Data 3). e, Immunofluorescence stains for basal and luminal lineage-specific proteins (red) in micro- and macrometastatic lesions. Autofluorescent red blood cells (RBCs) are also present in the lung (arrows), but do not represent positive immunostaining. Scale bars, 50 µm. Bar graphs quantify the percentage of low- and high-burden metastatic cells, and primary tumour cells that were positive for antibody staining. Data from at least three fields, in three different mice was collected from each group, and P values were calculated as described in the Methods. Error bars represent standard deviation.

Extended Data Figure 5 Low-burden metastatic cells have tumour-initiating and differentiation capacity.

a, Schematic overview of orthotopic transplant experiments to investigate the tumour-initiating and differentiation capacity of low-burden metastatic cells. Images of resulting grafts show that 2/4 transplants of low-burden cells grew large tumours, while 0/10 transplants from primary tumour cells developed tumours. b, Unsupervised hierarchical clustering of tumour cells derived from transplants of low-burden metastatic cells. Transplant-derived tumour cells were clustered with metastatic and primary tumour cells from previous experiments (Extended Data Fig. 3a) to investigate their differentiation status. Transplant-derived tumour cells were heterogeneous, where 1.3% of them were basal/stem-like, and 98.7% of them clustered with more luminal-like cells. This shows that low-burden basal/stem-like metastatic cells have the capacity to give rise to luminal-like cancer cells.

Extended Data Figure 6 Metastatic cells found in different organs show distinct gene expression signatures.

a, Supervised clustering of metastatic cells by target organ emphasizes tissue-specific gene signatures. Arrows indicate genes significantly differentially expressed between at least two tissues, as shown in b. b, Box plots show genes most characteristic of each tissue type, as determined by ANOVA and pair-wise analyses. P values and fold change for each gene and tissue pair are listed in Supplementary Table 3. Box plots for all 80 genes differentially expressed between the tissue pairs are shown in Supplementary Data 7. BM, bone marrow; BR, brain; LN, lymph node; LU, lung; PB, peripheral blood (CTC); T, tumour. c, Pearson correlations indicate similarity of CTCs to other metastatic tissue types across all genes analysed. Each dot represents an individual gene. BM, bone marrow; BR, brain; LN, lymph node; LU, lung; PB, peripheral blood (CTC).

Extended Data Figure 7 Analysis of dinaciclib-treated animals.

a, Immunofluorescence stains for Ki67 in micro- and macrometastatic lesions from low- and high-burden animals, as well as in primary tumours. Scale bars, 50 µm. b, Bar graphs quantify the percentage of MYC, phospho-histone H3 (pH3), and Ki67 positive cells per lesion in micro- and macrometastatic lesions. Error bars represent standard deviation. c, d, Waterfall plots shows the longest final tumour diameter for each PDX animal treated with vehicle (black bars) or drug (white bars). e, Bar graphs show the average number of days to endpoint (4 weeks, or 20 mm primary tumour size) for animals treated with vehicle or drug.

Extended Data Figure 8 Model for tumour cell heterogeneity during metastasic progression.

a, Metastatic cells from animals with low metastatic burden (blue) are distinct from animals with higher burden, due to their increased expression of stemness, anti-apoptosis, EMT, and dormancy/quiescence-related genes. In contrast, higher burden metastatic cells are more heterogeneous, and comprise larger numbers of proliferative, differentiated cells (red). Transplant experiments of stem-like metastatic cells showed that they have tumour-initiating potential, and can produce luminal-like cancer cells. This strongly suggests that metastases derive from stem-like cells, which differentiate and undergo a switch from dormancy into proliferation as they colonize and produce more advanced metastatic tumours. Metastatic progression was also associated with increased MYC expression, and could be attenuated with CDK inhibition. We believe this is due to apoptosis of cells as they upregulate MYC, since our previous work has shown that CDK inhibition induces apoptosis in high MYC-expressing cancer cells through synthetic lethality23. b, Comparison of gene signatures in primary tumour and metastatic cells showed that 1.4% of primary tumour cells, and 16.7% of CTCs possessed a stem-like signature. This suggests that these cells may be the origin of metastatic tumours.

Extended Data Table 1 Metastatic frequency and tissue tropism identified by FACS in each PDX model
Extended Data Table 2 All genes differentially expressed in low-burden metastatic cells relative to primary tumour cells

Supplementary information

Supplementary Methods

This file contains the analysis of single-cell real-time qPCR experiments using dynamic arrays in R statistical language. Ct values generated in single-cell real-time qPCR experiments were processed in the R statistical language, using algorithms we generated (PDF 1393 kb)

Supplementary Data

This file contains Supplementary Data 1-8. (PDF 4372 kb)

Supplementary Table 1

List of genes analysed in normal and tumour cell populations: The table lists each gene, forward and reverse primer sequence, and rationale for inclusion in dynamic array experiments. (XLSX 27 kb)

Supplementary Table 2

Genes comprising the 49-gene differentiation signature that are differentially expressed between normal human mammary epithelial populations: The table lists fold change (FC) and p-values for each gene that was differentially expressed between normal human basal/stem, luminal, and luminal progenitor cells. (XLSX 14 kb)

Supplementary Table 3

Quantification of gene expression differences in metastatic cells from different tissues: The table lists a p-value for each gene identified by ANOVA, followed by fold change (FC) and p-values for each tissue pair that demonstrated a significant difference by post hoc pair-wise analysis. (XLSX 26 kb)

PowerPoint slides

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Lawson, D., Bhakta, N., Kessenbrock, K. et al. Single-cell analysis reveals a stem-cell program in human metastatic breast cancer cells. Nature 526, 131–135 (2015).

Download citation

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing