Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Breast tumours maintain a reservoir of subclonal diversity during expansion

Abstract

Our knowledge of copy number evolution during the expansion of primary breast tumours is limited1,2. Here, to investigate this process, we developed a single-cell, single-molecule DNA-sequencing method and performed copy number analysis of 16,178 single cells from 8 human triple-negative breast cancers and 4 cell lines. The results show that breast tumours and cell lines comprise a large milieu of subclones (7–22) that are organized into a few (3–5) major superclones. Evolutionary analysis suggests that after clonal TP53 mutations, multiple loss-of-heterozygosity events and genome doubling, there was a period of transient genomic instability followed by ongoing copy number evolution during the primary tumour expansion. By subcloning single daughter cells in culture, we show that tumour cells rediversify their genomes and do not retain isogenic properties. These data show that triple-negative breast cancers continue to evolve chromosome aberrations and maintain a reservoir of subclonal diversity during primary tumour growth.

This is a preview of subscription content

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: The ACT method and technical performance.
Fig. 2: Clonal substructure of eight triple-negative breast tumours.
Fig. 3: Evolutionary analysis of clonal lineages in patients with TNBC.
Fig. 4: Mathematical modelling of transient instability after punctuated copy number evolution.
Fig. 5: Clonal substructure of TNBC cell lines and single-cell expansions.

Data availability

The data from this study were deposited in the NCBI Sequence Read Archive under accession number PRJNA629885.

Code availability

Code used in this study is available at https://github.com/navinlabcode/ACT_paper.

References

  1. 1.

    Davis, A., Gao, R. & Navin, N. Tumor evolution: linear, branching, neutral or punctuated? Biochim. Biophys. Acta Rev. Cancer 1867, 151–161 (2017).

    CAS  PubMed  Google Scholar 

  2. 2.

    Burrell, R. A., McGranahan, N., Bartek, J. & Swanton, C. The causes and consequences of genetic heterogeneity in cancer evolution. Nature 501, 338–345 (2013).

    ADS  CAS  PubMed  Google Scholar 

  3. 3.

    Pfister, K. et al. Identification of drivers of aneuploidy in breast tumors. Cell Rep. 23, 2758–2769 (2018).

    CAS  PubMed  PubMed Central  Google Scholar 

  4. 4.

    Xu, J., Huang, L. & Li, J. DNA aneuploidy and breast cancer: a meta-analysis of 141,163 cases. Oncotarget 7, 60218–60229 (2016).

    PubMed  PubMed Central  Google Scholar 

  5. 5.

    Gordon, D. J., Resio, B. & Pellman, D. Causes and consequences of aneuploidy in cancer. Nat. Rev. Genet. 13, 189–203 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  6. 6.

    Fearon, E. R. & Vogelstein, B. A genetic model for colorectal tumorigenesis. Cell 61, 759–767 (1990).

    CAS  PubMed  PubMed Central  Google Scholar 

  7. 7.

    Gao, R. et al. Punctuated copy number evolution and clonal stasis in triple-negative breast cancer. Nat. Genet. 48, 1119–1130 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  8. 8.

    Navin, N. et al. Tumour evolution inferred by single-cell sequencing. Nature 472, 90–94 (2011).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  9. 9.

    Cross, W. et al. The evolutionary landscape of colorectal tumorigenesis. Nat. Ecol. Evol. 2, 1661–1672 (2018).

    PubMed  PubMed Central  Google Scholar 

  10. 10.

    Baca, S. C. et al. Punctuated evolution of prostate cancer genomes. Cell 153, 666–677 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  11. 11.

    Carter, S. L. et al. Absolute quantification of somatic DNA alterations in human cancer. Nat. Biotechnol. 30, 413–421 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  12. 12.

    Zack, T. I. et al. Pan-cancer patterns of somatic copy number alteration. Nat. Genet. 45, 1134–1140 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  13. 13.

    Gerstung, M. et al. The evolutionary history of 2,658 cancers. Nature 578, 122–128 (2020).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  14. 14.

    Cross, W. Ch., Graham, T. A. & Wright, N. A. New paradigms in clonal evolution: punctuated equilibrium in cancer. J. Pathol. 240, 126–136 (2016).

    PubMed  PubMed Central  Google Scholar 

  15. 15.

    Hadimioglu, B., Stearns, R. & Ellson, R. Moving liquids with sound: the physics of acoustic droplet ejection for robust laboratory automation in life sciences. J. Lab. Autom. 21, 4–18 (2016).

    PubMed  PubMed Central  Google Scholar 

  16. 16.

    Zahn, H. et al. Scalable whole-genome single-cell library preparation without preamplification. Nat. Methods 14, 167–173 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  17. 17.

    Chavez, K. J., Garimella, S. V. & Lipkowitz, S. Triple negative breast cancer cell lines: one tool in the search for better treatment of triple negative breast cancer. Breast Dis. 32, 35–48 (2010).

    PubMed  PubMed Central  Google Scholar 

  18. 18.

    Williams, M. J., Werner, B., Barnes, C. P., Graham, T. A. & Sottoriva, A. Identification of neutral tumor evolution across cancer types. Nat. Genet. 48, 238–244 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  19. 19.

    Liberzon, A. et al. The Molecular Signatures Database (MSigDB) hallmark gene set collection. Cell Syst. 1, 417–425 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  20. 20.

    Wang, Y. et al. Clonal evolution in breast cancer revealed by single nucleus genome sequencing. Nature 512, 155–160 (2014).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  21. 21.

    Cross, W. et al. Stabilising selection causes grossly altered but stable karyotypes in metastatic colorectal cancer. Preprint at https://doi.org/10.1101/2020.03.26.007138 (2020).

  22. 22.

    Fehrmann, R. S. et al. Gene expression analysis identifies global gene dosage sensitivity in cancer. Nat. Genet. 47, 115–125 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  23. 23.

    Ben-David, U. et al. Genetic and transcriptional evolution alters cancer cell line drug response. Nature 560, 325–330 (2018).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  24. 24.

    Greenfield, E. A. Single-cell cloning of hybridoma cells by limiting dilution. Cold Spring Harb. Protoc. https://doi.org/10.1101/pdb.prot103192 (2019).

  25. 25.

    Zong, C., Lu, S., Chapman, A. R. & Xie, X. S. Genome-wide detection of single-nucleotide and copy-number variations of a single human cell. Science 338, 1622–1626 (2012).

    ADS  CAS  PubMed  PubMed Central  Google Scholar 

  26. 26.

    Xi, L. et al. New library construction method for single-cell genomes. PLoS ONE 12, e0181163 (2017).

    PubMed  PubMed Central  Google Scholar 

  27. 27.

    Laks, E. et al. Clonal decomposition and DNA replication states defined by scaled single-cell genome sequencing. Cell 179, 1207–1221 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  28. 28.

    Vitak, S. A. et al. Sequencing thousands of single-cell genomes with combinatorial indexing. Nat. Methods 14, 302–308 (2017).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  29. 29.

    Baslan, T. et al. Genome-wide copy number analysis of single cells. Nat. Protoc. 7, 1024–1041 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  30. 30.

    Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  31. 31.

    Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    PubMed  PubMed Central  Google Scholar 

  32. 32.

    Venkatraman, E. S. & Olshen, A. B. A faster circular binary segmentation algorithm for the analysis of array CGH data. Bioinformatics 23, 657–663 (2007).

    CAS  PubMed  Google Scholar 

  33. 33.

    Hahsler, M., Piekenbrock, M. & Doran, D. Fast density-based Clustering with R. J. Stat. Softw. https://doi.org/10.18637/jss.v091.i01 (2019).

    Article  Google Scholar 

  34. 34.

    Leung, M. L. et al. Highly multiplexed targeted DNA sequencing from single nuclei. Nat. Protoc. 11, 214–235 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  35. 35.

    Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  36. 36.

    Nilsen, G. et al. Copynumber: efficient algorithms for single- and multi-track copy number segmentation. BMC Genomics 13, 591 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  37. 37.

    L. McInnes, J. Healy & J. Melville. UMAP: uniform manifold approximation and projection for dimension reduction. Preprint at https://arxiv.org/abs/1802.03426 (2018).

  38. 38.

    Lun, A. T., McCarthy, D. J. & Marioni, J. C. A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor. F1000Res. 5, 2122 (2016).

    PubMed  PubMed Central  Google Scholar 

  39. 39.

    Csardi, G. & Nepusz, T. The igraph software package for complex network research. InterJournal 1695 (2006).

  40. 40.

    McInnes, L. Healy, J. & Astels, S. hdbscan: hierarchical density based clustering. JOSS 2, 205 (2017).

    ADS  Google Scholar 

  41. 41.

    Gu, Z., Eils, R. & Schlesner, M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics 32, 2847–2849 (2016).

    CAS  PubMed  Google Scholar 

  42. 42.

    Zhang, Z., Lange, K. & Sabatti, C. Reconstructing DNA copy number by joint segmentation of multiple sequences. BMC Bioinformatics 13, 205 (2012).

    PubMed  PubMed Central  Google Scholar 

  43. 43.

    Desper, R. & Gascuel, O. Fast and accurate phylogeny reconstruction algorithms based on the minimum-evolution principle. J. Comput. Biol. 9, 687–705 (2002).

    CAS  PubMed  MATH  PubMed Central  Google Scholar 

  44. 44.

    Paradis, E. & Schliep, K. ape 5.0: an environment for modern phylogenetics and evolutionary analyses in R. Bioinformatics 35, 526–528 (2019).

    CAS  PubMed  Google Scholar 

  45. 45.

    Yu, G., Smith, D. K., Zhu, H., Guan, Y. & Lam, T. T.-Y. ggtree: an r package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Methods Ecol. Evol. 8, 28–36 (2017).

    Google Scholar 

  46. 46.

    McKenna, A. et al. The genome analysis toolkit: a mapreduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  47. 47.

    Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 31, 213–219 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  48. 48.

    Knaus, B. J. & Grünwald, N. J. vcfr: a package to manipulate and visualize variant call format data in R. Mol. Ecol. Resour. 17, 44–53 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  49. 49.

    Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).

    PubMed  PubMed Central  Google Scholar 

  50. 50.

    Ng, P. C. & Henikoff, S. SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res. 31, 3812–3814 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  51. 51.

    Adzhubei, I., Jordan, D. M. & Sunyaev, S. R. Predicting functional effect of human missense mutations using PolyPhen-2. Curr. Protoc. Hum. Genet. 7, 20 (2013).

    PubMed  PubMed Central  Google Scholar 

  52. 52.

    1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).

    Google Scholar 

  53. 53.

    Van Loo, P. et al. Allele-specific copy number analysis of tumors. Proc. Natl Acad. Sci. USA 107, 16910–16915 (2010).

    ADS  PubMed  PubMed Central  Google Scholar 

  54. 54.

    Nik-Zainal, S. et al. The life history of 21 breast cancers. Cell 149, 994–1007 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  55. 55.

    Patro, R., Duggal, G., Love, M. I., Irizarry, R. A. & Kingsford, C. Salmon provides fast and bias-aware quantification of transcript expression. Nat. Methods 14, 417–419 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  56. 56.

    Frankish, A. et al. GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Res. 47, D766–D773 (2019).

    CAS  PubMed  PubMed Central  Google Scholar 

  57. 57.

    Soneson, C., Love, M. I. & Robinson, M. D. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000Res. 4, 1521 (2015).

    PubMed  Google Scholar 

  58. 58.

    Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).

    PubMed  PubMed Central  Google Scholar 

  59. 59.

    Korotkevich, G., Sukhov, V. & Sergushichev, A. Fast gene set enrichment analysis. Preprint at https://doi.org/10.1101/060012 (2019).

  60. 60.

    R Core Team. R: A Language and Environment for Statistical Computing http://www.R-project.org/ (R Foundation for Statistical Computing, 2013).

  61. 61.

    Kassambara, A. rstatix: Pipe-Friendly Framework for Basic Statistical Tests https://CRAN.R-project.org/package=rstatix (2020).

  62. 62.

    Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer-Verlag, 2016).

  63. 63.

    Virtanen, P. et al. SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  64. 64.

    McKinney, W. Data structures for statistical computing in Python. In Proc. 9th Python in Science Conference (eds van der Walt, S. & Millman, J.) 56–61 (2010).

Download references

Acknowledgements

This work was supported by grants to N.E.N. from the American Cancer Society (129098-RSG-16-092-01-TBG), the National Cancer Institute (R01CA240526, R01CA236864), the Emerson Collective Cancer Research Fund (144300-80-121457-19) and the CPRIT Single Cell Genomics Center (RP180684). N.E.N. is an AAAS Wachtel Scholar and AAAS Fellow. This study was supported by core facility grants (CA016672, CA016672) from the NIH. This work was supported by the Dana-Farber Cancer Institute Physical Sciences Oncology Center (grant no. U54CA143798 to F.M.) and the DFCI Center for Cancer Evolution (to F.M.). T.B., M.T. and P.V.L. are supported by the Francis Crick Institute, Cancer Research UK (FC001202), the UK Medical Research Council (FC001202) and the Wellcome Trust (FC001202). P.V.L. is a Winton Group Leader in recognition of the Winton Charitable Foundation’s support towards the establishment of the Francis Crick Institute. T.B. is supported by a fellowship from the Boehringer Ingelheim Fonds. We thank K. Polyak, H. Tang, L. Ramagli, E. Thompson, S. Shah, A. McPherson, N. Ramesh and A. Kalia for their assistance on this project.

Author information

Affiliations

Authors

Contributions

D.C.M. was involved in all aspects of the work, M.D.N., T.O.M. and F.M.-B. performed mathematical modelling. T.B., P.V.L. and M.T. performed WGD and LOH analysis. M.H. and A.D. performed data analysis. H.Y., K.W., M.R., C.P., H.D., E.S., S.B., A.S. and Y.-w.L. performed single-cell experiments. A.M. and J.M. performed cytogenetics experiments. M.H. and A.C. performed data processing. A.B., H.C., B.L., B.A. and F.M.-B. provided tissue samples, managed IRBs and contributed clinical expertise. F.M., P.V.L. and N.E.N. managed the project, analysed data and wrote the manuscript.

Corresponding authors

Correspondence to Franziska Michor or Nicholas E. Navin.

Ethics declarations

Competing interests

F.M. is the co-founder of an oncology company. The other authors declare no competing interests.

Additional information

Peer review information Nature thanks the anonymous reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 Technical metrics and performance of ACT.

a, ACT single-cell DNA library size distributions for TN1, TN2 and TN3 after pooling 384 cell libraries. b, Schematic showing the use of positional barcoding information to determine single-molecule information by tagmentation during ACT, compared to whole-genome amplification using DOP-PCR, where the original DNA fragmentation sites of single molecules cannot be resolved. c, Breadth of coverage for sparse depth data from different scDNA-seq methods plotted by individual samples, using n = 100 random cells per sample. d, Overdispersion of bin counts for sparse depth data from different scDNA-seq methods plotted by individual samples, using N = 100 random cells per sample. e, Distribution of sequencing reads across a diploid region of chromosome 4p14 for a single SK-BR-3 cell sequenced by DOP-PCR compared to ACT, in which the PCR duplicates were retained or removed to obtain single-molecule data. f, Distribution of sequencing reads across a diploid region of chromosome 4p (top) and 10q (bottom) for a single SK-BR-3 cell sequenced by DOP-PCR compared to ACT, with or without duplicate molecules retained. g, Lorenz curves of coverage uniformity for ACT, DOP-PCR and one bulk DNA-seq data from SK-BR-3 single cells, downsampled to equal coverage depth. h, Breadth of coverage as a function of pseudo-bulk reconstruction by combining multiple cells for ACT, DOP-PCR and bulk sequencing.

Extended Data Fig. 2 Molecular properties of subclonal chromosome aberrations.

a, FACS profiles of DAPI-stained nuclei flow-sorted for ACT from eight patients with TNBC, showing ploidy distributions, with vertical red lines showing the sorting gates. b, Shannon diversity indexes calculated from the single-cell copy number data from each of the eight individuals with 95% confidence intervals indicated. c, Heat map of the genomic regions of cCNAs, sCNAs and uCNAs across the eight tumour samples. d, Distributions of the genomic segment sizes of clonal, subclonal and unique CNAs across the eight tumours. e, Proportion of genome altered relative to the tumour ploidy classified as copy number losses in blue, neutral ground state copy number in white and gains in red. f, Bootstrapping of subclone clusters showing the mean Jaccard similarity for each subclone across the eight tumours. g, Scatter plots of number of cells in each subclone cluster by mean Jaccard similarity for each of the eight tumours.

Extended Data Fig. 3 Copy number substructure of additional patients with TNBC.

a, Clustered heat maps of single-cell copy number profiles for TN3–TN8 with left annotation bars representing superclones and subclones, and bottom annotation bars representing different genomic regions of CNA classes as well as annotations for selected breast cancer genes. b, Matrix plots for TN3–TN8, showing integer copy number states for selected breast cancer genes in regions of cCNAs, sCNAs and uCNAs across the different subclones in each tumour.

Extended Data Fig. 4 Validation of clonal substructure using a microdroplet approach.

a, Co-clustering of ACT and 10X Genomics copy number data for samples TN1 (n = 1,976 cells) and TN3 (n = 2,171 cells), showing subclones detected in the merged datasets. b, Frequency of subclones detected on each platform in the merged datasets from 10X and ACT. c, Clustered heat maps of single-cell copy number profiles for TN1 and TN3 with left annotation bars representing the scDNA-seq technology platform and the different subclones, with annotations for selected breast cancer genes indicated below. d, Bar plots of copy number state frequencies of selected breast cancer genes for ACT and 10X CNV showing the proportion of copy number states for all cells separated by platform.

Extended Data Fig. 5 WGD estimates and additional copy number lineages.

a, Most frequent exonic mutations in genes with significant SIFT (<0.05) and PolyPhen-2 (>0.85) scores. b, Exome mutation counts of each tumour indicating mutations that were classified as clonal or subclonal based on allele-specific copy number frequencies. c, Density plots showing the probability of genome doubling as a function of relative mutational time for 7 out of the 8 patients with TNBC with sufficient number of truncal exome mutations. d, Minimum evolution trees of single-cell copy number profiles using Manhattan distances for TN3–TN8, indicating the distance from the diploid root node to the MRCA and the distance from the MRCA to the terminal nodes. Annotations indicate the timing of genome doubling and timing of TP53 mutations before WGD in all of the tumours. e, Summary of the truncal distances from the diploid root node to the MRCA and the branching distances from the MRCA to the last terminal node.

Extended Data Fig. 6 Evolutionary analysis of clonal lineages in additional patients with TNBC.

a, Left, minimum evolution trees after the MRCA generated using the consensus CNA profiles of subclones for TN3–TN8 rooted by a neutral node to the MRCA and coloured by superclones and subclones. Right, heat maps of consensus subclones profiles, with annotations for the superclones and subclones on left annotation bars and bottom annotation bars showing different CNA classes, as well as selected breast cancer genes. The last row in the clustered heat maps shows the inferred MRCA copy number profiles. b, Genome-wide copy number profiles of TNBC tumours with segments of the rounded total copy number (orange) and the rounded number of copies of the minor allele (blue). Thick segments are ASCAT profiles from the exome bulk, and thinner segments are from the superclones with slight offset relative to integer values for visualization. For each superclone, parentheses show the percentage of the genomic region in which both the minor and major allele copy numbers are the same as in the exome, restricting analysis to the genomic region where the total is also the same.

Extended Data Fig. 7 Chromosome-breakpoint frequency spectra of additional tumours.

a, Comparison of the expected CNA frequency spectrum obtained from theory and simulation. Simulations include a flexible fitness distribution, whereas the theoretical analysis considers neutral and lethal changes only. Different colours correspond to varying the increase in CNA rate during the transient instability phase, and the tumour size at which the instability subsides. Exact parameters are provided in the Supplementary Methods. b, Maximum-likelihood fits for the breakpoint frequency spectra obtained for TNBC tumours under models of gradual and transient instability after PCNE; parameter values for simulations and further details are provided in the Supplementary Methods. c, Maximum-likelihood fits for the breakpoint frequency spectra obtained from expanded clones of MDA-MB-231 under models of gradual and transient instability. Further details are provided in the Supplementary Methods.

Extended Data Fig. 8 Clonal substructure of additional TNBC cell lines and single-cell expansions.

a, b, Clustered heat maps of single-cell copy number data from the BT-20 (n = 1,231 cells) and MDA-MB-157 (n = 1,210 cells) cell lines, in which left annotation bars represent superclones and subclones, and the bottom annotation bar represents different classes of CNA types. c, Number of superclones and subclones identified in the TNBC cell lines. d, Number of clonal, subclonal and unique CNAs detected in the four TNBC cell lines, as well as the two MDA-MB-231 expanded daughter cells. e, Distributions of the genomic sizes of clonal, subclonal and unique CNAs across the four TNBC cell lines and the two MDA-MB-231 expanded daughter cell lines. f, Shannon indexes calculated from the single-cell copy number profiles from the four TNBC cell lines and the two expanded MDA-MB-231 daughter cells with 95% confidence intervals. g, Microscopic field of DNA-FISH experiments of MDA-MB-231 using AKT3 and BCAS2 probes at 60× original magnification. h, Bar plots showing the results of DNA-FISH copy number states counted across 1,000 cells for each of the probes compared to the ACT data. i, Clustered heat map of single-cell copy number data for MDA-MB-231 EX2 cell line expansion (n = 897 cells), in which left annotation bars represent superclones and subclones, and the bottom annotation bar represents different classes of CNA types.

Extended Data Fig. 9 DNA and RNA analysis of expanded clones from MDA-MB-231.

a, Schematic of physical single-cell subcloning experiments of daughter cells to generate 78 expansions from the MDA-MB-231 parental cell line. b, Co-clustering of the single-cell copy number data from the parental MDA-MB-231 cell line (n = 820 cells) with the 78 expanded clone bulk DNA-seq copy number profiles. c, Principal component analysis of bulk RNA-seq profiles of the 78 expanded daughter cell lines triplicates, with contour colour representing superclones and point colour representing the subclone clusters from the genotypes of the single-cell and bulk DNA-seq co-clustering. d, Clustered heat map of bulk DNA copy number profiles from the 78 expanded clones, with left annotation bars representing superclones and subclones, as determined by co-clustering with the parental single-cell copy number data. e, Mean gene expression levels of different copy number states for 78 expansions from the MDA-MB-231 parental cell line. f, Cumulative number of subclonal segments as a function of Kruskal–Wallis test P-value, in which the red line denotes a P-value of 0.05. g, Mean gene expression as a function of copy number segments with points representing expanded clusters for two subclonal CNAs on chr11 and chr19. h, i, Consensus integer copy number profiles of the 10 expanded clone clusters on chromosome 11 (h) and chromosome 19 (i) (top) with matched RNA-seq expression (bottom) using moving windows of 100 genes. Right, selected breast cancer genes in subclonal CNA regions and their corresponding box plots of RNA expression for each expanded cluster. Box plots show the median, box edges represent the first and third quartiles, and the whiskers extend to 1.5× interquartile range. j, Cancer hallmark signatures with significant variability of normalized enrichment scores (NES) across the expanded clone clusters.

Extended Data Fig. 10 Models of chromosome evolution during primary tumour expansion.

ac, Three models of chromosome evolution dynamics during the expansion of primary TNBC tumours, with schematic plots of chromosome accumulation over time (left) and Muller plots of clonal frequencies (right). a, Gradual model of copy number evolution, in which CNAs are acquired sequentially throughout tumour progression leading to the expansion of successive subclones over time. b, Punctuated copy number evolution model, in which an initial burst of instability generates a large number of CNAs and subclones that undergo stable expansions to form the primary tumour mass, with no (or few) new CNAs acquired after the initial burst. c, Model of punctuated evolution and transient instability, in which the early acquisition of TP53 mutations and genome doubling lead to a burst of genomic instability in which a large number of CNA events are acquired and subclones are generated. These events are followed by a period of transient instability and ongoing copy number evolution during the expansion of the primary tumour mass, which leads to the generation of additional subclones and genomic diversity.

Supplementary information

Supplementary Information

This files contains Supplementary Table 1: Relevant clinical information for the eight TNBC tumors and four TNBC cell lines that were analyzed in this study; Supplementary Table 2: Exome sequencing mutational information from bulk DNA exome of the 8 TNBC tumors; Supplementary Table 3: List of dual barcodes from ACT protocol; and Supplementary Methods: Extensive description of the methodology used for the mathematical modeling section, including Supplementary Figures 1-20 and additional references.

Reporting Summary

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Minussi, D.C., Nicholson, M.D., Ye, H. et al. Breast tumours maintain a reservoir of subclonal diversity during expansion. Nature 592, 302–308 (2021). https://doi.org/10.1038/s41586-021-03357-x

Download citation

Further reading

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing