Subclonal reconstruction from bulk tumor DNA sequencing has become a pillar of cancer evolution studies, providing insight into the clonality and relative ordering of mutations and mutational processes. We provide an outline of the complex computational approaches used for subclonal reconstruction from single and multiple tumor samples. We identify the underlying assumptions and uncertainties in each step and suggest best practices for analysis and quality assessment. This guide provides a pragmatic resource for the growing user community of subclonal reconstruction methods.
This is a preview of subscription content, access via your institution
Open Access articles citing this article.
Oncogene Open Access 26 April 2023
Nature Open Access 09 November 2022
Intratumoral heterogeneity affects tumor regression and Ki67 proliferation index in perioperatively treated gastric carcinoma
British Journal of Cancer Open Access 08 November 2022
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
Hanahan, D. & Weinberg, R. A. Hallmarks of cancer: the next generation. Cell 144, 646–674 (2011).
Nowell, P. C. The clonal evolution of tumor cell populations. Science 194, 23–28 (1976).
Gundem, G. et al. The evolutionary history of lethal metastatic prostate cancer. Nature 520, 353–357 (2015).
Hong, M. K. H. et al. Tracking the origins and drivers of subclonal metastatic expansion in prostate cancer. Nat. Commun. 6, 6605 (2015).
Mitchell, T. J. et al. Timing the landmark events in the evolution of clear cell renal cell cancer: TRACERx Renal. Cell 173, 611–623.e17 (2018).
Turajlic, S. et al. Tracking cancer evolution reveals constrained routes to metastases: TRACERx Renal. Cell 173, 581–594.e12 (2018).
Andor, N. et al. Pan-cancer analysis of the extent and consequences of intratumor heterogeneity. Nat. Med. 22, 105–113 (2016).
Espiritu, S. M. G. et al. The evolutionary landscape of localized prostate cancers drives clinical aggression. Cell 173, 1003–1013.e15 (2018).
Jamal-Hanjani, M. et al. Tracking the evolution of non–small-cell lung cancer. N. Engl. J. Med. 376, 2109–2121 (2017).
Fittall, M. W. & Van Loo, P. Translating insights into tumor evolution to clinical practice: promises and challenges. Genome Med. 11, 20 (2019).
Sendorek, D. H. et al. Germline contamination and leakage in whole genome somatic single nucleotide variant detection. BMC Bioinformatics 19, 28 (2018).
Alioto, T. S. et al. A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing. Nat. Commun. 6, 10001 (2015).
Sun, R. et al. Between-region genetic divergence reflects the mode and tempo of tumor evolution. Nat. Genet. 49, 1015–1024 (2017).
Salehi, S. et al. ddClone: joint statistical inference of clonal populations from single cell and bulk tumour sequencing data. Genome Biol. 18, 44 (2017).
Dentro, S. C. et al. Characterizing genetic intra-tumor heterogeneity across 2,658 human cancer genomes. Preprint at bioRxiv https://doi.org/10.1101/312041 (2020).
Gerstung, M. et al. The evolutionary history of 2,658 cancers. Nature 578, 122–128 (2020).
Abbosh, C. et al. Phylogenetic ctDNA analysis depicts early-stage lung cancer evolution. Nature 545, 446–451 (2017).
Noorani, A. et al. Genomic evidence supports a clonal diaspora model for metastases of esophageal adenocarcinoma. Nat. Genet. 52, 74–83 (2020).
McGranahan, N. et al. Clonal neoantigens elicit T cell immunoreactivity and sensitivity to immune checkpoint blockade. Science 351, 1463–1469 (2016).
Gomez, K. et al. Somatic evolutionary timings of driver mutations. BMC Cancer 18, 85 (2018).
Opasic, L., Zhou, D., Werner, B., Dingli, D. & Traulsen, A. How many samples are needed to infer truly clonal mutations from heterogenous tumours? BMC Cancer 19, 403 (2019).
Salcedo, A. et al. A community effort to create standards for evaluating tumor subclonal reconstruction. Nat. Biotechnol. 38, 97–107 (2020).
Griffith, M. et al. Optimizing cancer genome sequencing and analysis. Cell Syst. 1, 210–223 (2015).
Caravagna, G. et al. Subclonal reconstruction of tumors by using machine learning and population genetics. Nat. Genet. 52, 898–907 (2020).
Deshwar, A. G. et al. PhyloWGS: reconstructing subclonal composition and evolution from whole-genome sequencing of tumors. Genome Biol. 16, 35 (2015).
Laks, E. et al. Clonal decomposition and DNA replication states defined by scaled single-cell genome sequencing. Cell 179, 1207–1221.e22 (2019).
Schwarz, R. F. et al. Phylogenetic quantification of intra-tumour heterogeneity. PLOS Comput. Biol. 10, e1003535 (2014).
Ha, G. et al. TITAN: inference of copy number architectures in clonal cell populations from tumor whole-genome sequence data. Genome Res. 24, 1881–1893 (2014).
El-Kebir, M. SPhyR: tumor phylogeny estimation from single-cell sequencing data under loss and error. Bioinformatics 34, i671–i679 (2018).
Zhang, J. et al. Intratumor heterogeneity in localized lung adenocarcinomas delineated by multiregion sequencing. Science 346, 256–259 (2014).
Roth, A. et al. PyClone: statistical inference of clonal population structure in cancer. Nat. Methods 11, 396–398 (2014).
Shi, W. et al. Reliability of whole-exome sequencing for assessing intratumor genetic heterogeneity. Cell Rep. 25, 1446–1457 (2018).
Yates, L. R. et al. Subclonal diversification of primary breast cancer revealed by multiregion sequencing. Nat. Med. 21, 751–759 (2015).
Schuh, A. et al. Monitoring chronic lymphocytic leukemia progression by whole genome sequencing reveals heterogeneous clonal evolution patterns. Blood 120, 4191–4196 (2012).
Boutros, P. C. et al. Spatial genomic heterogeneity within localized, multifocal prostate cancer. Nat. Genet. 47, 736–745 (2015).
Robbe, P. et al. Clinical whole-genome sequencing from routine formalin-fixed, paraffin-embedded specimens: pilot study for the 100,000 Genomes Project. Genet. Med. 20, 1196–1205 (2018).
Chin, S.-F. et al. Shallow whole genome sequencing for robust copy number profiling of formalin-fixed paraffin-embedded breast cancers. Exp. Mol. Pathol. 104, 161–169 (2018).
Nik-Zainal, S. et al. The life history of 21 breast cancers. Cell 149, 994–1007 (2012).
Deshpande, A., Walradt, T., Hu, Y., Koren, A. & Imielinski, M. Robust foreground detection in somatic copy number data. Preprint at bioRxiv https://doi.org/10.1101/847681 (2019).
Van Loo, P. et al. Allele-specific copy number analysis of tumors. Proc. Natl Acad. Sci. USA 107, 16910–16915 (2010).
Dentro, S. C., Wedge, D. C. & Van Loo, P. Principles of reconstructing the subclonal architecture of cancers. Cold Spring Harb. Perspect. Med. 7, a026625 (2017).
Chiang, D. Y. et al. High-resolution mapping of copy-number alterations with massively parallel sequencing. Nat. Methods 6, 99–103 (2009).
Olshen, A. B., Venkatraman, E. S., Lucito, R. & Wigler, M. Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5, 557–572 (2004).
Nilsen, G. et al. Copynumber: efficient algorithms for single- and multi-track copy number segmentation. BMC Genomics 13, 591 (2012).
Lai, D. & Shah, S. HMMcopy: copy number prediction with correction for GC and mappability bias for HTS data. R Package Version 1 (2012).
Fischer, A., Vázquez-García, I., Illingworth, C. J. R. & Mustonen, V. High-definition reconstruction of clonal composition in cancer. Cell Rep. 7, 1740–1752 (2014).
McPherson, A. W. et al. ReMixT: clone-specific genomic structure estimation in cancer. Genome Biol. 18, 140 (2017).
Oesper, L., Mahmoody, A. & Raphael, B. J. THetA: inferring intra-tumor heterogeneity from high-throughput DNA sequencing data. Genome Biol. 14, R80 (2013).
Jiang, Y., Qiu, Y., Minn, A. J. & Zhang, N. R. Assessing intratumor heterogeneity and tracking longitudinal and spatial clonal evolutionary history by next-generation sequencing. Proc. Natl Acad. Sci. USA 113, E5528–E5537 (2016).
Müller, C. A. et al. The dynamics of genome replication using deep sequencing. Nucleic Acids Res. 42, e3 (2014).
Carter, S. L. et al. Absolute quantification of somatic DNA alterations in human cancer. Nat. Biotechnol. 30, 413–421 (2012).
Steele, C. D. et al. Undifferentiated sarcomas develop through distinct evolutionary pathways. Cancer Cell 35, 441–456.e8 (2019).
Almendro, V. et al. Genetic and phenotypic diversity in breast tumor metastases. Cancer Res. 74, 1338–1348 (2014).
Farahani, H. et al. Engineered in-vitro cell line mixtures and robust evaluation of computational methods for clonal decomposition and longitudinal dynamics in cancer. Sci. Rep. 7, 13467 (2017).
Miller, C. A. et al. SciClone: inferring clonal architecture and tracking the spatial and temporal patterns of tumor evolution. PLOS Comput. Biol. 10, e1003665 (2014).
Popic, V. et al. Fast and scalable inference of multi-sample cancer lineages. Genome Biol. 16, 91 (2015).
Sottoriva, A. et al. A Big Bang model of human colorectal tumor growth. Nat. Genet. 47, 209–216 (2015).
Williams, M. J., Werner, B., Barnes, C. P., Graham, T. A. & Sottoriva, A. Identification of neutral tumor evolution across cancer types. Nat. Genet. 48, 238–244 (2016).
Williams, M. J. et al. Quantification of subclonal selection in cancer from bulk sequencing data. Nat. Genet. 50, 895–903 (2018).
Strino, F., Parisi, F., Micsinai, M. & Kluger, Y. TrAp: a tree approach for fingerprinting subclonal tumor composition. Nucleic Acids Res. 41, e165 (2013).
Marass, F. et al. A phylogenetic latent feature model for clonal deconvolution. Ann. Appl. Stat. 10, 2377–2404 (2016).
Jiao, W., Vembu, S., Deshwar, A. G., Stein, L. & Morris, Q. Inferring clonal evolution of tumors from single nucleotide somatic mutations. BMC Bioinformatics 15, 35 (2014).
Ewing, A. D. et al. Combining tumor genome simulation with crowdsourcing to benchmark somatic single-nucleotide-variant detection. Nat. Methods 12, 623–630 (2015).
Zhou, T., Müller, P., Sengupta, S. & Ji, Y. PairClone: a Bayesian subclone caller based on mutation pairs. J. R. Stat. Soc. Ser. C Appl. Stat. 68, 705–725 (2019).
El-Kebir, M., Satas, G. & Raphael, B. J. Inferring parsimonious migration histories for metastatic cancers. Cancer 2, 5 (2018).
Zamani Esteki, M. et al. Concurrent whole-genome haplotyping and copy-number profiling of single cells. Am. J. Hum. Genet. 96, 894–912 (2015).
Mantere, T., Kersten, S. & Hoischen, A. Long-read sequencing emerging in medical genetics. Front. Genet. 10, 426 (2019).
Sedlazeck, F. J. et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat. Methods 15, 461–468 (2018).
Dong, X. et al. Accurate identification of single-nucleotide variants in whole-genome-amplified single cells. Nat. Methods 14, 491–493 (2017).
Martelotto, L. G. et al. Whole-genome single-cell copy number profiling from formalin-fixed paraffin-embedded samples. Nat. Med. 23, 376–385 (2017).
Huddleston, J. et al. Discovery and genotyping of structural variation from long-read haploid genome sequence data. Genome Res. 27, 677–685 (2017).
Malikic, S., Jahn, K., Kuipers, J., Sahinalp, S. C. & Beerenwinkel, N. Integrative inference of subclonal tumour evolution from single-cell and bulk sequencing data. Nat. Commun. 10, 2750 (2019).
Abécassis, J. et al. Assessing reliability of intra-tumor heterogeneity estimates from single sample whole exome sequencing data. PLoS One 14, e0224143 (2019).
Liu, L. Y. et al. Quantifying the influence of mutation detection on tumour subclonal reconstruction. Preprint at bioRxiv https://doi.org/10.1101/418780 (2020).
Parikh, A. R. et al. Liquid versus tissue biopsy for detecting acquired resistance and tumor heterogeneity in gastrointestinal cancers. Nat. Med. 25, 1415–1421 (2019).
Litchfield, D. K. et al. Representative sequencing: unbiased sampling of solid tumor tissue. Cell Rep. 31, 107550 (2019).
Eirew, P. et al. Dynamics of genomic clones in breast cancer patient xenografts at single-cell resolution. Nature 518, 422–426 (2015).
Kim, C. et al. Chemoresistance evolution in triple-negative breast cancer delineated by single-cell sequencing. Cell 173, 879–893.e13 (2018).
Gawad, C., Koh, W. & Quake, S. R. Single-cell genome sequencing: current state of the science. Nat. Rev. Genet. 17, 175–188 (2016).
Yoshida, K. et al. Tobacco smoking and somatic mutations in human bronchial epithelium. Nature 578, 266–272 (2020).
Zahn, H. et al. Scalable whole-genome single-cell library preparation without preamplification. Nat. Methods 14, 167–173 (2017).
Chkhaidze, K. et al. Spatially constrained tumour growth affects the patterns of clonal selection and neutral drift in cancer genomic data. PLOS Comput. Biol. 15, e1007243 (2019).
Eaton, J., Wang, J. & Schwartz, R. Deconvolution and phylogeny inference of structural variations in tumor genomic samples. Bioinformatics 34, i357–i365 (2018).
Cmero, M. et al. Inferring structural variant cancer cell fraction. Nat. Commun. 11, 730 (2020).
Noorbakhsh, J. & Chuang, J. H. Uncertainties in tumor allele frequencies limit power to infer evolutionary pressures. Nat. Genet. 49, 1288–1289 (2017).
Tarabichi, M. et al. Neutral tumor evolution? Nat. Genet. 50, 1630–1633 (2018).
Heide, T. et al. Reply to ‘Neutral tumor evolution?’. Nat. Genet. 50, 1633–1637 (2018).
Williams, M. J., Werner, B., Barnes, C. P., Graham, T. A. & Sottoriva, A. Reply: Uncertainties in tumor allele frequencies limit power to infer evolutionary pressures. Nat. Genet. 49, 1289–1291 (2017).
Zare, F., Dow, M., Monteleone, N., Hosny, A. & Nabavi, S. An evaluation of copy number variation detection tools for cancer using whole exome sequencing data. BMC Bioinformatics 18, 286 (2017).
Gerlinger, M. et al. Genomic architecture and evolution of clear cell renal cell carcinomas defined by multiregion sequencing. Nat. Genet. 46, 225–233 (2014).
Vinci, M. et al. Functional diversity and cooperativity between subclonal populations of pediatric glioblastoma and diffuse intrinsic pontine glioma cells. Nat. Med. 24, 1204–1215 (2018).
Kuipers, J., Jahn, K., Raphael, B. J. & Beerenwinkel, N. Single-cell sequencing data reveal widespread recurrence and loss of mutational hits in the life histories of tumors. Genome Res. 27, 1885–1894 (2017).
Rieber, N. et al. Coverage bias and sensitivity of variant calling for four whole-genome sequencing technologies. PLoS One 8, e66621 (2013).
A.S. was supported by an NSERC CGS. M.T. and P.V.L. are supported by the Francis Crick Institute, which receives its core funding from Cancer Research UK (FC001202), the UK Medical Research Council (FC001202) and the Wellcome Trust (FC001202). M.T. is a postdoctoral fellow supported by the European Union’s Horizon 2020 research and innovation program (Marie Skłodowska-Curie grant agreement no. 747852-SIOMICS). P.V.L. is a Winton Group Leader in recognition of the Winton Charitable Foundation’s support towards the establishment of the Francis Crick Institute. M.N.L. was supported by a Junior Research Fellowship (Trinity College, University of Oxford). P.C.B. was supported by a Terry Fox Research Institute New Investigator Award and a CIHR New Investigator Award. Q.M. is supported by an Associate Investigator Award from the Ontario Institute of Cancer Research and holds a Canada CIFAR AI chair. This work was supported by the NIH/NCI under award numbers P30CA016042 (P.C.B.) and P30-CA008748 (Q.M.), and through support from the ITCR (1U24CA248265-01) to P.C.B.
P.C.B is a member of the Scientific Advisory Boards of BioSymetrics Inc. and Intersect Diagnostics Inc. M.T., A.S., A.G.D., M.N.L., J.W., D.C.W., Q.D.M., and P.V.L. declare no competing interests.
Peer review information Nature Methods thanks the anonymous reviewers for their contribution to the peer review of this work. Lin Tang was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Tarabichi, M., Salcedo, A., Deshwar, A.G. et al. A practical guide to cancer subclonal reconstruction from DNA sequencing. Nat Methods 18, 144–155 (2021). https://doi.org/10.1038/s41592-020-01013-2
This article is cited by
Intratumoral heterogeneity affects tumor regression and Ki67 proliferation index in perioperatively treated gastric carcinoma
British Journal of Cancer (2023)
Convergent evolution and multi-wave clonal invasion in H3 K27-altered diffuse midline gliomas treated with a PDGFR inhibitor
Acta Neuropathologica Communications (2022)
Algorithms for Molecular Biology (2022)
Evaluating statistical approaches to define clonal origin of tumours using bulk DNA sequencing: context is everything
Genome Biology (2022)