Protocol | Published:

Copy-number analysis and inference of subclonal populations in cancer genomes using Sclust

Nature Protocols volume 13, pages 14881501 (2018) | Download Citation


The genomes of cancer cells constantly change during pathogenesis. This evolutionary process can lead to the emergence of drug-resistant mutations in subclonal populations, which can hinder therapeutic intervention in patients. Data derived from massively parallel sequencing can be used to infer these subclonal populations using tumor-specific point mutations. The accurate determination of copy-number changes and tumor impurity is necessary to reliably infer subclonal populations by mutational clustering. This protocol describes how to use Sclust, a copy-number analysis method with a recently developed mutational clustering approach. In a series of simulations and comparisons with alternative methods, we have previously shown that Sclust accurately determines copy-number states and subclonal populations. Performance tests show that the method is computationally efficient, with copy-number analysis and mutational clustering taking <10 min. Sclust is designed such that even non-experts in computational biology or bioinformatics with basic knowledge of the Linux/Unix command-line syntax should be able to carry out analyses of subclonal populations.

  • Subscribe to Nature Protocols for full access:



Additional access options:

Already a subscriber?  Log in  now or  Register  for online access.


  1. 1.

    , , & Cancer evolution: mathematical models and computational inference. Syst. Biol. 64, e1–25 (2015).

  2. 2.

    & Clonal evolution in cancer. Nature 481, 306–313 (2012).

  3. 3.

    The clonal evolution of tumor cell populations. Science 194, 23–28 (1976).

  4. 4.

    , & The cancer genome. Nature 458, 719–724 (2009).

  5. 5.

    & Evolution of the cancer genome. Nat. Rev. Genet. 13, 795–806 (2012).

  6. 6.

    , & Advances in understanding cancer genomes through second-generation sequencing. Nat. Rev. Genet. 11, 685–696 (2010).

  7. 7.

    et al. Absolute quantification of somatic DNA alterations in human cancer. Nat. Biotechnol. 30, 413–421 (2012).

  8. 8.

    , , & High-definition reconstruction of clonal composition in cancer. Cell Rep. 7, 1740–1752 (2014).

  9. 9.

    , , , & Correcting for cancer genome size and tumour cell content enables better estimation of copy number alterations from next-generation sequence data. Bioinformatics 28, 40–47 (2012).

  10. 10.

    et al. TITAN: inference of copy number architectures in clonal cell populations from tumor whole-genome sequence data. Genome Res. 24, 1881–1893 (2014).

  11. 11.

    , & THetA: inferring intra-tumor heterogeneity from high-throughput DNA sequencing data. Genome Biol. 14, R80 (2013).

  12. 12.

    et al. Allele-specific copy number analysis of tumors. Proc. Natl. Acad. Sci. USA 107, 16910–16915 (2010).

  13. 13.

    et al. PhyloWGS: reconstructing subclonal composition and evolution from whole-genome sequencing of tumors. Genome Biol. 16, 35 (2015).

  14. 14.

    , , , & Inferring clonal evolution of tumors from single nucleotide somatic mutations. BMC Bioinformatics 15, 35 (2014).

  15. 15.

    et al. The life history of 21 breast cancers. Cell 149, 994–1007 (2012).

  16. 16.

    et al. PyClone: statistical inference of clonal population structure in cancer. Nat. Methods 11, 396–398 (2014).

  17. 17.

    , , & Clonality inference in multiple tumor samples using phylogeny. Bioinformatics 31, 1349–1356 (2015).

  18. 18.

    et al. Bayclone: Bayesian nonparametric inference of tumor subclones using NGS data. Pacific Symposium on Biocomputing 467–478, (2015).

  19. 19.

    et al. Inferring clonal composition from multiple sections of a breast cancer. PLoS Comput. Biol. 10, e1003703 (2014).

  20. 20.

    , , & Assessing intratumor heterogeneity and tracking longitudinal and spatial clonal evolutionary history by next-generation sequencing. Proc. Natl. Acad. Sci. USA 113, E5528–E5537 (2016).

  21. 21.

    , , , & EXPANDS: expanding ploidy and allele frequency on nested subpopulations. Bioinformatics 30, 50–60 (2014).

  22. 22.

    et al. Comprehensive genomic profiles of small cell lung cancer. Nature 524, 47–53 (2015).

  23. 23.

    et al. Integrative genome analyses identify key somatic driver mutations of small-cell lung cancer. Nat. Genet. 44, 1104–1110 (2012).

  24. 24.

    et al. Telomerase activation by genomic rearrangements in high-risk neuroblastoma. Nature 526, 700–704 (2015).

  25. 25.

    et al. Mutational dynamics between primary and relapse neuroblastomas. Nat. Genet. 47, 872–877 (2015).

  26. 26.

    et al. The evolutionary history of 2,658 cancers. Preprint at bioRxiv, (2017).

  27. 27.

    et al. A genomics-based classification of human lung tumors. Sci. Transl. Med. 5, 209ra153 (2013).

  28. 28.

    et al. Mapping the hallmarks of lung adenocarcinoma with massively parallel sequencing. Cell 150, 1107–1120 (2012).

Download references


We thank the Evolution and Heterogeneity Working Group of the PCAWG initiative for fruitful discussions; P. van Loo, D. Wedge, and S. Dentro for making the Battenberg calls for PD4120a available; and L. Maas for proofreading. The computation was performed on the DFG-funded CHEOPS Cluster of the Regional Computing Centre of Cologne. This work was supported by German Cancer Aid (Deutsche Krebshilfe, grant ID: 109679), the German Ministry of Science and Education (BMBF) as part of the e:Med program (grant nos. 01ZX1303A and 01ZX1406), and the Deutsche Forschungsgemeinschaft (CRU-286, CP2).

Author information

Author notes

    • Yupeng Cun
    • , Tsun-Po Yang
    •  & Viktor Achter

    These authors contributed equally to this work.


  1. Department of Translational Genomics, Center for Integrated Oncology Cologne–Bonn, Medical Faculty, University of Cologne, Cologne, Germany.

    • Yupeng Cun
    • , Tsun-Po Yang
    •  & Martin Peifer
  2. Center for Molecular Medicine Cologne (CMMC), University of Cologne, Cologne, Germany.

    • Tsun-Po Yang
    •  & Martin Peifer
  3. Computing Center, University of Cologne, Cologne, Germany.

    • Viktor Achter
    •  & Ulrich Lang
  4. Department of Informatics, University of Cologne, Cologne, Germany.

    • Ulrich Lang


  1. Search for Yupeng Cun in:

  2. Search for Tsun-Po Yang in:

  3. Search for Viktor Achter in:

  4. Search for Ulrich Lang in:

  5. Search for Martin Peifer in:


Y.C. and M.P. conceived the project. Y.C., T.-P.Y., V.A., and M.P. wrote the manuscript. Y.C., T.-P.Y., V.A., and M.P. developed and optimized the algorithm. Y.C., T.-P.Y., V.A., and M.P. performed computational analysis. V.A. and U.L. provided and optimized computing and data infrastructure. All co-authors reviewed the manuscript. All authors read and approved the final manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Martin Peifer.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Note.

About this article

Publication history



Rights and permissions

To obtain permission to re-use content from this article visit RightsLink.


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.