Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Protocol
  • Published:

Copy-number analysis and inference of subclonal populations in cancer genomes using Sclust


The genomes of cancer cells constantly change during pathogenesis. This evolutionary process can lead to the emergence of drug-resistant mutations in subclonal populations, which can hinder therapeutic intervention in patients. Data derived from massively parallel sequencing can be used to infer these subclonal populations using tumor-specific point mutations. The accurate determination of copy-number changes and tumor impurity is necessary to reliably infer subclonal populations by mutational clustering. This protocol describes how to use Sclust, a copy-number analysis method with a recently developed mutational clustering approach. In a series of simulations and comparisons with alternative methods, we have previously shown that Sclust accurately determines copy-number states and subclonal populations. Performance tests show that the method is computationally efficient, with copy-number analysis and mutational clustering taking <10 min. Sclust is designed such that even non-experts in computational biology or bioinformatics with basic knowledge of the Linux/Unix command-line syntax should be able to carry out analyses of subclonal populations.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Validation of the copy-number analysis of Sclust against Absolute and a mixing series of the small-cell lung-cancer cell line H2171 with its matched normal cell line.
Figure 2: Reconstruction of the subclonal structure of the breast cancer case PD4120a.
Figure 3: Simulations of a clonal and subclonal population with different proportions of clonal mutations.
Figure 4: Overview of the Sclust workflow.
Figure 5: Example of the first two pages of the <sample>_cn_profile.pdf file.

Similar content being viewed by others


  1. Beerenwinkel, N., Schwarz, R.F., Gerstung, M. & Markowetz, F. Cancer evolution: mathematical models and computational inference. Syst. Biol. 64, e1–25 (2015).

    Article  CAS  Google Scholar 

  2. Greaves, M. & Maley, C.C. Clonal evolution in cancer. Nature 481, 306–313 (2012).

    Article  CAS  Google Scholar 

  3. Nowell, P.C. The clonal evolution of tumor cell populations. Science 194, 23–28 (1976).

    Article  CAS  Google Scholar 

  4. Stratton, M.R., Campbell, P.J. & Futreal, P.A. The cancer genome. Nature 458, 719–724 (2009).

    Article  CAS  Google Scholar 

  5. Yates, L.R. & Campbell, P.J. Evolution of the cancer genome. Nat. Rev. Genet. 13, 795–806 (2012).

    Article  CAS  Google Scholar 

  6. Meyerson, M., Gabriel, S. & Getz, G. Advances in understanding cancer genomes through second-generation sequencing. Nat. Rev. Genet. 11, 685–696 (2010).

    Article  CAS  Google Scholar 

  7. Carter, S.L. et al. Absolute quantification of somatic DNA alterations in human cancer. Nat. Biotechnol. 30, 413–421 (2012).

    Article  CAS  Google Scholar 

  8. Fischer, A., Vazquez-Garcia, I., Illingworth, C.J. & Mustonen, V. High-definition reconstruction of clonal composition in cancer. Cell Rep. 7, 1740–1752 (2014).

    Article  CAS  Google Scholar 

  9. Gusnanto, A., Wood, H.M., Pawitan, Y., Rabbitts, P. & Berri, S. Correcting for cancer genome size and tumour cell content enables better estimation of copy number alterations from next-generation sequence data. Bioinformatics 28, 40–47 (2012).

    Article  CAS  Google Scholar 

  10. Ha, G. et al. TITAN: inference of copy number architectures in clonal cell populations from tumor whole-genome sequence data. Genome Res. 24, 1881–1893 (2014).

    Article  CAS  Google Scholar 

  11. Oesper, L., Mahmoody, A. & Raphael, B.J. THetA: inferring intra-tumor heterogeneity from high-throughput DNA sequencing data. Genome Biol. 14, R80 (2013).

    Article  Google Scholar 

  12. Van Loo, P. et al. Allele-specific copy number analysis of tumors. Proc. Natl. Acad. Sci. USA 107, 16910–16915 (2010).

    Article  CAS  Google Scholar 

  13. Deshwar, A.G. et al. PhyloWGS: reconstructing subclonal composition and evolution from whole-genome sequencing of tumors. Genome Biol. 16, 35 (2015).

    Article  Google Scholar 

  14. Jiao, W., Vembu, S., Deshwar, A.G., Stein, L. & Morris, Q. Inferring clonal evolution of tumors from single nucleotide somatic mutations. BMC Bioinformatics 15, 35 (2014).

    Article  Google Scholar 

  15. Nik-Zainal, S. et al. The life history of 21 breast cancers. Cell 149, 994–1007 (2012).

    Article  CAS  Google Scholar 

  16. Roth, A. et al. PyClone: statistical inference of clonal population structure in cancer. Nat. Methods 11, 396–398 (2014).

    Article  CAS  Google Scholar 

  17. Malikic, S., McPherson, A.W., Donmez, N. & Sahinalp, C.S. Clonality inference in multiple tumor samples using phylogeny. Bioinformatics 31, 1349–1356 (2015).

    Article  CAS  Google Scholar 

  18. Sengupta, S. et al. Bayclone: Bayesian nonparametric inference of tumor subclones using NGS data. Pacific Symposium on Biocomputing 467–478, (2015).

  19. Zare, H. et al. Inferring clonal composition from multiple sections of a breast cancer. PLoS Comput. Biol. 10, e1003703 (2014).

    Article  Google Scholar 

  20. Jiang, Y., Qiu, Y., Minn, A.J. & Zhang, N.R. Assessing intratumor heterogeneity and tracking longitudinal and spatial clonal evolutionary history by next-generation sequencing. Proc. Natl. Acad. Sci. USA 113, E5528–E5537 (2016).

    Article  CAS  Google Scholar 

  21. Andor, N., Harness, J.V., Muller, S., Mewes, H.W. & Petritsch, C. EXPANDS: expanding ploidy and allele frequency on nested subpopulations. Bioinformatics 30, 50–60 (2014).

    Article  CAS  Google Scholar 

  22. George, J. et al. Comprehensive genomic profiles of small cell lung cancer. Nature 524, 47–53 (2015).

    Article  CAS  Google Scholar 

  23. Peifer, M. et al. Integrative genome analyses identify key somatic driver mutations of small-cell lung cancer. Nat. Genet. 44, 1104–1110 (2012).

    Article  CAS  Google Scholar 

  24. Peifer, M. et al. Telomerase activation by genomic rearrangements in high-risk neuroblastoma. Nature 526, 700–704 (2015).

    Article  CAS  Google Scholar 

  25. Schramm, A. et al. Mutational dynamics between primary and relapse neuroblastomas. Nat. Genet. 47, 872–877 (2015).

    Article  CAS  Google Scholar 

  26. Gerstung, M. et al. The evolutionary history of 2,658 cancers. Preprint at bioRxiv, (2017).

  27. Seidel, D. et al. A genomics-based classification of human lung tumors. Sci. Transl. Med. 5, 209ra153 (2013).

    Google Scholar 

  28. Imielinski, M. et al. Mapping the hallmarks of lung adenocarcinoma with massively parallel sequencing. Cell 150, 1107–1120 (2012).

    Article  CAS  Google Scholar 

Download references


We thank the Evolution and Heterogeneity Working Group of the PCAWG initiative for fruitful discussions; P. van Loo, D. Wedge, and S. Dentro for making the Battenberg calls for PD4120a available; and L. Maas for proofreading. The computation was performed on the DFG-funded CHEOPS Cluster of the Regional Computing Centre of Cologne. This work was supported by German Cancer Aid (Deutsche Krebshilfe, grant ID: 109679), the German Ministry of Science and Education (BMBF) as part of the e:Med program (grant nos. 01ZX1303A and 01ZX1406), and the Deutsche Forschungsgemeinschaft (CRU-286, CP2).

Author information

Authors and Affiliations



Y.C. and M.P. conceived the project. Y.C., T.-P.Y., V.A., and M.P. wrote the manuscript. Y.C., T.-P.Y., V.A., and M.P. developed and optimized the algorithm. Y.C., T.-P.Y., V.A., and M.P. performed computational analysis. V.A. and U.L. provided and optimized computing and data infrastructure. All co-authors reviewed the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Martin Peifer.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Note. (PDF 662 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cun, Y., Yang, TP., Achter, V. et al. Copy-number analysis and inference of subclonal populations in cancer genomes using Sclust. Nat Protoc 13, 1488–1501 (2018).

Download citation

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer