Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Protocol
  • Published:

CONIPHER: a computational framework for scalable phylogenetic reconstruction with error correction

Abstract

Intratumor heterogeneity provides the fuel for the evolution and selection of subclonal tumor cell populations. However, accurate inference of tumor subclonal architecture and reconstruction of tumor evolutionary histories from bulk DNA sequencing data remains challenging. Frequently, sequencing and alignment artifacts are not fully filtered out from cancer somatic mutations, and errors in the identification of copy number alterations or complex evolutionary events (e.g., mutation losses) affect the estimated cellular prevalence of mutations. Together, such errors propagate into the analysis of mutation clustering and phylogenetic reconstruction. In this Protocol, we present a new computational framework, CONIPHER (COrrecting Noise In PHylogenetic Evaluation and Reconstruction), that accurately infers subclonal structure and phylogenetic relationships from multisample tumor sequencing, accounting for both copy number alterations and mutation errors. CONIPHER has been used to reconstruct subclonal architecture and tumor phylogeny from multisample tumors with high-depth whole-exome sequencing from the TRACERx421 dataset, as well as matched primary-metastatic cases. CONIPHER outperforms similar methods on simulated datasets, and in particular scales to a large number of tumor samples and clones, while completing in under 1.5 h on average. CONIPHER enables automated phylogenetic analysis that can be effectively applied to large sequencing datasets generated with different technologies. CONIPHER can be run with a basic knowledge of bioinformatics and R and bash scripting languages.

Key points

  • CONIPHER is a computational framework for accurately inferring subclonal structure and phylogenetic relationships from multisample tumor sequencing, accounting for both copy number alterations and mutation errors.

  • Benchmarking analyses on simulations show that CONIPHER outperforms similar methods, and in particular scales to a large number of tumor samples and clones. This enables automated phylogenetic analysis that can be effectively applied to large sequencing datasets generated with different technologies.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Overview of the CONIPHER clustering and tree-building methods.
Fig. 2: Benchmarking CONIPHER against current state-of-the-art methods.
Fig. 3: Method workflow.
Fig. 4: Example input table for clustering and tree building: input.tsv.
Fig. 5: Example panels from CONIPHER clustering stage output.
Fig. 6: Example pytree_and_bar.pdf for case CRUK0063.
Fig. 7: Example pytree_multipletrees.pdf for case CRUK0063.

Similar content being viewed by others

Data availability

The WES data (from the TRACERx study) used during this study have been deposited in the European Genome–phenome Archive, which is hosted by The European Bioinformatics Institute and the Centre for Genomic Regulation under the accession code EGAS00001006494; access is controlled by the TRACERx data access committee. Details on how to apply for access are available on the linked page. The three simulated datasets used in the benchmarking analyses are available at https://zenodo.org/doi/10.5281/zenodo.10048164.

Code availability

The code to run the CONIPHER clustering and tree-building wrapper functions can be found with documentation and run examples on the Github page at https://github.com/McGranahanLab/CONIPHER-wrapper. The source code for the CONIPHER R package can be found on the Github page at https://github.com/McGranahanLab/CONIPHER. The simulation framework can be found on the Github page at https://github.com/zaccaria-lab/TRACERx_simulation_tool. The code in this protocol has been peer reviewed.

References

  1. Greaves, M. & Maley, C. C. Clonal evolution in cancer. Nature 481, 306–313 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Jamal-Hanjani, M. et al. Tracking the evolution of non-small-cell lung cancer. N. Engl. J. Med. 376, 2109–2121 (2017).

    Article  CAS  PubMed  Google Scholar 

  3. Maley, C. C. et al. Genetic clonal diversity predicts progression to esophageal adenocarcinoma. Nat. Genet. 38, 468–473 (2006).

    Article  CAS  PubMed  Google Scholar 

  4. Gerstung, M. et al. The evolutionary history of 2,658 cancers. Nature 578, 122–128 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Frankell, A. M. et al. The evolution of lung cancer and impact of subclonal selection in TRACERx. Nature 616, 525–533 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Dentro, S. C. et al. Characterizing genetic intra-tumor heterogeneity across 2,658 human cancer genomes. Cell 184, 2239–2254.e39 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Roth, A. et al. PyClone: statistical inference of clonal population structure in cancer. Nat. Methods 11, 396–398 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Satas, G., Zaccaria, S., El-Kebir, M. & Raphael, B. J. DeCiFering the elusive cancer cell fraction in tumor heterogeneity and evolution. Cell Syst. 12, 1004–1018.e10 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Satas, G. & Raphael, B. J. Tumor phylogeny inference using tree-constrained importance sampling. Bioinformatics 33, i152–i160 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Deshwar, A. G. et al. PhyloWGS: reconstructing subclonal composition and evolution from whole-genome sequencing of tumors. Genome Biol. 16, 35 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  11. Dentro, S. C., Wedge, D. C. & Van Loo, P. Principles of reconstructing the subclonal architecture of cancers. Cold Spring Harb. Perspect. Med. 7, a026625 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  12. Tarabichi, M. et al. A practical guide to cancer subclonal reconstruction from DNA sequencing. Nat. Methods 18, 144–155 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Malikic, S., McPherson, A. W., Donmez, N. & Sahinalp, C. S. Clonality inference in multiple tumor samples using phylogeny. Bioinformatics 31, 1349–1356 (2015).

    Article  CAS  PubMed  Google Scholar 

  14. El-Kebir, M., Oesper, L., Acheson-Field, H. & Raphael, B. J. Reconstruction of clonal trees and tumor composition from multi-sample sequencing data. Bioinformatics 31, i62–i70 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Popic, V. et al. Fast and scalable inference of multi-sample cancer lineages. Genome Biol. 16, 91 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  16. Gundem, G. et al. The evolutionary history of lethal metastatic prostate cancer. Nature 520, 353–357 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Al Bakir, M. et al. The evolution of non-small cell lung cancer metastases in TRACERx. Nature 616, 534–542 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Martínez-Ruiz, C. et al. Genomic-transcriptomic evolution in lung cancer and metastasis. Nature 616, 543–552 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  19. Abbosh, C. et al. Tracking early lung cancer metastatic dissemination in TRACERx using ctDNA. Nature 616, 553–562 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Karasaki, T. et al. Evolutionary characterization of lung adenocarcinoma morphology in TRACERx. Nat. Med. 29, 833–845 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Benjamin, D. et al. Calling somatic SNVs and indels with Mutect2. Preprint at https://www.biorxiv.org/content/10.1101/861054v1 (2019).

  22. Koboldt, D. C. et al. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 22, 568–576 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Van Loo, P. et al. Allele-specific copy number analysis of tumors. Proc. Natl Acad. Sci. USA 107, 16910–16915 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  24. Zaccaria, S. & Raphael, B. J. Accurate quantification of copy-number aberrations and whole-genome duplications in multi-sample tumor sequencing data. Nat. Commun. 11, 4301 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Favero, F. et al. Sequenza: allele-specific copy number and mutation profiles from tumor sequencing data. Ann. Oncol. 26, 64–70 (2015).

    Article  CAS  PubMed  Google Scholar 

  26. Nik-Zainal, S. et al. The life history of 21 breast cancers. Cell 149, 994–1007 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. McGranahan, N. et al. Clonal status of actionable driver events and the timing of mutational processes in cancer evolution. Sci. Transl. Med. 7, 283ra54 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  28. Myers, M. A., Satas, G. & Raphael, B. J. CALDER: inferring phylogenetic trees from longitudinal tumor samples. Cell Syst. 8, 514–522.e5 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Wintersinger, J. A. et al. Reconstructing complex cancer evolutionary histories from multiple bulk DNA samples using Pairtree. Blood Cancer Discov. 3, 208–219 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

The TRACERx study (Clinicaltrials.gov no. NCT01888601) is sponsored by University College London (UCL/12/0279) and has been approved by an independent Research Ethics Committee (13/LO/1546). TRACERx is funded by Cancer Research UK (CRUK) (C11496/A17786) and coordinated through the CRUK and UCL Cancer Trials Centre, which has a core grant from CRUK (C444/A15953). We gratefully acknowledge the patients and relatives who participated in the TRACERx study. We thank all site personnel, investigators, funders and industry partners that supported the generation of the data within this study. This work was supported by the Francis Crick Institute, which receives its core funding from Cancer Research UK (CC2041), the UK Medical Research Council (CC2041) and the Wellcome Trust (CC2041). This work was also supported by the Cancer Research UK Lung Cancer Centre of Excellence, the CRUK City of London Centre Award (C7893/A26233) and the UCL Experimental Cancer Research Centre. C.S. is a Royal Society Napier Research Professor (RSRP\R\210001); is supported by the Francis Crick Institute that receives its core funding from Cancer Research UK (CC2041), the UK Medical Research Council (CC2041) and the Wellcome Trust (CC2041). For the purpose of open access, the author has applied a CC BY public copyright licence to any author accepted manuscript version arising from this submission. C.S. is funded by Cancer Research UK (TRACERx (C11496/A17786), PEACE (C416/A21999) and CRUK Cancer Immunotherapy Catalyst Network); Cancer Research UK Lung Cancer Centre of Excellence (C11496/A30025); the Rosetrees Trust, Butterfield and Stoneygate Trusts; NovoNordisk Foundation (ID16584); Royal Society Professorship Enhancement Award (RP/EA/180007); National Institute for Health Research (NIHR) University College London Hospitals Biomedical Research Centre; the Cancer Research UK–University College London Centre; the Experimental Cancer Medicine Centre; the Breast Cancer Research Foundation (US); the Mark Foundation for Cancer Research Aspire Award (21-029-ASP); and is in receipt of an ERC Advanced Grant (PROTEUS) from the European Research Council under the European Union’s Horizon 2020 research and innovation programme (835297). This work was supported by a Stand Up To Cancer‐LUNGevity-American Lung Association Lung Cancer Interception Dream Team Translational Research Grant (SU2C-AACR-DT23-17 to S. M. Dubinett and A. E. Spira). Stand Up To Cancer is a division of the Entertainment Industry Foundation. Research grants are administered by the American Association for Cancer Research, the Scientific Partner of SU2C. S.Z. is a Cancer Research UK Career Development Fellow (RCCCDF-Nov21\100005), is supported by Rosetrees Trust (M917) and is also supported by a Cancer Research UK UCL Centre Non-Clinical Training Award (CANTAC721\100022). N.M. is a Sir Henry Dale Fellow, jointly funded by the Wellcome Trust and the Royal Society (211179/Z/18/Z) and also receives funding from Cancer Research UK, Rosetrees and the NIHR BRC at University College London Hospitals and the CRUK University College London Experimental Cancer Medicine Centre.

Author information

Authors and Affiliations

Authors

Contributions

K.G., A.H., E.C., A.M.F., K.T., N.J.B. and N.M. helped to develop the protocol and wrote the manuscript. A.B. and S.Z. created the simulations, performed the benchmarking and wrote the manuscript. M.S.H. helped with bioinformatics pipeline development. C.S., S.Z. and N.M. jointly designed and supervised the study and helped to write the manuscript.

Corresponding authors

Correspondence to Charles Swanton, Simone Zaccaria or Nicholas McGranahan.

Ethics declarations

Competing interests

A.M.F. is co-inventor to a patent application to determine methods and systems for tumor monitoring (PCT/EP2022/077987). N.J.B. is a co-inventor to a patent to identify responders to cancer treatment (PCT/GB2018/051912) and a co-inventor on a patent for methods for predicting anti-cancer response (US14/466,208). C.S. acknowledges grant support from AstraZeneca, Boehringer-Ingelheim, Bristol Myers Squibb, Pfizer, Roche-Ventana, Invitae (previously Archer Dx Inc—collaboration in minimal residual disease sequencing technologies) and Ono Pharmaceutical, and Personalis. He is an AstraZeneca Advisory Board member and Chief Investigator for the AZ MeRmaiD 1 and 2 clinical trials and is also Co-Chief Investigator of the NHS Galleri trial funded by GRAIL and a paid member of GRAIL’s Scientific Advisory Board. He receives consultant fees from Achilles Therapeutics (also a Scientific Advisory Board member), Bicycle Therapeutics (also a Scientific Advisory Board member), Genentech, Medicxi, China Innovation Centre of Roche (CICoR) formerly Roche Innovation Centre–Shanghai, Metabomed (until July 2022) and the Sarah Cannon Research Institute. He has received honoraria from Amgen, AstraZeneca, Pfizer, Novartis, GlaxoSmithKline, MSD, Bristol Myers Squibb, Illumina, and Roche-Ventana; had stock options in Apogen Biotechnologies and GRAIL until June 2021, and currently has stock options in Epic Bioscience, Bicycle Therapeutics, and has stock options and is co-founder of Achilles Therapeutics. C.S. is listed as an inventor on a European patent application relating to assay technology to detect tumor recurrence (PCT/GB2017/053289), which has been licensed to commercial entities and, under his terms of employment, C.S. is due a revenue share of any revenue generated from such license(s). C.S. holds patents relating to targeting neoantigens (PCT/EP2016/059401), to identifying patient response to immune checkpoint blockade (PCT/EP2016/071471), to determining human leukocyte antigen loss of heterozygosity (HLA LOH) (PCT/GB2018/052004), to predicting survival rates of patients with cancer (PCT/GB2020/050221), to identifying patients who respond to cancer treatment (PCT/GB2018/051912), to detecting tumor mutations (US patent: PCT/US2017/28013) and to methods for lung cancer detection (US20190106751A1). He also holds both a European and US patent related to identifying indel targets (PCT/GB2018/051892) and is co-inventor to a patent application to determine methods and systems for tumor monitoring (PCT/EP2022/077987) and is a named inventor on a provisional patent protection related to a ctDNA detection algorithm. N.M. has received consultancy fees and has stock options in Achilles Therapeutics. N.M. holds European patents relating to targeting neoantigens (PCT/EP2016/059401), identifying patient response to immune checkpoint blockade (PCT/EP2016/071471), determining HLA LOH (PCT/GB2018/052004) and predicting survival rates of patients with cancer (PCT/GB2020/050221).

Peer review

Peer review information

Nature Protocols thanks Alexander Anderson, Tim Coorens, Andrew Roth and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Related links

Key references using this protocol

Frankell, A. M. et al. Nature 616, 525–533 (2023): https://doi.org/10.1038/s41586-023-05783-5

Al Bakir, M. et al. Nature 616, 534–542 (2023): https://doi.org/10.1038/s41586-023-05729-x

Martínez-Ruiz, C. et al. Nature 616, 543–552 (2023): https://doi.org/10.1038/s41586-023-05706-4

Abbosh, C. et al. Nature 616, 553–562 (2023): https://doi.org/10.1038/s41586-023-05776-4

Karasaki, T. et al. Nat. Med. 29, 833–845 (2023): https://doi.org/10.1038/s41591-023-02230-w

Supplementary information

Supplementary Information

Supplementary Methods 1–5, Notes 1–4, Figs. 1–4 and Table 1

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Grigoriadis, K., Huebner, A., Bunkum, A. et al. CONIPHER: a computational framework for scalable phylogenetic reconstruction with error correction. Nat Protoc 19, 159–183 (2024). https://doi.org/10.1038/s41596-023-00913-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41596-023-00913-9

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing: Cancer

Sign up for the Nature Briefing: Cancer newsletter — what matters in cancer research, free to your inbox weekly.

Get what matters in cancer research, free to your inbox weekly. Sign up for Nature Briefing: Cancer