Abstract
Single-cell barcoding technologies enable genome sequencing of thousands of individual cells in parallel, but with extremely low sequencing coverage (<0.05×) per cell. While the total copy number of large multi-megabase segments can be derived from such data, important allele-specific mutations—such as copy-neutral loss of heterozygosity (LOH) in cancer—are missed. We introduce copy-number haplotype inference in single cells using evolutionary links (CHISEL), a method to infer allele- and haplotype-specific copy numbers in single cells and subpopulations of cells by aggregating sparse signal across hundreds or thousands of individual cells. We applied CHISEL to ten single-cell sequencing datasets of ~2,000 cells from two patients with breast cancer. We identified extensive allele-specific copy-number aberrations (CNAs) in these samples, including copy-neutral LOHs, whole-genome duplications (WGDs) and mirrored-subclonal CNAs. These allele-specific CNAs affect genomic regions containing well-known breast-cancer genes. We also refined the reconstruction of tumor evolution, timing allele-specific CNAs before and after WGDs, identifying low-frequency subpopulations distinguished by unique CNAs and uncovering evidence of convergent evolution.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
The sequencing data from 10x Genomics Chromium Single Cell CNV Solution for patient S0 are available at https://support.10xgenomics.com/single-cell-dna/datasets. Raw read counts and phased SNP counts for patient S0 are available at https://doi.org/10.5281/zenodo.3817605 and for patient S1 at https://doi.org/10.5281/zenodo.3817536. The DOP-PCR sequencing data of 89 breast tumor cells are available from the NCBI Sequence Read Archive under accession SRA: SRP114962. All the processed data for all datasets of patients S0 and S1 and for the DOP-PCR data, as well as all the results of CHISEL, are available on GitHub at https://github.com/raphael-group/chisel-data.
Code availability
CHISEL is available on GitHub at https://github.com/raphael-group/chisel and on Code Ocean at https://doi.org/10.24433/CO.6796686.v1.
References
Navin, N. et al. Tumour evolution inferred by single-cell sequencing. Nature 472, 90–94 (2011).
Wang, Y. et al. Clonal evolution in breast cancer revealed by single nucleus genome sequencing. Nature 512, 155–160 (2014).
Navin, N. E. The first five years of single-cell cancer genomics and beyond. Genome Res. 25, 1499–1507 (2015).
Gawad, C., Koh, W. & Quake, S. R. Single-cell genome sequencing: current state of the science. Nat. Rev. Genet. 17, 175–188 (2016).
Andor, N. et al. Joint single cell DNA-seq and RNA-seq of gastric cancer reveals subclonal signatures of genomic instability and gene expression. Preprint at bioRxiv https://doi.org/10.1101/445932 (2018).
Zahn, H. et al. Scalable whole-genome single-cell library preparation without preamplification. Nat. Methods 14, 167–173 (2017).
Laks, E. et al. Clonal decomposition and DNA replication states defined by scaled single-cell genome sequencing. Cell 179, 1207–1221 (2019).
Beroukhim, R. et al. The landscape of somatic copy-number alteration across human cancers. Nature 463, 899–905 (2010).
Zack, T. I. et al. Pan-cancer patterns of somatic copy number alteration. Nat. Genet. 45, 1134–1140 (2013).
Ciriello, G. et al. Emerging landscape of oncogenic signatures across human cancers. Nat. Genet. 45, 1127–1133 (2013).
Taylor, A. M. et al. Genomic and functional approaches to understanding cancer aneuploidy. Cancer Cell 33, 676–689 (2018).
Burrell, R. A., McGranahan, N., Bartek, J. & Swanton, C. The causes and consequences of genetic heterogeneity in cancer evolution. Nature 501, 338–345 (2013).
McGranahan, N. & Swanton, C. Biological and therapeutic impact of intratumor heterogeneity in cancer evolution. Cancer Cell 27, 15–26 (2015).
Desper, R. et al. Distance-based reconstruction of tree models for oncogenesis. J. Comput. Biol. 7, 789–803 (2000).
Chowdhury, S. A. et al. Algorithms to model single gene, single chromosome, and whole genome copy number changes jointly in tumor phylogenetics. PLOS Comput. Biol. 10, e1003740 (2014).
Schwarz, R. F. et al. Phylogenetic quantification of intra-tumour heterogeneity. PLOS Comput. Biol. 10, 1–11 (2014).
El-Kebir, M. et al. Complexity and algorithms for copy-number evolution problems. Algorithms Mol. Biol. 12, 13 (2017).
Zaccaria, S., El-Kebir, M., Klau, G. W. & Raphael, B. J. Phylogenetic copy-number factorization of multiple tumor samples. J. Comput. Biol. 25, 689–708 (2018).
Van Loo, P. et al. Allele-specific copy number analysis of tumors. Proc. Natl Acad. Sci. USA 107, 16910–16915 (2010).
Carter, S. L. et al. Absolute quantification of somatic DNA alterations in human cancer. Nat. Biotech. 30, 413–421 (2012).
Nik-Zainal, S. et al. The life history of 21 breast cancers. Cell 149, 994–1007 (2012).
Ha, G. et al. TITAN: Inference of copy number architectures in clonal cell populations from tumor whole-genome sequence data. Genome Res. 24, 1881–1893 (2014).
Fischer, A., Vázquez-Garcı́a, I., Illingworth, C. J. & Mustonen, V. High-definition reconstruction of clonal composition in cancer. Cell Rep. 7, 1740–1752 (2014).
McPherson, A. W. et al. ReMixT: clone-specific genomic structure estimation in cancer. Genome Biol. 18, 140 (2017).
Zaccaria, S. & Raphael, B. J. Accurate quantification of copy-number aberrations and whole-genome duplications in multi-sample tumor sequencing data. Preprint at bioRxiv https://doi.org/10.1101/496174 (2018).
Pleasance, E. D. et al. A comprehensive catalogue of somatic mutations from a human cancer genome. Nature 463, 191–196 (2010).
Waddell, N. et al. Whole genomes redefine the mutational landscape of pancreatic cancer. Nature 518, 495–501 (2015).
Dentro, S. C. et al. Portraits of genetic intra-tumour heterogeneity and subclonal selection across cancer types. Preprint at bioRxiv https://doi.org/10.1101/312041 (2018).
Langdon, J. A. et al. Combined genome-wide allelotyping and copy number analysis identify frequent genetic losses without copy number reduction in medulloblastoma. Gene. Chromosome. Cancer 45, 47–60 (2006).
Kuga, D. et al. Prevalence of copy-number neutral loh in glioblastomas revealed by genomewide analysis of laser-microdissected tissues. Neuro-Oncol. 10, 995–1003 (2008).
O’Keefe, C., McDevitt, M. A. & Maciejewski, J. P. Copy neutral loss of heterozygosity: a novel chromosomal lesion in myeloid malignancies. Blood 115, 2731–2739 (2010).
Ha, G. et al. Integrative analysis of genome-wide loss of heterozygosity and monoallelic expression at nucleotide resolution reveals disrupted pathways in triple-negative breast cancer. Genome Res. 22, 1995–2007 (2012).
Bielski, C. M. et al. Genome doubling shapes the evolution and prognosis of advanced cancers. Nat. Genet. 50, 1189–1195 (2018).
Campbell, K. R. et al. Clonealign: statistical integration of independent single-cell RNA and DNA sequencing data from human cancers. Genome Biol. 20, 54 (2019).
Garvin, T. et al. Interactive analysis and assessment of single-cell copy-number variations. Nat. Methods 12, 1058–1060 (2015).
Bakker, B. et al. Single-cell sequencing reveals karyotype heterogeneity in murine and human malignancies. Genome Biol. 17, 115 (2016).
Wang, X., Chen, H. & Zhang, N. R. DNA copy number profiling using single-cell sequencing. Brief. Bioinform. 19, 731–736 (2017).
Dong, X., Zhang, L., Hao, X., Wang, T. & Vijg, J. SCCNV: a software tool for identifying copy number variation from single-cell whole-genome sequencing. Preprint at bioRxiv https://doi.org/10.1101/535807 (2019).
Wang, R., Lin, D.-Y. & Jiang, Y. SCOPE: a normalization and copy-number estimation method for single-cell DNA sequencing. Cell Syst. 10, 445–452 (2020).
Jamal-Hanjani, M. et al. Tracking the evolution of non-cell lung cancer. N. Engl. J. Med. 376, 2109–2121 (2017).
Loh, P.-R. et al. Reference-based phasing using the Haplotype Reference Consortium panel. Nat. Genet. 48, 1443–1448 (2016).
Nik-Zainal, S. et al. Landscape of somatic mutations in 560 breast cancer whole-genome sequences. Nature 534, 47–54 (2016).
McGranahan, N. et al. Allele-specific hla loss and immune escape in lung cancer evolution. Cell 171, 1259–1271 (2017).
Kim, C. et al. Chemoresistance evolution in triple-negative breast cancer delineated by single-cell sequencing. Cell 173, 879–893 (2018).
Roth, A. et al. PyClone: statistical inference of clonal population structure in cancer. Nat. Methods 11, 396–398 (2014).
Deshwar, A. G. et al. PhyloWGS: reconstructing subclonal composition and evolution from whole-genome sequencing of tumors. Genome Biol. 16, 35 (2015).
El-Kebir, M., Satas, G., Oesper, L. & Raphael, B. J. Inferring the mutational history of a tumor using multi-state perfect phylogeny mixtures. Cell Syst. 3, 43–53 (2016).
Dentro, S. C., Wedge, D. C. & Van Loo, P. Principles of reconstructing the subclonal architecture of cancers. Cold Spring Harbor Perspect. Med. 7, a026625 (2017).
Gao, R. et al. Punctuated copy number evolution and clonal stasis in triple-negative breast cancer. Nat. Genet. 48, 1119–1130 (2016).
Fan, J. et al. Linking transcriptional and genetic tumor heterogeneity through allele analysis of single-cell RNA-seq data. Genome Res. 28, 1217–1227 (2018).
Zaccaria, S. & Raphael, B. J. Characterizing allele- and haplotype-specific copy numbers in single cells with CHISEL (Github, 2020); https://github.com/raphael-group/chisel
Zaccaria, S. & Raphael, B. J. Characterizing allele- and haplotype-specific copy numbers in single cells with CHISEL (Code Ocean, 2020); https://doi.org/10.24433/CO.6796686.v1
Staaf, J. et al. Segmentation-based detection of allelic imbalance and loss-of-heterozygosity in cancer cells using whole genome snp arrays. Genome Biol. 9, R136 (2008).
Greenman, C. D. et al. PICNIC: an algorithm to predict absolute allelic copy number variation with microarray cancer data. Biostatistics 11, 164–175 (2009).
Popova, T. et al. Genome Alteration Print (GAP): a tool to visualize and mine complex cancer genomic profiles obtained by SNP arrays. Genome Biol. 10, R128 (2009).
Carter, S. L., Meyerson, M. & Getz, G. Accurate estimation of homologue-specific DNA concentration-ratios in cancer samples allows long-range haplotyping. Nat. Prec. https://doi.org/10.1038/npre.2011.6494.1 (2011).
Chen, H., Bell, J. M., Zavala, N. A., Ji, H. P. & Zhang, N. R. Allele-specific copy number profiling by next-generation DNA sequencing. Nucleic Acid. Res. 43, e23–e23 (2014).
Shen, R. & Seshan, V. E. FACETS: Allele-specific copy number and clonal heterogeneity analysis tool for high-throughput DNA sequencing. Nucleic Acid. Res. 44, e131–e131 (2016).
Cheng, Y. et al. Quantification of multiple tumor clones using gene array and sequencing data. Ann. Appl. Stat. 11, 967–991 (2017).
Choi, Y., Chan, A. P., Kirkness, E., Telenti, A. & Schork, N. J. Comparison of phasing strategies for whole human genomes. PLOS Genet. 14, e1007308 (2018).
Do, C. B. & Batzoglou, S. What is the expectation maximization algorithm? Nat. Biotechnol. 26, 897–899 (2008).
Thorndike, R. L. Who belongs in the family? Psychometrika 18, 267–276 (1953).
Li, H. A statistical framework for snp calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics 27, 2987–2993 (2011).
Das, S. et al. Next-generation genotype imputation service and methods. Nat. Genet. 48, 1284–1287 (2016).
McCarthy, S. et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 48, 1279–1283 (2016).
Koboldt, D. C. et al. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 22, 568–576 (2012).
Li, H. et al. The sequence alignment/map format and samtools. Bioinformatics 25, 2078–2079 (2009).
Acknowledgements
We thank L. Hepler and K. Ganapathy from 10x Genomics for providing additional data for our study, for providing access to the published data of the total copy-number analysis, and for the useful feedback. This work is supported by a US National Institutes of Health (NIH) grants R01HG007069 and U24CA211000, US National Science Foundation (NSF) CAREER Award (CCF-1053753) and Chan Zuckerberg Initiative DAF grants 2018-182608 (B.J.R.). Additional support was provided by NIH grant (Rutgers) 2P30CA072720-20, the O’Brien Family Fund for Health Research and the Wilke Family Fund for Innovation (B.J.R.).
Author information
Authors and Affiliations
Contributions
S.Z. and B.J.R. conceived the project, developed the theory and algorithms and wrote the paper. S.Z. implemented the algorithms and performed the analyses.
Corresponding author
Ethics declarations
Competing interests
B.J.R. is a cofounder of, and consultant to, Medley Genomics.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Supplementary Information
Supplementary Figs. 1–28, Results 1–4 and Methods 1–12.
Rights and permissions
About this article
Cite this article
Zaccaria, S., Raphael, B.J. Characterizing allele- and haplotype-specific copy numbers in single cells with CHISEL. Nat Biotechnol 39, 207–214 (2021). https://doi.org/10.1038/s41587-020-0661-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41587-020-0661-6
This article is cited by
-
scAbsolute: measuring single-cell ploidy and replication status
Genome Biology (2024)
-
Inferring single-cell copy number profiles through cross-cell segmentation of read counts
BMC Genomics (2024)
-
scGAL: unmask tumor clonal substructure by jointly analyzing independent single-cell copy number and scRNA-seq data
BMC Genomics (2024)
-
HATCHet2: clone- and haplotype-specific copy number inference from bulk tumor sequencing data
Genome Biology (2024)
-
Computational validation of clonal and subclonal copy number alterations from bulk tumor sequencing using CNAqc
Genome Biology (2024)