Abstract
Chromosomal instability (CIN) results in the accumulation of large-scale losses, gains and rearrangements of DNA1. The broad genomic complexity caused by CIN is a hallmark of cancer2; however, there is no systematic framework to measure different types of CIN and their effect on clinical phenotypes pan-cancer. Here we evaluate the extent, diversity and origin of CIN across 7,880 tumours representing 33 cancer types. We present a compendium of 17 copy number signatures that characterize specific types of CIN, with putative aetiologies supported by multiple independent data sources. The signatures predict drug response and identify new drug targets. Our framework refines the understanding of impaired homologous recombination, which is one of the most therapeutically targetable types of CIN. Our results illuminate a fundamental structure underlying genomic complexity in human cancers and provide a resource to guide future CIN research.
This is a preview of subscription content, access via your institution
Relevant articles
Open Access articles citing this article.
-
Deep learning model accurately classifies metastatic tumors from primary tumors based on mutational signatures
Scientific Reports Open Access 30 May 2023
-
Copy number alteration features in pan-cancer homologous recombination deficiency prediction and biology
Communications Biology Open Access 16 May 2023
-
Mechanisms of chromosomal instability (CIN) tolerance in aggressive tumors: surviving the genomic chaos
Chromosome Research Open Access 14 April 2023
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Rent or buy this article
Get just this article for as long as you need it
$39.95
Prices may be subject to local taxes which are calculated during checkout




Data availability
All data used in this study were obtained from publicly available sources and are described in detail in Supplementary Table 1, section ‘Data and Code’ in the Supplementary Methods. Some raw data have restricted access (TCGA dbGaP accession number: phs000178.v11.p8; ICGC EGA accession number: EGAS00001001692). Access can be obtained by applying to the relevant Data Access Committees (TCGA or ICGC). The authors declare that all other data supporting the findings of this study, including the source data for all figures, are publicly available without restrictions and are also available in the Supplementary Information and the Github repositories. All data supporting the analysis of our copy number signatures are navigable via our web portal (https://markowetz.cruk.cam.ac.uk/cincompendium/).
Code availability
The code is publicly accessible via our hub repository (https://github.com/markowetzlab/Drews2022_CIN_Compendium), which describes how the CIN signatures were derived and how to reproduce the figures and tables in this publication. The repository also contains the publicly accessible data and intermediary results used and produced in this study. The hub repository links to other repositories containing the code for specialized tasks.
References
Bakhoum, S. F. & Cantley, L. C. The multifaceted role of chromosomal instability in cancer and its microenvironment. Cell 174, 1347–1360 (2018).
Hanahan, D. & Weinberg, R. A. Hallmarks of cancer: the next generation. Cell 144, 646–674 (2011).
Tijhuis, A. E., Johnson, S. C. & McClelland, S. E. The emerging links between chromosomal instability (CIN), metastasis, inflammation and tumour immunity. Mol. Cytogenet. 12, 17 (2019).
Chakravarti, D., LaBella, K. A. & DePinho, R. A. Telomeres: history, health, and hallmarks of aging. Cell 184, 306–322 (2021).
Bakhoum, S. F. et al. Chromosomal instability drives metastasis through a cytosolic DNA response. Nature 553, 467–472 (2018).
Davies, H. et al. HRDetect is a predictor of BRCA1 and BRCA2 deficiency based on mutational signatures. Nat. Med. 23, 517–525 (2017).
Cohen-Sharir, Y. et al. Aneuploidy renders cancer cells vulnerable to mitotic checkpoint inhibition. Nature 590, 486–491 (2021).
Macintyre, G. et al. Copy number signatures and mutational processes in ovarian carcinoma. Nat. Genet. 50, 1262–1270 (2018).
Steele, C. D. et al. Undifferentiated sarcomas develop through distinct evolutionary pathways. Cancer Cell 35, 441–456.e8 (2019).
Ben-David, U. & Amon, A. Context is everything: aneuploidy in cancer. Nat. Rev. Genet. 21, 44–62 (2020).
Stok, C., Kok, Y. P., van den Tempel, N. & van Vugt, M. A. T. M. Shaping the BRCAness mutational landscape by alternative double-strand break repair, replication stress and mitotic aberrancies. Nucleic Acids Res. 49, 4239–4257 (2021).
Takemon, Y. et al. Multi-omic analyses reveal a role for mammalian CIC in cell cycle regulation and mitotic fidelity Preprint at bioRxiv https://www.biorxiv.org/content/10.1101/533323v2 (2019).
Hell, M. P., Duda, M., Weber, T. C., Moch, H. & Krek, W. Tumor suppressor VHL functions in the control of mitotic fidelity. Cancer Res. 74, 2422–2431 (2014).
Brownlee, P. M., Chambers, A. L., Cloney, R., Bianchi, A. & Downs, J. A. BAF180 promotes cohesion and prevents genome instability and aneuploidy. Cell Rep. 6, 973–981 (2014).
Silverman, J. S., Skaar, J. R. & Pagano, M. SCF ubiquitin ligases in the maintenance of genome stability. Trends Biochem. Sci. 37, 66–73 (2012).
Godinho, S. A. & Pellman, D. Causes and consequences of centrosome abnormalities in cancer. Phil. Trans. R. Soc. B 369, 20130467 (2014).
Menghi, F. et al. The tandem duplicator phenotype is a prevalent genome-wide cancer configuration driven by distinct gene mutations. Cancer Cell 34, 197–210.e5 (2018).
Abkevich, V. et al. Patterns of genomic loss of heterozygosity predict homologous recombination repair defects in epithelial ovarian cancer. Br. J. Cancer 107, 1776–1782 (2012).
Popova, T. et al. Ploidy and large-scale genomic instability consistently identify basal-like breast carcinomas with BRCA1/2 inactivation. Cancer Res. 72, 5454–5462 (2012).
The Cancer Genome Atlas Research Network. Integrated genomic analyses of ovarian carcinoma. Nature 474, 609–615 (2011).
Nik-Zainal, S. et al. Landscape of somatic mutations in 560 breast cancer whole-genome sequences. Nature 534, 47–54 (2016).
Ogden, A., Rida, P. C. G. & Aneja, R. Prognostic value of CA20, a score based on centrosome amplification-associated genes, in breast tumors. Sci Rep. 7, 262 (2017).
Piazza, A. & Heyer, W.-D. Homologous recombination and the formation of complex genomic rearrangements. Trends Cell Biol. 29, 135–149 (2019).
Guirouilh-Barbat, J., Lambert, S., Bertrand, P. & Lopez, B. S. Is homologous recombination really an error-free process? Front. Genet. 5, 175 (2014).
Knijnenburg, T. A. et al. Genomic and molecular landscape of dna damage repair deficiency across The Cancer Genome Atlas. Cell Rep. 23, 239–254.e6 (2018).
Saavedra, H. I., Fukasawa, K., Conn, C. W. & Stambrook, P. J. MAPK mediates RAS-induced chromosome instability. J. Biol. Chem. 274, 38083–38090 (1999).
Perl, A. L. et al. Protein phosphatase 2A controls ongoing DNA replication by binding to and regulating cell division cycle 45 (CDC45). J. Biol. Chem. 294, 17043–17059 (2019).
Chen, L. et al. The augmented R-loop is a unifying mechanism for myelodysplastic syndromes induced by high-risk splicing factor mutations. Mol. Cell 69, 412–425.e6 (2018).
Li, Q. et al. ERCC2 helicase domain mutations confer nucleotide excision repair deficiency and drive cisplatin sensitivity in muscle-invasive bladder cancer. Clin. Cancer Res. 25, 977–988 (2019).
Menon, V. & Povirk, L. Involvement of p53 in the repair of DNA double strand breaks: multifaceted roles of p53 in homologous recombination repair (HRR) and non-homologous end joining (NHEJ). Subcell. Biochem. 85, 321–336 (2014).
Dentro, S. C. et al. Characterizing genetic intra-tumor heterogeneity across 2,658 human cancer genomes. Cell 184, 2239–2254.e39 (2021).
Dewhurst, S. M. et al. Tolerance of whole-genome doubling propagates chromosomal instability and accelerates cancer genome evolution. Cancer Discov. 4, 175–185 (2014).
Davoli, T. & de Lange, T. The causes and consequences of polyploidy in normal development and cancer. Annu. Rev. Cell Dev. Biol. 27, 585–610 (2011).
Berenjeno, I. M. et al. Oncogenic PIK3CA induces centrosome amplification and tolerance to genome doubling. Nat. Commun. 8, 1773 (2017).
Darp, R., Vittoria, M. A., Ganem, N. J. & Ceol, C. J. Oncogenic BRAF induces whole-genome doubling through suppression of cytokinesis. Preprint at bioRxiv https://doi.org/10.1101/2021.04.08.439023 (2021).
Zhang, Q. et al. FBXW7 facilitates nonhomologous end-joining via K63-linked polyubiquitylation of XRCC4. Mol. Cell 61, 419–433 (2016).
Citri, A., Skaria, K. B. & Yarden, Y. The deaf and the dumb: the biology of ErbB-2 and ErbB-3. Exp. Cell Res. 284, 54–65 (2003).
Alexandrov, L. B. et al. The repertoire of mutational signatures in human cancer. Nature 578, 94–101 (2020).
Venkatesan, S. et al. Induction of APOBEC3 exacerbates DNA replication stress and chromosomal instability in early breast and lung cancer evolution. Cancer Discov. 11, 2456–2473 (2021).
Crockford, A. et al. Cyclin D mediates tolerance of genome-doubling in cancers with functional p53. Ann. Oncol. 28, 149–156 (2017).
Ray Chaudhuri, A. & Nussenzweig, A. The multifaceted roles of PARP1 in DNA repair and chromatin remodelling. Nat. Rev. Mol. Cell Biol. 18, 610–621 (2017).
Goel, S. et al. CDK4/6 inhibition triggers anti-tumour immunity. Nature 548, 471–475 (2017).
Brownlee, P. M., Meisenberg, C. & Downs, J. A. The SWI/SNF chromatin remodelling complex: Its role in maintaining genome stability and preventing tumourigenesis. DNA Repair 32, 127–133 (2015).
Kops, G. J. P., Foltz, D. R. & Cleveland, D. W. Lethality to human cancer cells through massive chromosome loss by inhibition of the mitotic checkpoint. Proc. Natl Acad. Sci. USA 101, 8699–8704 (2004).
Quinton, R. J. et al. Whole-genome doubling confers unique genetic vulnerabilities on tumour cells. Nature 590, 492–497 (2021).
Janssen, A., Kops, G. J. P. L. & Medema, R. H. Elevating the frequency of chromosome mis-segregation as a strategy to kill tumor cells. Proc. Natl Acad. Sci. USA 106, 19108–19113 (2009).
Datta, D. et al. Nucleolar GTP-binding protein-1 (NGP-1) promotes G1 to S phase transition by activating cyclin-dependent kinase inhibitor p21 Cip1/Waf1. J. Biol. Chem. 290, 21536–21552 (2015).
Martin, L. P., Hamilton, T. C. & Schilder, R. J. Platinum resistance: the role of DNA repair pathways. Clin. Cancer Res. 14, 1291–1295 (2008).
Zack, T. I. et al. Pan-cancer patterns of somatic copy number alteration. Nat. Genet. 45, 1134–1140 (2013).
Scheinin, I. et al. DNA copy number analysis of fresh and formalin-fixed specimens by shallow whole-genome sequencing with identification and exclusion of problematic regions in the genome assembly. Genome Res. 24, 2022–2032 (2014).
Acknowledgements
We thank M. Eldridge for setting up the online resource and A. Berman for bug fixes. R.M.D., P.S.S., D.-L.C., J.D.B. and F.M. are funded by Cancer Research UK (core grants C14303/A17197, A22905 and A19274) and the Cambridge Cancer Centre (grant C9685/A25117). G.M. and B.H. are hosted by the Centro Nacional de Investigaciones Oncológicas (CNIO), which is supported by the Instituto de Salud Carlos III and recognized as a ‘Severo Ochoa’ Centre of Excellence (ref. CEX2019-000891-S) by the Spanish Ministry of Science and Innovation (MCIN/AEI/ 10.13039/501100011033). G.M. and B.H. were also supported by a Spanish Ministry of Science and Innovation grant PID2019-111356RA-I00 (MCIN/AEI/ 10.13039/501100011033). M.T. was supported as a postdoctoral researcher of the F.R.S.-FNRS. L.M.G. was supported by the Wellcome Trust PhD programme in Mathematical Genomics and Medicine (grant number RG92770). M.S. was supported by the Horizon 2020 (H2020) Integrated Training Network CONTRA (grant 766030-CONTRA-H2020-MSCA-ITN-2017). This work was supported by the Francis Crick Institute, which receives its core funding from Cancer Research UK (FC001202), the UK Medical Research Council (FC001202) and the Wellcome Trust (FC001202). P.V.L. is a Winton Group Leader in recognition of the Winton Charitable Foundation’s support towards the establishment of The Francis Crick Institute.
Author information
Authors and Affiliations
Contributions
G.M. and F.M. contributed equally to this work. R.M.D., G.M. and F.M. conceived and designed the study. R.M.D., B.H., D.-L.C., M.S. and G.M. developed the methodology of the study. R.M.D., B.H., M.T., K.H., T.L., P.S.S., L.M.G., L.L., M.S. and G.M. developed the software for the study. R.M.D., M.T., K.H., T.L., P.S.S., D.-L.C. and G.M. contributed to the validation of the method and results. R.M.D., B.H., L.M.G., D.-L.C., L.L. and G.M. contributed to the formal analysis presented in this study. R.M.D., B.H., K.H., T.L., P.S.S., L.M.G. and P.V.L. provided access to data and contributed to gathering, processing and curating data. R.M.D., J.D.B., P.V.L., G.M. and F.M. wrote the original draft. R.M.D., B.H., G.M. and F.M. produced and contributed to the visualizations of the study. R.M.D., G.M. and F.M. supervised the project. All authors had access to all of the data in the study. All authors contributed to the review and the editing of the manuscript. All authors approved the manuscript before the initial submission and all other resubmissions.
Corresponding authors
Ethics declarations
Competing interests
J.D.B., G.M., F.M. are co-founders, directors and share holders of Tailor Bio Ltd. Cancer Research UK, the University of Cambridge and the Spanish National Cancer Research Centre (CNIO) are in the process of applying for a patent application GB2114203.9 covering the copy number signature methodology and drug target associations that lists R.M.D., B.H., G.M. and F.M. as inventors. G.M., F.M. and J.D.B are listed on a patent on using copy number signatures to predict response to doxorubicin treatment in ovarian cancer (PCT/EP2021/065058).
Peer review
Peer review information
Nature thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 Workflow of sample filtering and detectable chromosomal instability (dCIN).
a, REMARK diagram showing flow of samples through the study. b, For each copy number feature of the previous ovarian signatures: a histogram of number of events per sample that could not be assigned to an ovarian copy number signature on the TCGA ovarian cohort. Red dotted line indicates the quantile 0.95. c, Scatterplot of cancer types comparing our estimate of detectable CIN (Supplementary Methods) to estimates reported in the Mitelman database. d,e, Boxplots comparing our estimate of detectable CIN with aneuploidy score and four CNA-specific metrics. Boxes represent the interquartile range (IQR) with the median as a bolded line. The whiskers extend to the largest/smallest value no further than 1.5 * IQR from the hinge. Outliers beyond the end of the whiskers are marked individually as points. Results of two-sided Welch’s t-test shown on top of the boxplots.
Extended Data Fig. 2 Overview of copy number features and signature identification.
a, A schematic showing the 5 fundamental copy number features that were computed using 6,335 samples with detectable CIN (dCIN). Note, a feature capturing absolute copy number is not included in our method. b, A schematic showing how mixture modelling is used to split the genome-wide feature distributions into smaller components by either Variational Bayes Gaussian mixture models or Finite Poisson mixture models. The actual number of resulting components is listed below each feature distribution. These components represent basic building blocks of each feature distribution. c, An example of how the probability of a CNA belonging to a mixture component (posterior probability) is calculated and how these are summed. d, (Right) The resulting 43-dimensional feature vectors for each sample, after all posterior probabilities are summed for each component. (Left) A schematic of how the sum-of-posterior matrix for all 6,335 samples was split in two matrices by a Bayesian implementation of the non-negative matrix factorisation (NMF), resulting in a signature catalogue and an activity catalogue.
Extended Data Fig. 3 Schematic of the signature compendium identification.
a, From the complete input matrix 10 pan-cancer signatures were identified. b, For the 20 cancer types with over 100 samples each, 128 cancer-type enriched signatures (CTES) were identified. c, All CTES were removed that had a cosine similarity over 0.74 with any pan-cancer signature. d, From the groups of CTES that had cosine similarities over 0.74 to each other, the signature with activities in the largest number of samples was taken as a representative signature. e, We performed non-negative least squares on each pair of pan-cancer specific signatures to each CTES. For any combination which showed a reconstruction error below 0.1, this CTES was removed. f, The sets of 10 pan-cancer and 7 CTES were joined to a compendium of 17 signatures. g, Using linear combination decomposition, the signature activities were calculated for the 6,335 TCGA samples.
Extended Data Fig. 4 Signature interpretation matrix.
Displayed on the left are the five features, their mixture components and component means. The heatmap on the right shows the signature interpretation values, which combine information from the sum-of-posterior matrix, signature activity matrix and the signature definition matrix (Supplementary Methods). Only components that are positively correlated with signature activity levels are displayed. Interpretation values are normalised per feature and signature.
Extended Data Fig. 5 Monte Carlo simulation results for determining signature-specific noise thresholds.
a, Each plot (1 per signature) shows the interquartile range of sample signature activities after the introduction of noise in the copy number features using a Monte Carlo simulation. Samples are ordered by their observed signature activity (red line). b, Schematic showing how we fitted a Gaussian distribution to the simulated values of all samples with an observed signature activity of 0 (red line). The horizontal black line represents the quantile 0.95 of the fitted Gaussian and forms the basis of our signature specific noise threshold, where values below this line are not distinguishable from 0. c, Plot of the signature-specific thresholds for the 17 copy number signatures.
Extended Data Fig. 6 Signature stability across different copy number profiling technologies.
Across the same set of 478 tumours, we compared the SNP6-array based copy number profiles and signatures to copy number profiles and signatures derived using different copy number profiling technologies. The columns contain results for the different technologies and the rows contain results for comparison between copy number profiles (top), signature activities (middle) and signature definitions (bottom, limited to pan-cancer signatures). For each comparison we show results for a range of penalties for ASCAT’s piecewise constant fitting or ASCAT.sc’s circular binary segmentation. (*): For settings marked with a star it was not possible to derive solutions for K = 10, instead the optimal number of K was chosen (lower than K = 10).
Extended Data Fig. 7 Workflow for determining signature aetiology and confidence rating.
a, Flowchart showing how an association between a mutated gene and signature activity was used to derive a hypothesis for a putative aetiology. b, Flowchart representing the decision making process leading to the assignment of a 3-star rating confidence score. c, Example of the star rating process for CX3.
Extended Data Fig. 8 Summary of associations between signatures and other covariates.
a, Main panel shows significant associations between copy number signatures and mutated genes. Gene annotations summarised in the panels below. Boxes with a red line indicate significant associations that were not considered when determining signature aetiologies as the significant enrichment was via amplification of the gene, which also resided in an ecDNA amplicon, which could be a consequence of the signature rather than a cause, potentially causing a spurious correlation with amplification signatures (CX8, CX9, CX11, CX13). b, Each row shows highly significant associations between signatures and different covariates. Unless otherwise specified, only positive correlations are shown.
Extended Data Fig. 9 Impaired homologous recombination signatures and their associations.
a, Boxplots summarise signature activities of different patient groups (rows) defined by their driver gene mutation status. Ovarian samples are coloured in dark green and breast in orange. Boxes represent the interquartile range (IQR) with the median as a bolded line. The whiskers extend to the largest/smallest value no further than 1.5 * IQR from the hinge. Outliers beyond the end of the whiskers are marked individually as points. Significance tested with two-sided Welch’s t-test between WT BRCA1/2 and each of the categories and corrected for multiple testing by using Benjamini-Hochberg method. Statistically significant comparisons are shown to the right of the boxplots with stars denoting significance (q < 0.05) and arrows denoting the two groups used for the statistical test. (BRCA1/2 = BRCA1 and BRCA2, WT = wild type; LOH = loss of heterozygosity). b, Boxplots (with same characteristics as in a) summarise the scaled signature activities of 5,466 TCGA samples split by low, medium and high cell cycle scores. The brackets and stars (q < 0.05) show where there was a significant increase from low to medium to high cell cycle groups tested with a Welch’s t-test and corrected for multiple testing with Benjamini-Hochberg method. c, Volcano plots showing the results of a correlation between signature activity and expression of genes involved in nucleotide excision repair (NER). Each dot represents a gene, coloured dots show significant correlations. d, Spearman correlation coefficient (y-axis) of correlation between signature activities and seven common metrics of HRD (listed at top). Individual coefficients are displayed for impaired homologous recombination (IHR) signatures and the distribution of coefficients from remaining signatures are represented by boxplots (with same characteristics as in a).
Extended Data Fig. 10 Performance of classifiers for predicting platinum sensitivity.
a, Kaplan-Meier estimator showing the overall survival probabilities of TCGA ovarian cancer patients split into two groups using our CX3/CX2 classifier. b, Hazard ratios and their 95% confidence interval obtained from a Cox proportional hazards model trained on our CX3/CX2 classification predicting overall survival of TCGA ovarian cancer patients. The model also corrected for age and cancer stage of the patients. P-value represents the significance of a Wald test. c,d, Median survival and hazard ratios generated for five cancer cohorts from the TCGA, PCAWG and ICGC projects using predictions from three classifiers (our CX3/CX2 classifier, HRDetect and Myriad myChoice based on the HRD score). Improvements in median survival tested by log-rank test (Kaplan-Meier survival analysis), with the minus symbol representing the predicted resistant group and the plus symbol the predicted sensitive group. Hazard ratios, their 95% confidence interval, and Wald test significance of the predicted sensitive group compared to the predicted resistant group are obtained from Cox proportional hazards models correcting for stage and age of patients, except for HRDetect where tumour stage was omitted as the models did not converge if included. The number and proportion of patients predicted to be sensitive (with HRD) and resistant (without HRD) by each classifier are listed on the right.
Supplementary information
Supplementary Methods
This file contains full descriptions of all methods, links to software, additional results and Supplementary Tables 1–13.
Supplementary Figures
This file contains Supplementary Figs. 1–54.
Supplementary Tables
This zip folder contains Supplementary Tables 14-64 and their accompanying legends.
Rights and permissions
About this article
Cite this article
Drews, R.M., Hernando, B., Tarabichi, M. et al. A pan-cancer compendium of chromosomal instability. Nature 606, 976–983 (2022). https://doi.org/10.1038/s41586-022-04789-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41586-022-04789-9
This article is cited by
-
Deep learning model accurately classifies metastatic tumors from primary tumors based on mutational signatures
Scientific Reports (2023)
-
Copy number alteration features in pan-cancer homologous recombination deficiency prediction and biology
Communications Biology (2023)
-
The CMG helicase and cancer: a tumor “engine” and weakness with missing mutations
Oncogene (2023)
-
Multiparameter single-cell proteomic technologies give new insights into the biology of ovarian tumors
Seminars in Immunopathology (2023)
-
Mechanisms of chromosomal instability (CIN) tolerance in aggressive tumors: surviving the genomic chaos
Chromosome Research (2023)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.