Comprehensive analysis of chromothripsis in 2,658 human cancers using whole-genome sequencing

Chromothripsis is a mutational phenomenon characterized by massive, clustered genomic rearrangements that occurs in cancer and other diseases. Recent studies in selected cancer types have suggested that chromothripsis may be more common than initially inferred from low-resolution copy-number data. Here, as part of the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA), we analyze patterns of chromothripsis across 2,658 tumors from 38 cancer types using whole-genome sequencing data. We find that chromothripsis events are pervasive across cancers, with a frequency of more than 50% in several cancer types. Whereas canonical chromothripsis profiles display oscillations between two copy-number states, a considerable fraction of events involve multiple chromosomes and additional structural alterations. In addition to non-homologous end joining, we detect signatures of replication-associated processes and templated insertions. Chromothripsis contributes to oncogene amplification and to inactivation of genes such as mismatch-repair-related genes. These findings show that chromothripsis is a major process that drives genome evolution in human cancer.


Summary
Chromothripsis is a newly discovered mutational phenomenon involving massive, clustered genomic rearrangements that occurs in cancer and other diseases. Recent studies in cancer suggest that chromothripsis may be far more common than initially inferred from low resolution DNA copy number data. Here, we analyze the patterns of chromothripsis across 2,658 tumors spanning 39 cancer types using whole-genome sequencing data. We find that chromothripsis events are pervasive across cancers, with a frequency of >50% in several cancer types. Whereas canonical chromothripsis profiles display oscillations between two copy number states, a considerable fraction of the events involves multiple chromosomes as well as additional structural alterations. In addition to non-homologous end-joining, we detect signatures of replicative processes and templated insertions. Chromothripsis contributes to oncogene amplification as well as to inactivation of genes such as mismatch-repair related genes. These findings show that chromothripsis is a major process driving genome evolution in human cancer.

INTRODUCTION
Chromothripsis is a mutational phenomenon characterized by massive genomic rearrangements, often generated in a single catastrophic event and localized to isolated chromosomal regions [1][2][3][4] . In contrast to the traditional view of tumorigenesis as a gradual Darwinian process of progressive mutation accumulation, chromothripsis provides a mechanism for the rapid accrual of hundreds of rearrangements in few cell divisions.
This phenomenon has been studied in primary tumors of diverse histological origin 5-10 , but similar random joining of chromosomal fragments has also been observed in the germline 11 . There has been considerable progress in elucidating the mechanism by which chromothripsis may arise, including fragmentation and subsequent reassembly of a single chromatid in aberrant nuclear structures called micronuclei 2,12 , as well as fragmentation of dicentric chromosomes in telomere crisis 13,14 . Chromothripsis is not specific to cancer, as it can cause rare congenital human disease and can be transmitted through the germline 11,15 ; it has also been described in plants, where it has been linked to micronucleation 16 . However, despite the recent rapid progress on chromothripsis, much remains to be discovered regarding its cause, prevalence, and consequences.
A hallmark of chromothripsis is multiple oscillations between two or three copy number (CN) states 1,6 . Applying this criterion to copy number profiles inferred from SNP arrays, chromothripsis was initially estimated to occur in at least 2-3% of human cancers and in ~25% of bone cancers 1 . Subsequent studies of large array-based datasets gave similar frequencies: 1.5% (124 of 8,227 tumors across 30 cancer types) 17 and 5% (918 out of 18,394 tumors across 132 cancer types) 18 , with the highest frequencies detected for soft-tissue tumors (54% for liposarcomas, 24% for fibrosarcomas, and 23% for sarcomas) 18 . These estimates relied on the detection of copy number oscillations that are more densely clustered than expected by chance, e.g., at least 10 adjacent CN oscillations in medulloblastomas 8 .
Whole-genome sequencing (WGS) data provide a greatly enhanced view of structural variations (SVs) in the genome with breakpoints identified at single nucleotide resolution.
They also provide information on the rearranged DNA sequence, which can be used to determine the type of SVs (e.g., deletion, insertion, inversion) and to infer the likely repair mechanism for joining fragments (e.g., based on the degree of microhomology, or the presence or absence of small insertions at the breakpoints). With higher spatial resolution and additional information provided by WGS data, it is possible to postulate a more nuanced set of criteria for chromothripsis and enhance detection specificity 3 . Our earlier analysis of WGS data from cutaneous melanomas already found chromothripsislike rearrangements in 38% (45 out of 117 patients) 10 ; other studies based on WGS found 60-65% 5 for pancreatic cancer, and 32% for esophageal adenocarcinomas 7 .
Whether these examples are outliers reflecting the unique biology of these tumors, or whether they reflect a more general underestimation of the frequency of chromothripsis remained unclear.
Motivated by the importance of chromothripsis during tumor evolution and the need for more systematic and comprehensive analysis, we sought to determine the frequency and spectrum of chromothripsis events in the WGS data for 2,658 cancer patients spanning 39 cancer types, available through the International Cancer Genome Consortium (ICGC). In addition to deriving more accurate per-tumor type prevalence of chromothripsis, we determine the size and genomic distribution of such events, examine their role in amplification of oncogenes or loss of tumor suppressors, describe their relationship to genome ploidy, and investigate whether their presence is correlated with patient survival. Our chromothripsis calls can be browsed at the accompanying website: http://compbio.med.harvard.edu/chromothripsis/.

Prevalence of chromothripsis across cancer types
We first sought to formulate a set of criteria for identifying chromothripsis events with varying complexities (Fig. 1a). The generally acknowledged model of chromothripsis posits that some of the DNA fragments generated by the shattering of the DNA are lost; thus, copy number oscillations between two or three states 1,6 are an obvious first criterion (Fig. 1a). Such deletions also lead to interspersed loss of heterozygosity (LOH), or altered haplotype ratios if there is only a single copy of the parental homolog of the fragmented chromatid. Although chromosome shattering and reassembly has been demonstrated to experimentally generate chromothripsis 2 , template-switching DNA replication errors can generate a similar pattern 19 . Indeed, shattering and replication error models are not exclusive and could indeed co-occur 2 . Therefore, for the discussion below we will refer generally to "chromothripsis" as encompassing both classes of models.
To detect chromothripsis from WGS data, we developed ShatterSeek (a detailed description of the algorithm and its performance is provided in Online Methods and Supplementary Note). A key feature of our method is to identify clusters of breakpoints belonging to SVs that are interleaved, i.e., the regions bridged by their breakpoints overlap instead of being nested (Fig. 1), as is expected from random joining of genomic fragments. This encompasses the many cases that do not display simple oscillations, e.g., partially oscillating CN profiles with interspersed amplifications, and oscillations spanning multiple CN levels due to aneuploidy 5,20 . Rearrangements in chromothripsis should also follow a roughly even distribution for the different types of fragment joins (i.e., duplication-like, deletion-like, head-to-head and tail-to-tail inversions, depicted in blue, orange, black, and green, respectively, in Fig. 1a and throughout the manuscript) and have breakpoints randomly distributed across the affected region 1-3 . Finally, by criteria to be described below, we also use interchromosomal SVs to identify chromothripsis events involving multiple chromosomes.
After removing low-quality samples using stringent quality control criteria, we applied our chromothripsis detection method to 2,543 tumor-normal pairs spanning 37 cancer types (Supplementary Table 1 and Online Methods). 2,428 cases harbored SVs and were considered for further analysis. To tune the parameters in our method, we used statistical thresholds and visual inspection. For the minimum number of oscillating CN segments, we used two thresholds: 'high-confidence' calls display oscillations between two states in at least 7 adjacent segments, whereas 'low-confidence' calls involve between 4 and 6 segments ( Fig. 1b and Supplementary Note). The analyses described in the following sections were performed using the high-confidence call set unless otherwise stated.
We first focused on the 1,427 nearly-diploid genomes (ploidy ≤ 2.1; Supplementary Table   1), in which detection of chromothripsis is more straightforward. We defined as 'canonical' those events in which >60% of the CN segments in the affected region oscillate between two CN states (canonical events in polyploid tumors are described later). The frequency of canonical chromothripsis events is over 40% for multiple cancer types, such as glioblastomas (CNS−GBM; names in parentheses are short-hand names agreed upon by the ICGC, 50%) and lung adenocarcinomas (Lung-AdenoCA, 40%; Supplementary Note). These numbers provide a lower bounds for the frequency of chromothripsis events, but are nevertheless much higher than previous estimates 17,18 .
Overall, these results indicate much greater prevalence of chromothripsis in a majority of human cancers than previously estimated 10,17,18 . The immense variations we observe across tumor types cannot be explained by random fluctuations and must be linked to tumor biology.

Understanding the difference between our frequency estimates and previous ones
In accordance with some recent analyses that found dramatically higher frequency of chromothripsis in specific tumor types 5,7 , our overall estimates are considerably higher than those in prior pan-cancer studies. This is in large part due to the fact that previous pan-cancer studies were based on array-based technologies. Additional factors include improvement in SV detection and more refined criteria for defining chromothripsis. In Supplementary Note, we have compiled the criteria used in 26 major chromothripsisrelated studies published to date; many did not involve precise description of their approach and code were not publicly available.
To better understand the discrepancy between WGS-based studies, we carried out a detailed comparison on the same datasets that have been analyzed previously. For 109 prostate adenocarcinomas in Fraser et al 21 that were also part of the ICGC dataset, the original authors used ShatterProof 22 -the only publicly available algorithm that uses CNV/SV calls as input-and found chromothripsis in 21% of the tumors (23/109). When we re-applied the same algorithm (with same parameters) but using our CNV/SV calls, the fraction of chromothripsis cases more than doubled to 45% (49/109). This indicates that a major reason for the lower frequencies in the past may be the lower sensitivity in some previous SV detection approaches. SV detection remains a challenging problem especially for low-purity tumors, and algorithms differ substantially in their sensitivity and specificity. Note that the SV calls we used were compiled by the ICGC SV group, with each variant requiring consensus from at least two of four algorithms 23 .
Applying our own chromothripsis algorithm (ShatterSeek), we identified 11 additional cases for a total of 55% (60/109). Compared to the 23 cases reported by Fraser et al. 21 , we missed 4. The missed events are focal events comprising less than 6 SVs, which is the lowest number allowed in our criteria to avoid a high false positive rate; the detected regions appeared to be hypermutated regions characterized by tandem duplications or deletions. Visual inspection of the cases we detect but are missed by Fraser et al.
reveals that the differences in the rates are indeed mostly due to the lower sensitivity of their SV calls (see Supplementary Note for an in-depth comparison). ShatterSeek has increased sensitivity by incorporating cases that display more complex patterns of oscillations and interchromosomal SVs while keeping the specificity high by imposing additional criteria on breakpoint homology to remove tandem duplications and those arising from breakage-fusion-bridge (BFB) cycles. Lastly, we also compared our method against ChromAL 5 for 76 pancreatic tumors. Both ChromAL and ShatterSeek detect chromothripsis in the same 41 tumors (54%).
Thus, our hypothesis that we have not over-estimated the chromothripsis frequencies is supported by the following: (i) some tumor types such as thyroid, CLL, and pilocytic astrocytomas give no events; (ii) diploid tumors, which give simpler configurations that are easier to reconstruct or verify visually, give high frequencies; (iii) the cases were divided into high-vs low-confidence cases and the high-confidence ones were used for final estimates; (iv) more sensitive CNV/SV calls result in higher frequencies for the same datasets; and (v) our estimates are in agreement with very recent analysis in specific tumor types. These results reinforce the high prevalence of chromothripsis in human cancers.

Frequent involvement of interchromosomal SVs
An important feature of our approach is the incorporation of interchromosomal SVs to Distinguishing these scenarios requires a case-by-case analysis that is beyond the scope of this study. However, it is likely that both mechanisms contribute to the generation of chromothripsis involving multiple chromosomes.

Size and complexity of chromothripsis events are highly variable
Chromothripsis events span a wide range of genomic scale, with the number of breakpoints involved varying by three orders of magnitude within some tumor types ( Supplementary Fig. 1c). We find tumors in which relatively focal chromothripsis events, usually a few Mb in size, take place against the backdrop of an otherwise quiet genome (lower-right quadrant in Fig. 2d). Although focal, these events can lead to the simultaneous amplification of multiple oncogenes located in different chromosomes (Supplementary Figs. 3c-e, 4a-c). Other focal events co-localize with other complex events in highly rearranged genomes (lower-left quadrant in Fig. 2d).
Those events spanning large genomic regions comprise tens to hundreds of SVs, affecting anywhere from one chromosome arm to more than 10 chromosomes (Fig. 2a-c). We observe 47 tumors harboring >200 rearrangements, of which at least 50% belong to chromothripsis regions (upper-right quadrant in Fig. 2d). Overall, our analysis reveals greater heterogeneity of chromothripsis patterns than previously appreciated, both in terms of the number of SVs and chromosomes involved.

Relationship between chromothripsis and aneuploidy
Newly established polyploid cells have high rates of mitotic errors that generate lagging chromosomes 24,25 , which in turn are linked to chromothripsis in medulloblastomas and in vitro 2,12,14 . However, a direct causal link between polyploidy and chromothripsis has not been established, and the frequency with which polyploidy and chromothripsis are associated in cancer genomes has not been comprehensively assessed. To examine the sequence of events clearly, we focused on the canonical cases in polyploid tumors, where we can infer whether chromothripsis occurred before or after polyploidization 26 .
For example, if the CN oscillation occurs between 2 and 4 copies in a tetraploid tumor, we infer that polyploidization occurred after chromothripsis; on the other hand, if the oscillation occurs between 3 and 4 copies, we infer that polyploidization occurred first 26 (Supplementary Figs. 1-2, 4d, 5 and Supplementary Note). Among the canonical cases (~57% of all chromothripsis events), 68% occur in nearly diploid genomes (n=1,648) and 32% in polyploid tumors (n=748; ploidy >=2.5). Of the 163 cases in which we can distinguish the sequence of events, 74% show chromothripsis after polyploidization while the remaining 26% show chromothripsis before polyploidization. This suggests that a larger fraction of the canonical chromothripsis events in polyploid tumors are late events.
After correcting for tumor type using the logistic regression, we estimate that the odds of chromothripsis occurring in a polyploidy tumor is 1.5 times larger than that in a diploid tumor on average (1.20-1.85 95% CI; P < 10 -3 , cases with ploidy ≥ 2.5). Although polyploidy is associated with higher incidence of chromothripsis, this may be primarily due to the presence of increased genomic material in polyploids. Polyploidy, on the other hand, could reduce the sensitivity of CNV and SV detection (due to lower sequence coverage per copy), and might make it easier for the cell to lose the highly-rearranged copy when intact copies are present 27 .
Co-localization of APOBEC-mediated kataegis and rearrangements has been reported for multiple cancer types 30,31 , and has been linked to double-strand break resection during break-induced replication 32 . To study the relationship between kataegis and chromothripsis, we examined the presence of clusters of APOBEC-induced mutations within the chromothripsis regions (Online Methods). Excluding melanoma samples (due to the overlap between the APOBEC and ultraviolet light signatures 33 ), we find that 30% Bone-Liposarc tumors, suggesting that they occurred at late stages of tumor development, likely after chromothripsis (Fig. 3e). Overall, although kataegis can cooccur with chromothripsis, this co-occurrence is not common. This is consistent with recent data that chromothriptic derivative chromosomes are mostly assembled by end joining mechanisms which do not involve extensive DNA end resection 34 .

TP53 mutation status and chromothripsis
Inactivating TP53 mutations have been previously associated with chromothripsis in medulloblastomas 8 and in pediatric cancers 35,36 . TP53 deficient cells serve as models to generate chromothripsis in vitro 2,14 . Nevertheless, the relationship between deleterious TP53 mutations and chromothripsis has not been examined comprehensively. In our data, ~38% of the patients with inactivating TP53 mutations show chromothripsis, whereas only 24% of those with wildtype TP53 exhibit chromothripsis (Fig. 2e). After correcting for cancer type, this translates to an odds ratio of 1.54 (1.21-195, 95% CI, P<10 -3 ) for chromothripsis in patients with TP53 mutations compared to TP53 WT cases, reinforcing the notion that TP53 mutations are associated with a higher incidence of chromothripsis. However, we note that 398 (58%) of the patients in our cohort exhibiting chromothripsis do not show TP53 mutations nor MDM2 amplifications (the major cellular regulator of TP53 by ubiquitination 37 ), including those with massive cases in diploid genomes, e.g., DO25622 (Fig. 2b). This suggests that, although TP53 malfunction and polyploidy are predisposing factors to chromothripsis, it still occurs frequently in diploid tumors with proficient TP53.

Signatures of repair mechanisms in chromothripsis regions
By examining the sequence homology at the breakpoints, it is possible to infer the predominant mechanisms responsible for the chromothripsis event 38,39 . Although this classification is not precise, it is helpful in providing an overview of mutational signatures. Previously, NHEJ has been implicated in the reassembly of the DNA fragments generated by chromothripsis 2,34 , whereas alternative end-joining (alt-EJ) has been proposed to occur in constitutional chromothripsis and in glioblastomas 15,40 . In addition, short templated insertions suggestive of microhomology-mediated breakinduced replication (MMBIR) or alt-NHEJ associated with polymerase theta have been detected in chromothripsis originated from DNA fragmentation in micronuclei 2,41-43 .
We analyzed the breakpoints involved in canonical chromothripsis events showing evidence of interspersed LOH, as most SVs in such cases are chromothripsis-related ( Fig. 1b). In 55% of these events, we only detected repair signatures concordant with NHEJ or alt-EJ ( Supplementary Fig. 6). In 32%, we identified stretches of microhomology at two or more breakpoint junctions (most of them of 0-6 bp) and short insertions of 10-500 bp that map to distant locations within the affected region by chromothripsis ( Supplementary Fig. 6). For instance, in the massive chromothripsis case We also find features associated to replication-associated mechanisms in more complex rearrangements involving multiple chromosomes (Fig. 4). In a number of these events, LOH is observed in some chromosomes (e.g., Fig. 4b), but it is absent in others, where the oscillations occur at higher CN states without LOH (Fig. 4c,d). For instance, in the case reported in Fig. 4, there is evidence of templated insertions in chromosomes 5 and 13, which are linked to a chromothripsis event showing LOH in chromosome 1. Notably, the minor CN (i.e., the copy number of the allele with the lower number of copies) for the templated insertions in chromosome 13 is 1, whereas for the rest of the chromosome is 0. This suggests that one parental chromosome served as a template and was later lost to the cell.
Overall, these results hint at the involvement of template switching events in the generation or repair of complex rearrangements, consistent with the observations of replicative processes in the formation of clustered rearrangements in congenital disorders and cancer 15,19,38,23,44 . Although further experimental evidence will be necessary, we postulate that the involvement of replication-associated mechanisms in the assembly of derivative chromosomes in chromothripsis might be substantial.

Oncogene amplification in chromothripsis regions
Evidence of double minute (DM) formation from chromothripsis has been reported for selected cancer types 1,2,8,40 . However, the extent to which chromothripsis contributes to DM formation has not been fully examined on a pan-cancer scale. Although reconstruction of DM structure with appropriate discordant reads would present clear evidence for its extrachromosomal nature, this proves to be too difficult in most cases. Therefore, we rely on copy number to make our inferences. We find that 15 patients (2% of tumors with chromothripsis) show CN oscillations between one low (CN≤4) and one very high (CN≥10) states, consistent with the presence of DM 8,40 . We detect known cancer drivers in these putative DMs, including MDM2 (4 samples, Supplementary Figs. 3e and 4a, and Supplementary Table 2), and CDK4 (4). These amplifications lead to increased mRNA levels: e.g., 5 log2-fold increase for MDM2, NUP107, and CDK4 in a GBM sample (DO14049) compared to other GBM tumors. In chromothripsis regions subject to additional rearrangements, it is not always possible to discern using bulk sequencing data alone whether highly amplified segments are part of DMs or correspond to intrachromosomal gene amplification 45 . Furthermore, once a DM has formed, the derivative chromosome showing chromothripsis may be lost if it has no other tumor-promoting mutations. Therefore, the contribution of chromothripsis to the formation of extrachromosomal DNA bodies is likely to be higher than estimated here.

Chromothripsis-mediated loss of tumor suppressors and DNA repair genes
Tumor suppressors have a direct effect on cell growth or genes, such as those involved in DNA repair, or accelerate the rate of acquiring other growth-promoting or survival mutations 48 . Except for few selected cancer types 5,49 , the extent to which chromothripsis contributes to the loss of these genes has not been thoroughly examined yet. We find that chromothripsis underlies 2.1% and 1.9% of the losses of tumor suppressors and DNA repair genes, respectively. These include MLH1 (9 patients out of 301 harboring Ta b l e 2). In 28 of these cases (28 genes in 24 tumors), both alleles were inactivated, one due to chromothripsis and the other due to an SNV. These include SMAD4, APC, TP53, and CDKN2A. In a biliary adenocarcinoma (Fig. 5), for instance, one MLH1 allele was lost due to chromothripsis and the other allele was likely silenced due to promoter hypermethylation, as evidenced by low expression of MLH1 and the microsatellite instability phenotype in an otherwise MMR proficient tumor 50 . Overall, these data illustrate the way in which chromothripsis can confer tumorigenic potential through the loss key tumor suppressors and DNA repair genes.

Chromothripsis is prognostic of poor patient survival
Chromothripsis has been associated with poor prognosis for several cancer types 5,8,9,51 .
Here, the increased sensitivity of our approach and the larger cohort permits us to evaluate the impact of chromothripsis on patient survival in greater detail, and to determine the biological contexts in which chromothripsis leads to more aggressive tumors. We did not find significant association between chromothripsis and survival in a multivariate pan-cancer analysis when stratifying the patients into two categories with respect to the presence or absence of chromothripsis (Wald and likelihood ratio tests

DISCUSSION
Our analysis of ca. 2,600 cancer genomes has revealed that chromothripsis plays a major role in shaping the architecture of cancer genomes across diverse human cancers, with prevalence and heterogeneity much higher than previously appreciated and marked variability across cancer types. Our approach enabled us to define more nuanced criteria to detect chromothripsis events, including those that involve multiple chromosomes and those that were hard to detect previously due to the presence of other co-localized rearrangements.
We note that the estimated frequencies of chromothripsis depend on the cut-off values used for statistical significance. We have tested various parameters and chose conservative thresholds, such as at least 7 CN segments oscillating between two copy number states for the high-confidence calls; however, we cannot exclude the possibility that some of these chromothripsis-like patterns might have arisen due to other sources of genomic instability. Conversely, it is also possible that we missed true chromothripsis events that have fewer than the required number of rearrangements; it is worth noting that such small-scale events are seen in experimentally generated chromothripsis 2 .
Cases in which chromothripsis is followed by other complex rearrangements that mask the canonical CN pattern are especially difficult to detect, requiring additional criteria and in-depth manual inspection. Despite these limitations, we believe that our statistical approach based on observed frequencies of various alterations compared to the background is more sensitive than a reassembly-based approach. The latter method attempts to reconstruct the steps that led to the observed SV pattern, but most complex events are too complicated, especially when many breakpoints are entirely missed and some are incorrectly identified due to inherent limitation of short-read data, imperfect SV algorithms, and insufficient sequencing coverage.
A substantial fraction of the chromothripsis events we detect show templated insertions and evidence of MMBIR. Although chromothripsis and chromoanasynthesis are considered to be two different processes, they often lead to similar SV and CN profiles, especially when occurring in aneuploid genomes, as oscillating CN profiles with interspersed LOH might be generated by replicative processes alone if the DNA polymerase skips over segments of the template. Moreover, there is experimental evidence that MMBIR and NHEJ can co-exist in chromothripsis induced in micronuclei 2 .
Therefore, further experiments will be required to assess the interplay between DNA repair mechanisms in chromothripsis.
Given the pervasiveness of chromothripsis in human cancers and its association with poorer prognosis, another question that arises is whether chromothripsis per se constitutes an actionable molecular event amenable to therapy. This is particularly interesting given the link between aneuploidy, depleted immune infiltration, and reduced response to immunotherapy 52 . As more WGS data are linked to other data types including clinical information, it will become more feasible to understand the impact of chromothripsis on tumorigenesis and its potential as a biomarker for diagnosis or treatment.

Whole-genome sequencing data
We integrated in a common processing pipeline whole-genome sequencing data from the TCGA and ICGC consortia for 2,658 tumor and matched normal pairs across 39 cancer types, of which 2,543 pairs spanning 37 cancer types that passed our qualitycontrol criteria were selected for further analysis 53 . The list of samples is provided in Supplementary Table 1. Further information for all tumor samples and patients is given in 54 . Sequencing reads were aligned using BWA-MEM 0.7.8-r455, whereas biobambam 0.0.138 was used to extract unpaired reads and mark duplicates 55,56 .

Mutation calling
We utilized the consensus SNV and indel calls sets released by the Pan-Cancer Analysis of Whole Genomes (PCAWG) project, whereas we used HaplotypeCaller 3.4-46-gbc0262554 to call single nucleotide polymorphisms (SNPs) in both tumor and matched normal samples following the GATK best practices guidelines. We only kept SNPs supported by at least 10 reads. We processed a total of 210,021 non-synonymous somatic mutations, of which 43,548 were predicted as deleterious using the MetaLR score as implemented in Annovar 57 . To identify APOBEC mutagenesis we followed the procedure previously described 33 . In brief, we considered as APOBEC-associated mutations those involving a change of (i) G within the sequence motif wGa to a C or A (where w is A or T), and (ii) C in the sequence motif tCw to G or T (where w is A or T).

Detection of SVs and SCNAs
The SVs were identified by the ICGC SV subgroup, which applied four algorithms and

RNA-seq data analysis
We processed RNA-seq data for a total of 162 and 1,268 normal and tumor samples, respectively. Sequencing reads were aligned using Top H a t2 a nd STA R 58,59 . HTseqcount was subsequently used to calculate read counts for the genes encompassed in the PCAWG reference GTF set, namely Gencode v19. Counts were normalized to UQ-

Characterization of chromothripsis events using ShatterSeek
To identify and visualize chromothripsis-like patterns in the cancer genomes using CN and SV data, we extended the set of statistical criteria proposed by 3  Duplication-like SVs, deletion-like SVs, head-to-head and tail-to-tail inversions are depicted in blue, orange, black, and green, respectively. The value for the statistical criteria described above for each event is provided below its representation.

Survival analysis
We performed multivariate analysis using the Cox proportional hazards model corrected for confounding factors known to influence survival rates, namely: age at the time of diagnosis, sex, tumor stage, radiation therapy, the presence of metastasis, and cancer type. Survival analysis was conducted using the coxph function from the R package survival version 2.30. Significance was assessed by the likelihood ratio and Wald tests.
Tumor stages for all cancer types were manually curated and grouped into four categories. In the pancancer analysis, we considered only cancer types with clinical data available for at least 20 patients. The clinical data are listed in Supplementary Table 1.

Data availability
The code for calling chromothripsis events is available at https://github.com/parklab/ShatterSeek.

Acknowledgements
The results published here are partly based upon data generated by The Cancer

Competing Financial Interest
The authors declare no competing financial interests.     Samples were stratified into three categories based on the fraction of SVs that map to chromothripsis regions: absent (black; no chromothripsis), moderate ("mod."; red; the fraction of chromothripsis-related SVs is smaller than the median within the same cancer type), and predominant ("pred."; blue; the fraction of chromothripsis-related SVs is higher than the median within the same cancer type). Kaplan-Meier plots and estimated Hazard

Supplementary Tables
Supplementary Table 1