a, Size distribution of deletions per histology group, with tumour types ordered according to total number of events seen. Vertical dashed lines represent the two prominent modes. b, Size distribution of segments of templated insertion per histology group. For each tumour type, the three distributions for cycles, bridges and chains of templated insertions are superimposed. Ins, insertion. c, Associations between a subset of the genomic properties (rows) and classes of structural variant (columns). Each density curve represents the quantile distribution of the genomic property values at observed breakpoints compared to random genome positions. Asterisks indicate a significant departure from uniform quantiles after multiple hypothesis correction on a one-sided Kolmogorov–Smirnov test based on a sample size of 2,559 genomes containing structural variants: *false-discovery rate < 0.01, **false-discovery rate < 0.001, ***false-discovery rate < 10−6. Cells with significant property associations are shaded by the magnitude of the shift of the median observed quantile above (blue) or below (red) 0.5. The interpretation of each property from left to right is indicated by the axes to the right of the property label. Complex uncl, complex clusters unclassified; cplxy, chromoplexy; del, deletion; inv, inversion; ins, insertion; LAD, lamina-associated domain; recip, reciprocal; TAD, topologically associated domain; TD, tandem duplication; trans, translocation; unbal, unbalanced. d, Rearrangement counts as a function of bases of junction microhomology, fit to three linear functions consistent with different formation mechanisms. NHEJ, non-homologous end joining; MMEJ, microhomology-mediated end joining; SSA, single-strand annealing. e, Enrichment or depletion of breakpoint junctions between regions of the genome with particular annotations, compared with a permuted background that preserves breakpoint positions but swaps breakpoint partners. Centre points are the mean fold change over the permuted background; error bars represent three s.d. Analysis is based on a sample size of 2,559 genomes containing structural variants. LTR, long terminal repeat; SINE, short interspersed nuclear element; LINE, long interspersed nuclear element; heterochrom, heterochromatin.