Mutational processes contributing to the development of multiple myeloma

Hoang, Phuc H.; Cornish, Alex J.; Dobbins, Sara E.; Kaiser, Martin; Houlston, Richard S.

doi:10.1038/s41408-019-0221-9

Download PDF

Article
Open access
Published: 06 August 2019

Mutational processes contributing to the development of multiple myeloma

Blood Cancer Journal volume 9, Article number: 60 (2019) Cite this article

5510 Accesses
41 Citations
25 Altmetric
Metrics details

Subjects

Abstract

To gain insight into multiple myeloma (MM) tumorigenesis, we analyzed the mutational signatures in 874 whole-exome and 850 whole-genome data from the CoMMpass Study. We identified that coding and non-coding regions are differentially dominated by distinct single-nucleotide variant (SNV) mutational signatures, as well as five de novo structural rearrangement signatures. Mutational signatures reflective of different principle mutational processes—aging, defective DNA repair, and apolipoprotein B editing complex (APOBEC)/activation-induced deaminase activity—characterize MM. These mutational signatures show evidence of subgroup specificity—APOBEC-attributed signatures associated with MAF translocation t(14;16) and t(14;20) MM; potentially DNA repair deficiency with t(11;14) and t(4;14); and aging with hyperdiploidy. Mutational signatures beyond that associated with APOBEC are independent of established prognostic markers and appear to have relevance to predicting high-risk MM.

Timing the initiation of multiple myeloma

Article Open access 21 April 2020

Functional dissection of inherited non-coding variation influencing multiple myeloma risk

Article Open access 10 January 2022

The mutagenic impact of melphalan in multiple myeloma

Article 19 May 2021

Introduction

Cancers have variable numbers of somatic mutations that have accumulated during the life history of the tumors as a consequence of diverse cellular processes, including defective DNA replication or DNA repair, and exposure to endogenous or exogenous DNA-damaging agents^1,2. Each of these processes results in mutational signatures, which serve as proxy for the cellular processes that have gone amiss. Mathematical deconvolution³ of these mutational signatures in large pan-cancer series has revealed multiple distinct signatures¹, several of which are associated with known etiologies, but many remain unexplained^1,4,5. Hence, studying the mutational signatures of cancers provides a mechanism for gaining insight into the etiological basis of tumor development.

Multiple myeloma (MM) is an incurable malignancy of plasma cells whose pathogenesis is only partially understood⁶. Approximately 40% of MM tumors harbor chromosome translocations leading to over-expression of oncogenes (including CCND1, CCND3, MAF, MAFB, WHSC1/MMSET, and FGFR3) through juxtaposition to the immunoglobulin heavy-chain locus⁶. Other tumors exhibit hyperdiploidy (HD), which is also considered to be an important initiating event⁶. Whole-exome sequencing (WES) and whole-genome sequencing (WGS) studies have so far identified over 40 driver genes that are recurrently altered in MM^6,7,8,9,10. However, the molecular mechanisms giving rise to these mutations are yet to be fully elucidated.

Here we report a comprehensive analysis of the mutation signatures of over 800 MM genomes. We identify major mutational signatures in MM reflective of three known principle mutational processes: aging^1,11,12, DNA repair deficiency^1,12–17, and activation-induced deaminase (AID)/apolipoprotein B editing complex (APOBEC) activity (signature 2, 9, and 13)^1,13,18,19. These mutational signatures show subgroup specificity and are reflective of the molecular mechanisms involved in tumorigenesis. Additionally, we show that information on mutational signatures beyond that associated with APOBEC has relevance to predicting patient prognosis and defining high-risk MM.

Results

Genome sequencing of multiple myeloma

To examine the diversity of mutational signatures, we analyzed overlapping WGS and WES data on 850 and 874 MM tumor-normal pairs, respectively, generated by the Relating Clinical Outcomes in Multiple Myeloma to Personal Assessment of Genetic Profile Study (CoMMpass, IA10 release). The frequency of the MM major subgroups—HD, t(11;14), t(4;14), t(14;16), t(14;20), and t(8;14) MYC translocation—is similar to other unselected series of patients who have been reported⁶ (Supplementary Table 1). We used the high-coverage WES data (120–150×, 136,074 single-nucleotide variants [SNVs]) to analyze coding regions and the low-coverage WGS data (6–12×, 1,348,881 SNVs and 44,155 structural variants [SVs]) to provide genome-wide insights into clonal mutations associated with early processes underlying tumorigenesis^7,20.

Mutational signatures in multiple myeloma

Application of non-negative matrix factorization (NMF)¹¹ (Supplementary Fig. 1) to extract de novo SNV mutational signatures did not identify any novel mutational signatures (Supplementary Figs. 2 and 3), consistent with a recent analysis on CoMMpass exome dataset²¹. Overall, a total 9 of the 30 mutational signatures referenced by COSMIC (Catalog of Somatic Mutations in Cancer) were seen at >1% mutational contribution in the WGS data (Supplementary Table 2); signature 1 related to aging¹; 2 and 13 to activity of the APOBEC family of cytidine deaminases; 9 to polymerase η implicated with the activity of AID during somatic hypermutation^1,18,19; signature 30 reflective of mismatch repair deficiency¹⁷; and signature 16 which has as yet an unknown etiology. We also extracted flat signatures 3, 5, and 8 in tumors - indicative of DNA repair deficiency (homolgous recombination deficiency and nucleotide repair deficiency)^1,13,14,15,^,^12–16. In view of the potential ambiguous assignment of these three signatures^22,23, we considered them collectively thereafter. We did however identify five novel de novo structural rearrangement signatures (RSs) (Fig. 1): RS1 (19% of SVs across samples)—characterized by non-clustered deletions, large-scale tandem duplications and inversions; RS2 (17%)—characterized by clustered translocations; RS3 (13%)—characterized by inversions; RS4 (21%)—characterized by non-clustered small-scale deletions and tandem duplications; RS5 (30%)—characterized by non-clustered translocations. We therefore focused on the nine major SNV and five de novo SV mutational signatures for subsequent analyses.

**Fig. 1: De novo structural rearrangements signatures.**

Following on from this, we examined the contributions of the nine major COSMIC SNV mutational signatures in both WES and WGS datasets. The signature profiles recovered from the analysis of clonal WES and exome-restricted WGS data were highly correlated (r = 1.00, Spearman’s correlation, Supplementary Fig. 4, Supplementary Table 3). Hence, while the average sensitivity to detect clonal SNVs from the WGS data is 20–35%⁷, these findings indicate that the mutational signatures identified by WGS are valid and representative of early mutational processes in MM. We also observed a high concordance of mutational signature in WES data from CoMMpass and that reported by Walker et al.⁸ (r = 0.86, Spearman’s correlation, Supplementary Fig. 5, Supplementary Table 4), reflecting the generalizability of our observations. No significant association between the major COSMIC SNV signatures and those associated with rearrangements was seen (Supplementary Table 5).

Influence of DNA replication and transcription on mutational signatures

The impact of DNA replication and transcription on mutational signatures was broadly consistent with observations previously made in the analyses of other cancers^11,12,24. Specifically, an overall increased mutation rate in late-replicating regions was shown (P < 1.0 × 10⁻⁴) (Fig. 2a, Supplementary Table 6), with the exception of signature 13 having higher mutation rate in early-replicating regions (P < 1.0 × 10⁻⁴, Supplementary Fig. 6, Supplementary Table 7), consistent with generalized replication time-dependent DNA damage mechanisms that operate in other cancers such as those of the breast¹² and liver¹¹. The difference in how replication timing influences mutation rates in signatures 2 and 13, which are both associated with APOBEC activity, suggests an intrinsically different mutational processes linked to DNA replication consistent with the model previously described¹².

**Fig. 2: Relationship between replication and transcription in mutational processes.**

Similarly, as previously documented, strong replicative strand asymmetry (>30% imbalances)¹² was shown with respect to signatures 2 (Q = 4.0 × 10⁻¹⁶) and 13 (Q = 4.0 × 10⁻¹⁶) with higher mutation in the lagging strand (Fig. 2b, Supplementary Table 8). These findings are consistent with APOBEC activity primarily affecting lagging strands.

Overall, increased mutation rate was associated with increased transcription, suggesting the mutagenic role of the transcriptional process in MM (Fig. 2c). This contrasts markedly to hepatocellular carcinoma¹¹, suggesting that transcription-associated mutagenesis may overwhelm transcription-coupled repair in MM²⁵. Moreover, strikingly elevated mutation rates of both SNVs and indels were shown for highly expressed genes (Fig. 2c). A number of these highly expressed genes (i.e., FPKM (fragments per kilobase of exons per million reads) >100), which are also frequently mutated, including EGR1²⁶, XBP1²⁷, BTG2²⁸, DDX5²⁹, and NFKBIA⁹ (Supplementary Table 9), have well-established roles in plasma cell differentiation and MM. The strong replicative, but weak transcriptional mutational asymmetry (Fig. 2d, Supplementary Table 10) seen in MM is consistent with the mutual exclusivity trend of replicative and transcriptional asymmetries shown in many cancers²⁴.

Mutational signatures in coding and non-coding regions

A significant difference in all mutational signatures within coding and non-coding regions was shown (Fig. 3, Supplementary Table 11), implying different genomic regions are subject to specific mutational processes, consistent with earlier observations³⁰. AID-attributed signature 9 predominates in non-coding regions, whereas exonic mutations are dominated by signatures 1, 2, and 13, implicating aging and APOBEC signatures as important.

Relationship between mutational signatures and kataegis

Local hypermutated regions of tumor genomes, or kataegis, have been observed in MM^9,31 and other B cell malignancies¹. We examined COSMIC mutational signatures contributing to kataegis (defined on the basis of average inter-mutation distance ≤1 kb;^3,32 Supplementary Table 12a), which were detected in 9% of samples (71/874). We did not observe significant and consistent enrichment of COSMIC signatures at kataegis foci compared to other mutations in tumors with and without kataegis detected (Supplementary Table 12b). We identified 70 genes disrupted by kataegis (Supplementary Table 13), including CCND1, CCND3, MAF, and FZD2, which are often affected by chromosomal rearrangements^6,33. Globally, 62% of kataegis foci co-localize with 5% of somatic structural arrangement sites (Supplementary Fig. 7), consistent with previous finding that most genomic rearrangements do not feature kataegis in nearby regions¹.

Mutational signatures and myeloma subgroups

We observed significant association between specific mutational signatures and MM subgroups (Table 1). Signature 1 was enriched in HD MM (Q = 3.2 × 10⁻⁴) consistent with the correlation between age and frequency of HD³⁴ (Supplementary Table 14). APOBEC-attributed signatures 2 and 13 were enriched in MAF-translocation subgroups—t(14;16) (Q = 1.7 × 10⁻¹⁵ and Q = 3.5 × 10⁻¹⁹, respectively), t(14;20) (Q = 1.4 × 10⁻³ and Q = 6.4 × 10⁻⁶, respectively)—and to a lesser extent in t(4;14) (only signature 2, Q = 9.3 × 10⁻⁶) consistent with previous reports^7,35. Flat COSMIC signatures, attributable to DNA repair deficiency, were enriched in t(11;14) MM (Q = 3.3 × 10⁻⁴) and t(4;14) MM (Q = 0.033). We observed an enrichment of non-clustered deletions, large-scale tandem duplications, and inversions RS1 (Q = 3.8 × 10⁻⁶); and clustered translocation RS2 (Q = 0.010) signatures in t(4;14) MM (Supplementary Table 15). Although speculative it is possible that the t(4;14) translocation, which leads to up-regulation of histone methyltransferase (MMSET), may affect genomic instability through some as yet undisclosed epigenetic mechanism.

Table 1 Association of major myeloma subgroups and mutational signature (Q < 0.05)

Full size table

We further explored the links between established prognostic mutational events (1p deletion, 1q gain, 17p deletion, and TP53 mutations) with mutational signatures (Supplementary Table 16). Associations between chromosome-arm events at 1p and 1q with COSMIC signatures 2, 13, and RS1 (Q < 0.05) and between TP53 mutations tumors with RS1 (Q = 0.033) and RS2 (Q = 7.4 × 10⁻³) raise the possibility of causal relationships.

Mutational signatures and driver genes

To identify etiological mutational processes underlying driver mutations in MM, we compared mutational contribution in driver genes to other exonic mutations. Overall, the same diversity of processes in driver mutations was seen as in other coding mutations, but with differences: lower contribution of signatures 2 and 13; and higher contribution of signatures 1, 9, 16, 30, and the flat signatures in coding regions of driver genes, compared to other exonic mutations (Fig. 4, Supplementary Table 17). Notably, we observed an over-representation of signatures reflective of aging with CCND1 and DNAH5 mutations, and AID with EGR1 mutations (Fig. 4, Supplementary Table 18). In contrast, a relative under-representation of signatures 2 and 13 suggests that APOBEC mutations are ubiquitous mutational processes and they do not specifically affect driver genes. Driver genes were replicated earlier than other coding genes (P < 2.2 × 10⁻¹⁶, Wilcoxon’s rank-sum test) and we therefore assessed whether this difference could explain enrichment of the signatures. APOBEC signature 2 is enriched in late-replicating regions (Supplementary Fig. 6, Supplementary Table 7); hence, the tendency of driver genes to be replicated early may explain the lower frequency of signature 2 mutations associated with driver genes. Signatures 1, 9, 16, 30, and the flat signatures were also associated with late-replicating regions (Supplementary Fig. 6, Supplementary Table 7), but conversely were more frequently associated with driver gene mutations. To test if the enrichment of mutational processes in driver genes were due to positive selection of certain mutations, we excluded all mutations that occurred at the exact same position in multiple tumors (46% of mutations) and repeated the analysis. Exclusion of recurrent mutations did not change the overall results, inferring that positive selection of specific mutations did not bias the analysis. We did not observe any significant transcriptional strand bias across mutational signatures (Fig. 2d), suggesting that the differences in mutational contribution between driver genes and other exonic mutations are unlikely to be influenced by transcription.

**Fig. 4: Mutational signatures associated with driver genes.**

Prognostic impact of mutational signatures

We next investigated the prognostic impact of mutational signatures using the prospective data from CoMMpass. The APOBEC signature has previously been reported to be associated with a worse patient outcome^21,35. In this study after adjusting for age, sex, translocation status, chromosome-arm events, and TP53 status, no statistically significant association was shown, suggesting that APOBEC status does not represent an independent biomarker of patient outcome: progression-free survival (PFS: hazard ratio [HR] = 2.45, 95% confidence interval [CI] = 0.94–6.37, P = 0.066) and overall survival (OS: HR = 2.81, 95% CI = 0.96–10.10, P = 0.10) (Supplementary Table 19). We next explored whether incorporating information on major SNVs and SV mutational signatures could further enhance the prediction of patient outcome after taking into account of established prognostic factors. Unsupervised hierarchical clustering provided evidence for seven distinct groups (A–G) associated with both PFS (log-rank P = 3.4 × 10⁻⁴) and OS (log-rank P = 0.011) (Fig. 4, Table 2, Supplementary Fig. 8), with group C being enriched for HD MM, group G is featuring tumors with 1p deletion, while group D being characterized by APOBEC mutation, enrichment for MAF-translocation subgroups, 1p deletion, and 1q gain (Supplementary Table 20). Post hoc delineation allowed us to stratify patients in seven groups into low- (A, B, C, and E) and high-risk groups (D, G, and F) (Supplementary Table 21). Classification of MM based on mutational signatures captured by these seven groups is independent prognosis factors (Supplementary Table 22). Notably, group F was independently associated with adverse prognosis (PFS: HR = 1.95, 95% CI = 1.35–2.81, P = 3.3 × 10⁻⁴ ; OS: HR = 1.47, 95% CI = 1.02–2.13, P = 0.039) (Supplementary Table 22), despite not being associated with the high-risk features of APOBEC, t(14;16)/t(14;20), 1p/1q/17p chromosome-arm events or TP53 mutation status, but was typified by non-clustered structural rearrangements (Fig. 5a, Table 2, Supplementary Fig. 8).

Table 2 Summary of characteristics of the seven cluster subgroups

Full size table

**Fig. 5: Integrative clusters based on mutational signatures and patient prognosis.**

Discussion

Our analysis of over 800 myeloma genomes has afforded a global overview of the mutational processes in MM tumorigenesis. A major finding of this study is that a combination of signatures linked to aging, APOBEC/AID, and indicative DNA repair deficiency—account for around 80% of mutations in MM. Despite the difficulty of assigning flat signatures (3, 5, and 8)^22,23, their detection of such profiles in large patient series supports the role of defective DNA repair in MM. By utilizing both WES and WGS data, we were able to extract five novel structural RSs and identify differential prevalent mutational processes in coding (aging and APOBEC) and non-coding regions (AID), consistent with a previous report³⁰. Our work supports previous findings³⁰ in implying an early role for AID in shaping the MM mutational landscape. We also identified new and validated previously reported subgroup associations with mutational signatures, allowing further categorization of MM beyond simple translocation status and providing additional insight in the etiological processes implicated in tumorigenesis (Fig. 6).

Mutations do not occur uniformly over the genome and local mutation rates are modulated by replication, transcription, and chromatin organization¹². We observed an enrichment of somatic mutations in late-replicating regions, as seen across several cancers³⁶, and highly expressed regions. Previous analyses, which have sought to establish the mutational profile of myeloma genomes, have been based on data solely from exome sequencing projects. Here we have sought to provide a more comprehensive analysis; however, we acknowledge that the low coverage of CoMMpass WGS raises the possibility that we may have underestimated the global mutation rate. The strong replicative asymmetry observed is consistent with mutations in MM being predominantly associated with APOBEC family of mutations²⁴. In addition, we identified that coding drivers are likely to be originated from a number of mutational processes, including aging and DNA repair deficiency. In contrast, while APOBEC enzymes appear to act more ubiquitously within coding regions, they do not specifically affect coding drivers.

The different MM translocation subgroups showed striking differences in their mutational signatures, reflective of the cellular processes driving respective clonal expansions (Fig. 6). As previously reported, t(14;16) and t(14;20) MM were enriched with APOBEC signatures 2 and 13^7,35. This is a consequence of the over-expression of APOBEC genes, specifically APOBEC3A and APOBEC3B, mediated through the over-expression of MAF transcription factors³⁵. The t(4;14) subgroup was also enriched with APOBEC mutational patterns, although only for signature 2 and, to a lesser extent, as compared to MAF-translocation subgroups. Since signatures 2 and 13 are reflective of different mutational processes¹², we speculate that the mutational processes associated with t(4;14) are likely to be different from those with MAF-translocation subgroups. In contrast, signatures indicative of DNA repair deficiency were associated with t(11;14) and t(4;14) and aging with HD. DNA breaks unsuccessfully repaired due to defective DNA repair may facilitate the generation of chromosomal translocations³⁷. Because of the flat structure of signatures 3, 5, and 8 robust insight into etiological contribution of DNA repair deficiency to MM tumorigenesis requires assiduous signature fitting and adjustment for confounding covariates²³. The molecular mechanisms responsible for initiating HD in MM are unknown. However, by inference from childhood acute lymphoblastic leukemia³⁸, it is likely it is a consequence of the simultaneous gain of chromosomes in a single abnormal cell division. Cells failing to execute programmed cell death in response to mitotic failure are likely to divide asymmetrically, resulting in the generation of aneuploidy cells³⁹. The association between aging with increased cell division errors⁴⁰ and decreased apoptosis⁴¹ further supports a relationship between HD MM and aging. Signatures defined by large-scale structural aberrations were associated to varying degrees with MM subgroups, but clustered translocations and non-clustered deletions, large-scale tandem duplications, and inversions showed a significant association in t(4;14) MM.

**Fig. 6: Contribution of major mutational processes operative in multiple myeloma.**

The APOBEC mutational signatures are inextricably linked to a high mutation load^7,35 and the adverse t(14;16) and t(14;20) MAF-translocation subgroups. We show that molecular classification based solely on APOBEC signatures do not fully differentiate the underlying genomic complexity in MM relevant to predicting patient outcome. Hence, while APOBEC activity is an adverse prognostic factor in MM^21,35, using it as a sole classifier does not fully capture high-risk MM, which with genetically unstable genome is typified by complex structural variants. Our findings support the need for considering other mutational signatures to refine prediction of patient prognosis.

Our study does, however, suggest that analysis of APOBEC activity together with other molecular features at diagnosis should allow for the identification of high-risk MM patients that may benefit from more intensive treatment. Collectively, these data shed new light on the diversity of cellular processes generating somatic mutations in MM. Moreover, they provide a strong rationale for integration of mutational signatures data in conventional molecular profiling of patient tumors to tailor therapy.

Materials and methods

Samples and datasets

All data analyzed were generated as part of the Multiple Myeloma Research Foundation (MMRF) CoMMpass Study (release IA10). WGS data on 850 matched tumor-normal baseline newly diagnosed bone marrow samples were downloaded from the database of Genotype and Phenotype (dbGaP). Matched tumor RNAseq processed by HTseq were used for gene expression analysis. WES variants (detected by at least two out of three variant callers—MuTect, Seurat, and Strelka) from 874 samples, RNAseq, copy number variation (CNV), clinical data, and sequencing-based fluorescent in situ hybridization (Seq-FISH) data (MMRF IA10 dataset) were downloaded from MMRF web portal (https://research.themmrf.org/). WES and WGS data were available for 824 samples.

Somatic mutation calling

Calling of somatic mutations was performed as described previously⁷. Briefly, raw WGS sequencing data were quality checked using FastQC (v.0.11.4) and aligned using the Burrows-Wheeler Alignment tool⁴² (BWA v0.7.12) to the human genome hg19/GRCh37 assembly. SNV mutations were called using MuTect⁴³ (v1.1.7) according to best practices, utilizing data from dbSNP v147 and COSMIC non-coding variants v77⁴⁴. Mutations were then filtered for oxidation artifacts⁴⁵ and by quality score as described previously⁷. Mutations mapping to immune hypermutated regions (429 immunoglobulin and the major histocompatibility complex loci, each region extended by 50 kb, as defined in Ensembl v73)⁴⁶ were excluded to avoid bias from mutation as a consequence of normal B cell development.

Determination of myeloma karyotype

Translocation status of MM tumors was based on Seq-FISH⁴⁷. HD was defined as amplification of 90% of the chromosome in at least two autosomes⁷. Prognostic chromosome-arm events (>1 Mb) were defined as deleted or amplified with abs(log₂ ratio) ≥0.1613 occurring at 1p12, 1p32.3, 1q21.1, 1q23.3, and 17p13⁶.

Mutational signatures

Characterization of the 30 COSMIC mutational signatures (http://cancer.sanger.ac.uk/cosmic/signatures) and de novo extraction of signatures was performed using Palimpsest^11,48 with default parameters. We compared de novo mutational signatures with 30 pre-defined COSMIC signatures by computing their cosine similarities¹. A de novo mutational signature was assigned to a COSMIC signature if the cosine similarity was >0.75 as previously advocated¹¹. If multiple COSMIC signatures passed this threshold, then the most similar COSMIC signature was assigned to the de novo signature. We compared proportion COSMIC mutational signatures between high-coverage WES clonal mutations (alternate allele ratio >0.9) and low-coverage WGS mutations restricted to exome regions, as well as between CoMMpass exome and Walker et al.⁸ exome mutations. Correlations were tested using Spearman’s correlation. For those signatures with an apparent flat profile we considered these in concert, by combining the respective contributions of signatures 3, 5, and 8.

We used MANTA to identify somatic SVs from the WGS data adopting default settings⁴⁹. We applied the same statistical framework used for signature analysis of SVs implemented in Palimpsest⁴⁸ to extract de novo RSs as previously described¹¹. Correlations between SV signatures and major COSMIC pre-defined SNV signatures (>1% mutational contribution in WGS) were tested using Spearman’s correlation. No significant correlation was seen after adjusting for multiple testing (i.e., Q > 0.05).

We examined the relationship between mutational signatures and clinico-pathological parameters confining our analysis to the major MM subgroups—HD, t(4;14), t(11;14), t(14;16), t(14;20), and t(8;14) MYC. Test of association between each signature and subgroups was based on a two-tailed Fisher’s exact test using Benjamini–Hochberg false discovery rate procedure to address multiple testing.

We compared contribution of each mutational signature to coding and non-coding regions using WGS data. To calculate contribution of a mutational signature to a genomic region, we first estimated the probability that each mutation was due to the process underlying each signature and calculated the cumulative probability of all mutations in each region, as per Letouze et al.¹¹. After computing these probabilities, regional differences in trinucleotide composition were accounted for when comparing the contribution of mutational signatures between two genomic regions (regions X and Y). Such normalization was conducted by changing the number of mutations from each mutational category in region X to that expected if the trinucleotide composition of region X was identical to the trinucleotide composition of region Y, assuming a constant rate of mutation at positions of each trinucleotide context. The normalized number of mutations $U_{\mathrm{norm}}^{C,X}$ of category C in region X was calculated as:

$$U_{\mathrm{norm}}^{{C},{X}} = U^{C,X}\frac{{V^{C,Y}{W}^X}}{{V^{{C},{X}}{W}^{Y}}},$$

where U^C,X is the number of mutations of category C observed in region X, V^C,X is the number of positions at which a mutation of category C can occur in region X, and W^X is the size of region X (in base pairs). As $U_{\mathrm{norm}}^{C,X}$ is not necessarily an integer, it is rounded to the closest integer before comparisons are completed. Mutation numbers were normalized within each tumor. Since small numbers of mutations may impact on normalization, in each comparison the larger region was designated as region X, the smaller region designated as region Y.

Replication timing and replication strand bias

We used replication sequencing (Repli-seq) data generated by the ENCODE consortium for the lymphoblast cell lines GM12878, GM06990, GM12801, GM12812, and GM12813 to define early- and late-replicating regions, as well as leading and lagging DNA strands using Repli-seq signal peaks from GM12801 as previously described^11,12. Mutation rates across deciles of replication timings were estimated globally using WGS data and for each signature, with each mutation assigned to a single signature by Palimpsest^11,48. The replication timing slope was estimated by linear regression model. To test the null hypothesis that the slope gradients equal zero, the replication timing deciles were permuted 10,000 times. Empirical P values were calculated as the fraction of permutations with absolute slope values at least as great as the absolute slope value computed using the true replication timing deciles.

Analysis of mutational replication strand bias between leading and lagging strands was performed across all 30 COSMIC signatures as previously described¹¹, using WGS data. The Wilcoxon’s rank-sum test was used to determine significant difference of mutational contribution from each COSMIC signature between leading and lagging strands. Levels of asymmetry were considered significant if strand imbalances were >30%¹² and Q < 0.05.

Transcriptional levels and transcriptional strand bias

To correlate mutational processes with gene expression, RNAseq data were normalized to FPKM¹¹. For each tumor, genes were partitioned into pentiles based on respective FPKM. Immunoglobulin-related genes and genes known to be highly upregulated in MM as a result of translocations (CCND1, CCND3, FGFR3, MMSET, MAF, MAFB, and MYC)⁶ were excluded to mitigate against bias. Mutation rates of genes within each of the five transcriptional level categories were estimated per tumor based on WES called mutations. Average alignability score for highly expressed genes was based on alignability of 75mers defined by the ENCODE/CRG GEM mappability tool⁵⁰. We examined mutation rates on transcribed and non-transcribed strands globally and for each signature as described previously¹¹ using Palimpsest^11,48. Wilcoxon’s rank-sum tests, corrected for multiple testing, were used to determine significant difference of mutational contribution from each COSMIC signature between transcribed and non-transcribed strands. Levels of asymmetry were again considered significant if strand imbalances were >30%¹² and Q < 0.05.

Kataegis

We restricted our kataegis analysis to high-coverage WES data, where we have sufficient coverage to detect local hypermutation. Kataegis foci were defined as having six or more consecutive mutations with an average mutational distance ≤1 kb, as previously described^3,32. Co-localization of kataegis and structural rearrangements was assessed based on the proportion of SV regions having kataegis foci residing within 10 kb. To examine enrichment of a mutational signature at kataegis regions, we compared mutational contribution of each signature across all mutations at kataegis foci with other mutations in tumors with and without kataegis being detected using Wilcoxon’s rank-sum test, corrected for multiple testing and imposed a threshold of Q < 0.05.

Association of mutational signatures with the mutation of driver genes

For SNV mutational signatures, Wilcoxon’s rank-sum tests were used to compare contribution of each mutational signature in coding drivers^7,8,9,10 and other exonic mutations, with normalizing for trinucleotide composition as described above. For each somatic mutation, we estimated the probability that it was the result of each mutational process considering the trinucleotide context and the number of mutations attributed to each process in the respective tumor as per Letouze et al.¹¹. We then compared, for each driver gene and mutational signature, the probability distribution in mutations affecting the driver gene as compared to all other mutations in tumors with and without the driver gene mutated using Wilcoxon’s rank-sum tests, imposing Benjamini–Hochberg correction for multiple testing. We evaluated all driver genes identified in previous studies^7,8,9,10 with Q < 0.05.

Association of signatures with clinical features

Multivariate Cox regression was performed to adjust for covariates, including age at diagnosis, sex, translocation status, and APOBEC mutational contribution (COSMIC signatures 2 and 13). We used the ConsensusClusterPlus R package⁵¹ to hierarchically cluster patients based on de novo SV and major COSMIC SNV signatures (>1% contribution) extracted from WGS with default settings³². Fisher’s exact test was used to test whether clusters were associated with MM subgroups or driver gene mutations, imposing Benjamini–Hochberg correction for multiple testing. The log-rank test was used to assess the differences in PFS and OS between all cluster groups. To delineate clusters into low- and high-risk groups, pairwise comparisons in survival distributions were performed using the pairwise_survdiff function implemented in the survminer R package³³.

Multivariate Cox regression was performed for each subgroup vs. other subgroups, adjusting for age at diagnosis, sex, translocation status, APOBEC contribution, 1p deletion, 1q gain, 17p deletion, and TP53 non-synonymous mutations.

Data availability

WGS and WES raw fastq data were obtained from dbGaP under the study accession code phs000748.v4.p3. WES somatic variants, RNAseq, CNV, Seq-FISH, and clinical data were obtained from MMRF IA10 (https://research.themmrf.org/). Replication timing data were downloaded from the UCSC Genome Browser (http://hgdownload.cse.ucsc.edu/goldenPath/hg19/encodeDCC/wgEncodeUwRepliSeq/).

References

Alexandrov, L. B. et al. Signatures of mutational processes in human cancer. Nature 500, 415–421 (2013).
Article CAS Google Scholar
Helleday, T., Eshtad, S. & Nik-Zainal, S. Mechanisms underlying mutational signatures in human cancers. Nat. Rev. Genet. 15, 585–598 (2014).
Article CAS Google Scholar
Alexandrov, L. B., Nik-Zainal, S., Wedge, D. C., Campbell, P. J. & Stratton, M. R. Deciphering signatures of mutational processes operative in human cancer. Cell Rep. 3, 246–259 (2013).
Article CAS Google Scholar
Alexandrov, L. B. et al. Mutational signatures associated with tobacco smoking in human cancer. Science 354, 618–622 (2016).
Article CAS Google Scholar
Pfeifer, G. P. Environmental exposures and mutational patterns of cancer genomes. Genome Med. 2, 54 (2010).
Article Google Scholar
Manier, S. et al. Genomic complexity of multiple myeloma and its clinical implications. Nat. Rev. Clin. Oncol. 14, 100–113 (2017).
Article CAS Google Scholar
Hoang, P. H. et al. Whole-genome sequencing of multiple myeloma reveals oncogenic pathways are targeted somatically through multiple mechanisms. Leukemia 32, 2459–2470 (2018).
Article CAS Google Scholar
Walker, B. A. et al. Mutational spectrum, copy number changes, and outcome: results of a sequencing study of patients with newly diagnosed myeloma. J. Clin. Oncol. 33, 3911–3920 (2015).
Article CAS Google Scholar
Bolli, N. et al. Heterogeneity of genomic evolution and mutational profiles in multiple myeloma. Nat. Commun. 5, 2997 (2014).
Article Google Scholar
Lohr, J. G. et al. Widespread genetic heterogeneity in multiple myeloma: implications for targeted therapy. Cancer Cell 25, 91–101 (2014).
Article CAS Google Scholar
Letouze, E. et al. Mutational signatures reveal the dynamic interplay of risk factors and cellular processes during liver tumorigenesis. Nat. Commun. 8, 1315 (2017).
Article Google Scholar
Morganella, S. et al. The topography of mutational processes in breast cancer genomes. Nat. Commun. 7, 11383 (2016).
Article CAS Google Scholar
Nik-Zainal, S. et al. Mutational processes molding the genomes of 21 breast cancers. Cell 149, 979–993 (2012).
Article CAS Google Scholar
Zou, X. et al. Validating the concept of mutational signatures with isogenic cell models. Nat. Commun. 9, 1744 (2018).
Article Google Scholar
Kim, J. et al. Somatic ERCC2 mutations are associated with a distinct genomic signature in urothelial tumors. Nat. Genet. 48, 600–606 (2016).
Article CAS Google Scholar
Jager, M. et al. Deficiency of nucleotide excision repair is associated with mutational signature observed in cancer. Genome Res. 29, 1067–1077 (2019).
Article Google Scholar
Drost, J. et al. Use of CRISPR-modified human stem cell organoids to study the origin of mutational signatures in cancer. Science 358, 234–238 (2017).
Article CAS Google Scholar
Di Noia, J. M. & Neuberger, M. S. Molecular mechanisms of antibody somatic hypermutation. Annu. Rev. Biochem. 76, 1–22 (2007).
Article Google Scholar
Puente, X. S. et al. Non-coding recurrent mutations in chronic lymphocytic leukaemia. Nature 526, 519–524 (2015).
Article CAS Google Scholar
Griffith, M. et al. Optimizing cancer genome sequencing and analysis. Cell Syst. 1, 210–223 (2015).
Article CAS Google Scholar
Maura, F. et al. Biological and prognostic impact of APOBEC-induced mutations in the spectrum of plasma cell dyscrasias and multiple myeloma cell lines. Leukemia 32, 1044–1048 (2018).
Article CAS Google Scholar
Rosenthal, R., McGranahan, N., Herrero, J., Taylor, B. S., Swanton, C. DeconstructSigs: delineating mutational processes in single tumors distinguishes DNA repair deficiencies and patterns of carcinoma evolution. Genome Biol. 17, 31 (2016).
Maura, F. et al. A practical guide for mutational signature analysis in hematological malignancies. Nat. Commun. 10, 2969 (2019).
Haradhvala, N. J. et al. Mutational strand asymmetries in cancer genomes reveal mechanisms of DNA damage and repair. Cell 164, 538–549 (2016).
Article CAS Google Scholar
Park, C., Qian, W. & Zhang, J. Genomic evidence for elevated mutation rates in highly expressed genes. EMBO Rep. 13, 1123–1129 (2012).
Article CAS Google Scholar
Chen, L. et al. Identification of early growth response protein 1 (EGR-1) as a novel target for JUN-induced apoptosis in multiple myeloma. Blood 115, 61–70 (2010).
Article CAS Google Scholar
Reimold, A. M. et al. Plasma cell differentiation requires the transcription factor XBP-1. Nature 412, 300–307 (2001).
Article CAS Google Scholar
Leone, E. et al. Targeting miR-21 inhibits in vitro and in vivo multiple myeloma cell growth. Clin. Cancer Res. 19, 2096–2106 (2013).
Article CAS Google Scholar
Felix, R. S. et al. SAGE analysis highlights the importance of p53csv, ddx5, mapkapk2 and ranbp2 to multiple myeloma tumorigenesis. Cancer Lett. 278, 41–48 (2009).
Article CAS Google Scholar
Maura, F. et al. Analysis of mutational signatures suggest that aid has an early and driver role in multiple myeloma. Blood 128, 116–116 (2016).
Google Scholar
Ashby, C. C. et al. Whole genome sequencing reveals the extent of structural variants in multiple myeloma and identifies recurrent mutational hotspots within the non-coding regions. Blood 130(Suppl. 1), 3032–3032 (2017).
Google Scholar
Nik-Zainal, S. et al. Landscape of somatic mutations in 560 breast cancer whole-genome sequences. Nature 534, 47–54 (2016).
Article CAS Google Scholar
Zhan, F. et al. The molecular classification of multiple myeloma. Blood 108, 2020–2028 (2006).
Article CAS Google Scholar
Ross, F. M. et al. Age has a profound effect on the incidence and significance of chromosome abnormalities in myeloma. Leukemia 19, 1634 (2005).
Article CAS Google Scholar
Walker, B. A. et al. APOBEC family mutational signatures are associated with poor prognosis translocations in multiple myeloma. Nat. Commun. 6, 6997 (2015).
Article CAS Google Scholar
Stamatoyannopoulos, J. A. et al. Human mutation rate associated with DNA replication timing. Nat. Genet. 41, 393–395 (2009).
Article CAS Google Scholar
Nambiar, M. & Raghavan, S. C. How does DNA break during chromosomal translocations? Nucleic Acids Res. 39, 5813–5825 (2011).
Article CAS Google Scholar
Onodera, N., McCabe, N. R. & Rubin, C. M. Formation of a hyperdiploid karyotype in childhood acute lymphoblastic leukemia. Blood 80, 203–208 (1992).
CAS PubMed Google Scholar
Castedo, M. et al. Cell death by mitotic catastrophe: a molecular definition. Oncogene 23, 2825–2837 (2004).
Article CAS Google Scholar
Ly, D. H., Lockhart, D. J., Lerner, R. A. & Schultz, P. G. Mitotic misregulation and human aging. Science 287, 2486–2492 (2000).
Article CAS Google Scholar
Tower, J. Programmed cell death in aging. Ageing Res. Rev. 23(Part A), 90–100 (2015).
Article CAS Google Scholar
Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).
Article Google Scholar
Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 31, 213–219 (2013).
Article CAS Google Scholar
Forbes, S. A. et al. COSMIC: exploring the world’s knowledge of somatic mutations in human cancer. Nucleic Acids Res. 43, D805–D811 (2015).
Article CAS Google Scholar
Costello, M. et al. Discovery and characterization of artifactual mutations in deep coverage targeted capture sequencing data due to oxidative DNA damage during sample preparation. Nucleic Acids Res. 41, e67 (2013).
Article CAS Google Scholar
Weinhold, N., Jacobsen, A., Schultz, N., Sander, C. & Lee, W. Genome-wide analysis of noncoding regulatory mutations in cancer. Nat. Genet. 46, 1160–1165 (2014).
Article CAS Google Scholar
Miller, C. et al. A comparison of clinical FISH and sequencing based FISH estimates in multiple myeloma: an Mmrf commpass analysis. Blood 128, 374–374 (2016).
Google Scholar
Shinde, J. et al. Palimpsest: an R package for studying mutational and structural variant signatures along clonal evolution in cancer. Bioinformatics 34, 3380–3381 (2018).
CAS PubMed Google Scholar
Chen, X. et al. Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications. Bioinformatics 32, 1220–1222 (2016).
Article CAS Google Scholar
Derrien, T. et al. Fast computation and applications of genome mappability. PLoS ONE 7, e30377 (2012).
Article CAS Google Scholar
Wilkerson, M. D. & Hayes, D. N. ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics 26, 1572–1573 (2010).
Article CAS Google Scholar

Download references

Acknowledgements

This work was supported by grants from Myeloma UK, Bloodwise, and Cancer Research UK (C1298/A8362). These data were generated as part of the Multiple Myeloma Research Foundation Personalized Medicine Initiatives (https://research.themmrf.org and www.themmrf.org).

Author information

Authors and Affiliations

Division of Genetics and Epidemiology, The Institute of Cancer Research, London, UK
Phuc H. Hoang, Alex J. Cornish, Sara E. Dobbins & Richard S. Houlston
Division of Molecular Pathology, The Institute of Cancer Research, London, UK
Phuc H. Hoang, Martin Kaiser & Richard S. Houlston

Authors

Phuc H. Hoang
View author publications
You can also search for this author in PubMed Google Scholar
Alex J. Cornish
View author publications
You can also search for this author in PubMed Google Scholar
Sara E. Dobbins
View author publications
You can also search for this author in PubMed Google Scholar
Martin Kaiser
View author publications
You can also search for this author in PubMed Google Scholar
Richard S. Houlston
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

P.H.H., S.E.D., and R.S.H. conceived and designed the study; P.H.H performed bioinformatics with contribution from A.J.C. and S.E.D.; P.H.H. and R.S.H wrote the manuscript with contributions from S.E.D, A.J.C., and M.K. All authors reviewed the final manuscript.

Corresponding author

Correspondence to Richard S. Houlston.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Figures

Supplementary Tables

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Hoang, P.H., Cornish, A.J., Dobbins, S.E. et al. Mutational processes contributing to the development of multiple myeloma. Blood Cancer J. 9, 60 (2019). https://doi.org/10.1038/s41408-019-0221-9

Download citation

Received: 14 January 2019
Revised: 26 April 2019
Accepted: 08 May 2019
Published: 06 August 2019
DOI: https://doi.org/10.1038/s41408-019-0221-9

This article is cited by

Accurate and sensitive mutational signature analysis with MuSiCal
- Hu Jin
- Doga C. Gulhan
- Peter J. Park
Nature Genetics (2024)
Multiple myeloma
- Florent Malard
- Paola Neri
- Mohamad Mohty
Nature Reviews Disease Primers (2024)
Imaging flow cytometry-based multiplex FISH for three IGH translocations in multiple myeloma
- Taku Tsukamoto
- Masaki Kinoshita
- Junya Kuroda
Journal of Human Genetics (2023)
Epigenetic regulation in hematopoiesis and its implications in the targeted therapy of hematologic malignancies
- Ailin Zhao
- Hui Zhou
- Ting Niu
Signal Transduction and Targeted Therapy (2023)
Integrative multi-omics identifies high risk multiple myeloma subgroup associated with significant DNA loss and dysregulated DNA repair and cell cycle pathways
- María Ortiz-Estévez
- Fadi Towfic
- Anjan Thakurta
BMC Medical Genomics (2021)