Tumors are characterized by variable numbers of somatic variants that have accumulated during the life history of the cancer cell as a result of abnormal DNA replication and/or DNA repair processes. The classification of such variants into six types based on the nucleotide change was used in the past to differentiate the crude mutation pattern of different cancers.1 Recently, the 5′- and 3′-context of each substitution was included in such analyses, expanding the combinations to 96 possible mutation types. This trinucleotide mutational model represents the combined effect of several mutational signatures, and has enough resolution to allow deconvolution of the underlying mutational processes through the non-negative matrix factorization (NNMF) algorithm.2 To date, more than 30 distinct signatures have been identified, opening the field to the investigation of the biological processes responsible for shaping the genome of cancer, and allowing a deeper understanding of their relative contribution in different cancer types.2, 3

In multiple myeloma (MM), two independent whole-exome sequencing (WES) studies have revealed four mutational signatures. Two are associated with aberrant activity of APOBEC cytidine deaminases (signatures #2 and #13). The other two reflect processes generating mutations at a steady rate, resulting in a mutation load that is often proportional to the cancer age at the time of sampling: these processes are highlighted by signature #1, arising from spontaneous deamination of methylated cytosines, and by signature #5, a less-understood process that exhibits transcriptional strand bias.3, 4, 5 Mutational signatures have not been investigated in other primary plasma cell dyscrasias such as monoclonal gammopathy of unknown significance (MGUS) or primary plasma cell leukemia (pPCL). Furthermore, human myeloma cell lines (HMCLs) bear a genomic profile that is only partially recapitulating their primary counterparts,6 and mutational signatures have never been studied in that context. Finally, while APOBEC activity has been correlated to increased mutational burden and poor-prognosis MAF/MAFB translocations in MM at diagnosis5, this has never been confirmed in multivariate analysis in an independent large series.

To answer these questions, we mined two large public MM WES data sets4, 7 that included six MGUS/Smoldering MM and 255 MM, to which we added 896 MM samples from the IA9 public release of the CoMMpass trial. The CoMMpass data were generated as part of the Multiple Myeloma Research Foundation Personalized Medicine Initiatives (https://research.themmrf.org and www.themmrf.org). Furthermore, we included matched WES data from five previously published pPCL patients.8 Finally, we used WES mutational catalogs from 18 HMCLs available from the COSMIC cell-line project (v81, http://cancer.sanger.ac.uk/cell_lines; Supplementary Table 1). Extraction of mutational signatures was performed using the NNMF algorithm across cumulative catalogs of coding and non-coding mutations as previously described2, 3 (Supplementary Materials and Methods).

We analyzed 203 917 mutations from 1162 whole exomes of primary plasma cell dyscrasias and 18 HMCLs. The global mutation burden increased linearly from MGUS to MM and pPCL. HMCLs showed the highest burden overall, but likely included many residual germline variants despite extensive filtering of these unmatched samples (Supplementary Figure 1). In all three studies, the mutational load of MM was quite heterogeneous, with a minority of hypermutated samples (Figure 1a).

Figure 1
figure 1

APOBEC contribution in plasma cell dyscrasias. (a, b) Barplot of absolute (a) and relative (b) contribution of mutational signatures on three different MM WES series. (c, d) Extraction of mutational signature from 18 HMCLs: (c) unsupervised hierarchical clustering, showing two main clusters A and B characterized by different APOBEC contribution. (d) Barplot representing the absolute APOBEC contribution to the mutational load when NNMF was applied considering clusters A and B as independent series. Asterisks (*) highlight cell lines with ‘canonical’ t(14;16) translocations (IGH/MAF). The template (§) and hash (#) signs mark cell lines carrying alternative MAF/MAFB rearrangements among clusters A and B, respectively. (e, f) Boxplot showing the progressive increase of the APOBEC absolute (e) and relative (f) mutation load from MGUS to Cluster A HMCLs.

NNMF extracted four signatures in the whole cohort pertaining to three distinct mutational processes:2, 3 two are the age-related signatures #1 and #5, and the third process is represented by aberrant APOBEC activity3 (Figures 1a and b). While the activity of age-related processes was more prominent in the cohort as a whole (median 70%, range 0–100%), APOBEC showed a heterogeneous contribution (Figures 1a and b). The absolute contribution of APOBEC activity to the mutational repertoire correlated with the overall number of mutations (r=0.71, P=<0.0001; Supplementary Figure 2). As previously described, APOBEC contribution was significantly enriched among MM patients with t(14;16) and with t(14;20) (P<0.001; Supplementary Figure 3 and Supplementary Table 2).5 However, even after subgrouping patients by main cytogenetic aberrations, the association between absolute APOBEC contribution and mutational load remained significant across all main subgroups (Supplementary Figure 2). In the MGUS/SMM series the APOBEC contribution was generally low, but the limited number of mutations and the supposedly low sample purity did not allow any further statistical investigation (Supplementary Figure 4). Among the pPCL cohort, APOBEC activity was preponderant in three out of five samples, all of them characterized by the t(14;16)(IGH/MAF); in the remaining two cases, the absolute number of APOBEC mutations was similar to that in MM (Supplementary Figure 5).

In HMCLs, unsupervised clustering based on APOBEC activity highlighted two distinct subgroups: one highly enriched in APOBEC activity (cluster A); and one with a virtually absent APOBEC activity (cluster B; Figure 1c, Supplementary Figure 6 and Supplementary Material and Methods). Interestingly, in cluster A we observed an enrichment of MAF/MAFB translocations (6/8) as compared to cluster B (1/10), and this partially explains the higher activity of APOBEC in the former. However, APOBEC activity was still variable even within cluster A, and its relative contribution was not enriched in MAF/MAFB translocated samples as compared to the other samples in the same cluster A (Figures 1c and d and Supplementary Figure 6). Cluster B was instead devoid of APOBEC activity. While some cell lines in this cluster (MC-CAR, IM-9 and ARH-77) are annotated as MM but were found to be compatible with Epstein–Barr virus-transformed lymphoblastoid cells instead (Supplementary Table 1),9, 10 others are of clear MM or PCL origin, thus underscoring the genomic diversity of HMCLs. Overall, the APOBEC contribution was characterized by a progressive increment from MGUS/SMM to MM and pPCL and ‘cluster A’ HMCLs (Figures 1e and f).

We next investigated the prognostic impact of APOBEC signatures at diagnosis using prospective data from the CoMMpass study (median follow-up 435 days (30–1421)). Patients with an absolute APOBEC contribution in the fourth quartile had shorter 2-year progression-free survival (PFS; 47% vs 66%, P<0.0001) and 2-year overall survival (OS; 70% vs 85%, P=0.0033) than patients in in the first–third quartiles (Figures 2a and b). As APOBEC contribution correlates with higher mutational burden and MAF/MAFB translocations, two known poor prognostic factors in MM5, 11, 12, 13 we performed a multivariate analysis with Cox regression to assess the independent prognostic value of APOBEC activity against these and other prognostic factors such as the International Staging System (ISS)14 and type of treatment (Figure 2c and d, Supplementary Figure 7 and Supplementary Table 3). In this model, variables such as IGH translocations and overall mutational load did not show any independent prognostic significance. Conversely, ISS stage III, as expected, had the highest hazard ratio (HR) and significance as independent prognostic factor for both PFS and OS. Remarkably, fourth quartile APOBEC had an independent adverse prognostic effect of significant magnitude (PFS HR 2.02, P=0.02, OS HR 2.78, P=0.02; Figures 2c and d and Supplementary Table 3). Despite MAF/MAFB/MAFA translocations being associated with high APOBEC activity,5 such cases accounted for just 23% of patients included in the fourth APOBEC quartile. The remainder of APOBEC-high patients did not carry MAF/MAFB/MAFA translocations nor overexpression of these genes (Supplementary Figure 8 and Supplementary Table 4). Conversely, they were characterized by a higher APOBEC (particularly APOBEC3B) gene expression compared to other quartiles (Supplementary Figure 9 and Supplementary Table 5).5 We went on to combine fourth quartile APOBEC activity with ISS stage III in a two-variable prognostic score, and we found that co-occurrence of these two factors identifies a fraction of high-risk patients with 2-year OS of 53.8% (95% confidence interval (CI) 36.6–79%), while their simultaneous absence identifies long-term survivors with 2-year OS of 93.3% (95% CI 89.6–97.2%; Supplementary Figures 10a and b). This was partially explained by a higher proportion of primary refractory cases among patients carrying both risk factors (Supplementary Figures 10c and d).

Figure 2
figure 2

Prognostic role of APOBEC mutations. (a, b) Kaplan–Meier estimated curves of PFS (a) and OS (b) according to APOBEC mutational activity in all patients from the CoMMpass study. (c, d) Forest plot summarizing the results of multivariate analysis for PFS (c) and OS (d).

In this study, we provided a global overview on the contribution of mutational processes in the largest WES series of plasma cell dyscrasias, from MGUS to MM to pPCL, investigated to date by NNMF. Contrary to what anticipated, we did not identify additional signatures compared to smaller data sets.4, 5, 7 Our data nevertheless suggest that the relative contribution of APOBEC activity may increase during progression through the different phases of MM evolution. Further studies will be necessary to confirm these findings. In primary samples, APOBEC activity showed a continuum of increased contribution that correlated with the overall mutational burden. In HMCLs instead, we found a clear-cut distinction between a cluster that had a much higher APOBEC contribution as compared to primary samples, and a second cluster where APOBEC activity was minimal or absent. Furthermore, in HMCLs the correlation with mutational burden was apparently lost. This observation is independent from the high number of likely residual germline variants observed in cell lines, as such variants are enriched for age-related signatures, while APOBEC mutations are typically of somatic nature.15 Furthermore, both in primary MM and HMCLs, the presence of MAF/MAFB/MAFA translocations explained some but not all cases with high APOBEC activity, suggesting other factors may modulate this aberrant process. Clearly, the low number of HMCLs and their poor annotation represent a potential confounding factor. Nevertheless, our data underscore the heterogeneity of HMCLs and prompt for comprehensive studies where the signature profile of cell lines is compared to that of the primary disease.6

It was shown before that a high fraction of APOBEC mutations is associated with adverse prognosis.5 Our findings nevertheless add relevant clinical information. In fact, high APOBEC activity emerged as one of the strongest and independent adverse prognostic factors in MM. Furthermore, combination of APOBEC activity and ISS showed an additive effect on survival that was already evident with a short follow-up, likely due to resistance or early relapse following initial response.

This suggests that analysis of APOBEC activity at diagnosis can help identify a small fraction of high-risk patients that could benefit from more effective treatments. We propose that cases with high APOBEC activity may represent a novel prognostic subgroup that is transversal to conventional cytogenetic classification, advocating for closer integration of next-generation sequencing studies and clinical annotation to confirm this finding in independent series.