Background

Minimal residual disease (MRD) assessment is a known surrogate marker for survival in multiple myeloma (MM) and sustained MRD has become the new therapeutic objective for MM treatment [1, 2]. However, there are few studies analyzing the impact of serial MRD assessments to monitor MRD dynamics in real life practice [3, 4].

Although MRD in MM is a valuable tool to predict patient outcomes in MM, it has pitfalls in that some patients with non-detectable MRD do relapse quickly, and some MRD-positive patients with extensive follow-up do not relapse. Sustained MRD is defined as the maintenance of MRD negativity in two consecutive samples and is considered a therapeutic objective. However, the evolution of such clonal numbers and the dynamics of MRD over time could provide additional relevant information to predict MM outcomes [5]. It is essential to look for predictors to identify patients who will not achieve sustained negative MRD, as described by D’Agostino et al. Their study identifies amp1q, ≥2 concomitant high-risk cytogenetic abnormalities, circulating tumor cells at baseline, and a longer time to reach first MRD negativity as negative prognostic factors for achieving sustained negative MRD [6].

The use of complementary information generated by the same procedure (measurement of MRD) to predict outcomes could be of great interest. It is known that partial or full immune reconstitution can have the effect of delaying the pathological progression of malignant plasma cells [7]. Clonal diversity is defined as the number of unique Immunoglobulin (H, K, or L) sequences in each sample analyzed for MRD. With new techniques of generation sequencing, we can analyze the number of different Ig gene sequences. We hypothesized that patients with high clonal diversity at a genomic level after therapy might have a more rapid recovery of the immune system and, hence lead to longer disease control.

Here, we present a single institution’s experience assessing MRD by NGS of Ig genes and the long-term impact of depth of response as well as that of clonal diversity on the clinical outcome of a large population of MM patients.

Material and methods

Data source

Four hundred eighty-two MM patients at the University of California, San Francisco (UCSF) (304 newly diagnosed and 178 in ≥2nd line) diagnosed from 2008 to 2020 were included in this retrospective analysis, based on the availability of MRD monitoring data during their treatment between 2012 and 2022. Patients were selected because they had a follow-up of more than 12 months after the first MRD test and achieved at least a VGPR. Major characteristics of the patients are summarized in Table 1. Median follow-up from the first MRD assessment was 31.8 months (range is 1–98 months).

Table 1 Main patient characteristics at diagnosis.

Patients received anti-MM therapy per provider preference with the aim of obtaining maximal response by International Myeloma Working Group (IMWG) criteria. MRD was assessed in random patients achieving VGPR or better (90% were in CR) at non-predetermined times. The majority of newly diagnosed patients were either status post autologous stem cell transplantation (ASCT) or were receiving maintenance therapy at the time of MRD assessment. In some cases, treatment was modified according to response status with the intent of achieving a deeper response.

Newly diagnosed patients generally received induction triplet combinations, which consisted of proteasome inhibitors, immunomodulators (or alkylators), and corticosteroids. Autologous stem cell transplantation (ASCT) and maintenance therapy were common (Table 2). In summary, 81% of patients received ASCT, 10% consolidation and 89% maintenance therapy until relapse or unacceptable toxicities.

Table 2 Initial therapy for NDMM.

Among the patients who had MRD testing in the relapsed setting, 48% were in the second line, 21% in the third line, and 31% in the fourth line or later. A variety of regimens were administered according to physician preference.

This retrospective study was approved by the UCSF Institutional Review Board (IRB), IRB number 15-17721.

MRD assessments

The response was evaluated following the consensus response criteria of the IMWG [8]. There were no pre-specified time points for MRD assessment, but for most of the newly diagnosed patients, it was performed pre-ASCT, at 2–3 months post-ASCT, or when complete response was achieved. Subsequent assessments were generally performed on approximately an annual basis until sustained MRD negativity was confirmed or until the patient relapsed.

Evaluation of MRD was performed by NGS of immunoglobulin genes (IGH-VDJH and IGK or IGH-VDJH, IGH-DJH, IGK and IGL) [5]. Fresh bone marrow samples from MM patients were sent to Adaptive Biotechnologies (Seattle, WA) for MRD testing after stored or fresh bone marrow specimens had been successfully utilized to obtain ID sequences. Patient-specific clonal rearrangements were identified at diagnosis and employed to track the response. Patients without a high-frequency myeloma clone (<5%) could not be monitored by this method and were excluded from this analysis. Once the absolute amount of total cancer-derived molecules present in a sample was determined, a final MRD measurement was calculated, providing the number of cancer-derived molecules per 1 million cell equivalents. The clone with the highest MRD value was selected when 2 or more clones were identified. MRD negativity was defined to be 10−6 or lower in most of the cases.

Clonal diversity of immunoglobulin (Igs) genes was determined in the same NGS assessment by several parameters. Clonality scores range from 0 to 1, where 0 represents a sample with a completely even distribution of repertoire sequences, and 1 represents a monoclonal sample. Both the depth and breadth of the repertoire contained within a sample were considered. In this study, clonality scores were calculated from clonoSEQ MRD data using Shannon Clonality. However, we used the total number of different Ig sequences of every receptor as this was more representative. Lastly, we used for further analysis the value of Clonal Diversity of the samples at the moment of maximum response.

Analysis of dynamics of MRD by artificial intelligence (AI)

The AI analysis was performed by machine learning with Connector R package to perform clustering analysis of longitudinal time-dependent data, based on MRD dynamics. This analysis was unsupervised. The logarithmic values of the MRD measurements have been used. To determine the optimal number of groups for clustering analysis, the elbow achieved in the fDB (fuzziness-Davies–Bouldin) index and tightness plots have been considered.

Statistical analysis

All data were included in a REDCap database (Vanderbilt University, Nashville, TN) and Microsoft Excel files. Statistical analysis was performed using the Statistical Package for Social Sciences program version 22.0 (SPSS, Statistical Package for Social Sciences Inc., Chicago, IL). Progression-free survival (PFS) was calculated from the MRD1 first assessment (landmark study) to disease progression or death or from the start of treatment to disease progression or death from any cause. To exclude some bias, we performed a landmark survival analysis at 12 months. Patients who did not experience an event at the end of follow-up were censored as of the last contact date. Kaplan–Meier method was employed to plot the survival curves for PFS and differences in survival outcomes between groups were evaluated through the log-rank test. Mean ± standard deviation or median ± interquartile range was included to summarize continuous variables; and qualitative variables were presented as relative and absolute frequencies. The χ2 and Fisher’s exact two-sided tests were used to compare categorical variables; and analysis of variance and Student t test were applied to compare continuous variables. An adjusted stepwise Cox proportional regression hazard model was employed to perform multivariate analysis. For the multivariable analysis, we included the following variables: age, sex, myeloma type, cytogenetic risk, hemoglobin, and creatinine. Analyses reaching two-sided p < 0.05 were considered statistically significant.

Results

Sequence identification, depth of response, and prediction of outcomes

Overall population

A total of 1098 MRD samples were analyzed at various time points during the disease course. MRD data were available at ≥3 time points for 150 patients. Overall, 184 of 482 patients (38.3%) achieved MRD negativity at 10−6, and 69 (14.3%) achieved MRD at a level of 10−5–10−6 on one or multiple assessments.

Newly diagnosed population

In the newly diagnosed group (n = 304), the median PFS from diagnosis was 86.4 months. Event-free probability at 5 years was 62% for PFS and 87% for OS. In this NDMM group, 119 of 304 (39%), patients achieved undetectable MRD at 10−6 on at least one occasion. These patients had a prolonged PFS in comparison with patients who were persistently MRD positive (median NR vs. 62 m, p < 0.001, HR 0.52 (0.3–0.8); PFS at 5 years 76% vs. 58%, Fig. 1A). When we separate patients by the depth of response, patients with a deeper response had a more prolonged PFS: <10−6 vs. 10−6–10−5 vs. below to 10−4 (median NR vs. NR vs. 49 m p > 0.001; HR 0.73 (0.7–0.9); PFS at 5 years 80% vs. 76% vs. 50%; Fig. 1B). Interestingly, in 57 patients with high-risk cytogenetic features, MRD negativity (n = 23, 40%) at the level of 10−6 was able to identify patients with longer PFS (PFS at 5 years 72% vs. 51%, p < 0.05).

Fig. 1: PFS curves based on MRD levels in different conditions.
figure 1

A Kaplan–Meier curves showing the PFS of newly diagnosed patients achieving MRD- vs different levels of MRD+. B Kaplan–Meier curves showing the OS of newly diagnosed patients achieving different levels of MRD. C Kaplan–Meier curves showing the PFS of relapsed refractory patients (RRMM) achieving MRD− vs. different levels of MRD+. D Kaplan–Meier curves showing the OS of RRMM patients achieving different levels of MRD.

Furthermore, patients who were MRD negative or who were MRD positive at a very low level (between 10−5 and 10−6), had a better OS than those with higher disease burdens (>10−5) (PFS at 5 years 92% vs. 88% vs. 68% p < 0.006, HR 0.6 (0.4–0.9) Fig. 1C).

Relapsed population

Overall, 65 of 178 (36%), patients achieved undetectable MRD at 10−6 on at least one occasion. These patients had a prolonged PFS in comparison with patients who were persistently MRD positive (NR vs. NR, PFS at 53 years 69% vs. 55% p < 0.02, HR 0.4 (0.3–0.9) Fig. 1D).

MRD monitoring and dynamics

We then analyzed the ability of repeated MRD monitoring to predict PFS in 118 newly diagnosed patients who had more than 3 MRD assessments where we could also establish a pattern of MRD dynamics. We could identify subjectively by the investigators a pattern of dynamic evolution. Three categories were identified in newly diagnosed patients: (A) patients with ≥3 persistently MRD negative measurements at 10−6 (n = 37), (B) patients with continuously declining, but detectable clones (n = 36), and (C) patients with either a stable or growing number of clones (n = 45). Groups A and B had a more prolonged PFS than group C as we have previously shown (NR vs. NR vs. 38 m, p < 0.0001; HR 0.5 (0.3–0.8) (Fig. 2A).

Fig. 2: Impact of MRD monitoring in MM patients outcomes.
figure 2

A Kaplan–Meier curves showing the PFS of different MRD dynamics patterns identified by the investigator. B By connector, we show in the graphs the optimal number of groups for clustering analysis, the elbow achieved in the fuzziness-Davies–Bouldin (fDB) index and tightness plots have been considered. Based on this criterion, we have selected the optimal number of groups to be G = 3. Mainly, this classification was based on the different patterns of MRD dynamics, MRD stable, MDR negative, or MRD decreasing. C Kaplan–Meier curves showing the PFS of different MRD dynamics patterns identified by connector grouped in A vs. B and C patterns. Group A represents patients with negative or decreasing MDR dynamics, and groups B or C represent patients with stable disease by MRD or increasing.

By AI using Connector, we were able to identify three patterns of MRD evolution in 150 patients with at least 3 MRD samples. To determine the optimal number of groups for clustering analysis, the elbow achieved in the fuzziness-Davies–Bouldin (fDB) index and tightness plots have been considered. Based on this criterion, we have selected the optimal number of groups to be G = 3, results are shown in Fig. 2B.

Figure 2C shows the Kaplan–Meier of the MRD dynamic analysis. (Note that groups B and C (BuC) have been merged due to a substantial number of patients experiencing a relapse.) We found that the prediction of outcome improves considerably with this AI approach (NR vs. 44 months, p < 10−7).

Clonal diversity of Ig genes

Finally, we analyzed clonal diversity at the moment of maximum MRD response: patients who were MRD-positive and who had not relapsed had higher clonal diversity than MRD-positive patients who had relapsed. Moreover, patients who were MRD negative and who had not relapsed had higher clonal diversity than patients who were MRD negative and who did relapse. This was observed independently for the 3 receptors analyzed (IgH p = 0.026; IgK p = 0.036 and IgL p = 0.036, Fig. 3A–C). Patients with more than 66,000 IgH unique sequences at the moment of maximum response had a prolonged PFS (p = 0.05, Fig. 3D).

Fig. 3: Impact of clonal diversity in prognosis.
figure 3

AC shows the difference in clonal diversity of IgH, IgK, and IgL genes in patients MRD+ or MRD− relapsing or not relapsing. D Kaplan–Meier curves showing the PFS of different patterns of clonal diversity of IgH genes.

Discussion

In this study, we investigated the prognostic value of MRD, MRD dynamics, and clonal diversity in a large cohort of patients with multiple myeloma. We found that achieving a very deep response by Ig genes sequencing (<10−6) was associated with a significantly longer progression-free survival (PFS) and overall survival (OS). We also identified a pattern of MRD dynamics that discriminated better between patients with an excellent outcome and those with a poor outcome than the measurement of MRD at a single time point. Finally, we found that clonal diversity of the Ig sequences was associated with a longer PFS in patients with MM.

Our findings are consistent with those of previous studies that have shown that achieving a deep MRD response is associated with a better prognosis in MM. For example, a study by the French Group of Multiple Myeloma (IMF) found that patients who achieved MRD negativity 6 months after treatment had a significantly longer PFS than patients who had persistent MRD positivity [9]. Similarly, a study by Martinez-Lopez et al. found that patients who achieved MRD negativity 1 year after treatment had a significantly longer OS than patients who had persistent MRD positivity [5].

Our study also adds to the growing body of evidence that MRD dynamics is a prognostic factor in MM. We found that patients with a pattern of MRD that was stable or increasing over time had significantly worse PFS than patients with a pattern of MRD that was decreasing. This finding is consistent with the results of a study by Paiva et al., which found that patients with a pattern of MRD that was stable or increasing over time had a significantly shorter OS than patients with a pattern of MRD that was decreasing [10].

Our study also found that clonal diversity of the Ig sequences was associated with a longer PFS in patients with newly diagnosed MM as proof of concept. This finding is consistent with the results of a study by Yang et al., which found that patients with higher clonal diversity had a longer PFS than patients with lower clonal diversity [11].

The main limitations of our study are its retrospective nature, the heterogeneity of patient treatments, and the different time points of the MRD assessments. However, we believe these limitations are overcome by the large number of patients and samples analyzed.

Future directions in this field should include the use of peripheral blood molecular tracking to monitor MRD. This would allow for more frequent and closer monitoring of MRD, which could lead to earlier intervention and improved outcomes for patients. Additionally, studies are needed to validate the use of clonal diversity as a prognostic factor in MM [12,13,14,15,16,17,18].

In conclusion, our retrospective study found that achieving a very deep and sustained MRD response, a pattern of MRD that is decreasing over time, and higher clonal diversity are all associated with a better prognosis in patients with MM. These findings suggest that MRD results combined with clonal diversity assessment could be used to identify patients who are at risk of relapse and could be targeted for earlier treatment intervention.