Introduction—the unmet need

The incidence and prevalence of cancer among older adults will increase dramatically over the next 30 years in large part because the elderly population is growing. It has been estimated that the global cancer incidence in the older person will rise from 6.7 million in 2012 to a projected 14 million by 2035 [1]. It is estimated that 70% of cancers will occur in patients older than 65 years by the year 2030. The decision to treat older adults with cancer should not be based on chronologic age alone. Multiple Myeloma (MM) is the second most common haematological malignancy with almost 5000 patients diagnosed in the UK each year and over 35,000 diagnosed in Europe in 2016 alone [2, 3]. MM is predominantly a disease of older people, with two-thirds of patients aged over 70 at diagnosis. As such the incidence is increasing as the population ages. Furthermore, the increasing age at presentation, associated with age-related systemic changes as well as co-morbidities has been shown to be related to a higher frequency of treatment discontinuation and non-haematological adverse event [2, 4, 5].

Over the last 10 years the development of novel biological agents for the management of MM (proteasome inhibitors—PI and immunomodulatory drugs—ImiDs, amongst others) has improved outcomes for patients with MM such that the median overall survival is more than 7–8 years for younger/fitter patients. However, the impact of these therapies has been less marked in the older/less fit transplant non-eligible (TNE) population, particularly those over the age of 75 years [6]. These patients do not have a greater incidence of molecularly high-risk disease and so these differences are likely to be accounted for by differences in patient physiology, increased treatment-related toxicity limiting delivery of effective therapy and less effective, rigorous anti-myeloma treatment being given (undertreatment) [7]. Life expectancy is extremely variable in the same age group, thus suggesting that not only chronological age is important but also the health status of the patient [8]. This group therefore has a high un-met need both for new, less toxic treatments and treatment-delivery approaches coupled with a more appropriate personalized patient/regimen selection process [9].

It is difficult to fully appreciate the size of the unmet need as elderly and frail patients are less likely to be included in clinical trials and may receive fewer novel agents, partly as a consequence of comorbidities, polypharmacy and more rapid physiological decompensation associated with both disease- and treatment-related complications [5]. Furthermore, inferior outcomes may also be associated with a lack of a more personalised regimen selection. For example, similar outcomes have been reported in the >75 years old patient cohort in both the POLLUX and CASTOR trials [10]. Treatment strategies for patients deemed TNE have evolved and it is clear that improvements in survival have been less impressive in this group of patients [7, 11]. This is probably an under-estimation, in part, due to an under-representation of less fit patients in clinical trials due to strict entry criteria (see below) and a reflection of real-world practice where physicians may be less confident about delivering full dosing and delivery schedules in the more frail patient. It is likely that tolerance of full dose therapies, as mandated per protocol in these studies, is poor. The FIRST/MM020 trial recently demonstrated improved outcomes, including quality of life [12], with the IMiD, lenalidomide, continued to disease progression, over other treatment strategies [13]. Within this study patients’ frailty index was assessed at baseline but doses were not prospectively adjusted according to frailty. Outcomes tended to improve for RD vs MPT in all frailty groups but with the greatest benefits were seen in fit patients [14, 15]. This suggests additional strategies are needed to improve outcomes for the unfit and frail groups. Furthermore, the lack of reliable international registry data limits a full-scale appreciation of the problem though national registries have contributed to our understanding of this unmet need. Current prognostic assessments rely on markers of either tumour burden (e.g. International Staging System; ISS) or genetic risk (FISH, mutational analysis, copy number variants) or a combination of both (R-ISS) [16, 17].

Defining frailty

Healthy ageing, and improving healthspan as well as prolonging lifespan are hugely important issues for both medicine and society in general. The difference between lifespan and healthspan, encompasses the influence of not only disease but of more rapid physiological deterioration, that predisposes to the consequence of concomitant illness and polypharmacy, further augmenting physiological decompensation—frailty. Frailty is a functional term that refers to a decline in physiological function, leading to dependency, vulnerability to stressors and high risk of health-related outcomes (metabolic disorders, infections, cancer) resulting in an increased morbidity and mortality [18]. Whilst the prevalence of frailty has been demonstrated to account for up to 60% of > 65-year-old community-dwellers in western societies, all degrees of frailty have been reported in two-thirds of MM patients with severe frailty in at least 40% in some reported series [19, 20]. Two conceptual models of frailty exist—the Frailty Phenotype (often referred to frailty syndrome [21]) and Frailty Index [22], as illustrated in Fig. 1. The frailty phenotype, recognising a link with sarcopenia, is defined as (1) unintentional weight loss, (2) weakness, (3) poor activity, (4) slowness of gait and (5) low physical activity level [21]. Sarcopenia, a progressive and generalised loss of skeletal muscle and function [23], is associated with the frailty phenotype [24]. The universally accepted definition of the functional sarcopenia was updated in 2019 by the European Working Group on Sarcopenia in older people (EWGSOP2). The definition included cut-offs to identify those who have sarcopenia. Biomarkers including imaging, to define the at risk population continues to evolve and may have a particular role in sarcopenic patients with cancer (see below).

Fig. 1: Defining frailty in suscetible populations.
figure 1

Two main accepted frailty assessment instruments [14, 15].

Clinical frailty scores

The realization of the need for identifying populations with cancer who are at risk of therapy-related toxicity culminating in poorer outcomes consequential of limitations of systemic anti-cancer therapy (SACT) exposure prompted the publication of the American Society of Clinical Oncology guidance on use of geriatric assessments (GAs) and frailty scores in oncogeriatric practice [25]. In MM in patients deemed to be TNE there is considerable heterogeneity, not only in age but also the complex interplay of age, physical function, cognitive function and comorbidity. Furthermore, performance score (PS) and the ISS has been shown to be less able to discern these key sub-groups in the TNE population and predict their response to therapy and survivorship [7, 26]. A functional or GA offers the possible advantage of guiding therapeutic decisions with the potential to account for treatment compatibility, drug-induced side effects and mortality [27, 28]. The use of a GA may complement a competent clinical judgement and indeed GA tools have been postulated to be valuable in a number of different cancers. However, generic GA tools have limitations and may not be applicable to patients with MM, with its well established disease-related morbidity (Fig. 2). The International Myeloma Working Group proposed a scoring system for MM patient frailty that predicts survival (IMWG FS), adverse events and treatment tolerability [29] using age, the Katz Activity of Daily Living (ADL), the Lawton Instrumental Activity of Daily Living (IADL) and the Charlson Comorbidity Index (CCI). The ADL ranks adequacy of performance in six functions: bathing, dressing, toileting, transferring, continence and feeding using a questionnaire-based tool. IADL assesses more complex activities (shopping, cooking and managing finances) necessary for functioning in community settings, the capacity to handle these complex functions are normally is lost before basic ADL. Both tools take ~10–15 min to complete [30]. A pooled analysis of 869 patients with newly diagnosed MM who were being entered into several clinical intervention studies, and had a baseline frailty assessment using this scoring system demonstrated a correlation with survivorship. The IMWG FS may offer an additional clinical evaluation for the measurement of frailty, assisting both in the design and assessment of clinical interventional studies and perhaps, in day-to-day clinical practice. However, the IMWG FS was tested, but not validated, in the setting of clinical intervention studies and thus, patients are selected according to strict inclusion criteria, which may limit “real-world” interpretation of the data (see below). More recently, the IMWG FS was tested in an unselected real-world population outside of a clinical trial setting demonstrating it can be used to define a more biologically frail population [31].

Fig. 2: The Frailty Spiral. The impact of aging, cellualr senescence and comorbidities on the evolution of frailty wiht the myeloma clinical features acting as “accelerants”.
figure 2

The impact of multiple myeloma in accelerating age-related physiological decompensation (frailty).

As a consequence of this seminal work several scores have been developed and tested in various populations (summarized in Table 1) and a systematic review was conducted by Salazar et al., with some notable omissions [32]. Subsequent scores were developed that did not included the functionality of ADL and IADL, often using PS instead, especially for retrospectively developed scores. Facon et al. reported on a simplified frailty score using a large clinical trial patient cohort (n = 1618), demonstrating that frail patients had inferior outcomes, especially overall survival. This simplified score was subsequently validated in an independent trial population [14, 15]. Engelhardt et al. prospectively assessed the impact of the IMWG FS on clinical outcome in a well-characterized non-trial cohort, comparing to alternative host-related scoring systems (R-MCI: revised Myeloma Comorbidity Index, CCI, KF index) and demonstrated the IMWG FS is a clinically useful tool in identifying patients with a host risk profile and is of prognostic value for functional decline and survivorship [33]. The Mayo Clinic developed a score (PS, Age and NT-proBNP), assessed in just <400 consecutive patients [34]. The authors were able to define four clear sub-groups based on outcomes and in particular, defining OS differences between the groups ranging from 18 to 54 months (p < 0.0001). One potential criticism of these single institutional studies developing models is that there is a tendency towards patient selection bias resulting from the tertiary nature of the authors clinical practice, albeit tertiary multicenter studies have just been prospectively shown to be feasible with different risk scores being compared [35]. Offidani et al. report a “vulnerability score” based on PS and CCI, though did test the impact of ISS, renal insufficiency and bone lesions as disease overlays [36]. Cook et al. generated a more laboratory-based objective risk score incorporating age, PS, CRP and ISS, which was able to discriminate not only therapy-related toxicity and regimen completion but survivorship and impact on quality of life [37]. Though this is more of a risk score than a traditional frailty score, it nonetheless defined patient populations who are vulnerable in the treatment setting. Similar to other scores reported here, it was tested and validated in clinical trial populations but has since been replicated in real-world setting [38].

Table 1 Clinical frailty/risk scores in myeloma.

A number of issues remain for clinical frailty assessment developments in MM before we can confidently rely on these. Firstly, the role of chronological age in scoring systems. Using age as a weighted factor is important but, to date, this has been subject to categorizing of what is in essence a continuous variable. This can leave individual patients switching from fit to unfit, or unfit to frail overnight as the age changes, which seems counter-intuitive when the frailty scores are after all measuring biological not chronological vulnerability. In addition, categorization can lead to an inflated type-1 statistical error, which increases the risk of a false-positive result in analysis [39]. As such, evolving our scores using age as a continuous variable seems appropriate. Secondly, in the IMWG FS, two measures that reflect activity are included (ADL and IADL), which in themselves are time-consuming and prone to subjectivity. Many of the other scoring systems have incorporated WHO PS (and one used Karnofsky PS) in replacement of the IMWG FS measures of activity, often as the models were developed using retrospective data, from which ADL and IADL cannot be implied as they were not collected contemporaneously [14, 33, 37]. Alternative approaches to find substitutes for the ADL/IADL have been described recently: using the FIRST trial data base, age, CCI and ECOG PS separated frail and fitter patients, which could define differences in outcomes (PFS, OS and treatment-related toxicities/endurance) [14]. Again, PS was used, but is more subjective, and prone to intra- and inter-observer bias and may not be able to detect disease burden overlay consequential to disease-related morbidity such as bone pain [40]. Given the limitations of these activity measures then we have an unmet need to more accurately define activity, and more specifically inactivity that is at the core of frailty. Given the development of wearable activity monitors, ironically most frequently used by younger fitter individuals for sport and leisure, then there is an opportunity to research the use of wearable devices to calibrate frailty susceptibility. One such study highlighted the use of stepping, walking and PAB parameters (sedentary and moderate-to-vigorous activity) in detecting pre-frailty [41]. Lastly, the use of the CCI has been core to many of the frailty scoring systems, though was not designed to be utilised in this setting in MM [20, 33, 35, 42]. Furthermore, given the average age of patients with MM and their documented multi-morbidity, this clinical setting represents the epitome of cluster medicine (Fig. 2). As such evolving the CCI to be more MM-specific may offer greater sensitivity when clinically assessing the impact of MM as a diagnosis as well as the risk of therapy intolerance. Engelhardt et al. devised a myeloma-specific comorbidity system that included 13 points of disease and organ dysfunction, in a similar design to the haematopoietic cell transplantation-specific comorbidity index [43], with the MCI preforming best in the dataset tested [20]. Further work can be useful in this regard [35, 42].

As the research evolves into frailty index mining, the key issues of validation before adoption include the need to compare to the accepted standard (IMWG FS), to test the prognostication prospectively, and to calibrate in a well-defined clinical trial populations before extending its predictability into real-world data is key. Some limited comparative studies and systematic reviews have been performed to date [32, 44, 45]. Salazar et al. presented a critical evaluation in a systematic review and meta-analysis. They identified seven studies, of which they only included three for the meta-analysis component [32]. However, they did not compare the scoring systems with IMWG FS rather they looked at the individual scoring system components and looked for efficacy of using frailty scoring in MM. It provides evidence that the use of such scoring systems is valid but not which one to adopt in clinical practice. Isaacs et al. [44] examined a comparison between IMWG FS, MCI and a cancer-based frailty deficit score (Carolina Frailty Score [46]), which as yet has not been tested in MM. Though this is to be applauded, the number of patients involved in the comparison is woefully inadequate to make a formal comparison valid. Formal validation of any score needs to included direct comparison with the IMWG FS to define that at least it proves the same level of discrepancy (null hypothesis-driven) before formally testing in prospective clinical trials and real-world data. Only then can its adoption in the clinic be warranted. Whether such scores can influence how a clinician delivers therapy (predictive biomarker) needs to be formally test in the clinical trial setting prospectively.

Recommendations

At this time, the IMWG FS remains the standard approach to defining frail and at risk populations in MM. Any and all developments in this setting need to be prospectively validated against the IMWG FS. Work needs to be done on refining the suitability, reproducibility and practical use of these systems as well as prospectively testing to define their prognostic biomarker potential. Only then can the predictive biomarker potential be tested in well-designed clinical trials. Alternative approaches, that have been compared and/or validated against the IMWG FS and include age, CCI, PS or other myeloma-appropriate risk factors could be clinically useful but such scoring systems need to be robust, reproducible and easy to use in the clinic.

Biomarkers of frailty

Gerosciene, the comprehensive multidisciplinary study of ageing and chronic disease in the older person, needs suitable and appropriate biomarkers to assess the ageing population and their risk of frailty syndrome (reviewed in [47]). This is especially important if the field is to expand into determining efficacy of interventions to improve healthspan and/or lifespan in a more preventative than curative medicine approach. In the context of MM, however, there are several issues. Firstly, we need to differentiate between biomarkers of ageing and those that reflect frailty. Much work is being done in this area (reviewed in [48]). Secondly, the use of high-throughput tools such as transcriptomics, proteomics, metabolomics whilst very important in advancing the biological understanding, in themselves are not suitable biomarkers for everyday use. However, understanding of the genetic and metabolic pathways may lead to an appropriate biomarker(s) for further everyday study for its value [49]. Nonetheless, such biomarkers need to be not only measurable but reliable, reproducible and feasible before validation in the clinical outcome setting. Thirdly, defining a biomarker as prognostic is not synonymous with prediction, which require further study in the setting of biomarker-driven clinical interventional studies [50]. Lastly, co-existing medical conditions of our patients as they are diagnosed can present as frailty and have a significant impact on treatment delivery and success. However, given that there is likely to be a disease-overlay (MM related morbidity) in the fitness of patient with age-related frailty, we need to have a dynamic assessment that highlights changes in frailty as treatment for MM proceeds. The clinical scales highlighted above are used as a static baseline consideration and have yet to prove sensitive enough to be a dynamic measure of susceptibility. As a consequence of these caveats, there is a need to develop and validate biomarkers of frailty in MM. The field continues to develop and is reviewed in [51]. It is worthy of note, that in developing biomarkers, either prognostic or predictive, we need to be cognoscente on whether biomarkers of frailty correspond to clinical measures of frailty, how do these biomarkers change with SACT, are frailty biomarkers correlative of patient-reported outcomes and can such biomarkers predict cancer-specific outcomes from SACT.

Several cellular pathways and biological functions lend themselves as potential areas for biomarker development. Of these, three in particular hold promise for the field of host response biology in MM: cellular senescence, inflammaging/immunosenescence and sarcopenia. Current biomarkers of frailty are listed in Table 2. Nearly 60 years ago, researchers identified the limited ability of human cells to divide, and in the intervening years, the cellular and molecular processes involved have been elucidated [52]. The current definition of senescence is characterized by three main features: arrested cell proliferation, resistance to apoptosis and the production of the senescence-associated secretory phenotype (SASP) [53]. Biomarkers of cellular senescence include markers of DNA damage (γH2AX, ATM, MDC1), telomere length and telomere dysfunction-induced foci (TIF), cell cycle arrest (p16INK4A, p53/p21 axis) and senescence-associated β-galactosidase (SA-βGal) [54, 55].

Table 2 Potential biomarkers of frailty.

Inflammaging is a term generated 20 years ago when the role of the accumulation of age-dependent inflammatory mediators in cells and tissues resulting in low-grade, sterile and chronic inflammation is associated with the development of the frailty syndrome [56,57,58]. The concept being that pro-inflammatory mediators such as cytokines and chemokines play an essential causative role in the adversity associated with ageing.

Senescence of the immune system (immunosenescence) is one of the causes of inflammaging. Accordingly, a SASP has been described, which includes pro-inflammatory mediators such as IL-6, IL-8, IL-1, TNFα, b-CHE, eHsp72, selenium and MicroRNAs (reviewed in [59]). In addition, alteration in immune cell subsets have been described including Th17/TReg cell ratios, reduced recent thymic emigrants, CD8+CD28-KLRG-1+ quantitation and dysfunctional T-cell responses to TCR-mediated signals [60, 61]. The role of these immune component quantification studies and the relevance of measuring the SASP components in MM is yet to be determined but may represent peripheral blood accessible biomarkers for frailty (Table 2).

In accordance with the definition of sarcopenia (see above) as updated by the recent European Working Group on Sarcopenia in Older People (EWGSOP2), diagnosis requires the use of techniques to define a combination of appendicular skeletal muscle mass measurement (kg/m2), muscle strength usually defined by grip strength (kg or Newtons) and performance most commonly defined in clinic as gait speed (m/sec) or timed “Up and Go” test [23, 62]. Muscle mass can be measure by several techniques though none are ideal with major limitations. The most effective to date seems to be the dual-energy X-ray absorptiometry (DXA) but CT (especially Lumbar 3rd vertebra by CT imaging in cancer patients including MM [63]) and more recently MRI scanning have been advocated, the latter being able to pose a calibration of muscle quality as well as mass [62, 64, 65]. One promising approach is the d3 creatinine A isotope dilution test, which has generated more correlative results with sarcopenia than DXA [66]. In MM context, early results from the HOVON 123 study have defined the predictive value of the IMWG FS and loss of muscle mass in frail with the reported outcomes included treatment tolerability [67]. Interestingly, the authors report that low muscle mass rather than muscle function were associated with clinical outcomes.

Blood-based biomarkers of sarcopenia are still in the developmental stages and have proven thus far to be more complicated to define than perhaps first thought, especially the relationship between adipocyte metabolism, turn-over and senescence with sarcopenia. Potential biomarkers include myostatin, Insulin-like Growth Factor 1, as well as markers of inflammation [68] but to date imaging currently offers the best measure of sarcopenia and outcomes in cancer patients.

Recommendations

Biomarker discovery to better define frailty syndrome is evolving in internal medicine but is very much in its infancy on Oncogeriatrics. In relation to MM, such biomarkers have yet to be systematically studied. As such we recommend that further study of their importance is needed in clinically defined trial propulsions before any such biomarkers can be recommended for prognostic determination let alone function as predictive biomarkers.

Validation of clinical scores and biomarkers

To date, the clinical frailty scores described above, have been largely developed using clinical trial datasets for both hypothesis testing and model validation or retrospective single-centre real-world populations. However, clinical trial datasets represent the perfect data in the imperfect population compared with real-world data, which represents the imperfect data in the perfect patient in respect to how we deliver everyday care in the clinic. For example, Shah et al. found that using common randomized controlled trial (RCT) eligibility criteria, only 60% of patients in the real-world were eligible for participation in RCT, in part related to renal function and lower hematopoietic reserve [69]. As a consequence patients ineligible for RCTs demonstrated more advanced disease (Connect MM Registry ISS III 22.1% in RCT eligible versus 40.1% in RCT ineligible; p < 0.001). NCRI Myeloma XI, the largest upfront trial in newly diagnosed MM, had a median age of 74 in the TNE cohort though only 13.2% were older than 80 years compared with the population rate of MM in the UK where 54.8% are older than 80 years of age [7]. As such it is clear that any clinical scoring system or biomarker, whether prognostic or predictive needs to be evaluated in a real-world population if the true impact on everyday clinical practice is to be measured [20, 33, 35, 42, 70].

Future directions

It is clear from the current evidence and the stated unmet need that the potential of GA tools in assessing TNE MM patients and influence healthcare delivery is important. However that one represents the best and most responsive tool remains to be defined. Given the unmet need and the potential for the clinical utility of a GA tool to predict survivorship, the next step is to test whether such a tool can be used to direct treatment delivery. In this context, the UK Myeloma Research Alliance (UKMRA) has developed the Myeloma XIV: FITNESS study (NCT03720041). In this study, patients will be randomized to a treatment-adaptive arm where therapy will be dose-reduced in accordance with the IMWG frailty score compared with a conventional treatment-reactive arm where therapy will be modified in relation to toxicity and tolerance (https://clinicaltrials.gov/ct2/show/NCT03720041?cond=myeloma+XIV&draw=2&rank=1). The aim of the study will be the development and assessment of a host biological scoring tool to predict treatment tolerance, prevention of treatment discontinuation and reduction of early death as well as defining the impact on PFS and survivorship. Funded by Cancer Research UK, the trial has opened for recruitment in March 2020. This is one approach to the use of GA tools by defining who will not benefit from standard of care. Another direction of clinical research is to highlight this population using GA tools, and provide alternative treatment regimens and delivery. Some novel agents are more suitable to such frail populations. Furthermore, we need to study how dynamic GA tools are, how frailty may change over time (age-related vs disease-related frailty) and the impact of novel interventions to may be adjuvant to the delivery of primary anti-myeloma therapy. There are currently seven other frailty-associated trials in MM listed on clintrials.gov either recruiting or in set up (https://clinicaltrials.gov/ct2/results?cond=Myeloma&term=frailty&cntry=&state=&city=&dist=) and a list of key trials of interest are listed in Table 3.

Table 3 Selected prospective clinical trials in MM, adapted from [42].

The inclusion of clinical frailty scores in everyday practice is best facilitated through tumour boards, where case-by-case discussion about treatment pathways and management are held. However, at present, clinical frailty scores have been proven as prognostic biomarkers, but as yet have not been proven as predictive biomarkers. It is only when defined as predictive biomarkers, can clinical confidence be ensured in patient decision making and the above outline clinical trials aim to define this.

Conclusions

As the treatment landscape continues to evolve towards the application of precision medicine in MM, there is a clear need to take stock of the host response biology when designing therapeutic strategies to maximize efficacy whilst minimizing toxicity and ensuring the best possible treatment delivery. An integral aspect of this approach is to define the physiological age and capacity of patients with MM to deal with the burden of their disease and it’s treatment. Such assessments may include not only functional and clinical assessments but also laboratory-based biomarkers of frailty, aging and senescent cellular burden. A need to develop, test and validate clinical screening scores before their adoption into clinical practice is mandated. Ongoing research into potential biomarkers of susceptibility and frailty may yield more time-sensitivity markers of frailty. Hopefully we are approaching a time where such measurements, once validated prospectively, will be used to direct safe, effective and personalized treatment.