The behavioral and psychological symptoms of dementia (BPSD) are challenging aspects of dementia care. This study used machine learning models to predict the occurrence of BPSD among community-dwelling older adults with dementia. We included 187 older adults with dementia for model training and 35 older adults with dementia for external validation. Demographic and health data and premorbid personality traits were examined at the baseline, and actigraphy was utilized to monitor sleep and activity levels. A symptom diary tracked caregiver-perceived symptom triggers and the daily occurrence of 12 BPSD classified into seven subsyndromes. Several prediction models were also employed, including logistic regression, random forest, gradient boosting machine, and support vector machine. The random forest models revealed the highest area under the receiver operating characteristic curve (AUC) values for hyperactivity, euphoria/elation, and appetite and eating disorders; the gradient boosting machine models for psychotic and affective symptoms; and the support vector machine model showed the highest AUC. The gradient boosting machine model achieved the best performance in terms of average AUC scores across the seven subsyndromes. Caregiver-perceived triggers demonstrated higher feature importance values across the seven subsyndromes than other features. Our findings demonstrate the possibility of predicting BPSD using a machine learning approach.
The number of people living with dementia is estimated to be more than 50 million worldwide; with the increasing aging population, this number is expected to triple by 20501. Behavioral and psychological symptoms of dementia (BPSD), also known as neurocognitive symptoms, are a heterogeneous group of non-cognitive symptoms and behaviors such as agitation, aggression, apathy, and depression that manifest in individuals diagnosed with dementia2. Almost 90% of individuals with dementia at all stages and etiologies are affected by these symptoms3. They are increasingly recognized as the most complex, challenging, and costly aspects of dementia care4, and result in decreased independence in activities of daily living and quality of life5, nursing home placement6, healthcare utilization7, and increased caregiver burden and depression8 for individuals with dementia.
The factors that have been identified as contributing to BPSD can be categorized as (1) those related to individuals with dementia, such as dementia-related neurobiological factors, unmet needs, and premorbid personality; (2) those related to caregivers, such as communication approach; and (3) environmental triggers such as lack of stimuli and environmental change9. They cause symptoms independently or in combination with other aspects9. Disturbances in circadian rhythm, including impaired sleep and inadequate physical activity, have also been identified as stressors that can trigger BPSD10,11. Recent empirical research has identified an association between disruption of circadian rhythms and BPSD, including depression, anxiety, agitation12, and sundowning-related aggressive symptoms13. However, given that systematic and continuous observation of sleep and activity levels is challenging to assess using traditional methods that rely on retrospective reports of caregiver observation, only a few methodologically rigorous studies have examined the influence of sleep and activity levels on such symptoms14.
Although a downward trajectory for the cognitive and functional decline over the course of dementia is expected, the manifestation of BPSD varies among individuals9. Current evidence suggests that person-centered non-pharmacological interventions that match the needs of persons with dementia and their abilities are a first-line treatment for managing BPSD. The most appropriate interventions for those with dementia should be selected individually after considering the causes and patterns of BPSD15. Predicting BPSD by identifying contributing factors and monitoring triggers is the first step in selecting and implementing individually customized non-pharmacological interventions to prevent and manage target symptoms15.
Machine learning and wearable technology have immense potential to overcome the methodological limitations of existing dementia research through the precise analysis of clinical information derived from digital devices16. Wearable technology such as actigraphy allows for continuous biometric monitoring, including levels of sleep and activity in everyday conditions, and for connecting them with clinical symptoms16,17. Machine learning facilitates the identification of underlying patterns and relationships between variables directly from data and the development of data-driven prediction models18. Although machine learning has been employed to develop predictive models for the incidence and detection of Alzheimer’s disease19, this analytic technique has rarely been applied to research on BPSD.
Hence, machine learning models are leveraged in this study to predict the occurrence of BPSD among community-dwelling older adults living with dementia based on various factors, including actigraphy-measured sleep and physical activity levels and diary-based caregiver-perceived symptom triggers.
This study utilized a prospective observational design with three-wave data collection to build predictive models for BPSD subsyndromes. The second wave of data collection was conducted after the first wave, with repeated measures from participants in the first wave who had agreed to participate in the second wave. A detailed description of the first and second waves of data collection is reported elsewhere20. In the third wave of data collection, a validation dataset was collected from new participants independently of the first and second wave data. We employed the first and second wave data for model training (i.e., the training dataset) and the third wave data for external validation (i.e., the test dataset).
We employed a standard mining methodology that comprised four steps: (1) data acquisition, (2) data preprocessing (e.g., data cleaning, class imbalance training, and dataset class optimization), (3) model learning, and (4) model evaluation21.
Recruitment and data collection
The first wave of data collection was conducted between June 2018 and June 2019. Eligible older adults with dementia living at home were recruited via on-site visits from outpatient neurological clinics at two tertiary hospitals and daycare centers in Seoul and the broader Gyeonggi region in Korea. The second wave of data collection, which involved first-wave participants who agreed to continue the study, was administered between July 2019 and June 2020. For external validation, eligible participants were recruited between July 2020 and November 2020 from an outpatient neurological clinic, where the first and second waves of data collection were conducted. The inclusion criteria applied to the three-wave data collection were (1) being at least 65 years old, (2) having a diagnosis of dementia, and (3) having a score of less than 24 on the Korean version of the Mini-Mental State Examination (K-MMSE)22.
Eligibility screening and data collection were performed by trained research staff.
After eligibility was established, trained research staff collected demographic and health data through interviews with family caregivers and older adults with dementia. Furthermore, chart reviews and standardized scales for physical, functional, and neuropsychological assessments were administered. Following the baseline assessment, the participants wore an actigraphy device on their wrists continuously for two weeks, and primary caregivers logged BPSD in the symptom diary daily for 14 consecutive days.
All procedures performed in this study involving human participants were in accordance with the ethical standards of the institutional or national research committee and with the Declaration of Helsinki (1964) and its later amendments or comparable ethical standards. Institutional review board approval was obtained from the Yonsei University Health System Severance Hospital (IRB 4-2018-0348, 4-2019-0314, 4-2020-0454) and Ilsan Hospital (IRB 2018-10-002-001, 2019-08-012-001). Legal representatives of all the participants provided written informed consent before enrollment after receiving a full explanation of the study procedures. The participants also provided verbal assent and written informed consent was obtained when possible.
Demographic and health data
At baseline, demographic and health data comprised age, sex, marital status, education level, dementia diagnosis, and neurological and psychiatric medications.
Cognitive and functional status
Scores (range 0–30) on the K-MMSE were used to assess cognitive functioning, with lower scores indicating greater cognitive deficits22. For the K-MMSE, Cronbach’s α was 0.91 in older Korean adults with dementia23. The severity of dementia was measured via the Korean version of the expanded Clinical Dementia Rating (CDR) scale, which assesses six functional domains—memory, orientation, judgment and problem-solving, community affairs, home and hobbies, and personal care24. The summed score ranged from 0 (none) to 5 (terminal dementia), and good inter-rater reliability for the overall CDR ratings was confirmed in Korean patients with dementia (kappa value range: 0.86–1.00)25. Functional independence was evaluated using the Korean version of Activities of Daily Living (K-ADL), which consists of seven items rated on a 3-point Likert-type scale, with higher scores indicating more severe levels of dependency26. The K-ADL was validated for older Korean adults with dementia, with good reliability (Cronbach’s α = 0.94)26.
The family caregiver informant-rated premorbid personality traits of older adults with dementia were assessed using the Korean version of the Big Five Inventory (BFI-K)27. The BFI-K constitutes 15 items rated on a 5-point Likert-type scale that measures 5 domains of personality traits: openness, conscientiousness, neuroticism, extraversion, and agreeableness. The internal consistency of the BFI-K was good; Cronbach’s α ranged from 0.67 to 0.8227.
Actigraphy data: nighttime sleep and physical activity
Older adults with dementia were fitted with a wrist-worn actigraphy device (ActiGraph wGT3X-BT, ActiGraph Corporation, Pensacola, FL, USA), which they wore all day for 14 consecutive days. The participants were instructed to remove the device when bathing or for a few minutes as needed. Previous validity studies have demonstrated that wrist actigraphy is a reliable and suitable method for objectively measuring sleep–wake cycles in older adults with dementia28,29. Raw acceleration data were collected along the three axes. We used ActiLife (version 6.13.3, Pensacola, FL, US) software to export the data and process raw acceleration data to sleep and physical activity parameters using vector magnitude count in 60-s epoch data (i.e., counts per minute). The vector magnitude is calculated as the square root of the sum of the squares of acceleration for each of the three axes (x, y, z). The Cole-Kripke algorithm was applied to score a one-minute epoch as asleep or awake30. Moreover, the previous night’s sleep parameters were employed to predict BPSD the following day. In this study, nighttime sleep was defined as the period between 20:00 (8:00 pm) and 08:00 (8:00 am). The following nighttime sleep parameters were generated: total sleep time, wake time after sleep onset, sleep efficiency, defined as the ratio of sleep duration over the assumed sleep period (total sleep time/[total sleep time + wake time after sleep onset] × 100), number of awakenings, and mean awake length (wake time after sleep onset/number of awakenings). The following physical activity parameters were also generated: energy expenditure (calories burned) in kcal per day, metabolic equivalents per day, total time spent in moderate-to-vigorous physical activity per day, percentage of time spent in moderate-to-vigorous physical activity per day, and the number of steps per day. We employed physical activity parameters measured the same day, which reflected the physical conditions during the day when BPSD occurred.
Symptom diary data: BPSD and caregiver-perceived symptom triggers
A symptom diary that comprised a structured, easy-to-use checklist modeled on the Neuropsychiatric Inventory (NPI) was developed to assess the presence and severity of BPSD (i.e., delusions, hallucinations, agitation/aggression, depression/dysphoria, anxiety, elation/euphoria, apathy/indifference, disinhibition, irritability/lability, aberrant motor behaviors, sleep and nighttime behaviors, and appetite and eating disorders) daily31. It also included a checklist that assessed caregiver-perceived triggers of BPSD (i.e., hunger/thirst, urination/bowel movement, pain/discomfort, sleep disturbance, noise, light, temperature), interpersonal triggers (i.e., factors related to the person(s) who were present), and changes in the environment. Caregivers were also instructed to check “other causes” in the symptom diary if the perceived trigger was a factor that could not be categorized under any of the options listed in the checklist, and then, list the factors. Family caregivers were instructed to check all options that were perceived as triggers of BPSD on the same day when the symptoms had occurred. The symptom diary was designed to overcome recall bias (e.g., the NPI is based on the caregiver’s two-week retrospective rating), enable daily monitoring of the occurrence of BPSD, and link symptoms to triggers daily32.
Recent studies have established that clustering several individual BPSD that are highly correlated and co-occur enhances the clinical utility of the assessment of BPSD, thus allowing for a more meaningful interpretation of the study findings and increasing power by raising the number of participants who endorsed the symptom cluster rather than the individual symptoms alone2,33,34. Based on previous NPI factor analysis studies, we clustered certain individual symptoms into three subsyndromes: psychotic symptoms (hallucination and delusion), affective symptoms (depression, anxiety, and apathy), and hyperactivity symptoms (agitation/aggression, disinhibition, and irritability)34,35,36,37. As prior studies have demonstrated that euphoria/elation, aberrant motor behaviors, sleep and nighttime behaviors, and appetite and eating disorders do not load into any clusters33,34,36,37,38, we analyzed them as individual subsyndromes consisting of only one symptom.
Missing actigraphy data were encountered for two main reasons: the improper wearing of the device and lack of participant compliance (e.g., constant removal of the device or not wearing the device). The number of participants with missing actigraphy data was 81/225 (36%). The mean number of days per person with missing actigraphy data was 0.9. The occurrence rates of BPSD were similar regardless of missing actigraphy data (Supplementary Table 1). Therefore, multivariate imputation was applied using chained equations39 to address the missing actigraphy data. Before training the models, we applied min–max normalization for continuous features. For categorical features, target encoding was employed instead of one-hot encoding. Target encoding reduces feature dimensions by converting categorical features to numerical values derived from target variables, assuming that a categorical feature is related to the outcomes40. There was an issue of outcome class imbalance for BPSD subsyndromes. While 26.8% of the participants exhibited affective symptoms, only 4.4% and 4.7% exhibited aberrant motor behaviors and euphoria/elation, respectively (Table 2). Researchers in various disciplines have prioritized the class imbalance problem and suggested strategies to address the issues of imbalanced data sets41,42,43. This study applied a synthetic minority oversampling technique to address the outcome class imbalance issue44.
Multiple machine learning approaches were selected for this study, including logistic regression, random forest45, gradient boosting machine46, and support vector machine47. We investigated each of these machine learning methods with a specific learning algorithm to gauge their effectiveness, and then selected the best-performing model that could predict each subsyndrome of BPSD18. Using logistic regression, the most common and well-established binary classifier48, as the baseline model, we evaluated the degree to which the machine learning models improved performance over the baseline model.
To avoid overfitting, hyperparameter tuning through random search49 was implemented with five-fold cross-validation for each machine learning method. Binary cross-entropy was employed as the evaluation criterion for five-fold cross-validation. The hyperparameters for tree complexity were considered for the random forest and gradient boosting machine models. The gradient boosting machine model was iteratively trained to minimize the loss function using stochastic gradient boosting. Thus, we considered the learning rate and number of trees for the gradient boosting machine. Various kernel functions such as linear, polynomial, and radial basis kernels can be utilized for the support vector machine50. Linear instead of nonlinear kernels, such as the radial basis function kernel, were used in this study to prevent overfitting in small datasets and calculate feature importance. For the support vector machine models, only the regularization hyperparameter was employed to determine the optimal model. All the selected hyperparameters are described in Supplementary Table 2. Feature importance analysis was performed to investigate the contribution of a range of features in predicting the seven subsyndromes of BPSD and to sort the importance of the top 10 influential features for prediction.
Categorical variables are summarized as the number of participants with percentages and continuous variables as means with standard deviations. Furthermore, two-sample independent t-tests and Fisher’s exact tests were used to compare the training and test dataset differences, respectively. The performances of the prediction models were compared and evaluated using several indices—accuracy, precision, sensitivity (recall), specificity, F1 score, and area under the receiver operating characteristic curve (AUC).
Statistical significance was set at p < 0.05, and all analyses were performed using R, version 3.6.3 (R Foundation for Statistical Computing, Vienna, Austria) and Python, version 3.7 (Python Software Foundation, Wilmington, USA).
Table 1 presents the participant characteristics of both the training (N = 187 older adults with dementia) and test sets (N = 35 older adults with dementia) at baseline. The mean age of the participants was 80.4 years (SD 7.4) for the training dataset and 80.7 (SD 5.6) for the test dataset. The majority of participants were women (59% for the training dataset; 63% for the test dataset), married (62% for the training dataset; 63% for the test dataset), and with an educational level of middle school or below (57% for both the training and the test datasets). Table 1 summarizes the prevalence of BPSD among the participants.
Summary of actigraphy and symptom diary data
Symptom diary and actigraphy data were collected over an average of 13.30 days (SD 2.08) per person (training set: mean [SD] = 13.35 [2.11]; test set: mean [SD] = 13.00 [1.91]).
Affective symptoms were the subsyndromes of BPSD that occurred most frequently (training set: 26.8%; test set: 17.1%), followed by hyperactivity symptoms (training set: 17.8%; test set: 12.7%). Aberrant motor behaviors, euphoria/elation, and appetite and eating disorders had lower frequency rates for both the training and test datasets. The average total sleep times per day were 7.7 h (for the training dataset) and 6.3 h (for the test dataset), and the average total nighttime sleep times per day were 5.9 h (training dataset) and 5.1 h (test dataset). The average energy expenditure was 459.9 kcal/day (training dataset) and 442.2 kcal/day (test dataset). The participants averaged 5908.7 steps (training dataset) and 5750.5 steps (test dataset) daily. Sleep disturbance (training dataset: 13.5%; test dataset: 8.4%) and interpersonal triggers (training dataset: 8.5%; test dataset: 10.8%) were the most frequently reported triggers. Table 2 provides the univariate analyses of actigraphy and symptom diary data.
Performance comparison for prediction models
Table 3 presents the prediction performance of all prediction models based on the training dataset with five-fold cross-validation. Gradient boosting machine models showed higher AUC values compared to other prediction models for predicting hyperactivity (0.706), affective symptoms (0.747), and appetite and eating disorders. (0.816). While the support vector machine model demonstrated the highest AUC value (0.706) for psychotic symptoms, the random forest model exhibited the highest AUC value (0.942) for sleep and nighttime behavior. The logistic regression models denoted the highest AUC values for aberrant motor behaviors (0.822) and euphoria/elation (0.696).
Table 4 presents the prediction performance of all prediction models based on the test dataset, and Supplementary Fig. 1 illustrates the receiver operating characteristics and precision-recall curves for the test dataset. Compared with the logistic regression models, the machine learning models revealed better performance for all seven subsyndromes. Specifically, the random forest and gradient boosting machine models performed better than the logistic regression and support vector machine models for most subsyndromes. The random forest models exhibited higher AUC values than the other prediction models for predicting hyperactivity (0.835), euphoria/elation (0.968), and appetite and eating disorders (0.888). The gradient boosting machine models presented higher AUC values than the other prediction models for predicting psychotic symptoms (0.801), affective symptoms (0.936), and aberrant motor behaviors (0.498). Moreover, the support vector machine model showed the highest AUC value (0.929) for sleep and nighttime behavior.
The gradient boosting machine model achieved the best performance in terms of the average AUC scores across the seven subsyndromes for both the training and test datasets. Tables 3 and 4 present the findings for the other performance indices.
The top 10 most significant features of the gradient boosting machine models, which achieved the best performance in terms of average AUC scores across the seven subsyndromes, were ranked using the permutation feature importance method. Figure 1 illustrates the relative importance of the predictors included in the seven subsyndromes. Caregiver-perceived triggers reported in the symptom diary, including interpersonal triggers, sleep disturbance, urination/bowel movement, and pain/discomfort, exhibited higher feature importance values than the other features across the seven subsyndromes. The CDR score was the most influential feature for psychotic symptoms and ranked in the top 10 for hyperactivity, euphoria/elation, and appetite and eating disorders. The features for sleep and activity levels, such as waking after sleep onset at night and percentage of moderate-to-vigorous physical activity per day, were in the top ranks of feature importance for aberrant motor behaviors and sleep and nighttime behaviors. Except for two symptoms (aberrant motor behaviors and euphoria/elation), premorbid personality types were among the top 10 influential features.
This study constructed predictive models for seven subsyndromes of BPSD, with overall good prediction accuracy, through a supervised machine learning technique based on actigraphy measures of sleep and activity levels, symptom diary entries of caregiver-perceived symptom triggers, standardized measures of cognitive and functional status and premorbid personality type, and a medical chart review. Through the feature importance approach, we also identified the relative importance of predicting each subsyndrome. The machine learning algorithms developed and validated in our study will inform timely and appropriate preventative approaches that identify individuals at high risk for specific subsyndromes of BPSD, and thus, provide customized interventions that address the underlying causes and triggers of the target symptoms.
Although the predictive machine learning models varied among the subsyndromes with regard to different evaluation metrics, our findings demonstrated the feasibility and usefulness of the machine learning approach for predicting BPSD by accounting for multifaceted factors. Although several studies have applied machine learning algorithms with a focus on predicting the future onset of or detecting undiagnosed dementia19,51, few have used machine learning models to predict the various symptoms observed in older adults with dementia. A recent study employed the deep learning approach to forecast agitation episodes based on environmental stimuli data, including ambient noise level, room temperature, and relative humidity, that was collected up to 30 min before the occurrence of an actual episode52. While it demonstrated the feasibility and efficacy of deep learning models in predicting agitation episodes (accuracy: 98.6%; sensitivity: 84.8%), the models developed in our study were more comprehensive and practical because they examined diverse BPSD, which often co-occurs in real care settings. Besides, a machine learning approach can account for a range of multifaceted contributing factors, including actigraphy data and symptom diary data measured over time, which might not have been feasible in conventional statistical modeling owing to highly complex relationships among the variables. In recent years, big healthcare data have emerged along with advances in assistive technologies, including wearable and mobile technologies in healthcare for older adults53, which has increased the number of candidate features54. Machine learning is a promising tool in predicting BPSD arising from interactions among a web of personal, interpersonal, and environmental factors captured by various digital health technologies54. Moreover, machine learning algorithms that predict BPSD subsyndromes can be incorporated into a mobile app to support healthcare providers’ decision-making in selecting individually customized non-pharmacological interventions that address the underlying causes of the target symptoms54.
Our outcomes concerning feature importance suggested that not all features contributed consistently to the prediction of each subsyndrome. Caregiver-perceived symptom triggers, including interpersonal triggers, sleep disturbance, and pain/discomfort, were consistently ranked within the five most influential features across the seven subsyndromes. The features for sleep and activity levels, such as waking after sleep onset at night and percentage of moderate-to-vigorous physical activity per day, ranked highly in terms of feature importance for aberrant motor behaviors and sleep and nighttime behaviors. Age, severity of dementia, cognitive function, education level, and premorbid personality type were also ranked among the 10 most influential features. While activity data were relatively less influential than other features, metabolic equivalents per day were ranked in the top 10 important features for sleep and nighttime behaviors, as was wake after sleep onset at night and the percentage of moderate-to-vigorous physical activity per day for aberrant motor behaviors. The clinical implications of the results are that healthcare providers and caregivers need to consider BPSD as heterogeneous and having different important predictors. Hence, effective interventions to prevent and manage target predictors and symptoms should be employed.
While caregiver-perceived symptom triggers evaluated by the symptom diary were among the most influential features, activity data were less influential than expected for most subsyndromes in our study, except for aberrant motor behaviors. This notion is somewhat consistent with Valembois et al.’s observational study55, which investigated the association between motor activity and sleep duration measured by actigraphy with different types of BPSD. They found a significant increase in motor activity among those with aberrant motor behavior between 21:00 (9:00 pm) and midnight. They explained the phenomenon in terms of the sundown syndrome, referring to an increase in BPSD from late afternoon to night; otherwise, no relationship between sleep duration or night awakening episodes and any type of BPSD was found55. Another study utilized the machine learning approach to predict mobility, cognitive, and depressive symptoms related to Alzheimer’s disease from activity-aware smart home behavior data (e.g., activities of daily living, sleep, outings, and global routines)56. While statistically significant predictors were observed for mobility and cognitive symptoms, depression measured by the Geriatric Depression Scale (GDS) was weakly correlated with the global set of smart home behavioral data56. This outcome somewhat aligns with our finding that affective symptoms were predictable using machine learning models, but its classification (occurrence vs. absence) was mostly not influenced by actigraphy data. Additionally, the inclusion of a variety of predictors in the machine learning model might have decreased the importance of features derived from actigraphy data in our study. For clinical implications, a comprehensive monitoring system that utilizes a symptom diary written by direct caregivers, in addition to wearable sensor technologies, is required for accurate prediction.
A strength of our study is the inclusion of a variety of input data, objectively measured by actigraphy, and caregiver-perceived symptom triggers assessed using a symptom diary to develop the most powerful predictive machine learning models. Furthermore, external validation was performed using an independent test dataset to prevent overestimation of the results57,58. Consistent results from the training and test datasets that differed substantially in terms of severity of functional impairment, characteristics of sleep and activity levels, caregiver-perceived symptom triggers, and frequency of occurrence of symptoms can be regarded as confirmation of generalizability58. Finally, the symptom diary, from which the most influential features were derived, was easy to measure daily in real life. The feature importance results will facilitate the development of a digital diary that tracks BPSD using devices such as smartphones and tablets. A digital diary enables caregivers to log the symptom manifestation and circumstances, including diverse triggers, in real time, and the accumulated data can be analyzed to provide an individualized approach to symptom management.
Our study had certain limitations. First, the symptom diary was written daily, and caregivers checked one or more types of symptoms that were observed and one or more factors that were perceived as triggers for the symptoms each day. Hence, we could not disentangle which caregiver-perceived trigger contributed to which type of individual symptom observed that day. Future research needs to investigate the link between caregiver-perceived triggers and specific symptoms on an episodic rather than a daily basis. Episodic prediction algorithms can provide more precise and accurate information regarding specific symptoms. Second, the prediction performances for certain subsyndromes were poor, particularly in the case of aberrant motor behaviors (AUC = 0.498 in the gradient boosting machine model). The findings regarding the prediction performance need to be interpreted cautiously considering the small sample size of the test dataset. The frequency of occurrence of certain symptoms, particularly aberrant motor behaviors and euphoria/elation, was very low (e.g., 2/871 days of aberrant motor behavior). The overall small sample size of the test dataset might lead to lower power for pattern recognition59. Future replication studies with larger sample sizes are needed to obtain more accurate predictions for the subsyndromes that were less observed in our study and to prevent model overfitting and biased machine learning performance59. As it was challenging to collect accurate data on comorbid conditions for older adults who were recruited from other outpatient neurological clinics (i.e., daycare centers), we did not account for any comorbidities that might have affected their BPSD. Additionally, caregivers’ daily written reports were used rather than direct observation of BPSD, which could result in an observer and recall bias. While dementia care has relied on caregivers’ subjective assessment reports in most cases4, future studies need to employ technology such as wearable sensors, non-wearable motion sensors, or assistive and smart home technologies to objectively monitor BPSD in real time, thus revolutionizing precision dementia care60.
This study developed and validated prediction models for BPSD subsyndromes using a machine learning approach. Overall, four classification models were trained to identify the optimal prediction model for each subsyndrome. To our knowledge, this study is the first to employ machine learning to predict BPSD using a wide variety of data, including actigraphy data. It suggests that machine learning-based prediction models can classify the manifestations of subsyndromes of BPSD. This study also identified influential predictors for specific subsyndromes that can be employed to prevent and manage target symptoms. Based on the outcomes of this study, algorithms that predict BPSD can be clinically applied to monitor and predict BPSD subsyndromes. Delivery of person-centered dementia care can be achieved through the early prediction of target subsyndromes and the provision of individually tailored non-pharmacological interventions that address the underlying causes of BPSD. Machine learning algorithms can also be embedded into smartphone applications to increase their clinical utility. Accordingly, this study is the first step toward personalized care for BPSD management using digital health technologies.
The datasets generated and/or analysed during the current study are not publicly available due to the data security requirements of the hospital but are available from the corresponding author on reasonable request.
Nichols, E. et al. Estimation of the global prevalence of dementia in 2019 and forecasted prevalence in 2050: An analysis for the Global Burden of Disease Study 2019. Lancet Public Health 7(2), e105–e125. https://doi.org/10.1016/S2468-2667(21)00249-8 (2022).
Cerejeira, J., Lagarto, L. & Mukaetova-Ladinska, E. B. Behavioral and psychological symptoms of dementia. Front. Neurol. 3, 73. https://doi.org/10.3389/fneur.2012.00073 (2012).
Ballard, C. & Corbett, A. Management of neuropsychiatric symptoms in people with dementia. CNS Drugs 24, 729–739. https://doi.org/10.2165/11319240-000000000-00000 (2010).
Kales, H. C., Kern, V., Kim, H. M. & Blazek, M. C. Moving evidence-informed assessment and management of behavioral and psychological symptoms of dementia into the real world: Training family and staff caregivers in the DICE Approach. Am. J. Geriatr. Psychiatry 28, 1248–1255. https://doi.org/10.1016/j.jagp.2020.08.008 (2020).
Dyer, S. M., Harrison, S. L., Laver, K., Whitehead, C. & Crotty, M. An overview of systematic reviews of pharmacological and non-pharmacological interventions for the treatment of behavioral and psychological symptoms of dementia. Int. Psychogeriatr. 30, 295–309. https://doi.org/10.1017/S1041610217002344 (2018).
Toot, S., Swinson, T., Devine, M., Challis, D. & Orrell, M. Causes of nursing home placement for older people with dementia: A systematic review and meta-analysis. Int. Psychogeriatr. 29, 195–208. https://doi.org/10.1017/S1041610216001654 (2017).
Herrmann, N. et al. The contribution of neuropsychiatric symptoms to the cost of dementia care. Int. J. Geriatr. Psychiatry 21, 972–976. https://doi.org/10.1002/gps.1594 (2006).
Baharudin, A. D., Din, N. C., Subramaniam, P. & Razali, R. The associations between behavioral-psychological symptoms of dementia (BPSD) and coping strategy, burden of care and personality style among low-income caregivers of patients with dementia. BMC Public Health 19, 447. https://doi.org/10.1186/s12889-019-6868-0 (2019).
Kales, H. C., Gitlin, L. N. & Lyketsos, C. G. Assessment and management of behavioral and psychological symptoms of dementia. BMJ 350, h369. https://doi.org/10.1136/bmj.h369 (2015).
Richards, K. C. & Beck, C. K. Progressively lowered stress threshold model: Understanding behavioral symptoms of dementia. J. Am. Geriatr. Soc. 52, 1774–1775. https://doi.org/10.1111/j.1532-5415.2004.52477.x (2004).
Volicer, L. & Hurley, A. C. Management of behavioral symptoms in progressive degenerative dementias. J. Gerontol. A Biol. Sci. Med. Sci. 58, 837–845. https://doi.org/10.1093/gerona/58.9.M837 (2003).
Kolberg, E. et al. The effects of bright light treatment on affective symptoms in people with dementia: A 24-week cluster randomized controlled trial. BMC Psychiatry 21, 377. https://doi.org/10.1186/s12888-021-03376-y (2021).
Todd, W. D. Potential pathways for circadian dysfunction and sundowning-related behavioral aggression in Alzheimer’s disease and related dementias. Front. Neurosci. 14, 910. https://doi.org/10.3389/fnins.2020.00910 (2020).
Hjetland, G. J. et al. An actigraphy-based validation study of the Sleep Disorder Inventory in the nursing home. Front. Psychiatry 11, 173. https://doi.org/10.3389/fpsyt.2020.00173 (2020).
Scales, K., Zimmerman, S. & Miller, S. J. Evidence-based nonpharmacological practices to address behavioral and psychological symptoms of dementia. Gerontologist 58, S88–S102. https://doi.org/10.1093/geront/gnx167 (2018).
Cho, C. H. et al. Mood prediction of patients with mood disorders by machine learning using passive digital phenotypes based on the Circadian Rhythm: Prospective observational cohort study. J. Med. Internet Res. 21, e11029. https://doi.org/10.2196/11029 (2019).
Asgari Mehrabadi, M. et al. Sleep tracking of a commercially available smart ring and smartwatch against medical-grade actigraphy in everyday settings: Instrument validation study. JMIR Mhealth Uhealth 8, e20465. https://doi.org/10.2196/20465 (2020).
Mufti, H. N., Hirsch, G. M., Abidi, S. R. & Abidi, S. S. R. Exploiting machine learning algorithms and methods for the prediction of agitated delirium after cardiac surgery: Models development and validation study. JMIR Med. Inform. 7, e14993. https://doi.org/10.2196/14993 (2019).
Escudero, J., Zajicek, J. P. & Ifeachor, E. Early detection and characterization of Alzheimer’s disease in clinical scenarios using Bioprofile concepts and K-means. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. 2011, 6470–6473. https://doi.org/10.1109/IEMBS.2011.6091597 (2011).
Cho, E. et al. Factors associated with behavioral and psychological symptoms of dementia: Prospective observational study using actigraphy. J. Med. Internet Res. 23, e29001. https://doi.org/10.2196/29001 (2021).
Luo, W. et al. Guidelines for developing and reporting machine learning predictive models in biomedical research: A multidisciplinary view. J. Med. Internet Res. 18, e323. https://doi.org/10.2196/jmir.5870 (2016).
Kang, Y., Na, D. L. & Hahn, S. A validity study on the Korean Mini-Mental State Examination (K-MMSE) in dementia patients. J. Korean Neurol. Assoc. 15, 300–308 (1997).
Song, J., Park, J. & Kim, H. Impact of behavioral and psychological symptoms of dementia on caregiver burden in nursing homes. J. Korean Gerontol. Nurs. 15, 62–74 (2013).
Hughes, C. P., Berg, L., Danziger, W. L., Coben, L. A. & Martin, R. L. A new clinical scale for the staging of dementia. Br. J. Psychiatry 140, 566–572. https://doi.org/10.1192/bjp.140.6.566 (1982).
Choi, S. H. et al. Estimating the validity of the Korean version of expanded Clinical Dementia Rating (CDR) Scale. J. Korean Neurol. Assoc. 19, 585–591 (2001).
Won, C. W., Rho, Y. G., Kim, S. Y., Cho, B. R. & Lee, Y. S. The validity and reliability of Korean Activities of Daily Living (K-ADL) Scale. J. Korean Geriatr. Soc. 6, 98–106 (2002).
Kim, J.-H., Kim, B.-H. & Ha, M.-S. Validation of a Korean version of the Big Five Inventory. J. Hum. Understand. Couns. 32, 47–65 (2011).
Camargos, E. F., Louzada, F. M. & Nóbrega, O. T. Wrist actigraphy for measuring sleep in intervention studies with Alzheimer’s disease patients: Application, usefulness, and challenges. Sleep Med. Rev. 17, 475–488. https://doi.org/10.1016/j.smrv.2013.01.006 (2013).
Figueiro, M. G. et al. Tailored lighting intervention for persons with dementia and caregivers living at home. Sleep Health 1, 322–330. https://doi.org/10.1016/j.sleh.2015.09.003 (2015).
Cole, R. J., Kripke, D. F., Gruen, W., Mullaney, D. J. & Gillin, J. C. Automatic sleep/wake identification from wrist activity. Sleep 15, 461–469. https://doi.org/10.1093/sleep/15.5.461 (1992).
Cummings, J. L. The Neuropsychiatric Inventory: Assessing psychopathology in dementia patients. Neurology 48, S10-16. https://doi.org/10.1212/WNL.48.5_Suppl_6.10S (1997).
Morganti, F., Soli, A., Savoldelli, P. & Belotti, G. The Neuropsychiatric Inventory-Diary Rating Scale (NPI-Diary) is a method for improving stability in assessing neuropsychiatric symptoms in dementia. Dement. Geriatr. Cogn Disord. Extra 8, 306–320. https://doi.org/10.1159/000490380 (2018).
Liew, T. M. Symptom clusters of neuropsychiatric symptoms in mild cognitive impairment and their comparative risks of dementia: A cohort study of 8530 older adults. J. Am. Med. Dir. Assoc. 20(1054), e1051-1054.e1059. https://doi.org/10.1016/j.jamda.2019.02.012 (2019).
Liew, T. M. Neuropsychiatric symptoms in cognitively normal older persons, and the association with Alzheimer’s and non-Alzheimer’s dementia. Alzheimers Res. Ther. 12, 35. https://doi.org/10.1186/s13195-020-00604-7 (2020).
Aalten, P. et al. Behavioral problems in dementia: Factor analysis of the neuropsychiatric inventory. Dementia Geriatr. Cogn. Disord. 15, 99–105. https://doi.org/10.1159/000067972 (2003).
van der Linde, R. M. et al. Longitudinal course of behavioral and psychological symptoms of dementia: A systematic review. Br. J. Psychiatry 209, 366–377. https://doi.org/10.1192/bjp.bp.114.148403 (2016).
van der Linde, R. M., Dening, T., Matthews, F. E. & Brayne, C. Grouping of behavioral and psychological symptoms of dementia. Int. J. Geriatr. Psychiatry 29, 562–568. https://doi.org/10.1002/gps.4037 (2014).
Canevelli, M. et al. Behavioral and psychological subsyndromes in Alzheimer’s disease using Neuropsychiatric Inventory. Int. J. Geriatr. Psychiatry 28, 795–803. https://doi.org/10.1002/gps.3904 (2013).
Buuren, S. V. & Oudshoorn, C. G. M. Multivariate imputation by chained equations (MICE) V1.0 User’s manual; https://stefvanbuuren.name/publications/MICE%20V1.0%20Manual%20TNO00038%202000.pdf (2000).
Micci-Barreca, D. Preprocessing scheme for high-cardinality categorical attributes in classification and prediction problems. ACM SIGKDD Explor. Newsl. 3, 27–32. https://doi.org/10.1145/507533.507538 (2001).
Choi, S.-H. Estimation of the validity of the Korean version of the Expanded Clinical Dementia Rating (CDR) scale. J. Korean Neurol. Assoc. 19, 585–591 (2001).
Japkowicz, N. Class imbalance problem: Significance and strategies. Proceedings of the International Conference on Artificial Intelligence (2000).
Japkowicz, N. Class imbalances: We focus on the correct issue. Workshop on Learning from Imbalanced Dataset II (2003).
Chawla, N. V., Bowyer, K. W., Hall, L. O. & Kegelmeyer, W. P. SMOTE: Synthetic minority oversampling technique. J. Artif. Intell. Res. 16, 321–357. https://doi.org/10.1613/jair.953 (2002).
Breiman, L. Random forests. Mach. Learn. 45, 5–32. https://doi.org/10.1023/A:1010933404324 (2001).
Friedman, J. H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 29, 1189–1232. https://doi.org/10.1214/aos/1013203451 (2001).
Suthaharan, S. Support vector machine. In Machine learning models and algorithms for big data classification. Integrated Series in Information Systems Vol. 36 207–235 (Springer, 2016). https://doi.org/10.1007/978-1-4899-7641-3_9.
Hyun, S., Moffatt-Bruce, S., Cooper, C., Hixon, B. & Kaewprag, P. Prediction model for hospital-acquired pressure ulcer development: Retrospective cohort study. JMIR Med. Inform. 7, e13785. https://doi.org/10.2196/13785 (2019).
Bergstra, J. & Bengio, Y. Random search for hyperparameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012).
Maji, S., Berg, A. C. & Malik, J. Efficient classification for additive kernel SVMs. IEEE Trans. Pattern. Anal. Mach. Intell. 35, 66–77. https://doi.org/10.1109/TPAMI.2012.62 (2013).
Hane, C. A., Nori, V. S., Crown, W. H., Sanghavi, D. M. & Bleicher, P. Predicting onset of dementia using daily notes and machine learning: A case-control study. JMIR Med. Inform. 8, e17819. https://doi.org/10.2196/17819 (2020).
HekmatiAthar, S., Goins, H., Samuel, R., Byfield, G. & Anwar, M. Data-driven forecasting of agitation for persons with dementia: A deep learning-based approach. SN Comput. Sci. 2, 326. https://doi.org/10.1007/s42979-021-00708-3 (2021).
Sapci, A. H. & Sapci, H. A. Innovative assisted living tools, remote monitoring technologies, artificial intelligence-driven solutions, and robotic systems for aging societies: A systematic review. JMIR Aging 2, e15429. https://doi.org/10.2196/15429 (2019).
Verdonk, C., Verdonk, F. & Dreyfus, G. How can machine learning be used in clinical practice during an epidemic?. Crit. Care 24, 265. https://doi.org/10.1186/s13054-020-02962-y (2020).
Valembois, L. et al. Wrist actigraphy: A simple way to record motor activity in elderly patients with dementia, apathy, or aberrant motor behavior. J. Nutr. Health Aging 19, 759–764. https://doi.org/10.1007/s12603-015-0530-z (2015).
Alberdi, A. et al. Smart home-based prediction of multi-domain symptoms related to Alzheimer’s disease. IEEE J. Biomed. Health Inform. 22, 1720–1731. https://doi.org/10.1109/JBHI.2018.2798062 (2018).
Toll, D. B., Janssen, K. J., Vergouwe, Y. & Moons, K. G. Validation, updating and impact of clinical prediction rules: A review. J. Clin. Epidemiol. 61, 1085–1094. https://doi.org/10.1016/j.jclinepi.2008.04.008 (2008).
Ho, S. Y., Phua, K., Wong, L. & Bin Goh, W. W. Extensions of the external validation for checking learned model interpretability and generalizability. Patterns 1, 100129. https://doi.org/10.1016/j.patter.2020.100129 (2020).
Vabalas, A., Gowen, E., Poliakoff, E. & Casson, A. J. Machine learning algorithm validation with a limited sample size. PLoS One 14, e0224365. https://doi.org/10.1371/journal.pone.0224365 (2019).
Husebo, B. S. et al. Sensing technology to monitor behavioral and psychological symptoms and assess treatment response in people with dementia. A systematic review. Front. Pharmacol. 10, 1699. https://doi.org/10.3389/fphar.2019.01699 (2020).
This work was supported by a National Research Foundation of Korea (NRF) Grant funded by the Government of Korea (Ministry of Science and ICT) (No. NRF-2018R1A2B6003506) and the Basic Science Research Program through the NRF, funded by the Ministry of Education (No. NRF-2020R1A6A1A03041989). The funders had no role in the study design, data collection and interpretation, or manuscript preparation.
The authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Cho, E., Kim, S., Heo, SJ. et al. Machine learning-based predictive models for the occurrence of behavioral and psychological symptoms of dementia: model development and validation. Sci Rep 13, 8073 (2023). https://doi.org/10.1038/s41598-023-35194-5
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.