The number of people living with dementia is estimated to be more than 50 million worldwide; with the increasing aging population, this number is expected to triple by 20501. Behavioral and psychological symptoms of dementia (BPSD), also known as neurocognitive symptoms, are a heterogeneous group of non-cognitive symptoms and behaviors such as agitation, aggression, apathy, and depression that manifest in individuals diagnosed with dementia2. Almost 90% of individuals with dementia at all stages and etiologies are affected by these symptoms3. They are increasingly recognized as the most complex, challenging, and costly aspects of dementia care4, and result in decreased independence in activities of daily living and quality of life5, nursing home placement6, healthcare utilization7, and increased caregiver burden and depression8 for individuals with dementia.

The factors that have been identified as contributing to BPSD can be categorized as (1) those related to individuals with dementia, such as dementia-related neurobiological factors, unmet needs, and premorbid personality; (2) those related to caregivers, such as communication approach; and (3) environmental triggers such as lack of stimuli and environmental change9. They cause symptoms independently or in combination with other aspects9. Disturbances in circadian rhythm, including impaired sleep and inadequate physical activity, have also been identified as stressors that can trigger BPSD10,11. Recent empirical research has identified an association between disruption of circadian rhythms and BPSD, including depression, anxiety, agitation12, and sundowning-related aggressive symptoms13. However, given that systematic and continuous observation of sleep and activity levels is challenging to assess using traditional methods that rely on retrospective reports of caregiver observation, only a few methodologically rigorous studies have examined the influence of sleep and activity levels on such symptoms14.

Although a downward trajectory for the cognitive and functional decline over the course of dementia is expected, the manifestation of BPSD varies among individuals9. Current evidence suggests that person-centered non-pharmacological interventions that match the needs of persons with dementia and their abilities are a first-line treatment for managing BPSD. The most appropriate interventions for those with dementia should be selected individually after considering the causes and patterns of BPSD15. Predicting BPSD by identifying contributing factors and monitoring triggers is the first step in selecting and implementing individually customized non-pharmacological interventions to prevent and manage target symptoms15.

Machine learning and wearable technology have immense potential to overcome the methodological limitations of existing dementia research through the precise analysis of clinical information derived from digital devices16. Wearable technology such as actigraphy allows for continuous biometric monitoring, including levels of sleep and activity in everyday conditions, and for connecting them with clinical symptoms16,17. Machine learning facilitates the identification of underlying patterns and relationships between variables directly from data and the development of data-driven prediction models18. Although machine learning has been employed to develop predictive models for the incidence and detection of Alzheimer’s disease19, this analytic technique has rarely been applied to research on BPSD.

Hence, machine learning models are leveraged in this study to predict the occurrence of BPSD among community-dwelling older adults living with dementia based on various factors, including actigraphy-measured sleep and physical activity levels and diary-based caregiver-perceived symptom triggers.


Study design

This study utilized a prospective observational design with three-wave data collection to build predictive models for BPSD subsyndromes. The second wave of data collection was conducted after the first wave, with repeated measures from participants in the first wave who had agreed to participate in the second wave. A detailed description of the first and second waves of data collection is reported elsewhere20. In the third wave of data collection, a validation dataset was collected from new participants independently of the first and second wave data. We employed the first and second wave data for model training (i.e., the training dataset) and the third wave data for external validation (i.e., the test dataset).

We employed a standard mining methodology that comprised four steps: (1) data acquisition, (2) data preprocessing (e.g., data cleaning, class imbalance training, and dataset class optimization), (3) model learning, and (4) model evaluation21.

Recruitment and data collection

The first wave of data collection was conducted between June 2018 and June 2019. Eligible older adults with dementia living at home were recruited via on-site visits from outpatient neurological clinics at two tertiary hospitals and daycare centers in Seoul and the broader Gyeonggi region in Korea. The second wave of data collection, which involved first-wave participants who agreed to continue the study, was administered between July 2019 and June 2020. For external validation, eligible participants were recruited between July 2020 and November 2020 from an outpatient neurological clinic, where the first and second waves of data collection were conducted. The inclusion criteria applied to the three-wave data collection were (1) being at least 65 years old, (2) having a diagnosis of dementia, and (3) having a score of less than 24 on the Korean version of the Mini-Mental State Examination (K-MMSE)22.

Eligibility screening and data collection were performed by trained research staff.

After eligibility was established, trained research staff collected demographic and health data through interviews with family caregivers and older adults with dementia. Furthermore, chart reviews and standardized scales for physical, functional, and neuropsychological assessments were administered. Following the baseline assessment, the participants wore an actigraphy device on their wrists continuously for two weeks, and primary caregivers logged BPSD in the symptom diary daily for 14 consecutive days.

Ethical considerations

All procedures performed in this study involving human participants were in accordance with the ethical standards of the institutional or national research committee and with the Declaration of Helsinki (1964) and its later amendments or comparable ethical standards. Institutional review board approval was obtained from the Yonsei University Health System Severance Hospital (IRB 4-2018-0348, 4-2019-0314, 4-2020-0454) and Ilsan Hospital (IRB 2018-10-002-001, 2019-08-012-001). Legal representatives of all the participants provided written informed consent before enrollment after receiving a full explanation of the study procedures. The participants also provided verbal assent and written informed consent was obtained when possible.


Demographic and health data

At baseline, demographic and health data comprised age, sex, marital status, education level, dementia diagnosis, and neurological and psychiatric medications.

Cognitive and functional status

Scores (range 0–30) on the K-MMSE were used to assess cognitive functioning, with lower scores indicating greater cognitive deficits22. For the K-MMSE, Cronbach’s α was 0.91 in older Korean adults with dementia23. The severity of dementia was measured via the Korean version of the expanded Clinical Dementia Rating (CDR) scale, which assesses six functional domains—memory, orientation, judgment and problem-solving, community affairs, home and hobbies, and personal care24. The summed score ranged from 0 (none) to 5 (terminal dementia), and good inter-rater reliability for the overall CDR ratings was confirmed in Korean patients with dementia (kappa value range: 0.86–1.00)25. Functional independence was evaluated using the Korean version of Activities of Daily Living (K-ADL), which consists of seven items rated on a 3-point Likert-type scale, with higher scores indicating more severe levels of dependency26. The K-ADL was validated for older Korean adults with dementia, with good reliability (Cronbach’s α = 0.94)26.

Personality type

The family caregiver informant-rated premorbid personality traits of older adults with dementia were assessed using the Korean version of the Big Five Inventory (BFI-K)27. The BFI-K constitutes 15 items rated on a 5-point Likert-type scale that measures 5 domains of personality traits: openness, conscientiousness, neuroticism, extraversion, and agreeableness. The internal consistency of the BFI-K was good; Cronbach’s α ranged from 0.67 to 0.8227.

Actigraphy data: nighttime sleep and physical activity

Older adults with dementia were fitted with a wrist-worn actigraphy device (ActiGraph wGT3X-BT, ActiGraph Corporation, Pensacola, FL, USA), which they wore all day for 14 consecutive days. The participants were instructed to remove the device when bathing or for a few minutes as needed. Previous validity studies have demonstrated that wrist actigraphy is a reliable and suitable method for objectively measuring sleep–wake cycles in older adults with dementia28,29. Raw acceleration data were collected along the three axes. We used ActiLife (version 6.13.3, Pensacola, FL, US) software to export the data and process raw acceleration data to sleep and physical activity parameters using vector magnitude count in 60-s epoch data (i.e., counts per minute). The vector magnitude is calculated as the square root of the sum of the squares of acceleration for each of the three axes (x, y, z). The Cole-Kripke algorithm was applied to score a one-minute epoch as asleep or awake30. Moreover, the previous night’s sleep parameters were employed to predict BPSD the following day. In this study, nighttime sleep was defined as the period between 20:00 (8:00 pm) and 08:00 (8:00 am). The following nighttime sleep parameters were generated: total sleep time, wake time after sleep onset, sleep efficiency, defined as the ratio of sleep duration over the assumed sleep period (total sleep time/[total sleep time + wake time after sleep onset] × 100), number of awakenings, and mean awake length (wake time after sleep onset/number of awakenings). The following physical activity parameters were also generated: energy expenditure (calories burned) in kcal per day, metabolic equivalents per day, total time spent in moderate-to-vigorous physical activity per day, percentage of time spent in moderate-to-vigorous physical activity per day, and the number of steps per day. We employed physical activity parameters measured the same day, which reflected the physical conditions during the day when BPSD occurred.

Symptom diary data: BPSD and caregiver-perceived symptom triggers

A symptom diary that comprised a structured, easy-to-use checklist modeled on the Neuropsychiatric Inventory (NPI) was developed to assess the presence and severity of BPSD (i.e., delusions, hallucinations, agitation/aggression, depression/dysphoria, anxiety, elation/euphoria, apathy/indifference, disinhibition, irritability/lability, aberrant motor behaviors, sleep and nighttime behaviors, and appetite and eating disorders) daily31. It also included a checklist that assessed caregiver-perceived triggers of BPSD (i.e., hunger/thirst, urination/bowel movement, pain/discomfort, sleep disturbance, noise, light, temperature), interpersonal triggers (i.e., factors related to the person(s) who were present), and changes in the environment. Caregivers were also instructed to check “other causes” in the symptom diary if the perceived trigger was a factor that could not be categorized under any of the options listed in the checklist, and then, list the factors. Family caregivers were instructed to check all options that were perceived as triggers of BPSD on the same day when the symptoms had occurred. The symptom diary was designed to overcome recall bias (e.g., the NPI is based on the caregiver’s two-week retrospective rating), enable daily monitoring of the occurrence of BPSD, and link symptoms to triggers daily32.

Recent studies have established that clustering several individual BPSD that are highly correlated and co-occur enhances the clinical utility of the assessment of BPSD, thus allowing for a more meaningful interpretation of the study findings and increasing power by raising the number of participants who endorsed the symptom cluster rather than the individual symptoms alone2,33,34. Based on previous NPI factor analysis studies, we clustered certain individual symptoms into three subsyndromes: psychotic symptoms (hallucination and delusion), affective symptoms (depression, anxiety, and apathy), and hyperactivity symptoms (agitation/aggression, disinhibition, and irritability)34,35,36,37. As prior studies have demonstrated that euphoria/elation, aberrant motor behaviors, sleep and nighttime behaviors, and appetite and eating disorders do not load into any clusters33,34,36,37,38, we analyzed them as individual subsyndromes consisting of only one symptom.

Data preprocessing

Missing actigraphy data were encountered for two main reasons: the improper wearing of the device and lack of participant compliance (e.g., constant removal of the device or not wearing the device). The number of participants with missing actigraphy data was 81/225 (36%). The mean number of days per person with missing actigraphy data was 0.9. The occurrence rates of BPSD were similar regardless of missing actigraphy data (Supplementary Table 1). Therefore, multivariate imputation was applied using chained equations39 to address the missing actigraphy data. Before training the models, we applied min–max normalization for continuous features. For categorical features, target encoding was employed instead of one-hot encoding. Target encoding reduces feature dimensions by converting categorical features to numerical values derived from target variables, assuming that a categorical feature is related to the outcomes40. There was an issue of outcome class imbalance for BPSD subsyndromes. While 26.8% of the participants exhibited affective symptoms, only 4.4% and 4.7% exhibited aberrant motor behaviors and euphoria/elation, respectively (Table 2). Researchers in various disciplines have prioritized the class imbalance problem and suggested strategies to address the issues of imbalanced data sets41,42,43. This study applied a synthetic minority oversampling technique to address the outcome class imbalance issue44.

Predictive modeling

Multiple machine learning approaches were selected for this study, including logistic regression, random forest45, gradient boosting machine46, and support vector machine47. We investigated each of these machine learning methods with a specific learning algorithm to gauge their effectiveness, and then selected the best-performing model that could predict each subsyndrome of BPSD18. Using logistic regression, the most common and well-established binary classifier48, as the baseline model, we evaluated the degree to which the machine learning models improved performance over the baseline model.

To avoid overfitting, hyperparameter tuning through random search49 was implemented with five-fold cross-validation for each machine learning method. Binary cross-entropy was employed as the evaluation criterion for five-fold cross-validation. The hyperparameters for tree complexity were considered for the random forest and gradient boosting machine models. The gradient boosting machine model was iteratively trained to minimize the loss function using stochastic gradient boosting. Thus, we considered the learning rate and number of trees for the gradient boosting machine. Various kernel functions such as linear, polynomial, and radial basis kernels can be utilized for the support vector machine50. Linear instead of nonlinear kernels, such as the radial basis function kernel, were used in this study to prevent overfitting in small datasets and calculate feature importance. For the support vector machine models, only the regularization hyperparameter was employed to determine the optimal model. All the selected hyperparameters are described in Supplementary Table 2. Feature importance analysis was performed to investigate the contribution of a range of features in predicting the seven subsyndromes of BPSD and to sort the importance of the top 10 influential features for prediction.

Statistical analysis

Categorical variables are summarized as the number of participants with percentages and continuous variables as means with standard deviations. Furthermore, two-sample independent t-tests and Fisher’s exact tests were used to compare the training and test dataset differences, respectively. The performances of the prediction models were compared and evaluated using several indices—accuracy, precision, sensitivity (recall), specificity, F1 score, and area under the receiver operating characteristic curve (AUC).

Statistical significance was set at p < 0.05, and all analyses were performed using R, version 3.6.3 (R Foundation for Statistical Computing, Vienna, Austria) and Python, version 3.7 (Python Software Foundation, Wilmington, USA).


Participant characteristics

Table 1 presents the participant characteristics of both the training (N = 187 older adults with dementia) and test sets (N = 35 older adults with dementia) at baseline. The mean age of the participants was 80.4 years (SD 7.4) for the training dataset and 80.7 (SD 5.6) for the test dataset. The majority of participants were women (59% for the training dataset; 63% for the test dataset), married (62% for the training dataset; 63% for the test dataset), and with an educational level of middle school or below (57% for both the training and the test datasets). Table 1 summarizes the prevalence of BPSD among the participants.

Table 1 Summary statistics of study participants and prevalence of subsyndromes of behavioral and psychological symptoms of dementia (BPSD).

Summary of actigraphy and symptom diary data

Symptom diary and actigraphy data were collected over an average of 13.30 days (SD 2.08) per person (training set: mean [SD] = 13.35 [2.11]; test set: mean [SD] = 13.00 [1.91]).

Affective symptoms were the subsyndromes of BPSD that occurred most frequently (training set: 26.8%; test set: 17.1%), followed by hyperactivity symptoms (training set: 17.8%; test set: 12.7%). Aberrant motor behaviors, euphoria/elation, and appetite and eating disorders had lower frequency rates for both the training and test datasets. The average total sleep times per day were 7.7 h (for the training dataset) and 6.3 h (for the test dataset), and the average total nighttime sleep times per day were 5.9 h (training dataset) and 5.1 h (test dataset). The average energy expenditure was 459.9 kcal/day (training dataset) and 442.2 kcal/day (test dataset). The participants averaged 5908.7 steps (training dataset) and 5750.5 steps (test dataset) daily. Sleep disturbance (training dataset: 13.5%; test dataset: 8.4%) and interpersonal triggers (training dataset: 8.5%; test dataset: 10.8%) were the most frequently reported triggers. Table 2 provides the univariate analyses of actigraphy and symptom diary data.

Table 2 Descriptive statistics for actigraphy and symptom diary.

Performance comparison for prediction models

Table 3 presents the prediction performance of all prediction models based on the training dataset with five-fold cross-validation. Gradient boosting machine models showed higher AUC values compared to other prediction models for predicting hyperactivity (0.706), affective symptoms (0.747), and appetite and eating disorders. (0.816). While the support vector machine model demonstrated the highest AUC value (0.706) for psychotic symptoms, the random forest model exhibited the highest AUC value (0.942) for sleep and nighttime behavior. The logistic regression models denoted the highest AUC values for aberrant motor behaviors (0.822) and euphoria/elation (0.696).

Table 3 Performance comparison of the prediction models for subsyndromes of behavioral and psychological symptoms of dementia (BPSD) for the training dataset with five-fold cross-validation.

Table 4 presents the prediction performance of all prediction models based on the test dataset, and Supplementary Fig. 1 illustrates the receiver operating characteristics and precision-recall curves for the test dataset. Compared with the logistic regression models, the machine learning models revealed better performance for all seven subsyndromes. Specifically, the random forest and gradient boosting machine models performed better than the logistic regression and support vector machine models for most subsyndromes. The random forest models exhibited higher AUC values than the other prediction models for predicting hyperactivity (0.835), euphoria/elation (0.968), and appetite and eating disorders (0.888). The gradient boosting machine models presented higher AUC values than the other prediction models for predicting psychotic symptoms (0.801), affective symptoms (0.936), and aberrant motor behaviors (0.498). Moreover, the support vector machine model showed the highest AUC value (0.929) for sleep and nighttime behavior.

Table 4 Performance comparison of the prediction models for subsyndromes of behavioral and psychological symptoms of dementia (BPSD) for the test dataset.

The gradient boosting machine model achieved the best performance in terms of the average AUC scores across the seven subsyndromes for both the training and test datasets. Tables 3 and 4 present the findings for the other performance indices.

Feature importance

The top 10 most significant features of the gradient boosting machine models, which achieved the best performance in terms of average AUC scores across the seven subsyndromes, were ranked using the permutation feature importance method. Figure 1 illustrates the relative importance of the predictors included in the seven subsyndromes. Caregiver-perceived triggers reported in the symptom diary, including interpersonal triggers, sleep disturbance, urination/bowel movement, and pain/discomfort, exhibited higher feature importance values than the other features across the seven subsyndromes. The CDR score was the most influential feature for psychotic symptoms and ranked in the top 10 for hyperactivity, euphoria/elation, and appetite and eating disorders. The features for sleep and activity levels, such as waking after sleep onset at night and percentage of moderate-to-vigorous physical activity per day, were in the top ranks of feature importance for aberrant motor behaviors and sleep and nighttime behaviors. Except for two symptoms (aberrant motor behaviors and euphoria/elation), premorbid personality types were among the top 10 influential features.

Figure 1
figure 1

Importance of the top 10 features calculated by the gradient boosting machine for each subsyndrome of the behavioral and psychological symptoms of dementia (BPSD). “Other causes” indicates any other caregiver-perceived BPSD trigger that cannot be categorized as one of the options provided in the symptom diary (e.g., medical treatment, hospital visits, and nightmare). ADL activities of daily living; BFI Big Five Inventory; CDR Clinical Dementia Rating scale; MMSE Mini-Mental State Examination; WASO wake after sleep onset; MVPA moderate-to-vigorous physical activity; METs metabolic equivalents.


This study constructed predictive models for seven subsyndromes of BPSD, with overall good prediction accuracy, through a supervised machine learning technique based on actigraphy measures of sleep and activity levels, symptom diary entries of caregiver-perceived symptom triggers, standardized measures of cognitive and functional status and premorbid personality type, and a medical chart review. Through the feature importance approach, we also identified the relative importance of predicting each subsyndrome. The machine learning algorithms developed and validated in our study will inform timely and appropriate preventative approaches that identify individuals at high risk for specific subsyndromes of BPSD, and thus, provide customized interventions that address the underlying causes and triggers of the target symptoms.

Although the predictive machine learning models varied among the subsyndromes with regard to different evaluation metrics, our findings demonstrated the feasibility and usefulness of the machine learning approach for predicting BPSD by accounting for multifaceted factors. Although several studies have applied machine learning algorithms with a focus on predicting the future onset of or detecting undiagnosed dementia19,51, few have used machine learning models to predict the various symptoms observed in older adults with dementia. A recent study employed the deep learning approach to forecast agitation episodes based on environmental stimuli data, including ambient noise level, room temperature, and relative humidity, that was collected up to 30 min before the occurrence of an actual episode52. While it demonstrated the feasibility and efficacy of deep learning models in predicting agitation episodes (accuracy: 98.6%; sensitivity: 84.8%), the models developed in our study were more comprehensive and practical because they examined diverse BPSD, which often co-occurs in real care settings. Besides, a machine learning approach can account for a range of multifaceted contributing factors, including actigraphy data and symptom diary data measured over time, which might not have been feasible in conventional statistical modeling owing to highly complex relationships among the variables. In recent years, big healthcare data have emerged along with advances in assistive technologies, including wearable and mobile technologies in healthcare for older adults53, which has increased the number of candidate features54. Machine learning is a promising tool in predicting BPSD arising from interactions among a web of personal, interpersonal, and environmental factors captured by various digital health technologies54. Moreover, machine learning algorithms that predict BPSD subsyndromes can be incorporated into a mobile app to support healthcare providers’ decision-making in selecting individually customized non-pharmacological interventions that address the underlying causes of the target symptoms54.

Our outcomes concerning feature importance suggested that not all features contributed consistently to the prediction of each subsyndrome. Caregiver-perceived symptom triggers, including interpersonal triggers, sleep disturbance, and pain/discomfort, were consistently ranked within the five most influential features across the seven subsyndromes. The features for sleep and activity levels, such as waking after sleep onset at night and percentage of moderate-to-vigorous physical activity per day, ranked highly in terms of feature importance for aberrant motor behaviors and sleep and nighttime behaviors. Age, severity of dementia, cognitive function, education level, and premorbid personality type were also ranked among the 10 most influential features. While activity data were relatively less influential than other features, metabolic equivalents per day were ranked in the top 10 important features for sleep and nighttime behaviors, as was wake after sleep onset at night and the percentage of moderate-to-vigorous physical activity per day for aberrant motor behaviors. The clinical implications of the results are that healthcare providers and caregivers need to consider BPSD as heterogeneous and having different important predictors. Hence, effective interventions to prevent and manage target predictors and symptoms should be employed.

While caregiver-perceived symptom triggers evaluated by the symptom diary were among the most influential features, activity data were less influential than expected for most subsyndromes in our study, except for aberrant motor behaviors. This notion is somewhat consistent with Valembois et al.’s observational study55, which investigated the association between motor activity and sleep duration measured by actigraphy with different types of BPSD. They found a significant increase in motor activity among those with aberrant motor behavior between 21:00 (9:00 pm) and midnight. They explained the phenomenon in terms of the sundown syndrome, referring to an increase in BPSD from late afternoon to night; otherwise, no relationship between sleep duration or night awakening episodes and any type of BPSD was found55. Another study utilized the machine learning approach to predict mobility, cognitive, and depressive symptoms related to Alzheimer’s disease from activity-aware smart home behavior data (e.g., activities of daily living, sleep, outings, and global routines)56. While statistically significant predictors were observed for mobility and cognitive symptoms, depression measured by the Geriatric Depression Scale (GDS) was weakly correlated with the global set of smart home behavioral data56. This outcome somewhat aligns with our finding that affective symptoms were predictable using machine learning models, but its classification (occurrence vs. absence) was mostly not influenced by actigraphy data. Additionally, the inclusion of a variety of predictors in the machine learning model might have decreased the importance of features derived from actigraphy data in our study. For clinical implications, a comprehensive monitoring system that utilizes a symptom diary written by direct caregivers, in addition to wearable sensor technologies, is required for accurate prediction.

A strength of our study is the inclusion of a variety of input data, objectively measured by actigraphy, and caregiver-perceived symptom triggers assessed using a symptom diary to develop the most powerful predictive machine learning models. Furthermore, external validation was performed using an independent test dataset to prevent overestimation of the results57,58. Consistent results from the training and test datasets that differed substantially in terms of severity of functional impairment, characteristics of sleep and activity levels, caregiver-perceived symptom triggers, and frequency of occurrence of symptoms can be regarded as confirmation of generalizability58. Finally, the symptom diary, from which the most influential features were derived, was easy to measure daily in real life. The feature importance results will facilitate the development of a digital diary that tracks BPSD using devices such as smartphones and tablets. A digital diary enables caregivers to log the symptom manifestation and circumstances, including diverse triggers, in real time, and the accumulated data can be analyzed to provide an individualized approach to symptom management.

Our study had certain limitations. First, the symptom diary was written daily, and caregivers checked one or more types of symptoms that were observed and one or more factors that were perceived as triggers for the symptoms each day. Hence, we could not disentangle which caregiver-perceived trigger contributed to which type of individual symptom observed that day. Future research needs to investigate the link between caregiver-perceived triggers and specific symptoms on an episodic rather than a daily basis. Episodic prediction algorithms can provide more precise and accurate information regarding specific symptoms. Second, the prediction performances for certain subsyndromes were poor, particularly in the case of aberrant motor behaviors (AUC = 0.498 in the gradient boosting machine model). The findings regarding the prediction performance need to be interpreted cautiously considering the small sample size of the test dataset. The frequency of occurrence of certain symptoms, particularly aberrant motor behaviors and euphoria/elation, was very low (e.g., 2/871 days of aberrant motor behavior). The overall small sample size of the test dataset might lead to lower power for pattern recognition59. Future replication studies with larger sample sizes are needed to obtain more accurate predictions for the subsyndromes that were less observed in our study and to prevent model overfitting and biased machine learning performance59. As it was challenging to collect accurate data on comorbid conditions for older adults who were recruited from other outpatient neurological clinics (i.e., daycare centers), we did not account for any comorbidities that might have affected their BPSD. Additionally, caregivers’ daily written reports were used rather than direct observation of BPSD, which could result in an observer and recall bias. While dementia care has relied on caregivers’ subjective assessment reports in most cases4, future studies need to employ technology such as wearable sensors, non-wearable motion sensors, or assistive and smart home technologies to objectively monitor BPSD in real time, thus revolutionizing precision dementia care60.


This study developed and validated prediction models for BPSD subsyndromes using a machine learning approach. Overall, four classification models were trained to identify the optimal prediction model for each subsyndrome. To our knowledge, this study is the first to employ machine learning to predict BPSD using a wide variety of data, including actigraphy data. It suggests that machine learning-based prediction models can classify the manifestations of subsyndromes of BPSD. This study also identified influential predictors for specific subsyndromes that can be employed to prevent and manage target symptoms. Based on the outcomes of this study, algorithms that predict BPSD can be clinically applied to monitor and predict BPSD subsyndromes. Delivery of person-centered dementia care can be achieved through the early prediction of target subsyndromes and the provision of individually tailored non-pharmacological interventions that address the underlying causes of BPSD. Machine learning algorithms can also be embedded into smartphone applications to increase their clinical utility. Accordingly, this study is the first step toward personalized care for BPSD management using digital health technologies.