Development of a prediction model for the depression level of the elderly in low-income households: using decision trees, logistic regression, neural networks, and random forest

Kim, Kyu-Min; Kim, Jae-Hak; Rhee, Hyun-Sill; Youn, Bo-Young

doi:10.1038/s41598-023-38742-1

Download PDF

Article
Open access
Published: 16 July 2023

Development of a prediction model for the depression level of the elderly in low-income households: using decision trees, logistic regression, neural networks, and random forest

Kyu-Min Kim^1,2,
Jae-Hak Kim^1,3,
Hyun-Sill Rhee⁴ &
…
Bo-Young Youn⁵

Scientific Reports volume 13, Article number: 11473 (2023) Cite this article

1603 Accesses
3 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Korea is showing the fastest trend in the world in population aging; there is a high interest in the elderly population nationwide. Among the common chronic diseases, the elderly tends to have a high incidence of depression. That said, it has been vital to focus on preventing depression in the elderly in advance. Hence, this study aims to select the factors related to depression in low-income seniors identified in previous studies and to develop a prediction model. In this study, 2975 elderly people from low-income families were extracted using the 13th-year data of the Korea Welfare Panel Study (2018). Decision trees, logistic regression, neural networks, and random forest were applied to develop a predictive model among the numerous data mining techniques. In addition, the wrapper’s stepwise backward elimination, which finds the optimal model by removing the least relevant factors, was applied. The evaluation of the model was confirmed via accuracy. It was verified that the final prediction model, in the case of a decision tree, showed the highest predictive power with an accuracy of 97.3%. Second, psychological factors, leisure life satisfaction, social support, subjective health awareness, and family support ranked higher than demographic factors influencing depression. Based on the results, an approach focused on psychological support is much needed to manage depression in low-income seniors. As predicting depression in the elderly varies on numerous influencing factors, using a decision tree may be beneficial to establish a firm prediction model to identify vital factors causing depression in the elderly population.

Importance analysis of psychosociological variables in frailty syndrome in heart failure patients using machine learning approach

Article Open access 13 May 2023

Performance of probable dementia classification in a European multi-country survey

Article Open access 20 March 2024

Uncovering the factors that affect earthquake insurance uptake using supervised machine learning

Article Open access 03 December 2023

Introduction

The global effects of the aging population are rapidly increasing. According to the World Health Organization, the number of people aged 60 or older has quickly been increasing, and it is expected to surpass 1.4 billion by 2030 and 2.1 billion by 2050¹. According to Statistics Korea, as of 2022, the elderly population aged 65 or older is expected to account for 17.5% of the total population; more importantly, the super-aged society is expected to reach in 2025, which accounts for 20.6% of the total population². The rapidly increasing elderly population can experience various health problems due to physical and psychological changes from a life cycle point of view³. Therefore, Korea is showing the fastest aging population trend globally, so interest in the elderly is high nationally⁴.

Due to aging, one in four older adults experience age-related mental health issues⁵. The most common issue is known to be depression, as it may perhaps complicate an older adult’s existing health condition and trigger new concerns⁶. In the case of Korea, the number of patients with depression continues to rise, and it is expected to rank first in 2030⁷. According to the Mental Health Foundation, about 22% of the United Kingdom population aged 65 and older suffered from depression; about 28% of men and 28% of women, and about 85% of the elderly with depression had no help⁸. It has also been reported that people experiencing depression commit dangerous behaviors such as self-harm behavior to escape their negative emotions⁹. As such, depression could be related to suicide risk and is closely similar to other risk behaviors; thus, careful attention is needed in the elderly community, especially for those who have experienced depression. Based on research findings in previous studies, it was demonstrated that depression is also closely related to income level. In particular, it is reported that low-income elderly households are in worse health than other groups, and even when diagnosed with depression, the symptoms worsen because timely and appropriate treatment is not provided¹⁰. Therefore, it can be seen that it is more important for low-income elderly households to make efforts to detect depression early and prevent and manage it compared to other groups.

A study of income inequality, social support, and depression in older European adults found that the lower the income level corresponding to the fifth quintile was, the higher the score appeared for depression—indicating that the role of household income is essential in understanding depression¹¹. Therefore, the depression of the elderly is a societal problem that cannot be overlooked as it increases the burden of caring for the person and the people around them. Since depression is a common disease in the elderly, the factors that increase depression are exceptionally diverse. Previous studies have shown that depression in the elderly is closely related to depression, including gender, age, education level, chronic disease, social support, health promotion behavior, leisure life satisfaction, and medical expenses^12,13. Concerning elderly depression in Ethiopia, one of the low-income countries, two out of five elderly people suffer from depression. Among the elderly, those who are females, have no formal education, have chronic diseases, and have no social support, are prone to depression¹⁴. Notably, the lower the income level, the higher the level of depression among the elderly, which can be understood as the higher the level of depression among the elderly with a relatively high risk of exposure to economic difficulties. Other studies confirmed that income level, education level, emotional support, and subjective health awareness affected depression; moreover, it was found that the intensity of depression heightened with a low-income level, less emotional support, and low subjective health awareness¹⁵. In addition, the level of depression decreased when participating in satisfactory leisure activities¹⁶.

Based on the comprehensive literature review of the previous studies, a few studies have analyzed the differences between the factors affecting depression and the level of depression in the elderly from low-income families. Most studies were found to have a limitation in that the existing studies have only identified the factors that affect depression in an enumerated manner. Most importantly, none of the existing studies had used data mining techniques to seek models that predict depression in low-income elderly. In addition, studies using text mining methods have merely identified depressive symptoms in different groups, while only a few studies were carried out on the elderly^17,18,19.

Hence, this study attempted to develop a prediction model for depression in the elderly from low-income families based on the influencing factors identified in previous studies. Data mining, a type of machine learning, was used since it is increasingly used in the healthcare field and is mainly used in developing predictive models, such as the prediction of therapeutic effects and diagnosis of hypertension and diabetes²⁰. According to McKinsey and Company, data mining increases patient management efficiency, can provide correct treatment planning and diagnosis, and is of great help to complex cases²¹. The active use of data mining techniques is highly likely to reduce medical costs by up to 17%²². This study is expected to provide the fundamental data for developing an integrated program to prevent and manage depression in the elderly from low-income families.

Methods

Data resource

This study analyzed the original data of the 13th-year (2018) from the Korean Welfare Panel Survey (KOWEPS). The KOWEPS is a nationally representative longitudinal survey of 7000 households in 16 metropolitan cities, including Seoul and Jeju Island, conducted jointly by the Korea Institute for Health and Social Affairs and the Social Welfare Research Center of Seoul National University. In addition, a sample was selected by extracting a two-stage stratified cluster systematic sampling based on the income data of households in the 2006 National Welfare Survey on Living Conditions. The aforementioned households were divided into low-income and regular-income households, and 7000 households were selected using stratified cluster systematic sampling, allocating 3500 households from each group.

The 13th-year (2018) data for household members were selected for this study since the number of elderly patients with panic disorders, non-organic sleep disorders, eating disorders, and depression increased by 81% from 290,000 in 2010 to 530,000 in 2018²³. A total of 2975 people was analyzed, excluding missing values from the factors used in the analysis; in the final stage, data from the elderly aged 65 or older were extracted, as aged 65 or older is designed as elderly according to the welfare of senior citizens act of Korea (Fig. 1)²⁴. The standard median income is the income of the person in the center of the line when all people are lined up, and the KOWEPS considers 60% or less to be low income. As aforementioned, low-income elderly are reported to have poorer health outcomes than other populations and are significantly affected by surroundings. Therefore, data from the 13th-year (2018) data was chosen, excluding any potential non-related impacts from the era of COVID-19; the first case of COVID-19 in South Korea was in January 2019²⁵.

Construction of variables

Target variable

The KOWEPS provides CES-D 11 (The Center for Epidemiological Studies-Depression Scale) as a measure of depression. The scale was reconstructed by reducing the 20-item instruments developed by Radloff (1977) to 11-item instruments²⁶. The instruments consist of the following questions: I did not feel like eating; my appetite was poor; I felt that I was just as good as other people; I felt depressed; I felt that everything I did was an effort; My sleep was restless; I felt lonely; I enjoyed life; People were unfriendly; I felt sad; I felt that people dislike me; and I could not get “going.”

The range of responses were from 0 (rarely or none of the time) to 3 (most or all of the time). In this study, the total score of the 20-item circle scale was used for analysis by multiplying by 20/11 to determine whether or not there was depression. The higher the value, the higher the level of depression indicated. Depression can be suspected if the score is 16 point or more, and a score less than 16 can be considered normal.

Input variable

Based on the literature review discussed above, the input variables used in this study are as follows. Gender, age, education level, number of household members, disability, economic activity, and chronic disease were included as demographic factors. Second, social support, family support, and leisure life satisfaction are measured on a four-point Likert scale, respectively, and the higher the score, the higher the support and satisfaction. Third, health promotion behavior is a concept that encompasses various factors, such as beliefs, behaviors, and habits necessary for health promotion and maintenance. However, this study was limited to factors of health behavior and lifestyle provided by the KOWEPS. Drinking was scored as 1 point for 'the average amount of alcohol consumed per year'; if there was no drinking experience at all, 0 points for drinking experience at least once. For smoking, 'currently smoking cigarettes,' 0 points if smoking, and 1 point was given for nonsmokers. The average of the health checkup was calculated by giving 0 points if it had never been done and 1 point if it was done once; the higher the score, the more health behaviors it had. Fourth, subjective health awareness is measured on a four-point Likert scale; the higher the score, the higher the subjective health awareness, and the level of medical expenditure means the average monthly medical expenditure. The factors used in the analysis are summarized in Table 1.

Table 1 Variables and measurements used in the analysis.

Full size table

Statistical analysis

Frequency analysis, T-test, and one-way ANOVA analysis were performed to verify whether statistical differences occurred according to the demographic characteristics and depression level of the participants of this study. Then, data mining techniques, logistic regression analysis, decision tree analysis, artificial neural network analysis and random forest analysis were used to build a predictive model for depression in the elderly of low-income households. A sensitivity analysis was conducted to ensure that the main outcome was reliable and robust. The analysis was carried out by changing the cut-off score for suspected depression as the dependent variable.

Logistic regression analysis is the most common method used when the target factor is binary, and it has the advantage of supplementing data that only takes a value of 0–1. An artificial neural network is one of the most widely used methodologies to predict the category of target factors by combining input factors with a nonlinear model, passing them to each hidden unit, and delivering the combination of hidden units to the output node. A decision tree analysis is a technique that classifies the categories of target factors by tabulating decision-making rules in the form of a tree structure. Since it is expressed in a tree structure, it is easy to interpret the classification results and has the advantage of obtaining information on major predictive factors. In this study, C5.0, one of the types of decision trees, was used. Random forest is a model that improves the shortcomings of decision tree and is reported to have excellent performance because it can prevent overfitting by applying bagging technique to generate multiple decision trees²⁷. Finally, logistic regression analysis was conducted to identify the predictors of high risk of depression. For the development and evaluation of the predictive model, a tenfold cross-validation method was used in which the entire data was divided into ten categories for generalization and used as model creation (9) and validation (1) data²⁸. After examining the relative importance of predictive factors via Shapley additive explanation analysis that contributed to predicting the depression level of the elderly in low-income households, wrapper's stepwise backward elimination was applied to find the optimal model by removing the least relevant factors. The models created via the process mentioned above were evaluated based on accuracy, and then the optimal model for this topic was selected.

The performance index of the developed prediction model means that the larger the size, the stronger the predictive power of the depression level. The model's final evaluation was based on accuracy, and sensitivity and specificity values were also presented. The analysis packages, IBM SPSS Modeler 18.0 (SPSS Inc., Chicago, Illinois, USA) and SAS 9.4 (SAS Institute Inc., Cary, NC), were used.

Ethical approval

This study was approved by the Korea University Institutional Review Board (IRB No. IRB-2022-0385). The IRB of Korea University waived informed consent since this study was retrospective and blinding of the personal information in the data was performed.

Results

The results of the demographic characteristics of the study and the average difference in depression levels are shown in Table 2. Females accounted for a higher number than males; females (n = 2008, 67.5%) and males (n = 967, 32.5%). For the age distribution, ‘ages of 80 or older’ was the largest with 1475 people (49.6%), followed by ‘ages of 75–79’ with 755 people (25.4%) and ‘ages of 70–74’ with 459 people (15.4%). Regarding the level of education, 2107 people (70.8%) had ‘elementary school graduation’, and 471 people (15.8%) had ‘middle school graduation’. indicating that the majority had a low level of education. When asked about the number of household members, ‘two people’ accounted for the largest portion with 1459 people (49.0%), showing that the majority lived with one more person. As for having disabilities, 2425 people (81.5%) mentioned ‘no’, and 550 people (18.5%) stated ‘yes’. In terms of participation in economic activities, ‘not participating’ accounted for more than half of the participants (n = 2042, 68.6%).

Table 2 General characteristics of the participants and differences in depression level.

Full size table

In terms of depression, 2553 people (85.8%) reported ‘no’ and 422 people (14.2%) reported ‘yes.’ Lastly, the average difference between the sociodemographic characteristics and the depression level of the participants was evaluated. As a result, there were significant gender differences (t = − 3.547, p < 0.001) and participation in economic activities (F = 7.326, p < 0.001), but no differences were found in other factors.

The descriptive statistical results of the main factors are shown in Table 3. Considering that the range of scores for health promoting behavior is from a minimum of 0 to a maximum of 3, an average of 2.2 points can be regarded as a high value. On the other hand, having a standard deviation of 0.72, it can be understood that there was no significant difference in health promotion behavior by the elderly in low-income households. With reference to subjective health awareness, it was found that numerous elderly people had a higher awareness than the average, with an average of 2.8 points. Regarding the level of medical expenses, it was found that the average monthly expenditure was 158,000 won, and the standard deviation was 20.17, indicating a high difference in expenditure among the elderly in low-income households. Family support, social support, and leisure life satisfaction showed average scores of 2.7, 2.6, and 2.3, respectively, which were verified to be in good standing, considering that the range of scores was at least 0 to up to 4.

Table 3 Descriptive analysis of major variables.

Full size table

The relative importance of the predictive factors that contributed to predicting depression in low-income seniors utilizing the feature selection, is shown in Table 4. The higher the order of importance of a predictor, the greater the influence of that factor in predicting the level of depression; the highest ranking was identified as 'leisure life satisfaction.' This result can be interpreted as having the greatest effect on satisfaction in leisure life than other factors when predicting the level of depression of the elderly in low-income households. Furthermore, the factors of subjective health awareness, family support, and social support were found to be in the upper ranks. However, it was noted that the factors of presence or absence of chronic diseases, educational level, disability, and health behavior were distributed in the low ranking. A SHAP summary plot was created (Fig. 2), a visualization of how much each explanatory variable affects the prediction of depression. A yellow bar indicates a positive influence on the occurrence of depression. The red and orange bars indicate a negative impact on the occurrence of depression. The red bars were found to be the most influential variables. Regarding leisure life satisfaction, it can be used as an explanatory or a dependent variable. This study used it as an explanatory variable because the subjects were low-income elderly. The relationship between leisure life satisfaction and depression in low-income elderly is often reported as causal, with leisure life satisfaction affecting depression²⁹.

Table 4 The importance of variables that affect the level of depression.

Full size table

In this study, the classification techniques used to develop the most accurate predictive model, predicting the level of depression of the elderly in low-income households, were artificial neural networks, decision trees, logistic regression and random forest analysis. Table 5 is the result of the classification analysis by sequentially applying the wrapper's stepwise method to the relative importance of the factors identified in Table 4. Based on the analysis, it was identified that the decision tree algorithm showed higher predictive power than the other three algorithms. In the case of logistic regression analysis, the prediction accuracy was 73.2%, and the artificial neural network showed 81.8%. On the other hand, the decision tree shows a tendency to increase predictive accuracy as the number of factors increases, except when there is only one input factor. When all 13 factors were input, an accuracy of 97.3%, a sensitivity of 100%, and a specificity of 94.6% were presented. Finally, when forming the decision-making tree, the factor that had the greatest impact was the subjective health awareness factor, followed by leisure life satisfaction, family support, and social support. To ensure that the main outcome was reliable and robust, a sensitivity analysis was conducted by dividing the dependent variable, depression incidence, into two thresholds (15 points or less, 16 points or more); the analysis revealed that the main outcome did not change in Tables 6, 7.

Table 5 Distribution of accuracy, sensitivity, and specificity by models.

Full size table

Table 6 Sensitivity analysis results (depression level: 15 point or less).

Full size table

Table 7 Sensitivity analysis results (depression level: 16 point or more).

Full size table

Logistic regression analysis was performed to seek the influence of the predictors of high risk of depression in the elderly from low-income households, and the results are shown in Table 8. The factors that affected the level of depression were gender, number of household members, subjective health awareness, family support, social support, and satisfaction with leisure life. In the case of gender, the probability of developing depression in women was confirmed to be 1.86 times (OR = 1.861, 95% CI = 1.173–2.954) higher than in men. As the number of household members increased by each level, the probability of depression decreased by 0.69 times (OR = 0.692, 95% CI = 0.513–0.933). In subjective health awareness, an increase of each level was associated with a 0.40-fold (OR = 0.403, 95% CI = 0.312–0.522) lower probability of depression. Further, family support (OR = 0.613, 95% CI = 0.494–0.759), social support (OR = 0.711, 95% CI = 0.552–0.916), and leisure life satisfaction (OR = 0.425, 95% CI = 0.328–0.425) showed that the probability of depression decreased by 0.61 times, 0.71 times, and 0.42 times, respectively, as the level increased by each level.

Table 8 The results of logistic analysis according to the level of depression of the elderly in low-income households.

Full size table

Discussion

This study analyzed the factors affecting the depression of the elderly from low-income families, using the KOWEPS data based on the literature review mentioned above. The study initially determined whether the factors are related to depression in the elderly of low-income families and then developed a prediction model to predict depression. As a result of the analysis, the decision tree had the highest accuracy as a model for predicting depression among the elderly from low-income families, and the factors that greatly influenced the formation of the model were mainly psychological.

The main findings are as follows. First of all, as a result of sequentially applying wrapper's step-by-step removal method to the relative importance of factors that affect predicting depression in the elderly from low-income families, it was confirmed that the decision tree analysis showed the highest predictive power (97.3%). This result is consistent with previous studies that decision trees show excellent results in developing predictive models. As Lee et al., stated, when developing a model that predicts patient satisfaction and revisits intention according to hospital visits, artificial neural networks, logistic regression analysis, and decision trees (C5.0, CART, QUEST) were used, and the decision trees showed the highest predictive power, and C5.0 showed excellent results³⁰. Moreover, decision trees (C5.0, CHAID, and QUEST) were used in a model development study that predicts whether patients with severe work histories are admitted to the intensive care unit. As a result, it was found that C5.0 showed the best predictive power³¹. With all that said, the decision tree (C5.0) has the advantage of having an algorithm that can more effectively handle complex relationships between predictors, which is widely used in the healthcare field. More importantly, it is known as one of the classification techniques of data mining with proven effectiveness³². It is expected that effective depression management services can be provided by detecting groups with a high risk of depression at an early stage. Further refinement of the model to include additional community infrastructure and geographic factors related to depression may lead to more diverse measures to prevent depressive problems among low-income elderly.

Second, when the decision tree (C5.0) was formed, subjective health awareness, leisure life satisfaction, family support, and social support were the factors that had a relatively significant influence. This outcome is supported by a study that depressive disorder in the elderly is on the rise worldwide and that psychological factors such as social support and subjective health awareness are key contributing factors³³. Another study reported that life satisfaction and subjective health awareness have the most significant influence³⁴. Depression in the elderly has been shown to have a significant psychological impact, and decision trees are reported to be a highly effective method^35,36. In order to prevent and manage depression in the elderly, it is necessary to recognize the need for policy support considering psychological factors (subjective health awareness, leisure life satisfaction, family support, and social support). For example, adequate mental health management can be provided by conducting free quarterly psychological examinations on low-income elderly at public health centers and local clinics in each region to detect risk groups for depression while developing and operating programs to increase psychological support in the community service centers.

Third, a logistic regression analysis was conducted to confirm the predictors of depression in the low-income elderly. As a result, gender, the number of household members, subjective health awareness, leisure life satisfaction, family support, and social support were identified as influencing factors. It was found that the higher the risk of depression, especially for women, the smaller the number of household members, the lower the satisfaction level of leisure life, the lower the family support and social support, and the lower the level of subjective health awareness. These results were aligned with the same context as previous studies^18,35,36,37. The level of depression according to income level can also be examined. Muhammad et al. reported that the elderly population in the poorest fifth quintile was 39% more likely to develop depression than the elderly in the first quintile³⁸. Thus, it can be presumed that depression in the elderly is not caused by a single factor but by a combination of various factors. With that mentioned, forming activities in the local community that senior citizens can participate in, such as senior universities and clubs, while encouraging active promotion and participation are considered to prevent depression in the long run. All activities could be provided free of charge considering the characteristics of the low-income elderly, and if necessary, it may be an idea to encourage participation by offering a subsidy. In South Korea, various psychological support programs for the elderly exist in different regions so that it would be more effective to form a network to establish and manage roles and functions across regions. For example, the community service centers in each region act as gatekeepers to identify groups of people who are likely to be depressed and encourage to participate in the community-based psychological support program.

Finally, the limitations of this study are as follows. First, various factors affecting depression in the elderly were not examined. Previous studies have shown that various factors, such as biological factors, cultural factors, and environmental factors, act in combination to affect depression; however, the current study did not include all factors due to data limitations. Second, this study was conducted as a cross-sectional study, and there are some difficulties in identifying the causal relationship over time. Thirdly, in terms of the influence of depression, the characteristics of the age of the elderly were not considered. Since recent old age has various characteristics by period, which are classified into the first, middle, and late stages, it is highly likely that different patterns will appear regarding the factors influencing depression and the size of its impact.

Conclusion

This study selected factors related to depression in the elderly from low-income families identified in previous studies to develop a prediction model considering depression in the elderly from low-income families. As a result of the study, psychological factors (leisure life satisfaction, subjective health awareness, family support and social support) were higher than demographic factors, and the most suitable predictive model was identified as a decision tree. The aforementioned results suggest that an approach focused on psychological support is needed to manage the level of depression in low-income seniors. More importantly, as several influencing factors of depression vary in the elderly population, utilizing a decision tree will be beneficial to establish a more concrete prediction model.

Data availability

The data can be available for a special purpose in request to the first author of the study.

Abbreviations

CES-D:: The Center for Epidemiological Studies-Depression Scale
CI:: Confidence Interval
COVID-19:: Corona Virus Disease 19
DT:: Decision Tree
KHPS:: Korea Health Panel Study
LR:: Logistic Regression
NN:: Neural Network
OR:: Odds Ratio
RF:: Random Forest
SHAP:: Shapley Additive Explanation

References

World Health Organization. Ageing (Accessed 2 December 2022); https://www.who.int/health-topics/ageing#tab=tab_1 (2022).
Statistics Korea. 2022 Statistics on the Aged (Accessed 2 July 2022); http://kostat.go.kr/portal/eng/pressReleases/11/3/index.board (2022).
Shao, M., Chen, J. & Ma, C. Research on the relationship between Chinese elderly health status, social security, and depression. Int. J. Environ. Res. Public Health 19(12), 7496 (2022).
Article PubMed PubMed Central Google Scholar
Statistics Korea. 2021 Statistics on the Aged (Accessed 2 December 2022); http://kostat.go.kr/portal/eng/pressReleases/11/3/index.board (2021) (Korean).
WebMD. What to Know About Mental Health in Older Adults (Accessed 3 December 2022); https://www.webmd.com/healthy-aging/mental-health-in-older-adults (2021) (Korean).
American Psychological Association. How to prevent depression as you age (Accessed 3 December 2022); https://www.apa.org/topics/aging-older-adults/depression (2022).
Kim, K. M., Kim, J. H. & Rhee, H. S. A study on the depression levels and influencing factors in the elderly: A comparison between low-income and ordinary-income households. Health Soc. Welfare Rev. 40(3), 286–314 (2020) (Korean).
Google Scholar
Mental Health Foundation. Older people: Statistics (Accessed 3 December 2022); https://www.mentalhealth.org.uk/explore-mental-health/mental-health-statistics/older-people-statistics. (2022).
Sparks. Risky behavior: The roles of depression, openness to experience, and coping (Accessed 3 December 2022); http://www.sparksjournal.org/risky-behavior/ (2020)
Banerjee, A. et al. Depression and loneliness among the elderly in low-and middle-income countries. J. Econ. Perspect. 37(2), 179–202 (2023).
Article Google Scholar
Sánchez-Moreno, E. & Gallardo-Peralta, L. P. Income inequalities, social support and depressive symptoms among older adults in Europe: A multilevel cross-sectional study. Eur. J. Ageing 19(3), 663–675 (2021).
Article PubMed PubMed Central Google Scholar
Anbesaw, T. & Fekadu, B. Depression and associated factors among older adults in Bahir Dar city administration, Northwest Ethiopia, 2020: Cross-sectional study. PLoS ONE 17(8), e0273345 (2020).
Article Google Scholar
Lee, S. H. Moderating effects of interpersonal relation and social network on the relationship between depression and health behavior in elderly. J. Digit. Converg. 15(9), 397–406 (2017) (Korean).
Google Scholar
Kasa, A. S., Lee, S. C. & Chang, H. R. Prevalence and factors associated with depression among older adults in the case of a low-income country, Ethiopia: A systematic review and meta-analysis. BMC Psychiatry 22(1), 675 (2022).
Article PubMed PubMed Central Google Scholar
Zhao, L., Wang, J., Deng, H., Chen, J. & Ding, D. Depressive symptoms and ADL/IADL disabilities among older adults from low-income families in Dalian, Liaoning. Clin. Interv. Aging 17, 733–743 (2022).
Article PubMed PubMed Central Google Scholar
Han, K. et al. Psychosocial factors for influencing healthy aging in adults in Korea. Health Qual. Life Outcomes 13, 31 (2015) (Korean).
Article PubMed PubMed Central Google Scholar
Wang, S. H. et al. Text mining for identifying topics in the literatures about adolescent substance use and depression. BMC Public Health 19(16), 1–8 (2016).
Google Scholar
Li, G., Li, B., Huang, L. & Hou, S. Automatic construction of a depression-domain lexicon based on microblogs: Text mining study. JMIR Med. Inform. 8(6), e17650 (2020).
Article PubMed PubMed Central Google Scholar
Kim, I. H. & Kim, C. S. Leisure life satisfaction: will it have a beneficial impact on depression among older adults in community care settings in Korea?. J. Prev. Med. Public Health 55(4), 398 (2022) (Korean).
Article PubMed PubMed Central Google Scholar
El-Hasnony, I. M., Elzeki, O. M., Alshehri, A. & Salem, H. Multi-label active learning-based machine learning model for heart disease prediction. Sensors (Basel) 22(3), 1184 (2022).
Article ADS PubMed Google Scholar
Velu, S. R., Ravi, V. & Tabianan, K. Data mining in predicting liver patients using classification model. Health Technol. (Berl). 12(6), 1211–1235 (2022).
Article PubMed PubMed Central Google Scholar
USF Health. Data mining in healthcare (Accessed 3 December 2022); https://www.usfhealthonline.com/resources/healthcare-analytics/data-mining-in-healthcare/ (2021)
Health Chosun News. The "crisis of old age" has increased by 81% in 10 years, including depression (Accessed 4 December 2022); https://m.health.chosun.com/svc/news_view.html?contid=2020100801070 (2020) (Korean).
National Assembly Library of Korea. Welfare of Senior Citizens Act. 2019 (Accessed 4 Dec 2022); https://elaw.klri.re.kr/eng_mobile/viewer.do?hseq=49845&type=part&key=38 (2019).
Center for Strategic & International Studies. A Timeline of South Korea’s Response to COVID-19 (Accessed 3 July 2022); https://www.csis.org/analysis/timeline-south-koreas-response-covid-19 (2020).
Radloff, L. S. The CES-D scale: A self-report depression scale for research in the general population. Appl. Psychol. Meas. 1(3), 385–401 (1977).
Article Google Scholar
Li, J. et al. A multicenter random forest model for effective prognosis prediction in collaborative clinical research network. Artf. Intell. Med. 103, 101814 (2020).
Article Google Scholar
Vakharia, V. & Gujar, R. Prediction of compressive strength and Portland cement composition using cross-validation and feature ranking techniques. Constr. Build. Mater. 225, 292–301 (2019).
Article Google Scholar
Stein, M. B. & Heimberg, R. G. Well-being and life satisfaction in generalized anxiety disorder: Comparison to major depressive disorder in a community sample. J. Affect. Discord. 79(1–3), 161–166 (2004).
Article Google Scholar
Lee, J., Liu, M. & Lim, G. G. A study on the revitalization of tourism industry through big data analysis. J. Intell. Inf. Syst. 24(2), 149–169 (2018) (Korean).
Google Scholar
Chang, C. C. et al. Utilization of decision tree algorithms for supporting the prediction of intensive care unit admission of myasthenia gravis: A machine learning-based approach. J. Pers. Med. 12(1), 32 (2022).
Article PubMed PubMed Central Google Scholar
Wang, Q. et al. Predictive analysis of the pro-environmental behaviour of college students using a decision-tree model. Int. J. Environ. Res. Public Health 19(15), 9407 (2022).
Article CAS PubMed PubMed Central Google Scholar
Abdoli, N. et al. The global prevalence of major depressive disorder (MDD) among the elderly: A systematic review and meta-analysis. Neurosci. Biobehav. Rev. 132, 1067–1073 (2022).
Article PubMed Google Scholar
Park, M., Choi, S., Shin, A. M. & Koo, C. H. Analysis of the characteristics of the older adults with depression using data mining decision tree analysis. J. Korean Acad. Nurs. 43(1), 1–10 (2013) (Korean).
Article PubMed Google Scholar
Yun, K. & Lee, Y. J. Factors influencing depression in older adults according to family structure: Data from the 2020 National Older Koreans Data. J. Korean Gerontol. Nurs. 24(1), 1–12 (2022) (Korean).
Article Google Scholar
Kim, H. J. et al. Depression among elderly in long-term care facilities: Focusing on the prevalence and related factors. Korean J. Fam. Pract. 8(3), 455–461 (2018).
Article Google Scholar
Kim, B. Factors influencing depressive symptoms in the elderly: Using the 7th Korea National Health and Nutrition Examination Survey (KNHANES VII-1). J. Health Inf. Stat. 45(2), 165–172 (2020).
Article MathSciNet Google Scholar
Muhammad, T., Skariah, A. E., Kumar, M. & Srivastava, S. Socioeconomic and health-related inequalities in major depressive symptoms among older adults: A Wagstaff’s decomposition analysis of data from the LASI baseline survey, 2017–2018. BMJ Open 12(6), e054730 (2022).
Article CAS PubMed PubMed Central Google Scholar

Download references

Author information

Authors and Affiliations

Department of Health Policy and Management, Graduate School, Korea University, Seoul, Korea
Kyu-Min Kim & Jae-Hak Kim
BK21FOUR R&E Center for Learning Health Systems, Korea University, Seoul, Korea
Kyu-Min Kim
Department of Fitness Promotion and Rehabilitation Exercise, National Rehabilitation Center, Seoul, Korea
Jae-Hak Kim
Department of Health Policy and Management, College of Public Health Science, Korea University, Seoul, Korea
Hyun-Sill Rhee
Department of Preventive Medicine, College of Korean Medicine, Kyung Hee University, Seoul, Korea
Bo-Young Youn

Authors

Kyu-Min Kim
View author publications
You can also search for this author in PubMed Google Scholar
Jae-Hak Kim
View author publications
You can also search for this author in PubMed Google Scholar
Hyun-Sill Rhee
View author publications
You can also search for this author in PubMed Google Scholar
Bo-Young Youn
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

K.K. significantly contributed to analysis and preparation of the manuscript; J.K. contributed; H.R. and B.Y. contributed the concept of the study and helped perform the analysis. All authors have read and approved the manuscript.

Corresponding author

Correspondence to Bo-Young Youn.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Kim, KM., Kim, JH., Rhee, HS. et al. Development of a prediction model for the depression level of the elderly in low-income households: using decision trees, logistic regression, neural networks, and random forest. Sci Rep 13, 11473 (2023). https://doi.org/10.1038/s41598-023-38742-1

Download citation

Received: 12 March 2023
Accepted: 13 July 2023
Published: 16 July 2023
DOI: https://doi.org/10.1038/s41598-023-38742-1

This article is cited by

Diagnosis of disease affecting gait with a body acceleration-based model using reflected marker data for training and a wearable accelerometer for implementation
- Mohammad Ali Takallou
- Farahnaz Fallahtafti
- Fadi Alsaleem
Scientific Reports (2024)
Analysis and evaluation of explainable artificial intelligence on suicide risk assessment
- Hao Tang
- Aref Miri Rekavandi
- Mohammed Bennamoun
Scientific Reports (2024)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.