Abstract
Frequent emergency department use is associated with many adverse events, such as increased risk for hospitalization and mortality. Frequent users have complex needs and associated factors are commonly evaluated using logistic regression. However, other machine learning models, especially those exploiting the potential of large databases, have been less explored. This study aims at comparing the performance of logistic regression to four machine learning models for predicting frequent emergency department use in an adult population with chronic diseases, in the province of Quebec (Canada). This is a retrospective population-based study using medical and administrative databases from the Régie de l’assurance maladie du Québec. Two definitions were used for frequent emergency department use (outcome to predict): having at least three and five visits during a year period. Independent variables included sociodemographic characteristics, healthcare service use, and chronic diseases. We compared the performance of logistic regression with gradient boosting machine, naïve Bayes, neural networks, and random forests (binary and continuous outcome) using Area under the ROC curve, sensibility, specificity, positive predictive value, and negative predictive value. Out of 451,775 ED users, 43,151 (9.5%) and 13,676 (3.0%) were frequent users with at least three and five visits per year, respectively. Random forests with a binary outcome had the lowest performances (ROC curve: 53.8 [95% confidence interval 53.5–54.0] and 51.4 [95% confidence interval 51.1–51.8] for frequent users 3 and 5, respectively) while the other models had superior and overall similar performance. The most important variable in prediction was the number of emergency department visits in the previous year. No model outperformed the others. Innovations in algorithms may slightly refine current predictions, but access to other variables may be more helpful in the case of frequent emergency department use prediction.
Introduction
Although definitions may vary, individuals who visit emergency department (ED) at least three times per year are considered as “frequent users”1,2,3. Frequent ED users often display heterogeneous profiles—a combination of mental health disorders, physical comorbidities, and low socioeconomic status1,4,5—leading to complex needs that are not adequately dealt with in an ED6. A significant proportion of frequent ED users have numerous chronic diseases, such as coronary artery disease or chronic obstructive pulmonary disease4,7. Those conditions could be managed in primary care, preventing acute deteriorations that lead to ED use8. Since frequent ED use for complex needs may occur because those needs have not been adequately addressed in a primary care context, this type of ED use is considered suboptimal. As an indicator of unmet needs, it is associated with negative outcomes for patients (e.g., higher hospital admissions or mortality rates9,10). Furthermore, ED costs are generally higher than those in a primary care setting, resulting in a socioeconomical burden for the health system7,11,12. In the province of Quebec (Canada), frequent ED users with chronic diseases represent 9.2% of all the ED users but account for 28.8% of all ED visits13. Furthermore, a recent Canadian census shows that chronic condition prevalence will increase as the population age; the burden on the healthcare system is then likely to increase14.
Targeted interventions such as case management have been shown to help reduce ED visits and ED costs, while improving patient satisfaction and clinical outcomes2,15,16. In this context, being able to accurately predict frequent ED use is relevant to target users who may really benefit from it. Much work has been done with statistical models in this direction. In particular, logistic regression (LR) is a standard and widely used statistical model17. However, with the constant improvement of quantity and quality of measurements (electronic health records), statistical models, and computer capacity, modern machine learning (ML) models are becoming more and more popular. Previous studies have predicted frequent ED use for a specific issue18,19 or in a local hospital20 successfully using ML models other than LR. Yet, no study has been conducted comparing predictive power of ML models in a general population and considering chronic diseases.
This study aims at comparing the performance of a logistic regression to four ML models for frequent emergency department use in an adult population with chronic diseases, in the province of Quebec (Canada).
Methods
All methods in this study were carried out in accordance with the TRIPOD guidelines for model development and validation (see the Supplementary material Table S1)21.
Study design and data sources
This is a population-based retrospective cohort study. We used medico-administrative databases from the health insurance board of the province of Quebec (Régie de l’assurance maladie du Québec), which manages health insurance plan for Quebec citizens. The following files were used:
-
1.
The patient demographic register, which contains information about the sex, date of birth, date of death (if applicable), and place of residence of the patient;
-
2.
The physician reimbursement claim register, which contains information about medical services provided by a fee-for-service physician in Quebec: date of service, place of service (emergency, medical clinic, etc.), physician specialty, diagnosis (International Classification of Diseases, ninth revision or ICD-9), and the medical act procedure performed by the physician;
-
3.
The hospital register, which contains information about the reasons for hospitalization (main diagnosis and up to 25 secondary diagnoses coded in ICD-10), dates of admission and release from hospital, and all medical procedures performed during the hospitalization.
Selection of participants
The study population included all adults (18 years and older) living in the province of Quebec, with at least one ED visit during the inclusion period, i.e., between the 1st of January 2012 and the 31st of December 2013, diagnosed with at least one chronic condition, and without dementia. Patients with dementia may have special needs compared to cognitively intact patients and were thus not included. In this study, the diseases considered were those from the Canadian Institute for Health Information (see the Supplementary material Table S2): asthma, chronic obstructive pulmonary disease (COPD), congestive heart failure (CHF), coronary artery disease (CAD), diabetes, epilepsy, and high blood pressure (HBP)22. Those specific conditions, also known as ambulatory care sensitive conditions, are a set of chronic diseases for which timely intervention in primary care could reduce the risk of hospitalization or the occurrence of acute episodes for those diseases23,24,25. The index date was randomly assigned as one ED visit among all ED visits occurring during the inclusion period26. The index date is then used as a “starting point” for measuring patient characteristics, such as ED use, age, or diagnoses.
There were two exclusion criteria (Fig. 1). First, patients living in remote areas were excluded (6.8%). Remote areas were defined as municipalities with fewer than 10,000 inhabitants with weak or no metropolitan influence zone (the percentage of resident employed labour force who commute to work in urban areas is less than 5%). This exclusion ensured that remote residents who tend to use ED as an alternative to walk-in clinics (as there are fewer primary care alternatives27,28) were not included. However, patients living in municipalities with fewer than 10,000 inhabitants with high or moderate metropolitan influence were included. Secondly, patients who died during the year after their index date (8.3%) were excluded as they can require specialized healthcare, such as patients at the end of life29,30. Besides, that exclusion helped reducing immortal time bias31.
Outcome and independent variables
Frequent ED use was investigated using two different definitions: (1) having at least three visits (“frequent users 3”) and (2) having at least five visits (“frequent users 5”) during the year following the index date (as mentioned in the previous subsection, the index date is an assigned ED visit between 2012 and 2013). Those definitions were chosen amongst the most common ones in order to compare performance in two populations that were different, yet still considered frequent users.
Independent variables (or predictors) considered at the index date were sex, age, residential area (metropolitan: ≥ 100,000; small town: 10,000–100,000; rural: < 10,000 with high or moderate metropolitan influence), material and social deprivation indices32, public prescription drug insurance plan status (PPDIP, see below for the different statuses), having been hospitalized in the two years before the index date, the number of previous ED visits during the year before the index date (PV), and the combined comorbidity index of Charlson (CCI33). The following diagnoses were considered: chronic disease (one diagnosis for each condition, i.e. asthma, COPD, CHF, CAD, diabetes, epilepsy, and HBP), chronic non-cancer pain (CNCP)34, injury, common mental disorders (CMD)35, serious mental disorders (SMD)35, alcohol abuse, and drug abuse. Each condition was identified using the reported diagnoses in the hospital register (one diagnosis) or in the physician reimbursement claim register (at least two diagnoses), during a two-year period before the index date.
Regarding PPDIP status, the Quebec province has four different statuses: “regular recipient of PPDIP”, “admissible to PPDIP and age ≥ 65 years with guaranteed income supplement” (GIS), “not admissible to PPDIP” (individuals with a private insurance plan), or “admissible to PPDIP and being a recipient of last-resort financial assistance” (LRFA)36.
There were less than 5% missing observations, mainly for material and social deprivation indices, and those observations were kept.
Statistical analysis
Frequent ED use prediction is a case of supervised learning, meaning that there are explicit labelled classes (i.e., frequent user or not). Along with logistic regression (LR), four ML predictive models amongst the most efficient for predicting a binary outcome38 were evaluated:
-
1.
Gradient boosting machines (GBM) build an ensemble of successive decision trees; each tree is a weak learner that improves on the previous one using the residuals39. Tuning parameters were the learning rate and the trees depth.
-
2.
Naïve Bayes (NB) model is based on Bayes’ theorem and uses a priori probabilities40. The tuning parameter was the Laplace smoothing for probabilities.
-
3.
Neural networks (NN) feed data through interconnected hidden layers of “neurons”, which apply mathematical operations to the inputs (the independent variables)41. Tuning parameters were the number of neurons and the weight decay.
-
4.
Random forests (RF) apply sequential splits to the data such that the separation is maximized in regards to a homogeneity criterion (i.e., the Gini index), resulting in a tree-like structure40. RF were evaluated with a binary (RF1) and a continuous outcome (RF2). Tuning parameters were the number of trees and the homogeneity criterion used.
The cohort was randomly divided in a training set (80% of the cohort) for building models and a testing set (remaining 20%) for evaluating performance18,42. This procedure is common in order to minimize overfitting, a sensitive issue when dealing with ML algorithms43. Area under the ROC curve (AUC), sensibility (SEN), specificity (SPE), positive predictive value (PPV), and negative predictive value (NPV) were computed to compare performances. AUC 95% confidence intervals were also computed using DeLong’s method44. The same reasoning was adopted as in Grinspan et al.18, the predictability of a model was judged on its AUC, based on 5 categories: poor (0.50–0.59), fair (0.60–0.69), good (0.70–0.79), very good (0.80–0.89) and excellent (0.90–1.0)18. The best cut-off thresholds were selected using Youden’s statistic 45 in order to compute sensitivity, specificity, positive predictive value, and negative predictive value. All tuning parameters were optimized by searching for the maximum AUC, but only the results with the selected parameters are presented here for clarity and brevity purposes.
Results from ML models (except LR) are not as directly interpretable as those from regression models, which straightforwardly assess the effect of predictive variables on the outcome with quantities such as odd ratios. However, ML framework allows for the evaluation of variable importance in a prediction model (also called feature importance). It was computed as the mean decrease in the Gini index in the case of GBM and RF, as the combinations of the absolute values of the weights for NN, and as the absolute value of the t-statistic for LR43. While it is not possible to compare variable importance directly from one model to another due to the models being different in nature, variable importance is still useful as an interpretable and relative quantity about the contribution of each predictor. In our models, all the variables are categorical and GBM, LR, and NN compute variable importance relative to a baseline category while RF computes an overall variable importance. Of note, there is no available variable importance measure when using the NB algorithm.
Sensitivity analyses were conducted on a population of frequent users with at least four visits and with a 50/50 training and testing sets.
Statistical significance level was set at α = 0.05 and differences in descriptive statistics were evaluated using chi-square tests. All analyses were performed with statistical software programs SAS (version 9.4) and R (version 4.2 with packages e1071, nnet, ranger, and xgboost).
Ethics approval and consent to participate
The research ethics board of the Centre intégré universitaire de santé et de services sociaux de l’Estrie – Centre hospitalier universitaire de Sherbrooke (number MP-31–2017-1571 – Urgences-CPSA) approved this study. The need for informed consent was waived by the aforementioned research ethics board due to the retrospective nature of the study.
Results
Characteristics of participants
Out of 451,775 ED users, 43,151 (9.5%) and 13,676 (3.0%) were frequent users 3 and frequent users 5, respectively (Table 1). For both definitions, differences between frequent users and non-frequent users were statistically significant except for the residential area variable.
Main results
Multiple combinations of explicative variables were evaluated. The following variables were selected for their clinical interpretation and explicative power: age, public prescription drug insurance plan status, Charlson comorbidity index, number of previous ED visits, chronic obstructive pulmonary disease, injury, serious mental disorders, common mental disorders, chronic non-cancer pain, alcohol, and drugs. No missing values were observed in the variables selected for prediction.
Model performances are shown in Tables 2, 3, for frequent users 3 and 5 respectively. In both cases, RF1 (binary outcome) had poor performances regarding AUC and SEN, followed by NB (poor or fair). On the other hand, RF1 had the highest SPE and PPV. GBM, LR, NN, and RF2 had similar good performances (or very good in the case of GBM, LR, and RF2 for frequent users 5). Performances improved as the threshold for frequent use was increased from three to five visits, except for RF1. Overall, SPE (NPV) was higher than SEN (PPV).
Variable importance results are shown in Tables 4, 5. Those measures are relative, meaning that it is only possible to compare importance between variables in the same model (e.g., variable importance between LR and GBM are not comparable). However, the ranking of independent variables in each model can still be compared for all models, along with the relative magnitude. All models reported the number of previous ED visits as the most important variable for prediction. The magnitude by which it was superior to the other variables varied considerably. CCI and PPDIP were also important, but to a lesser extent (for instance, their importance was respectively 6 and 12 times less than PV for RF2 in the case of frequent users 5). Among chronic diseases, COPD was the most important.
No significant changes were observed in the interpretation of results during sensitivity analyses.
Limitations
Both quantity and quality of data are imperative in a ML context. In this study, we had access to an exhaustive medico-administrative database which included hospital and physician data, but it did not include patient reported outcomes (e.g., perceived health, included in the Canadian Community Health Survey46). Those latter could improve the predictive power of models in future work. For instance, studies using national health surveys and telephone interviews found that fair or poor health status and dissatisfaction with treatment outcome were significantly associated with frequent ED use47,48.
Our study focused on frequent ED users with chronic diseases; though results should only be generalized to this population, chronic diseases are common in the frequent ED user population. Better understanding of a population of ED users with chronic diseases is relevant for other healthcare aspects as chronic diseases are linked not only to frequent ED use, but also to hospitalisations, functioning, and deaths24,49,50.
Discussion
This paper aimed at comparing four ML prediction models (gradient boosting machine, naïve Bayes, neural networks, and random forests) with logistic regression, for frequent ED use in a population with chronic diseases. Those ML models have been successfully used to predict related issues, such as ED revisits, in hospital mortality, or hospital admissions at ED triage42,51,52. Accurate ML models may help for early identification of frequent ED use, thus improving targeted interventions such as case management2,15,16. To this end, case-finding tools are appropriate, such as CONECT-6 which was derived from LR models53.
Model performance
In our study, no model clearly outperformed the others. Other studies on frequent ED use that applied ML reached a similar conclusion18,20,54, though they either focused on a specific chronic disease such as asthma or epilepsy or used hospital only data. In fact, a recent systematic review aiming at comparing performances of LR with ML models (among which figured the ones used in this study) for clinical prediction of a binary outcome showed that there is currently no clear performance benefit of ML models55. However, this review included only studies that used clinical data. Other studies that focus on ED related issues (e.g. risk of emergency hospital admission, risk for sepsis, heart failure readmission) found improved predictions with ML56,57, although this is not a general rule58. Quantity of variables (58 to 121 variables56) or very discriminative variables57 explained those improved predictions, amongst others. In our models, increasing the threshold for frequent ED use (thus reducing the number of frequent ED users) gave slightly better performances for all models. A higher threshold increased the homogeneity of the characteristics of frequent users, thus facilitating prediction of their ED use, a result that has already been observed54. Other studies also compared LR to ML models using administrative claims data59,60 and found similar performances, though they did not focus on ED-related outcomes.
In medical studies, the signal-to-noise ratio is often low, i.e., the amount of information contained in the database that is useful for the prediction61, which may explain in part the modest improvements (if any) of ML models. The type of available variables may also affect performance. For instance, in a study about uncontrolled diabetes prediction, LR was outperformed by NN or GBM62. The authors used data from administrative claims and from US census, in which they had access to social determinants, such as food insecurity or recreational park access. It is possible to tune more precisely ML models to overcome those limitations. In our study, this fine-tuning would have amounted to evaluate model parameters over broader spaces. As an example, NN is known for its ability to model complex and nonlinear relationships by combining multiple hidden layers41, which is limited for a traditional LR. Broader ranges could also be evaluated; GBM has shown good performances and helped refine clinical tools when allowed to learn slowly57. However, this fine-tuning comes with a high computation cost and an added complexity. This latter drawback may result in overfitting issues and limited generalization.
Our models had higher sensitivity than positive predictive value, apart from RF1. This means that most models accounted for a fair portion of frequent ED users, but the number of false positives was significant. This contrasts with another study on frequent ED use among children with asthma20. The authors also applied ML models (LR and RF amongst others) and found higher predictive positive values48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70 than sensitivity16,17,18,19,20,21,22,23,24,25,26,27. However, their threshold choice was guided by a maximization of the AUC rather than by a statistical criterion. Besides, specificity and negative predicted values were high in our study. This is a known issue when dealing with imbalanced health datasets (frequent users 3 and 5 represented 9.5% and 3.0% of the cohort respectively)63,64. Algorithms learns mostly from the majority class, which introduces a bias towards non-frequent users. There are possible adjustments such as under-sampling from the majority class or over-sampling from the minority class, but those may not be recommended procedures as they distort prevalence55. Learning from highly imbalanced datasets is an active research area65 and may affect prediction for frequent ED use in the future, especially if combined with multiple different models66.
Variable importance
The models developed in our study are also interesting from a clinical point of view (i.e., risk stratification by variable for frequent ED use). In our study, CCI, PPDIP, and the previous number of ED visits were important, though the latter was the most important variable by a large margin. This result is supported by other studies on frequent ED use conducted with LR26,67,68, but also with ML models18,20. In fact, this variable is usually so important that Brennan et al.68 stated that “targeting patients with the most extreme number of ED visits may be the best and most practical option for targeted interventions”, thus allowing for optimal resource allocation. Hudon et al. (2020) also found that a LR including this variable and having a previous hospitalization performed almost as well as models with more variables such as comorbidities, sociodemographic status, and public prescription drug insurance status26. Even when predicting other ED related-outcomes, the previous number of visits is relevant: Rahimian et al.56 predicted emergency hospital admission (after an ED visit) using RF and GBM and found that it was the most important variable. They also found that other variables are excellent predictors of emergency hospital admission, such as laboratory test results (e.g., cholesterol ratio, haemoglobin, platelets).
Conclusions
Frequent ED use is a major issue in primary and emergency care, and ML models are becoming increasingly popular in medicine and healthcare in general. They are rapidly evolving, offering new opportunities, and while there has been substantial theoretical progress with ML models, the small improvements do not show a clear superiority over simpler models. Those latter still display reasonable performances55,69. In our study, LR was as successful in predicting frequent ED use as other ML models, while the number of ED visits was the most important variable. Access to other variables may be more helpful for refining prediction in the case of frequent ED use, such as patient-reported outcomes or clinical notes. Those types of data have been successfully used with machine learning models in a context of primary care, although not for ED use prediction70,71. Future work also includes considering complex non-linear interactions, where ML models outperform traditional ones72.
Data availability
The datasets analysed during the current study are not publicly available due to provincial laws about privacy. Although we acknowledge the importance of data availability for reproducible research, our research team is bound by legal reasons to not divulge any part of the data. The Commission de l’accès à l’information du Québec is the provincial organisation that reviews research projects and allows researchers to access health databases. It is also responsible for ensuring their privacy as those databases contain sensitive patient information and it does not legally allow for making any part of them public. Therefore, we are not able to make any part of our data publicly available.
Abbreviations
- AUC:
-
Area under the curve
- CAD:
-
Coronary artery disease
- CCI:
-
Charlson comorbidity index
- CHF:
-
Congestive heart failure
- CMD:
-
Common mental disorders
- CNCP:
-
Chronic noncancer pain
- COPD:
-
Chronic obstructive pulmonary disorder
- ED:
-
Emergency department
- HBP:
-
High blood pressure
- GBM:
-
Gradient boosting machine
- LR:
-
Logistic regression
- ML:
-
Machine learning
- NB:
-
Naïve bayes
- NN:
-
Neural network
- NPV:
-
Negative predicted value
- PPDIP:
-
Public prescription drug insurance plan
- RF:
-
Random forests
- SEN:
-
Sensitivity
- SMD:
-
Serious mental disorders
- SPE:
-
Specificity
- PPV:
-
Positive predicted value
References
Krieg, C., Hudon, C., Chouinard, M. C. & Dufour, I. Individual predictors of frequent emergency department use: A scoping review. BMC Health Serv. Res. 16(1), 1–10 (2016).
Kumar, G. S. & Klein, R. Effectiveness of case management strategies in reducing emergency department visits in frequent user patient populations: A systematic review. J. Emerg. Med. 44(3), 717–729 (2013).
Soril, L. J., Leggett, L. E., Lorenzetti, D. L., Noseworthy, T. W. & Clement, F. M. Characteristics of frequent users of the emergency department in the general adult population: A systematic review of international healthcare systems. Health Policy 120(5), 452–461 (2016).
Giannouchos, T. V., Kum, H. C., Foster, M. J. & Ohsfeldt, R. L. Characteristics and predictors of adult frequent emergency department users in the United States: A systematic literature review. J. Eval. Clin. Pract. 25(3), 420–433 (2019).
Dufour, I. et al. Frequent emergency department use by older adults with ambulatory care sensitive conditions: A population-based cohort study. Geriatr. Gerontol. Int. 20(4), 317–323 (2020).
Cunningham, A., Mautner, D., Ku, B., Scott, K. & LaNoue, M. Frequent emergency department visitors are frequent primary care visitors and report unmet primary care needs. J. Eval. Clin. Pract. 23(3), 567–573 (2017).
Billings, J. & Raven, M. C. Dispelling an urban legend: frequent emergency department users have substantial burden of disease. Health Aff. (Millwood) 32(12), 2099–2108 (2013).
Atzema, C. L. & Maclagan, L. C. The transition of care between emergency department and primary care: A scoping study. Acad. Emerg. Med. 24(2), 201–215 (2017).
Sun, B. C., Burstin, H. R. & Brennan, T. A. Predictors and outcomes of frequent emergency department users. Acad. Emerg. Med. 10(4), 320–328 (2003).
Ellis, G., Marshall, T. & Ritchie, C. Comprehensive geriatric assessment in the emergency department. Clin. Interv. Aging 9, 2033–2044 (2014).
Mitchell, M. S., Leon, C. L. K., Byrne, T. H., Lin, W. C. & Bharel, M. Cost of health care utilization among homeless frequent emergency department users. Psychol. Serv. 14(2), 193–202 (2017).
LaCalle, E. & Rabin, E. Frequent users of emergency departments: the myths, the data, and the policy implications. Ann. Emerg. Med. 56(1), 42–48 (2010).
Institut canadien d’information sur la santé. SNISA — Nombre de Visites au Service d’Urgence et Durée du Séjour par Province et Territoire, 2018–2019. ICIS (2019).
Statistics Canada. Population Projections for Canada (2018 to 2068), Provinces and Territories (2018 to 2043) (2019).
Hudon, C. et al. Characteristics of case management in primary care associated with positive outcomes for frequent users of health care: A systematic review. Ann. Fam. Med. 17(5), 448–458 (2019).
Sutherland, D. & Hayter, M. Structured review: Evaluating the effectiveness of nurse case managers in improving health outcomes in three major chronic diseases. J. Clin. Nurs. 18(21), 2978–2992 (2009).
Chiu, Y. et al. Statistical tools used for analyses of frequent users of emergency department: A scoping review. BMJ Open 9(5), e027750 (2019).
Grinspan, Z. M. et al. Predicting frequent ED use by people with epilepsy with health information exchange data. Neurology 85(12), 1031–1038 (2015).
Patel, S. J., Chamberlain, D. B. & Chamberlain, J. M. A machine learning approach to predicting need for hospitalization for pediatric asthma exacerbation at the time of emergency department triage. Acad. Emerg. Med. 25(12), 1463–1470 (2018).
Das, L. T. et al. Predicting frequent emergency department visits among children with asthma using EHR data. Pediatr. Pulmonol. 52(7), 880–890 (2017).
Collins, G. S., Reitsma, J. B., Altman, D. G. & Moons, K. G. M. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): The TRIPOD statement. BMC Med. 13(1), 1 (2015).
Canadian Institute for Health Information. Ambulatory Care Sensitive Conditions 2019 [Available from: http://indicatorlibrary.cihi.ca/display/HSPIL/Ambulatory+Care+Sensitive+Conditions.
Gibson, O. R., Segal, L. & McDermott, R. A. A systematic review of evidence on the association between hospitalisation for chronic disease related ambulatory care sensitive conditions and primary health care resourcing. BMC Health Serv. Res. 13(1), 336 (2013).
Sanmartin C, Khan S, l’équipe de Recherche de l’Initiative sur les Données Longitudinales Administratives et sur la Santé. Hospitalisations Pour des Conditions Propices aux Soins Ambulatoires (CPSA) : Les Facteurs qui Importent (2011).
Hsieh, V. C., Hsieh, M. L., Chiang, J. H., Chien, A. & Hsieh, M. S. Emergency department visits and disease burden attributable to ambulatory care sensitive conditions in elderly adults. Sci. Rep. 9(1), 3811 (2019).
Hudon, C. et al. Risk of frequent ED utilization among an ambulatory care sensitive condition population: a population-based cohort study. Med. Care 58(3), 248–256 (2020).
Rechel, B. et al. Hospitals in rural or remote areas: An exploratory review of policies in 8 high-income countries. Health Policy 120(7), 758–769 (2016).
Haggerty, J. L., Roberge, D., Pineault, R., Larouche, D. & Touati, N. Features of primary healthcare clinics associated with patients’ utilization of emergency rooms: Urban–rural differences. Healthc Policy 3(2), 72 (2007).
Rosenwax, L. K. et al. Hospital and emergency department use in the last year of life: A baseline for future modifications to end-of-life care. Med. J. Aust. 194(11), 570–573 (2011).
Barbera, L., Taylor, C. & Dudgeon, D. Why do patients with cancer visit the emergency department near the end of life?. Can. Med. Assoc. J. 182(6), 563–568 (2010).
Lévesque, L. E., Hanley, J. A., Kezouh, A. & Suissa, S. Problem of immortal time bias in cohort studies: Example using statins for preventing progression of diabetes. BMJ 340, b5087 (2010).
Pampalon R, Hamel D, Gamache P. The Quebec Index of Material and Social Deprivation: Methodological Follow-up, 1991 Through 2006: Institut National de Santé Publique du Québec (2011).
Simard, M., Sirois, C. & Candas, B. Validation of the combined comorbidity index of charlson and elixhauser to predict 30-day mortality across ICD-9 and ICD-10. Med. Care 56(5), 441–447 (2018).
Lacasse, A., Ware, M. A., Dorais, M., Lanctôt, H. & Choinière, M. Is the Quebec provincial administrative database a valid source for research on chronic non-cancer pain?. Pharmacoepidemiol. Drug Saf. 24(9), 980–990 (2015).
Gaulin, M., Simard, M., Candas, B., Lesage, A. & Sirois, C. Combined impacts of multimorbidity and mental disorders on frequent emergency department visits: A retrospective cohort study in Quebec, Canada. CMAJ 191(26), E724–E732 (2019).
Éducaloi. The Public Drug Insurance Plan Québec2020 [Available from: https://educaloi.qc.ca/en/capsules/the-public-drug-insurance-plan/.
Huang, J. A., Weng, R. H., Lai, C. S. & Hu, J. S. Exploring medical utilization patterns of emergency department users. J. Formos. Med. Assoc. 107(2), 119–128 (2008).
Alanazi, H. O., Abdullah, A. H. & Qureshi, K. N. A critical review for developing accurate and dynamic predictive models using machine learning methods in medicine and health care. J. Med. Syst. 41(4), 69 (2017).
Friedman, J. H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 1189–1232 (2001).
James, G., Witten, D., Hastie, T. & Tibshirani, R. Introduction to Statistical Learning with Applications in R (Springer, 2013).
Hastie, T., Tibshirani, R. & Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction (Springer, 2009).
Hong, W. S., Haimovich, A. D. & Taylor, R. A. Predicting hospital admission at emergency department triage using machine learning. PLoS ONE 13(7), e0201016 (2018).
Kuhn, M. & Johnson, K. Applied Predictive Modeling (Springer, 2013).
Sun, X. & Xu, W. Fast implementation of DeLong’s algorithm for comparing the areas under correlated receiver operating characteristic curves. IEEE Signal Process. Lett. 21(11), 1389–1393 (2014).
Le, C. T. A solution for the most basic optimization problem associated with an ROC curve. Stat. Methods Med. Res. 15(6), 571–584 (2006).
Statistics Canada. Canadian Community Health Survey - Annual Component (CCHS) 2020 [Available from: https://www23.statcan.gc.ca/imdb/p2SV.pl?Function=assembleDESurv&DECId=113674&RepClass=591&Id=1263799&DFId=180541.
Vinton, D. T., Capp, R., Rooks, S. P., Abbott, J. T. & Ginde, A. A. Frequent users of US emergency departments: Characteristics and opportunities for intervention. Emerg. Med. J. 31(7), 526–532 (2014).
Huang, J. A., Tsai, W. C., Chen, Y. C., Hu, W. H. & Yang, D. Y. Factors associated with frequent use of emergency services in a medical center. J. Formos. Med. Assoc. 102(4), 222–228 (2003).
Rizzuto, D., Melis, R. J. F., Angleman, S., Qiu, C. & Marengoni, A. Effect of chronic diseases and multimorbidity on survival and functioning in elderly adults. J. Am. Geriatr. Soc. 65(5), 1056–1060 (2017).
Statistics Canada. Table 13-10-0800-01 Deaths and mortality rate (age standardization using 2011 population), by selected grouped causes (2020) [Available from: https://doi.org/10.25318/1310080001-eng.
Hao, S. et al. Risk prediction of emergency department revisit 30 days post discharge: A prospective study. PLoS ONE 9(11), e112944 (2014).
Taylor, R. A. et al. Prediction of in-hospital mortality in emergency department patients with sepsis: A local big data-driven, Machine Learning Approach. Acad. Emerg. Med. 23(3), 269–278 (2016).
Hudon, C. et al. CONECT-6: A case-finding tool to identify patients with complex health needs. BMC Health Serv. Res. 21(1), 1–9 (2021).
Pereira, M. et al. (eds) Predicting Future Frequent Users of Emergency Departments in California State (Association for Computing Machinery, Inc, 2016).
Christodoulou, E. et al. A systematic review shows no performance benefit of machine learning over logistic regression for clinical prediction models. J. Clin. Epidemiol. 110, 12–22 (2019).
Rahimian, F. et al. Predicting the risk of emergency admission with machine learning: Development and validation using linked electronic health records. PLoS Med. 15(11), e1002695 (2018).
Delahanty, R. J., Alvarez, J., Flynn, L. M., Sherwin, R. L. & Jones, S. S. Development and evaluation of a machine learning model for the early identification of patients at risk for sepsis. Ann. Emerg. Med. 73(4), 334–344 (2019).
Frizzell, J. D. et al. Prediction of 30-day all-cause readmissions in patients hospitalized for heart failure: Comparison of machine learning and other statistical approaches. JAMA Cardiol. 2(2), 204–209 (2017).
Desai, R. J., Wang, S. V., Vaduganathan, M., Evers, T. & Schneeweiss, S. Comparison of machine learning methods with traditional models for use of administrative claims with electronic medical records to predict heart failure outcomes. JAMA Netw. Open 3(1), e1918962-e (2020).
MacKay, E. J. et al. Application of machine learning approaches to administrative claims data to predict clinical outcomes in medical and surgical patient populations. PLoS ONE 16(6), e0252585 (2021).
Ennis, M., Hinton, G., Naylor, D., Revow, M. & Tibshirani, R. A comparison of statistical learning methods on the Gusto database. Stat. Med. 17(21), 2501–2508 (1998).
Basu, S. & Narayanaswamy, R. a prediction model for uncontrolled type 2 diabetes mellitus incorporating area-level social determinants of health. Med. Care 57(8), 592–600 (2019).
Liu, Y. -Q., Wang, C., Zhang, L. (eds) Decision tree based predictive models for breast cancer survivability on imbalanced data. In 2009 3rd International Conference on Bioinformatics and Biomedical Engineering (IEEE, 2009).
Dubey, R., Zhou, J., Wang, Y., Thompson, P. M. & Ye, J. Alzheimer’s disease neuroimaging I. Analysis of sampling techniques for imbalanced data: An n = 648 ADNI study. Neuroimage 87, 220–241 (2014).
Klement, W., Wilk, S., Michalowski, W., Matwin, S. (eds) Classifying severely imbalanced data. In Canadian Conference on Artificial Intelligence (Springer, 2011).
Huang, F., Wang, S., Chan, C. –C. (eds) Predicting disease by using data mining based on healthcare information system. In 2012 IEEE International Conference on Granular Computing (IEEE, 2012).
Okuyemi, K. S. & Frey, B. Describing and predicting frequent users of an emergency department. J. Assoc. Acad. Minor. Phys. 12(1–2), 119–123 (2001).
Brennan, J. et al. Predicting frequent use of emergency department resources. Ann. Emerg. Med. 4(64), S118–S119 (2014).
Hand, D. J. Classifier technology and the illusion of progress. Stat. Sci. 21, 1–14 (2006).
Verma, D., Bach, K., Mork. P. J. (eds) Application of machine learning methods on patient reported outcome measurements for predicting outcomes: A literature review. In Informatics (MDPI, 2021).
Hylan, T. R. et al. Automated prediction of risk for problem opioid use in a primary care setting. J. Pain 16(4), 380–387 (2015).
Orfanoudaki, A. et al. Machine learning provides evidence that stroke risk is not linear: The non-linear Framingham stroke risk score. PLoS ONE 15(5), e0232414 (2020).
Funding
This study was supported by the Fonds de recherche du Québec – Santé, the Fonds de recherche du Québec – Nature et technologie, the Québec SPOR SUPPORT Unit, and the Centre de recherche du Centre hospitalier de l’université de Sherbrooke.
Author information
Authors and Affiliations
Contributions
C.H. and A.V. acquired the funding. Y.M.C., J.C., and C.H. were involved in data analysis. Y.M.C. and C.H. wrote the original draft. Y.M.C., A.V., J.C., I.D., and C.H. contributed to the study conceptualization, interpreted, and validated the results, and reviewed and edited the submitted manuscript. All data used in this study were fully anonymized and no individual detail was used, therefore, consent for publication is not applicable.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Chiu, Y.M., Courteau, J., Dufour, I. et al. Machine learning to improve frequent emergency department use prediction: a retrospective cohort study. Sci Rep 13, 1981 (2023). https://doi.org/10.1038/s41598-023-27568-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-023-27568-6
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.