Age-adjusted suicide rates in the US have increased over the past two decades across all age groups. The ability to identify risk factors for suicidal behavior is critical to selected and indicated prevention efforts among those at elevated risk of suicide. We used widely available statewide hospitalization data to identify and test the joint predictive power of clinical risk factors associated with death by suicide for patients previously hospitalized for a suicide attempt (N = 19,057). Twenty-eight clinical factors from the prior suicide attempt were found to be significantly associated with the hazard of subsequent suicide mortality. These risk factors and their two-way interactions were used to build a joint predictive model via stepwise regression, in which the predicted individual survival probability was found to be a valid measure of risk for later suicide death. A high-risk group with a four-fold increase in suicide mortality risk was identified based on the out-of-sample predicted survival probabilities. This study demonstrates that the combination of state-level hospital discharge and mortality data can be used to identify suicide attempters who are at high risk of subsequent suicide death.
Suicide is a serious public health concern in the United States resulting in over 47,000 deaths each year1. Recent analysis indicates that the overall age-adjusted suicide rates have increased in the United States from 1999 to 2016, with increases reported among men and women and across all age groups2 despite the fact that suicide screening questions are a standard component of the clinical psychiatric interview. The ability to identify demographic and health event-related risk factors is critical to selected and indicated prevention efforts among those at elevated risk of suicide3.
The expanded use of electronic health records (EHR) in the US has stimulated efforts to identify patients at risk of suicide in different populations. A handful of studies using EHR and claims data have employed data mining and machine learning approaches to predict suicidal behavior and suicide mortality among patients in large healthcare systems4,5,6,7,8. Such studies have not only confirmed the importance of prominent clinical risk factors for suicide attempts and death identified in prior research (e.g., mental health diagnoses, particularly depressive disorders9, substance use disorders10, adverse childhood experiences11, HIV and sleep disorders12), but have also identified myriad other characteristics and features in their predictive algorithms that lead to greatly improved predictive accuracy compared to previous efforts3,5,14. In addition, follow-up of patients completing suicide risk assessments have found that predictive models applied to EHR data achieved higher sensitivity and specificity in identifying suicidal behavior than clinical assessments15. Similar findings were observed in a study utilizing veterans’ health data, providing additional evidence that, while clinicians may identify a state of risk using traditional clinical assessment techniques, predictive models are capable of identifying higher-risk patients who are missed during clinical assessments and are most likely to complete a lethal suicidal act4,16.
Despite the promise of using large healthcare databases to identify patients at risk of suicide, a critical challenge still remains: how to incorporate such models into clinical practice in diverse healthcare settings. Although there are many aspects to this challenge, including the alteration of clinical workflows, training providers and staff to respond appropriately to suicide risk17, and creating access to behavioral health treatment resources18, perhaps the most daunting among these is the limited data available to most healthcare providers. The rich datasets from which the most comprehensive and accurate algorithms have been generated are derived from large and sophisticated integrated delivery systems, health plans, and research networks4,6,19. Health system-wide medical records data of this nature are not, and may likely never be, available to the vast majority of healthcare providers in the US.
This study seeks to address this limitation by using available state-wide data to predict suicide mortality among patients previously hospitalized for a suicide attempt. There is strong evidence that individuals who have previously attempted suicide are at substantially elevated risk for subsequent death by suicide20, and risk stratification among patients in this vulnerable cohort could inform all aspects of care. The hospital discharge data used in this analysis are widely available in the US. Moreover, All-Payer Claims Databases (APCDs), which contain both inpatient and outpatient claims along with information related to pharmacy utilization, imaging, and laboratory data, are currently deployed in 27 states covering two-third of the US population21,22. If such data can be used to generate accurate models of subsequent suicide risk, their widespread availability would allow them to be employed in healthcare settings throughout the US.
We analyzed data from adult patients (≥ 18 and ≤ 70 years) hospitalized for suicide attempts in Connecticut acute care hospitals between October 1, 2004–September 30, 2012. Patients under age 18 and over age 70 were omitted from the analysis due to concerns related to both bias and generalizability. When modeling the likelihood of death due to suicide, deaths from other causes may result in substantial bias in model estimation among elderly patients. In addition, many studies show that risk factors for suicide vary across different age groups, especially in children and elderly23,24. To address these concerns, we have limited the study to adult patients under 70 years of age at the time of their last admission. Patients with hospitalizations for suicide attempts were identified using both E-codes and other ICD-9 code combinations indicative of suicidal behaviors (supplemental digital content Table 1)16,17,18.
We obtained de-identified discharge data from the Connecticut Hospital Inpatient Discharge Database and mortality data indicating cause of death from the Office of Connecticut Medical Examiner. Both contained a unique identifier within each dataset, although in the case of the discharge data this identifier was only consistent within hospitals. To detect multiple admissions for the same patient across hospitals and to integrate the hospitalization and mortality data, a unique patient identifier was generated using the patient’s date of birth, sex, race, and ethnicity, based on previous work indicating that such characteristics can be used to accurately link individuals across databases25. For each patient, multiple admission records were aggregated to the time of the most recent nonfatal attempt. Patients who died during their only hospitalized attempt were excluded (~ 1%). Of the 571 matches between the 2 datasets, 93.7% were unique; the remaining 6.3% involved the linkage of a hospitalization for suicide attempt with multiple suicide death events. For these cases the time of death was randomly assigned from one of the two matching records.
The Connecticut Department of Public Health Human Investigations Committee approved this research project. This project was ruled as non-human subjects research by the University of Connecticut Health Center Institutional Research Board. This research involved no interaction with human subjects.
Our analysis included sociodemographic variables including patient’s age, sex, race, and Hispanic ethnicity; the frequency and duration of hospitalizations including number of suicide-related admissions, and average length-of-stay across admissions; primary and secondary ICD-9 diagnosis codes, procedure codes, and discharge status. The first three digits of the ICD-9 codes were used as indicator variables. The primary outcome variable was time to death by suicide.
For each patient, the follow-up period for survival modeling began at the most recent nonfatal hospitalization for suicide attempt and continued until death or the end of the study period on September 30, 2012. Since there were a large number of factors (> 400), a marginal variable screening procedure was performed. We tested the association of each variable with survival time using a Cox proportional hazard regression model that controlled for race/ethnicity, sex and age. Variables with p-value less than 0.05 were kept for predictive modeling analysis. Subsequently, a stepwise Cox model was used for variable selection and model estimation, with the main-effects and two-way interactions of variables passing the screening included as candidate predictors. The final model was adjusted to include both main effects whenever an interaction term was selected. We tested the proportional hazard assumption for each selected variable as well as for the overall model; all tests indicated that the assumption was not violated (p = 0.34). For ease of interpretation, we chose the estimated 5-year survival probability as the risk measure.
To objectively determine a survival probability cut-off to identify high-risk patients and to assess the predictive power of the Cox model, we conducted an out-of-sample random-splitting procedure. The data was randomly split into 80% for training and 20% for testing. The Cox model was fitted using the training data, and the fitted model was then used to estimate the 5-year survival probabilities of the patients in the testing data. A high-risk group was identified as patients whose estimated probabilities exceed certain cut-off value. For each candidate cut-off, we computed (1) the risk ratio between the high-risk group and the testing cohort, defined as the ratio for observed deaths within 5-years, and (2) the relative size of the high-risk group among the test subjects. The accuracy of risk classification was then assessed by the Area Under the ROC Curve (AUC). This random-splitting procedure was repeated 300 times, and results were averaged.
"Novel Predictors Of Suicide Mortality: A Statewide Analysis" presented at the Mental Health Services Research Conference organized by the National Institute of Mental Health, Washington, D.C., August 1–4, 2018.
Risk factor identification
Table 1 presents the composition of the study population by age group, sex, race/ethnicity and median household income of the patient’s residential zip code. We observed 571 suicide deaths among 19,057 patients hospitalized for suicide attempts by the end of the study period. Men, non-Hispanic Whites, those aged 45–59 years, and those living in zip codes with higher median incomes were at highest risk for suicide mortality. Table 2 presents further information on the method used for the (most recent) prior suicide attempt, the number of previous attempts, and psychiatric diagnoses at the prior attempt. Multiple previous attempts were associated with subsequent mortality, and while more than half of all patients had a mental health diagnosis, there was no association between these mental health conditions and death by suicide ( χ2 test; p = 0.15).
We present data from our analyses of the risk factors for later suicide mortality in Table 3. This table combines the results of two separate analyses. First, since there were a large number of potential predictive factors (> 400), a marginal variable screening procedure was performed. Table 3 (model 1) presents the marginal effects of 28 risk factors that were significantly associated with suicide death after controlling for age, sex and race. The significant marginal effects and their 2-way interactions were then used to build a Cox proportional hazards model, with the final terms selected using a stepwise estimation method. Detailed information about the coefficients, confidence intervals and significance levels for all the factors in the selected Cox proportional hazards model are included in Table 3 (model 2), although we caution against the interpretation of individual parameter significance in this specification of the model since the computation of the p-values does not account for the uncertainty in predictor selection.
Results presented in Tables 3 and 4 (model 2) indicated that the socio-demographic factors positively associated with suicide deaths included being male, older age, White race and higher median household income. Diagnosis codes including organic sleep disorders, seizure without major comorbidities, other psychosocial circumstances and other persons seeking consultations were positive predictors of suicide deaths. Diagnostic codes related to method of suicide attempt including accidental poisoning by drugs, medicinal substance and biological, injury undetermined whether accidentally or purposely inflicted and other unspecified disorders of back were also significant predictors of suicide deaths. Many medical procedures that were likely due to the method and severity of the suicide attempt, such as procedures on the esophagus, suture of the tongue, and surgeries on bones particularly tibia and fibula, were positive predictors of suicide deaths. In addition, suicide attempts accompanied by operations on the penis were associated with subsequent suicide death.
Several significant interactions terms were also selected into the final Cox model. Patients who were discharged or transferred to a psychiatric hospital or a psychiatric unit of the same hospital had higher suicide risk, and this effect was much stronger for non-Whites compared to Whites. A similar interaction was observed between race and operations on the bone, with the sign of this effect indicating a lesser impact of such operations among Whites. Among patients with multiple suicide attempts and aortic and heart assistant procedures, the mortality risk was higher. Coexistence of multiple sclerosis with other non-organic psychoses also increased the risk of later suicide death. Non-organic psychoses interacted with open wound of other and unspecified sites to increase the risk of suicide deaths. A number of other interactions were observed among diagnostic codes related to methods of suicide including poisoning, back disorders, open wounds and hanging.
The estimated 5-year survival probability was used as a risk measure to identify high-risk patients. Figure 1 demonstrates the relationships between the probability cut-off, the size of the high-risk group relative to the general cohort, and the increase in suicide risk, based on the out-of-sample random splitting procedure. As expected, the lower the cut-off value, the higher the overall risk level of the identified high-risk group, and the smaller the size of the high-risk group. Our results show that if the high-risk group is defined as consisting of subjects whose 5-year survival probabilities were smaller than 0.90, then it equaled 4.9% [90% CI: (3.9, 5.8)] of the general cohort, and the risk of death in this group was on average 3.71 (90% CI: [2.371, 5.435]) times that in the general cohort.
In Fig. 2, we present the out-of-sample mean ROC curve and its 90% confidence bands computed from the random splitting procedure. With 80% sensitivity our model can achieve 55.2% specificity (90% CI:[48.9, 61.7]), with 50% sensitivity our model can achieve 79.6% specificity (90% CI:[75.8, 83.3]), and the mean AUC is 73.4% (90% CI:[70.6, 76.7]). The positive predictive value (PPV) is 7.1% [90% CI: (6.1%, 8.5%)] with a sensitivity of 0.5 and is 5.26% [90% CI: (4.6%, 6.0%)] with a sensitivity of 0.8, making this one of the best performing suicide prediction models published to date19.
In this study, we used widely available healthcare data to develop an interpretable model to predict suicide mortality following a prior attempt. In addition to augmenting the small but growing body of research on suicide mortality in high-risk populations5,20,21,22, our findings are relevant to the substantial portion of eventual suicides who first come to the attention of mental health clinicians through a prior suicide attempt, and show that clinical and contextual features from the prior attempt can be harvested from data to create a predictive model with good sensitivity and specificity. In fact, in comparison to other suicide mortality prediction models26, the sensitivity, specificity, and positive predictive value reported above makes our model among the best performing suicide mortality models published to date.
Several novel risk factors emerged from our analysis, including non-malignant pancreatic disorders, medical procedures associated with the prior attempt that could be indicative of the severity of injury, such as non-operative intubation and irrigation, aortic and heart assistant procedures, operations on bones, and operations on the penis. Regarding markers of injury severity, the clinical management of highly lethal suicide methods such as hanging often involves aggressive resuscitation and treatment of post-anoxic brain injury requiring intubation of attempters27,28. In case of suicide attempts by jumping, research evidence has demonstrated that lethal attempts have a very high probability of fractures of the upper limb which drastically increase surgical and inpatient workload due to the need for operations on bones29. Although operations on the penis were observed in a very small number of cases, they were associated with an eight-fold risk of later suicide death and may be indicative of the very high risk of suicide in patients with severe psychosis, which has been associated with genital mutilation26.
While the identification of specific markers for suicide mortality in psychiatric practice is important, the major contribution of this analysis lies in using data available in healthcare settings to identify the highest risk members of this already high-risk cohort. The cohort of patients hospitalized for suicide attempts included in this analysis accounted for approximately 25% of all suicide deaths among adults in Connecticut from 2005–2012 (571 out of 2,219)30. Our AUC analysis showed that 50% of the deaths in this cohort occurred among 21% of patients deemed at highest risk based on our model. In other words, our model identified approximately 4,000 high-risk patients, of which nearly 300 would die by suicide within 5 years of their attempt. Also, because all of the information used in the model is available at the time of discharge following a prior attempt, the elevated risk of particular patients could be incorporated into discharge planning and care transitions, and inform long-term approaches to treatment making it more implementable than past modeling efforts.
In terms of limitations, our analysis used discharge data; access to a broader array of healthcare data, such as ambulatory visits or pharmacy data could improve the predictive power of the model. At the time of this analysis we were limited to a combined dataset linking discharges and mortality through 2012. While we had no direct way of assessing the accuracy of the linkages in the absence of a shared unique identifier present in both databases, there are several reasons to have confidence in the accuracy of linkages derived from the demographic characteristics we used for matching. Research has shown that basic demographic characteristics such as those used in our analysis can be successfully used to connect individuals across very large, generic databases25, and in our case we had the additional advantage of linking very particular, related databases. Since both datasets were related to suicidal behavior, the accuracy of any match was likely to be much higher than what it would have been for a generic population of a larger size. Second, the potential for mismatches was limited by the presence of a unique identifier in both databases (noting that in the case of the hospital dataset this was only true within hospitals). Third, the datasets contained all hospitalizations and all suicide deaths in the state; absent data errors (such as an incorrect date of birth) or hospitalizations/deaths occurring outside the state, there was very limited potential for incomplete data in either database to result in matching failures. Finally, unless matching errors were systematic, their effect would be to introduce noise into the analysis. The fact that our final model was highly interpretable and had good out-of-sample predictive power indicates that the linkage between hospitalization and death records was highly accurate.
An additional limitation is that the relatively small number of suicide deaths precluded investigation of alternative model specifications, particularly sex-specific risk models. Finally, this study is limited to hospitalizations and deaths within Connecticut, which has one of the lowest suicide rates and is one of the most affluent states in the US. However, Connecticut’s proportion of non-White residents makes it slightly more diverse than the nation as a whole31.
Despite these limitations, the results from this study have major implications for clinical practice. Although there is robust literature showing that a prior attempt is a very strong risk factor for subsequent suicidal behavior and death by suicide, our work has shown that the risk of later mortality is confined to a relatively small subset of these patients, thus increasing opportunities to focus attention and resources on a smaller and more manageable patient population. In addition, it is important to emphasize that deploying suicide risk algorithms during the psychiatry consults at the time of hospitalization may substantially enhance clinical suicide risk assessments.
The data used in this study were obtained from the Connecticut Department of Public Health and the Office of the Connecticut Medical Examiner under terms that do not permit the authors to disclose or make this information publicly available. Requests for access to these datasets must be made directly to these agencies.
Centers for Disease Control and Prevention. Fatal Injury Data. Injury Prevention and Control 2019. https://www.cdc.gov/injury/wisqars/fatal.html (2020).
Stone, D. M. et al. Vital signs: trends in state suicide rates—United States, 1999–2016 and circumstances contributing to suicide—27 states, 2015. MMWR Morb. Mortal. Wkly. Rep. 67, 617. https://doi.org/10.15585/mmwr.mm6722a1 (2018).
Silverman, M. M. Preventing suicide: a call to action. World Psychiatry 3, 152–153 (2004).
Kessler, R. C. et al. Predicting suicides after psychiatric hospitalization in US Army soldiers: the Army Study to Assess Risk and Resilience in Service members (Army STARRS). JAMA Psychiatry 72, 49–57. https://doi.org/10.1001/jamapsychiatry.2014.1754 (2015).
Barak-Corren, Y. et al. Predicting suicidal behavior from longitudinal electronic health records. Am. J. Psychiatry 174, 154–162. https://doi.org/10.1176/appi.ajp.2016.16010077 (2016).
Simon, G. E. et al. Predicting suicide attempts and suicide deaths following outpatient visits using electronic health records. Am. J. Psychiatry 175, 951–960. https://doi.org/10.1176/appi.ajp.2018.17101167 (2018).
Poulin, C. et al. (2014) Predicting the risk of suicide by analyzing the text of clinical notes. PLoS ONE 9, e85733. https://doi.org/10.1371/journal.pone.0085733 (2014).
Ben-Ari, A. Hammond, K. Text mining the EMR for modeling and predicting suicidal behavior among US veterans of the 1991 Persian Gulf War. System Sciences (HICSS), In 2015 48th Hawaii International Conference on System Science, 3168–3175. IEEE (2015).
Ribeiro, J. D., Huang, X., Fox, K. R. & Franklin, J. C. Depression and hopelessness as risk factors for suicide ideation, attempts and death: meta-analysis of longitudinal studies. Br. J. Psychiatry 212, 279–286. https://doi.org/10.1192/bjp.2018.27 (2018).
Harford, T. C., Yi, H. Y., Chen, C. M. & Grant, B. F. Substance use disorders and self-and other-directed violence among adults: results from the national survey on drug use and health. J. Affect. Disord. 225, 365–373. https://doi.org/10.1016/j.jad.2017.08.021 (2018).
Felitti, V. J. et al. Relationship of childhood abuse and household dysfunction to many of the leading causes of death in adults: the adverse childhood experiences (ACE) study. Am. J. Prev. Med. 14, 245–258. https://doi.org/10.1016/s0749-3797(98)00017-8 (1998).
Ahmedani, B. K. et al. Major physical health conditions and risk of suicide. Am. J. Prev. Med. 53, 308–315. https://doi.org/10.1016/j.amepre.2017.04.001 (2017).
Walkup, J. T., Townsend, L., Crystal, S. & Olfson, M. A systematic review of validated methods for identifying suicide or suicidal ideation using administrative or claims data. Pharmacoepidemiol. Drug. Saf. 21, 174–182. https://doi.org/10.1001/jama.2016.17324 (2012).
Platt, R. et al. The US food and drug administration’s mini-sentinel program: status and direction. Pharmacoepidemiol. Drug. Saf. 21, 1–8. https://doi.org/10.1002/pds.2343 (2012).
Tran, T. et al. Risk stratification using data from electronic medical records better predicts suicide risks than clinician assessments. BMC Psychiatry 14, 76. https://doi.org/10.1186/1471-244X-14-76 (2014).
McCarthy, J. F. et al. Predictive modeling and concentration of the risk of suicide: implications for preventive interventions in the US Department of Veterans Affairs. Am. J. Public Health 105, 1935–1942. https://doi.org/10.2105/AJPH.2015.302737 (2015).
Smith, A. R., Silva, C., Covington, D. W. & Joiner, T. E. Jr. An assessment of suicide-related knowledge and skills among health professionals. Health Psychol. 33, 110–119. https://doi.org/10.1037/a0031062 (2014).
Baraff LJ, Janowicz N, Asarnow JR () Survey of California emergency departments about practices for management of suicidal patients and resources available for their care. Ann. Emerg. Med. 48, 452–8; 10.1016/j.annemergmed.2006.06.026 (2006).
Ahmedani, B. K. et al. Health care contacts in the year before suicide death. J. Gen. Intern. Med. 29, 870–877 (2014).
Bostwick, J. M., Pabbati, C., Geske, J. R. & McKean, A. J. Suicide attempt as a risk factor for completed suicide: even more lethal than we knew. Am. J. Psychiatry 173, 1094–1100 (2016).
All-Payer Claims Database Council. Interactive State Report Map 2018. https://www.apcdcouncil.org/state/map (2019).
United States Census Bureau PD. Annual estimates of the resident population: April 1, 2010 to July 1, 2017 2018. https://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=PEP_2017_PEPANNRES&src=pt (2019).
Parellada, M. Is attempted suicide different in adolescent and adults?. Psychiatry Res. 157, 131–137. https://doi.org/10.1016/j.psychres.2007.02.012 (2008).
Wu, W. H. & Bond, M. H. National differences in predictors of suicide among young and elderly citizens: linking societal predictors to psychological factors. Arch. Suicide Res. 10, 45–60. https://doi.org/10.1080/13811110500318430 (2006).
Sweeney, L. Simple demographics often identify people uniquely. Health 671, 1–34 (2000).
Belsher, B. E. et al. Prediction models for suicide attempts and deaths: a systematic review and simulation. JAMA Psychiatry 76, 642–651. https://doi.org/10.1001/jamapsychiatry.2019.0174 (2019).
Krol, L. V. & Wolfe, R. The emergency department management of near-hanging victims. J. Emerg. Med. 12, 285–292. https://doi.org/10.1016/0736-4679(94)90268-2 (1994).
Runeson, B., Tidemalm, D., Dahlin, M., Lichtenstein, P. & Långström, N. Method of attempted suicide as predictor of subsequent successful suicide: national long term cohort study. BMJ 13, c3222 (2010).
Rocos, B., Acharya, M. & Chesser, T. J. The pattern of injury and workload associated with managing patients after suicide attempt by jumping from a height. Open Orthop. J. 31, 395–398. https://doi.org/10.2174/1874325001509010395 (2015).
Annual Estimates of the Resident Population: April 1, 2010 to July 1, 2017 2018. https://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=PEP_2017_PEPANNRES&src=pt.(2019)
Patrick, A. R. et al. Identification of hospitalizations for intentional self-harm when E-codes are incompletely recorded. Pharmacoepidemiol. Drug. Saf. 19, 1263–1275. https://doi.org/10.1002/pds.2037 (2010).
This work was supported through grant funding from the National Institutes of Health (R01-MH112148; PI: Robert Aseltine). Hospitalization data were obtained from the Connecticut Department of Public Health which does not endorse or assume any responsibility for any analyses, interpretations or conclusions based on the data. The authors assume full responsibility for all such analyses, interpretations and conclusions.
The authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Doshi, R.P., Chen, K., Wang, F. et al. Identifying risk factors for mortality among patients previously hospitalized for a suicide attempt. Sci Rep 10, 15223 (2020). https://doi.org/10.1038/s41598-020-71320-3
Machine learning for suicide risk prediction in children and adolescents with electronic health records
Translational Psychiatry (2020)