A machine learning approach for predicting suicidal thoughts and behaviours among college students

Suicidal thoughts and behaviours are prevalent among college students. Yet little is known about screening tools to identify students at higher risk. We aimed to develop a risk algorithm to identify the main predictors of suicidal thoughts and behaviours among college students within one-year of baseline assessment. We used data collected in 2013–2019 from the French i-Share cohort, a longitudinal population-based study including 5066 volunteer students. To predict suicidal thoughts and behaviours at follow-up, we used random forests models with 70 potential predictors measured at baseline, including sociodemographic and familial characteristics, mental health and substance use. Model performance was measured using the area under the receiver operating curve (AUC), sensitivity, and positive predictive value. At follow-up, 17.4% of girls and 16.8% of boys reported suicidal thoughts and behaviours. The models achieved good predictive performance: AUC, 0.8; sensitivity, 79% for girls, 81% for boys; and positive predictive value, 40% for girls and 36% for boys. Among the 70 potential predictors, four showed the highest predictive power: 12-month suicidal thoughts, trait anxiety, depression symptoms, and self-esteem. We identified a parsimonious set of mental health indicators that accurately predicted one-year suicidal thoughts and behaviours in a community sample of college students.

Measures. The one-year follow-up questionnaire included questions about suicidal thoughts and suicide attempts during the last 12 months. Participants who reported having occasional or frequent suicidal thoughts and/or suicide attempts were coded as positive for STB.
We considered baseline assessments of 70 potential predictors (Supplementary Table 2). These variables included socio-demographic characteristics (e.g., age, year of study, scholarship, and accommodation type), lifestyle habits (e.g., time spent on screens and sleep quality), familial characteristics (e.g., perceived parental support, parental divorce, and parental history of depression), physical health (e.g., handicap and perceived health), and substance use (e.g., tobacco and alcohol use). Baseline characteristics also included history of diagnosed psychiatric disorders, lifetime suicide attempts and suicidal thoughts during the 12 months preceding inclusion (latter called baseline STB). We measured several mental health parameters with validated scales: depression symptoms using the 9-item Patient Health Questionnaire (PHQ-9) 20 ; trait anxiety using the Spielberger State-Trait Anxiety Inventory (STAI-YB) 21 ; self-esteem using the Rosenberg scale 22 ; perceived stress using the Perceived Stress Scale (PSS-4) 23 ; and impulsivity using the Barratt Impulsivity Scale (BIS-11) 24 .
Childhood adversities are not investigated in the baseline questionnaire. In order to take into account these important potential predictors in our models, a subsample of 1911 participants was administered a supplementary questionnaire adapted from the Childhood trauma questionnaire 25 . This questionnaire included 17 variables assessing experiences of sexual abuse, physical or psychological maltreatment, or neglect (Supplementary  Table 2). Statistical analyses. We first described the overall study sample and according to the gender. Continuous variables are expressed as mean ± standard error. Categorical variables are described as the proportion.
Prediction of one-year STB. To predict STB we used a random forests model, which is a non-parametric ensemble machine learning method applicable for both classification and regression prediction 26 . This technique is broadly used due to its high performance and robustness, and because it enables the use of variables independently of type and distribution 27 . Random forests are based on the aggregation of a set of decision trees created through recursive bootstraps of the initial sample 28 . In each bootstrap sample, a decision tree is created using two-third of the observations. The remaining one-third, termed the out-of-bag sample, is used to obtain an unbiased performance measure of the created algorithm. This evaluation of prediction performance yields a measure termed the out-of-bag error, which represents the overall error of the algorithm in terms of outcome prediction. The out-of-bag sample is also used to calculate the relative importance of each variable for the prediction. To this end, the value of a given variable is randomly shifted in the out-of-bag sample, and any resulting change of the out-of-bag error reflects the variable's importance in the prediction. Finally, all individual decision trees are aggregated to create the final predictor algorithm. To carry out these analyses we used the randomFor- www.nature.com/scientificreports/ est and caret packages in SAS and R. Missing data on the predictors (2%) were handled using the R missForest algorithm 29 specifically designed to deal with missing data in random forest models.
Predictors. For the main analyses, the 70 potential predictors were included in the model. We then performed two secondary analyses. First, we re-estimated our models in a subsample of participants who did not report STB at baseline to better identify new cases 30 . Second, we re-estimated our models in the subsample including data on childhood adversity.
Evaluation of model performance. We evaluated the prediction quality of our models in the testing sample using the following performance metrics: (1) out-of-bag error, obtained using the out-of-bag sample of the training set, which represents the overall error in the prediction (ranges from 0%, indicating that no individual is correctly classified, to 100%, indicating that all individuals are correctly classified); (2) area under the curve (AUC) 31 , which measures the accuracy of discrimination performance represented by the predicted true positive rate against the false positive rate (ranges from 0.5, indicating prediction by chance, to 1, indicating perfect prediction); (3) sensitivity, representing the rate of actual cases (i.e. students reporting STB) identified by the algorithm; and (4) the positive predictive value, describing the proportion of algorithm-predicted cases that are actual cases. To prevent these performances to be over-fitted and to increase the generalizability of the prediction model, we estimated these indices through cross-validation. We therefore split randomly the initial dataset into 10-folds, we created the model using 9 of the 10 folds and tested on the remaining fold. We repeated this process until all the folds were used as test sets. All the values needed for the prediction, i.e. the real outcome, predicted outcome and probabilities of belonging in each class of the outcome, were calculated in each test sample and stored in an independent file; the final prediction metrics were then obtained with all the stored values and we reported the mean value of the out-of-bag error across the 10 models.
All models were carried out in accordance with the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) statement for prediction model development 32 .
The main baseline characteristics did not significantly differ according to gender (Table 1). The mean participant age was 20.7 years (SD 2.6). Over one-third of the sample (n = 1932; 38.1%) was in their first year of university education. The majority of the students lived alone in an apartment (n = 1544; 30.5%) or at their parents' home (n = 1495; 29.5%), and 17.5% (n = 884) described their current economic situation as difficult or very difficult. The most prevalent indicators of childhood adversity were maternal depression history (n = 1536; 30.3%) and parental divorce or separation (n = 1484; 29.3%). At baseline, one in five students reported 12-month suicidal thoughts (n = 1072; 21.2%) and 5.4% (n = 275) reported a lifetime suicide attempt.

Prediction of suicidal thoughts and behaviours.
Among girls, the predictive model had an out-ofbag error of 24.6%, suggesting the overall misclassification of a quarter of the female participants. Among boys, the out-of-bag error was 28.1%. The model showed an AUC of 0.84 (95% CI 0.83-0.86) for girls, indicating a discrimination 68% better than chance, and 0.82 (95% CI 0.79-0.86) for boys (Fig. 1). The sensitivity was 0.79 for girls and 0.81 for boys, indicating that the model correctly predicted 79-81% of the actual cases ( Table 2). The predictive positive values were 0.40 and 0.36 for girls and boys, respectively, meaning that 40% and 36% of predicted cases were actually cases. Analysis of the variables' importance for the prediction, as measured by the mean decrease in accuracy, revealed that the following four variables were the most predictive in both girls and boys: 12-month suicidal thoughts at baseline, self-esteem, trait anxiety, and depression symptoms (Fig. 2).

Secondary analyses.
We repeated these analyses in a subsample of participants who did not report STB at baseline, and found that the predictive performances were lower than in the main analyses. For girls (n = 3114) and boys (n = 832), respectively, the AUC was 0.72 and 0.74, and the sensitivity was 0.63 and 0.62. Variable importance for the prediction was different between girls and boys, with the following main predictive variables for girls: depression symptoms, self-esteem, trait anxiety, and academic stress (Fig. 3). For boys, we found one main predictor i.e. self-esteem followed by trait anxiety (Fig. 3). We then fitted our random forests models among the 1497 girls and 414 boys who answered the childhood adversity questionnaire. The predictive performances were similar for girls (AUC 0.82; sensitivity of 79%) and boys (AUC 0.75; sensitivity of 76%). In girls, the four main predictive variables were baseline suicidal thoughts, depression symptoms, self-esteem, and trait anxiety. In boys, the four top predictors were 12-month suicidal thoughts, perceived stress, trait anxiety and self-esteem (Supplementary Figure 1). Thus, in both genders, childhood adversity variables did not contribute to STB prediction.

Discussion
Using random forests models in this large sample of college students we found that four main baseline variables predicted STB at 12-month: suicidal thoughts at baseline, trait anxiety, depression symptoms, and self-esteem. The model including these variables showed good predictive performance (AUC = 0.8) estimated using crossvalidation. In secondary analyses in a subsample excluding participants who reported STB at baseline, the main predicting variables were depressive symptoms, self-esteem, and academic stress for girls and mainly self-esteem for boys. These predictors differ according to gender only among participants who did not report STB at baseline. Finally, childhood adversity variables did not contribute to STB prediction. To our knowledge, only two prior studies have developed STB predictive models in students and reported comparable predictive performances to our study. One study used the random forests method to predict suicide attempts among medical students, using a cross-sectional design 33 . The other study used a logistic regression model to develop a risk-screening algorithm for persistence of suicidal behaviours during college 34 .
STB prediction was not influenced by childhood trauma or perceived parental support, which are usually strongly associated with STB in young adults 35,36 . These results are in line with previous studies 34,37 . This finding highlights that association does not necessarily means prediction 11 , and that proximal risk factors of STB may be better than distal or early life one for predicting one-year STB 38 . We can also assume that important predictors such as depression symptoms are the downstream consequences of higher adversity during childhood, and as they are more recent, they could be overshadowing the importance of early adversity in STB prediction. Furthermore, Table 1. Sample characteristics at baseline. All data presented as n (%) unless otherwise noted.    www.nature.com/scientificreports/ following the diathesis stress model of suicide, the predictors we found (anxiety, depression) might affect more vulnerable individuals who have experienced childhood adversities 39 .
We identified a small number of major predictors that ensured high accuracy in STB prediction. These predictors, derived from short and commonly used questionnaires, may help developing a large-scale screening tool for university students. For example, they could be integrated into a short online screening administered upon college entrance. An online questionnaire may prove acceptable to students, and would provide an alternative to mental health assessment by a physician for students who are often reluctant to disclose sensitive personal information in face-to-face interviews 40,41 .
The quantitatively most important predictor was suicidal thoughts at baseline 34,42 . Likewise, anxiety and depression were often comorbid with STB in students 43 . Interestingly, self-esteem emerged as one of the main predictors of STB. Low self-esteem is known to be a part of social anxiety, and to overlap with depression, both of which are associated with STB 44 . Self-esteem, which is an important marker of psychological vulnerability in young adults [45][46][47] has also been found associated with suicidality 48 . Our study showed that self-esteem is an independent and prominent predictive marker of STB and should therefore be used in a screening tool.
Overall, our results suggested that baseline suicidal ideation associated with three validated psychological scales (Rosenberg scale for self-esteem, STAI-YB Spielberger scale for trait anxiety, PHQ-9 for depression) are informative enough to identify students who will present STB at the one-year assessment.
Key strengths of this study are the large sample of students and the longitudinal design. Since there are many different paths to STB, accurate STB prediction requires the consideration of a complex combination of a large number of factors 13 . The i-Share baseline questionnaire includes a large number of variables, which enabled analyses with a large number of potential STB predictors (70 in the main analyses and 87 for the secondary analyses). Our analyses were conducted following the current recommendations and best methods for prediction analysis, especially the use of different samples for creating the predictors and then for calculating the predictive performance, which prevents the performance measures from being overfitted 10,11 . The variables identified as main predictors of STB were consistent across main and secondary analyses, suggesting robust and consistent findings. Some limitations should nevertheless be acknowledged when interpreting the results. First, the followup response rate (33.5%) was moderate, as is common in longitudinal studies with students 49 and differences were observed between respondents and non-respondents in the follow-up. These differences were not major (proportions were similar) and should have a limited impact when identifying STB predictors. Nevertheless, caution is needed regarding the external validity of our results and the possibility of generalizing conclusions to all students and to all settings. Second, girls were over-represented in our sample (79%) compared to the 50-60% of female students in France 50 , and our sample might not be representative of the whole student population. Third, the self-reported questionnaires could lead to information and recall bias, particularly if participants underreported their frequency of STB due to concerns about social desirability. However, such under-reporting is likely to be reduced by the use of an online questionnaire. Additionally, and more importantly, relying on other data (e.g., clinical assessment) would defeat our aim of finding easily assessable predictors of SBT in large university student samples. Fourth, given the adaptation of the CTQ used in this study, we could not create subsamples 'with' and 'without' childhood adversity. Thus we could not explore deeply if the identified predictors affected more individuals with childhood adversities. Finally, we could not strictly separate analyses between suicidal ideation and suicide attempts due to the small number of one-year suicide attempts in our sample even after combined the genders (n = 35). www.nature.com/scientificreports/ In conclusion, we identified a parsimonious number of predictors that can be used to accurately identify students who will present STB within one-year from the predictor assessment. Pending replication of these results in other studies, these predictors may be used to develop a screening tool to be routinely used among university students. For example, a web-based screening tool could represent a promising approach for identifying students at suicide risk and to refer them to counselling and mental health services.

Data availability
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.