Introduction

Available evidence suggests that one in eight men, and one in sixteen women will subsequently commit a serious criminal offense after release from a psychiatric facility [1]. This phenomenon is not isolated to specific geographical or generational effects, considering that in a systematic review comprising 33,588 individuals from 24 countries and 109 datasets, high rates of mental illness in prisoners were found in both high- and low-income countries over the timespan of four decades [2].

Additionally, results from a large Swedish registry study comprising 98,082 individuals with a history of hospitalization suggests that one in every twenty violent crimes is committed by someone with severe mental illness [3]. Given the high prevalence of criminal acts committed across cultures in individuals with severe mental illness, there has been a concerted effort to identify predictors of prospective criminal risk following discharge from psychiatric facilities.

In response to this, actuarial assessments became increasingly widespread, which use statistical algorithms to identify prospective patient risk, usually at the group level [4]. However, there is little evidence that actuarial risk estimates can accurately determine whether a specific patient will reoffend or commit subsequent acts of violence [5]. This is largely because most risk estimates have been developed statistically to assess group-based risk and perform poorly when making individualized predictions [5]. Altogether, this illustrates the limitations of current methods and the importance of a more precise, effective, and personalized approach to risk assessment in forensic settings. Given the ethical, psychiatric, and legal ramifications of inappropriately mischaracterizing the prospective risk of any given patient, and the resulting consequences to the individual, their families, and broader society, there is a growing interest in the use of artificial intelligence and predictive analytics to facilitate clinical decision making at an individual level [6]. This can potentially pave the way for tailor-made tools for the diagnosis, assessment, and treatment of patients [7, 8]. While predictive machine learning models have already shown promise in other fields of medicine [9, 10], there is a growing effort towards predicting criminal outcomes in psychiatric patients at an individual level. Incorporating such models into routine clinical care presents the potential to facilitate personalized and targeted rehabilitation strategies to decrease prospective criminal outcomes. To the best of our knowledge, there are no systematic reviews describing the diagnostic accuracy of machine learning models in predicting criminal and violent outcomes in psychiatry. Therefore, this systematic review and meta-analysis aim to assess the diagnostic accuracy of studies using machine learning techniques to predict criminal outcomes in psychiatry.

Methods

This study has been registered on PROSPERO with the registration number PROSPERO CRD42019127169.

Search strategy

We searched three electronic databases (PubMed, Scopus, and Web of Science) for articles published up until April 2022. To identify relevant studies, the following structure for the search terms was used: (Artificial Intelligence OR Supervised Machine Learning AND crime-related outcomes in psychiatry). The complete search filter is available in the supplementary material. We also screened references from included articles to search for potentially missed articles.

Eligibility criteria

This systematic review was performed according to the PRISMA statement [11]. We selected original articles that used supervised machine learning models to predict crime-related outcomes in mental illness. We excluded review articles and studies using unsupervised learning, since methods such as clustering are not outcome oriented. Furthermore, studies that predicted crime or violent-related outcomes in individuals without psychiatric disorders were excluded, although further information regarding these studies can be found in Supplementary Table 2.

Data collection and extraction

Potential articles were independently screened in a blinded standardized manner for title and abstract contents by two researchers (DW and DLG). Following this, the full texts of screened articles were obtained and evaluated according to the inclusion and exclusion criteria. A third author (PB) provided a final decision in cases of disagreement. Criminal outcomes were operationalized as rearrest, reconviction of crimes, or prediction of the type of crime committed. Violent outcomes involved recorded violent incidents during inpatient stay or following hospital discharge.

Quality assessment

We created a machine learning quality assessment table based on experts’ opinion to evaluate the reproducibility and reliability of the included studies. Our assessment provides a quick way to evaluate published papers and can also serve as a checklist for future studies. Briefly, the instrument comprises nine methodological considerations, including representativeness of the sample, confounding variables, outcome assessment, algorithm selection, feature selection, class imbalance (where applicable), missing data, performance/accuracy, and testing/validation. The instrument can be found in Supplementary Table S1, and further details can also be found in the Supplementary Material.

Statistical analysis

A bivariate meta-analysis was performed for crime-related and violent outcomes using the mada [6] meta [12], and dmetatools packages in R [6]. Since we anticipated considerable between-study heterogeneity, a random effects model was used to pool effect size. Additionally, an adjusted profile restricted maximum likelihood estimator was used to calculate the heterogeneity variance tau square (τ2). This metric was selected since the heterogeneity statistic I2 can be biased in meta-analyses with small sample sizes [13]. Using the retisma function in ‘mada’ [6], a linear mixed model with random effects was selected to produce summary estimates of sensitivity and specificity, as well as calculate AUC and partial AUC summary receiver operating characteristic (ROC) curves, as described elsewhere [14]. 95% confidence intervals for summary AUC were generated using 2000 iterations of parametric bootstrapping with the ‘dmetatools’ package in R. Additionally, using the metamean function in ‘meta’ [12], mean accuracy across models was pooled alongside standard error of model accuracy, as detailed in Supplementary Table S3. As we anticipated considerable between-study heterogeneity, a random effects model was selected to pool effect sizes. The restricted maximum likelihood estimator [15] was selected to calculate the heterogeneity variance τ2. Knapp-Hartung adjustments [16] were also used to calculate the confidence interval around the pooled effect. Additionally, we pooled the diagnostic odds ratio, and the positive negative and likelihood ratios within a random effects model with a DerSimonian-Laird estimator [17].

Four studies were excluded from the meta-analysis, as the authors did not report the sensitivity and specificity of their models. Criminal outcomes were operationalized as rearrest, reconviction of crimes, or prediction of the type of crime committed. Violent outcomes involved recorded violent incidents during inpatient stay or following hospital discharge.

Results

We found 12,420 potential titles/abstracts and included 20 studies which met inclusion criteria. A list of the included studies and their most relevant characteristics and findings are described in Table 1, while Table 2 details the diagnostic accuracies, odds ratios, and likelihood ratios of studies contained within the meta-analysis. Additionally, a schematic of the meta-analytic diagnostic accuracy of predicting criminal recidivism and physical violence are detailed in Fig. 1. Furthermore, a machine learning quality assessment, additional figures related to model performance, and a table comprising twenty-one studies assessing criminal outcomes in non-psychiatric individuals can be found in the supplementary material. Additional information about machine learning algorithms [18] including methodological considerations, common problems, and limitations, can be found elsewhere [19].

Table 1 Predicting criminal and violent outcomes in psychiatry.
Table 2 Performance Metrics: Accuracies, AUC, diagnostic odds ratio, and likelihood ratios.
Fig. 1: Paired Forest plot of model accuracy for criminal and violent outcomes in psychiatry.
figure 1

A linear mixed model with random effects was selected to produce summary estimates of sensitivity and specificity using the retisma function in mada. The average sensitivity across studies was 73.33% (95% I: 64.09–79.63) and average specificity was 72.90% (95% CI: 60.50–96.6). As such, the balanced accuracy across models (sensitivity + specificity/2) is 73.11%.

Of the studies included in the systematic review, six assessed predictors of criminal recidivism [20,21,22,23,24,25], two assessed predictors of the type of criminal offence [26, 27], three assessed predictors of physical violence during inpatient stay [28,29,30], and six assessed predictors of violent offending and aggression following discharge [24, 31,32,33,34,35,36,37,38]. All studies, apart from two [21, 30], used clinical input features, including socio-demographic information, questionnaires, and psychometric measures to derive predictions.

Studies assessing criminal outcomes

Eight studies used machine learning models to predict criminal outcomes in patients with psychiatric disorders [20,21,22,23,24,25,26,27]. Delfin and colleagues conducted the first 10-year follow-up of a cohort of forensic psychiatry patients, including 44 individuals, who underwent a single-photon emission CT scan. These data, alongside eight evidence-based clinical risk factors, were used in a random forest model to predict criminal recidivism, resulting in an accuracy of 82% and an AUC of 0.81. Of note, when only clinical risk factors were used alone, model performance degraded, with an accuracy of 64% and AUC of 0.69, emphasizing the importance of combining clinical and biological features to predict criminal recidivism. The top features reflecting neuronal activity included the right and left parietal lobe, left temporal lobe, and right cerebellum [21].

Kirchebner and colleagues used 653 clinical features to predict recidivism in 344 individuals with schizophrenia. Patients who had a criminal record prior to their current offence were considered as recidivists. Following imputation, the best performance was observed using Boosted Trees, with an accuracy of 67.6%. Without imputation, a Naive Bayes classifier achieved an accuracy of 79.4%. Important variables included amisulpride prescription prior to offence, recent stressors, recent legal complaints, and number of prior offences [24].

Sonnweber et al. developed a model to differentiate between violent and non-violent offenders in patients with schizophrenia. The best performance was observed using a gradient boosting machine, resulting in a balanced accuracy (operationalized as the average of sensitivity and specificity, as defined elsewhere [39]) of 67%. The most important variables included time spent in hospitalization, age at diagnosis, daily olanzapine at discharge, PANSS score at discharge, and social isolation in adulthood [26].

Furthermore, Watts and colleagues developed a machine learning model to predict the type of criminal offence committed in a large transdiagnostic sample of 1240 psychiatric patients. Using multiclass classification, they showed that sexual crimes could be discriminated from violent and nonviolent crimes at an individual level with an accuracy of 71.22%. Moreover, following recursive feature elimination, a reduced model with 36 variables resulted in an accuracy of 71.58%. The most important features for the model included previous absolute discharge, previous sexual convictions, cluster A personality disorder, and female gender [27]. Other studies predicted rearrest after release from jail [20, 22], reconviction for a violent crime [23], and risk of general criminal recidivism [25]. A summary of these findings can be found in Table 1 and Supplementary Table S2.

Studies assessing violent outcomes

Twelve studies used machine learning techniques to predict violent outcomes in patients with psychiatric disorders [28,29,30,31,32,33,34,35,36,37,38, 40]. Linaker and colleagues predicted violent incidents in psychiatric patients using behavioral symptoms from health records from 24 h prior. Overall, 48 acts of violence were recorded from 32 patients, and following feature selection using correlation coefficients, six variables were used as predictors in a logistic regression model. The authors reported a sensitivity of 81.3% and specificity of 100%, however it was unclear how class imbalance was addressed, since only 34.7% of patients committed an act of violence during the study [32].

Kirchebner and colleagues used a series of known stressors to predict violent offending in 370 patients with schizophrenia. The overarching goal was to determine whether accumulated stressors precipitated violent outcomes in patients. Using boosted classification trees, they reported an accuracy of 76.4%. However, no external validation or testing set was used, instead, performance was assessed using 5-fold CV [40].

Furthermore, Menger et al. used text analysis from doctor and nurse notes to predict violent incidents in psychiatric inpatients. Four feature extraction methods were used, comprising binary bag of words, term frequency-inverse document frequency (tf-idf) bag of words, document embeddings, and word embeddings, as described elsewhere. An AUC of 0.788 was observed using document embeddings with recurrent neural networks. The worst performances occurred with the Naive Bayes algorithm, which is the most classical and widely used algorithm for text classification [28].

Monahan and colleagues classified patients according to high and low risk of violence following discharge from psychiatric facilities. Decision trees were used in a binary classification task, and features were selected using a stepwise model, where the threshold of statistical significance between the feature and outcome were set at P < 0.05. The model correctly identified 72.6% of the sample as either low or high risk. Important variables included seriousness of prior arrests, motor impulsiveness, paternal drug use, and recurrent violent fantasies. It is important to mention that 27.4% of the total sample remained unclassified, meaning it could find no combination of risk factors to classify patients into high or low-risk groups [33].

Additionally, Suchting and colleagues used saliva FK506 binding protein 5 (FKBP5) polymorphisms alongside demographic and psychometric variables to predict state aggression, which resulted in an R2 of 0.66 [30]. Other studies identified predictors of violent risk following discharge [37, 38] and aggression in patients [29, 31, 34,35,36], which are further described in Table 1.

Meta-analysis of diagnostic accuracy

A forest plot detailing model performance can be observed in Figs. 1 and 2, while Table 2 details the diagnostic accuracies, odds ratios, and likelihood ratios across studies. Additional details related to the standard error of model accuracy, 95% CI, and the true/false positives and negatives, can be found in Supplementary Table S3. Nine studies were pooled, comprising 2,428 patients (the same dataset of 370 patients was used across two studies [26, 40]).

Fig. 2: Pooled effects of model accuracy.
figure 2

Pooled accuracy of criminal and violent models in psychiatry across 2428 patients (two studies used the same sample n = 370) within a random effects model using a restricted maximum likelihood estimator to calculate the heterogeneity variance τ2. Reported mean accuracy across models was used, in conjunction with standard deviation, calculated by multiplying the standard error by the square root of the sample size (SD = SE×√n). Knapp-Hartung adjustments were used to calculate the confidence interval around the pooled effect. The average accuracy across models was 71.45% (95% CI: 60.88–83.86), with a heterogeneity variance τ2 of 0.0424.

Additionally, nine studies which did not report the sensitivity and specificity of models [20, 22, 23, 28, 29, 31, 33,34,35], and one regression-based model [30] were excluded from the meta-analysis. Overall, the pooled accuracy across models was 71.45% (95% CI: 60.88–83.85), with a sensitivity ranging from 54.4%-87.3% (average: 73.33%, 95% CI: 64.09–79.63) and specificity ranging from 60.5–96.6% (average: 72.90%, 95% CI: 63.98–79.66). The heterogeneity statistic τ2 for pooled model accuracy was 0.0424 (95% CI: 0.0184–0.1553). A plot of the false positive rate against sensitivity for all studies can be found in Supplementary Fig. S1.

The diagnostic odds ratio (DOR) across studies was 9.75 (95% CI: 4.035–22.72; τ2 = 1.505) as detailed in Table 2. Similarly, the positive likelihood ratio (posLR) was 3.083 (95% CI: 1.954–4.866, with a τ2 of 0.437 (95% CI: 0.000–0.897), and the negative likelihood ratio (negLR) was 0.342 (95% CI: 0.201–0.583), with a τ2 of 0.566 (95% CI: 0.000–3.476), respectively. Additionally, the log DOR across studies was 2.466 (95% CI: 1.534–3.397). The average prevalence of the positive class (presence of criminal and violent outcomes) was 43.435% of the sample across studies. Furthermore, the AUC across studies was 0.816 (95% CI: 0.745–0.875) in predicting criminal and violent outcomes, with a partial AUC of 0.773. Spearman’s rho indicated a weak association (rho = 0.150, 95% CI: −0.571–0.740) with a large confidence interval between the sensitivities and false positive rates of included studies.

Discussion

To the best of our knowledge, this is the first systematic review comprising studies using supervised machine-learning techniques to predict criminal or violent outcomes in individuals with psychiatric disorders. Throughout our review, we have identified recurrent features and algorithms used, as well as current methodological challenges. In this section, we detail key aspects of these models, showcasing their limitations as well as our perspectives on best practices for developing machine learning models with clinical utility. Further details regarding common methodological issues in machine learning models can be observed in the supplementary material.

Model interpretability, model performance, and confidence intervals

More recent machine learning algorithms that use regularization parameters to account for common issues such as multicollinearity, tended to show higher performance accuracy in predicting outcomes. However, model complexity carries the trade-off of greater difficulty in model interpretability and explainability [41].

Recently, new local explanation methods have been developed, including SHapley Additive exPlanations (SHAP), to explain variable contributions at the individual level [42]. Adaptations of this, such as TreeExplainer, leverage the internal structure of tree-based models to efficiently compute local explanations using Shapley values [43]. Moreover, SHAP dependence plots can be used to showcase the effect that a single feature has on predictions made by the model [43]. In two studies included in the current review, feature importance metrics were not reported [28, 35]. It is argued that future studies may benefit from an increased focus on model interpretability, which may aid in the generalizability and replicability of such work.

Furthermore, it is important to highlight that model performance can be over-optimistic when assessed using internal cross-validation alone, in the absence of separate training and testing sets. Of the twenty studies contained in the present review, only seven (35%) incorporated training and testing sets in model development. In the majority of studies [25, 28,29,30,31, 33,34,35,36, 38] (76.9%) that evaluated model performance using internal cross-validation alone, sample sizes were also well over 100 patients. As mentioned elsewhere, several other fields use cross-validation to tune regularization parameters in model development, rather than taking performance estimates at face value [44]. Similarly, it is important to mention that uncertainty estimates should be considered when evaluating model performance and its potential clinical utility. Of nine studies comprising the meta-analysis, only four (44.4%) [21, 26, 27, 37] reported accuracy estimates using a method such as 95% confidence intervals.

Model performance and clinical predictors

Overall, eighteen models assessed clinical predictors of criminal and violent outcomes [20, 22,23,24,25,26,27,28,29, 31,32,33,34,35,36,37,38, 40]. In criminal prediction models, accuracy was generally high, ranging from 67.83–82%.

With respect to criminal behavior, common predictors across models included age at first crime, substance use disorder, cluster B personality disorder, prior criminality, a high number of stressors, and childhood trauma. Future work may benefit from comprising a standardized evidence-based risk battery for use in prospective models.

Furthermore, models predicting violent behavior were more variable, ranging from 58.25–92.1%, with five of twenty studies (25%) [22, 23, 28, 35] comprising the systematic review only reporting AUC. As such, several were excluded from the meta-analysis. Nonetheless, important clinical features included confusion, irritability, threats, recently attacking objects, child abuse, physical neglect, and callous affect. Important search terms included aggressive, offered, angry, door, walk, arrest, offer emergency medication, and walked.

With respect to the meta-analysis comprising nine studies (n = 2428 patients), the pooled accuracy was 71.45% (95% CI: 60.88–83.86) in predicting criminal and violent outcomes. Moreover, as detailed in Table 2, the DOR was 9.757 (95% CI: 4.035–22.72; τ2 = 1.505) and log DOR was 2.466 (95% I: 1.534–3.397). As discussed elsewhere, the DOR is a measure of the effectiveness of a diagnostic test that is independent of prevalence [45]. A DOR of 9.757 represents a high ratio of the odds of the test being positive if the individual will commit prospective criminal and violent outcomes relative to the odds of the test being positive if the individual will not prospectively commit criminal and violent outcomes. However, a large upper and lower bound of the 95% CI was observed, and the log DOR suggests a more conservative test effectiveness. Similarly, the posLR was 3.083 (95% CI: 1.954–4.866), suggesting a small increase in the likelihood of committing violent and criminal outcomes in patients with a positive test. In addition, the negative likelihood ratio was 0.342 (95% CI: 0.201–0.583), suggesting a 20–25% decrease in the odds of committing violent and criminal outcomes in patients with a negative test result.

Model performance and biological predictors

Furthermore, two models [21, 30] assessed biological predictors pertaining to saliva SNPs and resting-state regional cerebral blood flow. Although they contained small sample sizes and lacked external validation, both showed promising performance, corresponding to an R2 of 0.66, and accuracy of 82%, respectively. Important features included KBP5_14 (rs1460780), FKBP5_92 (rs9296158); and FKBP5_94 (rs9470080), right and left parietal lobe rCBF, left temporal lobe rCBF, and right cerebellum. Subsequent studies may benefit from replicating these findings and incorporating additional biological and physiological variables.

Limitations

Currently, the field of predicting crime and violent-related outcomes using machine learning techniques remain in its infancy. As such, there is a lack of studies validating model performance using independent cohorts. Furthermore, it is important to note that model accuracy should be considered alongside several other factors, such as the input features used, the preprocessing pipeline, feature selection method, model optimization strategy, and the validation procedure. Furthermore, data-driven approaches to feature selection can be useful in many cases, since it does not require knowledge derived from pre-existing literature to manually select important variables [46,47,48]. Of note, the absence of a formalized feature selection strategy was observed across a subset of studies.

There are several available feature selection methods, with varying degrees of appropriateness depending on the application, as described elsewhere [47]. Furthermore, feature selection can be useful to improve the generalizability of models when applied to independent datasets [49]. Considering that predictive models applied to forensic healthcare can have significant legal repercussions - such as incorrectly identifying individuals as not criminally responsible when in fact they are, or the inability to detect malingering - it is paramount that we use the most optimal methods available for these purposes.

Additionally, only two studies developed separate models to assess potential differences in performance between men and women using the same variables, as described in the supplementary material. Rosselini et al. reported an AUC of 0.74 for men and an AUC of 0.82 for women in predicting violent crime [50]. Additionally, the same authors also investigated predictors of major violent crime and reported an AUC of 0.81 for both models in men, and an AUC of 0.80–0.82 for both models in women. Based on these studies, it is still unclear whether biological sex or gender play a key role in deciding which features should be included within a predictive machine-learning model.

Future directions

Moving forward, a further refinement of predictive models in forensic risk prediction is required. Potentially, this may be facilitated by using a wider framework when selecting the input data in our models. Considering that our model performance is directly dependent on the available input data, an exploratory data-driven approach may be warranted in predictive models.

Most machine learning studies in forensic psychiatry thus far focus purely on clinical and administrative data, given the widespread availability of such data. However, other modalities, such as neuroimaging (MRI, fMRI, DTI), electrophysiology (EEG, MEG, ERG) various sensors (actigraphy, heart rate variability), and genomic features (whole genome sequencing, whole exome sequencing, and RNA sequencing) may prove to facilitate model performance, when used in conjunction with clinical data. Moreover, longitudinal studies with larger multicentric samples and adequate external validation are needed to translate proof-of-concept predictive models into applications to be used in clinical and legal settings. We hypothesize that such models may facilitate a more personalized approach to patient evaluation and risk management, provide greater precision in deriving a tailored treatment plan, and aid clinicians and the legal system in the decision-making process as it pertains to mentally disordered offenders. Ultimately, they may become critical tools to assist in prison sentencing, to determine fitness to stand trial, and to optimize the progress of individuals in the forensic system towards rehabilitation.