Explainable artificial intelligence model to predict acute critical illness from electronic health records

Acute critical illness is often preceded by deterioration of routinely measured clinical parameters, e.g., blood pressure and heart rate. Early clinical prediction is typically based on manually calculated screening metrics that simply weigh these parameters, such as early warning scores (EWS). The predictive performance of EWSs yields a tradeoff between sensitivity and specificity that can lead to negative outcomes for the patient. Previous work on electronic health records (EHR) trained artificial intelligence (AI) systems offers promising results with high levels of predictive performance in relation to the early, real-time prediction of acute critical illness. However, without insight into the complex decisions by such system, clinical translation is hindered. Here, we present an explainable AI early warning score (xAI-EWS) system for early detection of acute critical illness. xAI-EWS potentiates clinical translation by accompanying a prediction with information on the EHR data explaining it.


Field-specific reporting
Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection.

Life sciences
Behavioural & social sciences Ecological, evolutionary & environmental sciences For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf

Life sciences study design
All studies must disclose on these points even when the disclosure is negative. Reporting for specific materials, systems and methods We require information from authors about some types of materials, experimental systems and methods used in many studies. Here, indicate whether each material, system or method listed is relevant to your study. If you are not sure if a list item applies to your research, read the appropriate section before selecting a response.
The authors have accessed the data by applying the CROSS-TRACKS cohort, which is a newer Danish cohort that combines primary and secondary sector data. As with most register data/EHR data/health data we, as investigators, do not own data, and is not able to share data. This is both due to the EU regulations, national law and GDPR. However, all researchers can apply for access to the data by following the instructions on this page: http://www.tvaerspor.dk/.
In this study, we analyzed the secondary healthcare data of all residents of four Danish municipalities (Odder, Hedensted, Skanderborg, and Horsens) who were 18 years of age or older for the period of 2012-2017. The data contained information from the electronic health record (EHR), including biochemistry, medicine, microbiology, and procedure codes, and was extracted from the "CROSS-TRACKS" cohort, which embraces a mixed rural and urban multi-center population with four regional hospitals and one larger university hospital. Each hospital comprises multiple departmental units, such as emergency medicine, intensive care, and thoracic surgery. We included all 163,050 available inpatient admissions (45.86% male) during the study period and excluded only outpatient admissions. The included admissions were distributed across 66,288 unique residents. The prevalence for sepsis, AKI, and ALI among these admissions was 2.44%, 0.75%, and 1.68%, respectively (see Table 2).
No data excluded.
In order to quantify reproducibility all analysis was data were randomly divided into 5 portions of 20% each. For each fold four portions (80 %) was used to fit the xAI-EWS model parameters during training. The remaining 20% was split into two portions of 10% each for validation and test. This allowed to report means values along with confidence intervals to indicate variation between experiments/folds. All patients in the test set were randomly selected and were not correlated in any way. Results were consistent between folds and can be observed in Figure 2. The cross validation scheme is illustrated in Supplementary Figure 1. Also, in the Supplementary information Table 1 and 2 all the raw output the evaluation is listed.
Data were randomly divided into 5 portions of 20% each. For each fold four portions (80 %) was used to fit the xAI-EWS model parameters during training. The remaining 20% was split into two portions of 10% each for validation and test. The validation data were used to perform an unbiased evaluation of a model fit during training, and the test data were used to provide an unbiased evaluation of the final model. For each fold data were shifted such that a new portion was used for testing. All data for a single patient was assigned to either train, validation or test data.
All data for a single patient was assigned to either train, test or validation splits randomly. After random assignment patient identification keys were removed, leaving only a cleaned dataset with only parameters and labels visible to the investigators. The entire proces of random assignment, training and testing was done in an automatic pipeline with no human interaction until test results were output.