Deep learning-based prediction of in-hospital mortality for sepsis

As a serious blood infection disease, sepsis is characterized by a high mortality risk and many complications. Accurate assessment of mortality risk of patients with sepsis can help physicians in Intensive Care Unit make optimal clinical decisions, which in turn can effectively save patients’ lives. However, most of the current clinical models used for assessing mortality risk in sepsis patients are based on conventional indicators. Unfortunately, some of the conventional indicators have been shown to be inapplicable in the accurate clinical diagnosis nowadays. Meanwhile, traditional evaluation models only focus on a small amount of personal data, causing misdiagnosis of sepsis patients. We refine the core indicators for mortality risk assessment of sepsis from massive clinical electronic medical records with machine learning, and propose a new mortality risk assessment model, DGFSD, for sepsis patients based on deep learning. The DGFSD model can not only learn individual clinical information about unassessed patients, but also obtain information about the structure of the similarity graph between diagnosed patients and patients to be assessed. Numerous experiments have shown that the accuracy of the DGFSD model is superior to baseline methods, and can significantly improve the efficiency of clinical auxiliary diagnosis.

underestimated high-risk septic patients, and SAPS II is slightly negative.Perng et al. 19 has convinced himself through extensive experiments that convolutional neural networks(CNN) with softmax model outperforms autoencoder, principal component analysis(PCA), and machine learning such as K-nearest neighbor(KNN), support vector machines(SVM), and RF.
Despite the fact that existing studies have yielded some results, there are still some drawbacks.In terms of obtaining core indicators for the assessment of risk of mortality for sepsis, the number of indicators that can be identified by statistical methods is scarce.In addition, although statistical methods can determine that certain indicators have a significant impact on the risk of death in patients with sepsis, statistical methods cannot rank the impact of multiple highly significant indicators on the mortality of sepsis patients, in order to further determine which indicators are the most core indicators.With respect to mortality risk assessment in septic patients, machine learning still focus only on individual clinical records of septic patients with limited records, restricting assessment accuracy.
In this paper, we extract core indicators for mortality risk assessment of sepsis with machine learning, and propose a new mortality risk assessment model, DGFSD, for sepsis patients based on deep learning.Above all, we use machine learning model, XGboost, and adopt the recommendations of clinical experts to filter out the core indicators required for risk assessment, based on a massive amount of internationally available EMRs of sepsis patients.Then, a similarity connectivity graph of sepsis patients is constructed by patients similarity graph to connect patients with similar indicators.After that, we propose a mortality risk prediction model DGFSD for sepsis patients by constructing deep neural network (DNN) and graph convolutional network (GCN).The DGFSD model can not only learn individual clinical information about unassessed patients, but also obtain information about the structure of the similarity graph between diagnosed patients and patients to be assessed.

Method Dataset
MIMIC-III 20 is a large-scale public dataset jointly released by the Computational Physiology Laboratory of Massachusetts Institute of Technology, the Beth Israel Medical Center, and Philips Medical.ICU patients records from 2001 to 2012 at Beth Israel Deaconess Medical Center were collected in MIMIC-III, which includes data from many types of ICUs, such as the Surgical Care Unit, Medical Care Unit, and Trauma Surgical Care Unit.MIMIC-III contains patient's vital signs trend data and patient's clinical data, it is divided into four major categories: patient's basic information as well as transfer information category, patient's hospital outpatient related information category, patient's ICU related information category, and auxiliary information category, according to the degree of relevance of the record content.The four categories include 26 data tables such as hospitalization table, discharge table, date-type schedule, medical staff table, and monitoring situation table.etc.Researchers must pass tests in order to gain approval from the manager to use the dataset.We were approved to extract data from the MIMIC-III for research purposes after testing through the Citi Program.

Clinical Characteristics
We followed the criteria below to pick out patients: (1) Patients older than 18 years of age; (2) Patients diagnosed with sepsis according to the third international consensus definition of sepsis and septic shock; (3) Analyze each admission of sepsis patients as an independent sample.
A total of 9432 patients with sepsis were included in the study.1926 patients died, about 20.4% of the total.According to research by Hu et al. 12 , we employed international standardized ratios (INR) and so on as indicators for constructing a mortality risk prediction model by referring to established scoring tools such as SAPS II and APACHE III.The indicators contained both laboratory indicators and vital signs.Laboratory indicators included maximum values of serum creatinine, anion gap, lactate, blood urinary nitrogen (BUN), PH, white blood cell, bicarbonate, ionized calcium, serum calcium, serum chloride, serum sodium, serum potassium, blood glucose, INR, prothrombin time (PT), partial thromboplastin time (PTT), alanine aminotransferase (ALT), alkaline phosphatase (ALP), aspartate aminotransferase (AST), total bilirubin, creatine kinase MB, and lactate dehydrogenase, as well as minimum values of hematocrit and albumin.Vital signs included age, mean values of heart rate, respiratory rate, and body temperature, minimum values of oxygen saturation and the Glasgow Coma Scale (GCS) score (Supplementary Table S1).
We excluded indicators of septic patients with more than 30% missing values to generate a usable dataset (Supplementary Figure S1), and divided the training and test sets in a 7:3 ratio.Furthermore, we interpolated the missing values using the reference values of the indicators 21 .

Baseline
The goal of mortality risk assessment in septic patients is to accurately categorize patient outcomes, so it is essentially a binary classification task.For this reason, we choose accuracy (ACC) as an evaluation metric to assess the performance of the models.Considering that the dataset is unbalanced, we use SMOTE 22 algorithm to oversample the dataset, so that the utility of models can be reflected actually by the accuracy.ACC can be described as follows: www.nature.com/scientificreports/where S r denotes the number of actual surviving sepsis patients judged to be alive by the model, D r is the num- ber of sepsis patients whose actual deaths were determined as deaths by the model.S f represents the number of patients who are judged as dead by the model based on the actual survival of sepsis patients.D f is used as the number of patients with sepsis that the model determines as surviving when they actually die.
To verify the effectiveness of the DGFSD model, we compared the DGFSD model with the following baseline models: • Decision tree classification (DT): DT employs a tree structure and uses hierarchical reasoning to achieve the final classification.A decision tree is generally represented by a root node, internal nodes and leaf nodes.
We define the root node as the full sample of septic patients, the internal nodes as the septic patient feature attribute, and the leaf nodes represent the final decision result of the patient.For prediction, judgment is made inside the tree with eigenvalues, and based on the judgment, it decides which branch node to enter until it reaches the leaf node to get the classification result.• KNN: KNN predicts new data points by searching for the K most similar instances in the entire dataset and summarizing their output variables.We used KNN to make predictions about in-hospital outcomes for a particular septic patient by searching for information about similar patients.• Logistic regression (LR): LR is mainly used to solve binary classification problems.LR calculates the prob- ability of occurrence of a patient's outcome by accepting information about the characteristics of the sepsis patient's data.In particular, LR outperforms clinical scoring methods 12 .
Moreover, we perform ablation experiments by eliminating some of the modules in the DGFSD model, and the comparison methods involved include: • DGFSD-D-LR: Information about the structure of the similarity graph between patients is not considered, only obtain information about the individual clinical information of the patients.• DGFSD-G: Information about the individual clinical information of the septic patients is not considered, only obtain information about the structure of the similarity graph between patients.

Problem definition
The clinical records of patients with sepsis can be represented as S ∈ R n * d , where S = {s 1 , s 2 , s 3 , . . ., s n } and S i denotes the ith sample, d = {albumin, alp, alt, . . ., age} indicates the physical indicators, such as albumin, AST, and age.etc.In order to compensate for the issue of insufficient individual clinical data of septic patients, we obtained a large amount of patient information with similar indicators to individual patients by similarity calculation, denoted as similarity matrix A ∈ R n * n .The problem of assessing the risk of death in septic patients can be formalized as follows: given individual clinical data S on septic patients and information on similar patients A , the decision objective is to calculate the probability of the risk of death in septic patients through the DGFSD model P h = DGFSD{S, A}.

Model description
To more accurately assess the risk of mortality in patients with sepsis, we first construct a patients similarity graph based on the original sepsis patient records.Then we input the patients' data and patients similarity graph into autoencoder and GCN, respectively.The DGFSD model connects each layer of the autoencoder to the corresponding GCN layer so that a representation specific to the autoencoder can be integrated into a structurally aware representation of the GCN, and finally output the prediction result through GCN (Fig. 1).

Patients similarity graph
For each septic patient, we locate the top-k similar patients and set up edges to connect them.The formula for calculating the similarity between patients i and j can be described as: By calculating the similarity matrix X , we select the top-k similarities of each patient and construct an undi- rected patients similarity graph.Finally, the patient's adjacency matrix A can be obtained from the non-graph data of the septic patient.

DNN module
Variations of the basic autoencoder include masked autoencoders, convolutional autoencoder, LSTM encoderdecoder, adversarial autoencoder and deep autoencoder.etc [23][24][25][26][27] .We opted for the basic autoencoder to learn the clinical data representation of septic patients.We suppose there are L layers in the autoencoder and l denotes the number of layers.Then the septic patient clinical data representation H (l) learned by the l th layer encoder can be represented as follows: (1) (3) where ∅ is the activation function of the fully connected layer, W e and b (l) e denote the weight matrix and bias of the l th layer of the encoder, respectively.In addition, the input to 0th layer of the encoder is septic patient clinical records S.
The decoder section reconstructs the input data by the following description: where W

GCN module
We enable the GCN to learn both kinds of information by integrating the clinical data representation learned by the DNN into the GCN.The representation G (l) , learned by the GCN module at l th layer, can be described as follows: where W is the weight matrix, I is the unit diagonal matrix of the adjacency matrix A .In order to combine the individual information of septic patients learned from the autoencoder into the GCN, we merged H (l−1) with G (l−1) in the following way: where ϵ is an equilibrium coefficient.Due to reduce hyperparameter search in DGFSD, ϵ is set to 0.5, making the representation of GCN module and DNN module equally important 28 .We combine the autoencoder and the GCN layer by layer through Eq. ( 8) and use G (l−1) as the input to the l th layer in the GCN.At this point, the new data representation is as follows: From Eq. ( 9), it can be seen that the individual clinical information of septic patients learned by the autoencoder H (l−1) will be propagated in the GCN through the normalized adjacency matrix D − 1 2 A D − 1 2 .As the beginning of the GCN layer, we input the individual data S of the septic patient into the first GCN layer, at which point the first GCN layer is represented as shown below: The final layer of the GCN module is a binary classification layer with Relu functionality, the final representation of which is shown below: (4) www.nature.com/scientificreports/ The result g i,j ∈ G indicates that septic patient i was assessed for outcome j , at which point G is considered as a probability distribution.

Results
We evaluate the DGFSD model based on records from a large number of septic patients in MIMIC-III and compare the results of the DGFSD model with baseline models.Through extensive experimental comparison and analysis, we have obtained the following three main conclusions.
RC1.The results of our experiments indicate that serum sodium, serum potassium, and BUN are not central to the assessment of mortality risk in patients with sepsis in the accurate clinical diagnosis.RC2.We compare DGFSD with baseline models and find that the DGFSD model is more prominent than baseline models, indicating that the DGFSD model can be effectively applied in clinical auxiliary diagnosis.RC3.The DGFSD model can not only learn individual clinical information of undiagnosed sepsis patients, but also obtain similarity graph structure information between diagnosed and undiagnosed patients, thereby improving the evaluation accuracy of the model.

Core indicators analysis (RC1)
The baseline indicators we used are shown in Table 1.We first balance the dataset using SMOTE algorithm and then use the XGboost to obtain the importance ranks of the baseline indicators (Fig. 2).Incorporating the recommendations of clinical experts, we finalize the top 12 indicators ranked in importance as the core indicators for assessing the risk of death in patients with sepsis.
The core indicators for assessing the risk of death in patients with sepsis are shown in Table 2. Surviving patients have higher albumin, bicarbonate, and PH.In contrast, ALP, total bilirubin, serum chloride, creatinine, lactate, serum calcium, white blood cell, blood glucose, and age are higher in the deceased patients.
Zhang et al. 29 believe that serum sodium has a strong relationship to in-hospital mortality in patients with sepsis; SAPS II considers serum potassium, serum sodium, and BUN to be the core indicators for assessing the  risk of death in patients with sepsis; Hu et al. 12 consider BUN to be the core indicators for assessing the risk of death in patients with sepsis.However, Fig. 2 shows that the importance of three indicators, serum sodium, serum potassium and BUN, is not as significant as previous research.Therefore, we do not consider the three indicators to be core indicators for mortality risk assessment in septic patients.

Model comparison (RC2)
The experimental result of the comparison experiments of the DGSFD model with baseline models are shown in Fig. 3.The DGFSD model outperforms the baseline models.DGFSD, as a deep learning model, have an accuracy of 82.78%, which is superior to LR (78.80%),DT (75.78%) and KNN (76.07%).It shows that the DGFSD model can be used for the accurate clinical diagnosis.

Conclusions and discussions
We refine the core indicators for assessing mortality risk assessment of sepsis that are more relevant to the accurate clinical diagnosis.At the same time, We incorporate graph neural networks into the task of mortality risk assessing in septic patients, and propose a deep learning-based mortality risk assessment model DGFSD.
Specifically, we extract indicators importance rankings for mortality risk assessment of septic patients by XGboost model, and then cream off core indicators for assessing the risk of death of sepsis, taking into account Finally, we perform multiple experiments of the DGFSD model on MIMIC-III, an internationally recognized open medical dataset, and compare the performance of DGFSD with other classical machine learning.The experimental results show the superiority of the DGFSD model in predicting the risk of death of sepsis, and the DGFSD model can reach the criteria for clinical auxiliary diagnostic of sepsis. https://doi.org/10.1038/s41598-023-49890-9

d
are the weight matrix and bias of the lth layer of the decoder, respectively.

Figure 1 .
Figure 1.DGFSD, a mortality risk assessment model for sepsis patients based on deep learning.(A) represents the construction of a patients similarity graph between similar sepsis patients by similarity formula; (B) denotes the GCN module and the DGFSD model learns graph structure information between evaluated and unevaluated patients via the GCN module; (C) represents the DNN module, through which DGFSD learns data information from unassessed patients.After integrating the information learned by the DNN into the GCN, the DGFSD model is able to learn two representations of the information, resulting in a more accurate assessment of the risk of death.
results of the ablation experiments of the DGFSD model are shown in Fig. 4. The DGFSD model can not only learn individual clinical information about unassessed patients, but also obtain information about the structure of the graph between diagnosed patients and patients to be assessed.As shown in Fig. 4, the DGFSD model has the most ascendant performance.DGFSD-G only learns the similarity graph structure information between septic patients, and the experimental result shows that it is not as accurate as the DGFSD model.Meanwhile, DGFSD-D-LR only obtains individual clinical information of sepsis patients, and experimental result shows that it gains the most powerless performance.The ablation experiment shows that this multi-representation learning mode, DGFSD, can indeed improve the performance of the model in assessing the risk of death in sepsis patients.

Figure 3 .
Figure 3. Performance evaluation of model comparison.