Machine learning: a predication model of outcome of SARS-CoV-2 pneumonia CURRENT

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has resulted in thousands of deaths in the world. Information about prediction model of prognosis of SARS-CoV-2 infection is scarce. We used machine learning for processing laboratory findings of 110 patients with SARS-CoV-2 pneumonia (including 51 non-survivors and 59 discharged patients). The maximum relevance minimum redundancy (mRMR) algorithm and the least absolute shrinkage and selection operator (LASSO) logistic regression model were used for selection of laboratory features. Seven laboratory features selected by machine learning were: prothrombin activity, urea, white blood cell, interleukin-2 receptor, indirect bilirubin, myoglobin, and fibrinogen degradation products. The signature constructed using the seven features had 98% [93%, 100%] sensitivity and 91% [84%, 99%] specificity in predicating outcome of SARS-CoV-2 pneumonia. Thus it is feasible to establish an accurate prediction model of outcome of SARS-CoV-2 pneumonia with machine learning.


Introduction
Most human coronavirus infections are mild. However, several betacoronaviruses can cause serious diseases or even death.1,2 The mortality rates of severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV) were 10% and 37% respectively. SARS-CoV-2 is the pathogen for 2019 novel coronavirus disease (COVID-19),3,4 which has resulted in thousands of deaths in the world since the beginning of 2020.
The diagnosis of SARS-CoV-2 infection must be confirmed by the real-time reverse transcriptase high sensitivity cardiac troponin I and procalcitonin.11,5, 7 Ruan et al12 retrospectively analyzed laboratory findings of 68 nonsurvivors and 82 discharged patients, and found significant differences in lymphocytes, platelets, albumin, TB, urea nitrogen, creatinine, myoglobin, C-reactive protein and interleukin-6 between the two groups. These laboratory findings seemed useful in predicting outcome of SARS-CoV-2 infection. However, their accuracies need to be validated by other studies, and more laboratory tests need to be verified for their predication value. In addition, an advanced prediction model involving multiple laboratory parameters is urgently required to be applied in a clinical-decision support system to improve the predictive and prognostic accuracy.
As a branch of artificial intelligence, machine learning (ML) is helpful to establish accurate prediction model. 13,14,15 However, there are few publications reporting predication of the outcome of SARS-CoV-2 pneumonia using ML methods based on laboratory findings. Thus we retrospectively collected laboratory findings of discharged patients and non-survivors. These data were dealt with a ML method similar to radiomics. 16,17 We aim to establish a prediction model of outcome of SARS-CoV-2 pneumonia based on laboratory data.

Methods
All methods were carried out in accordance with relevant guidelines and regulations.

Study design and participants
This study was approved by the Ethics Commission of Hospital (TJ-2020-075). Written informed consent was waived by the Ethics Commission of hospital.
The author's center was the designated hospital for severe and critical SARS-CoV-2 pneumonia.
Specialists from the top hospitals in the country gathered here and formed a consensus on the assessment and treatment for patients. Patients underwent repeated RT-PCR tests to confirm the presence of SARS-CoV-2. Laboratory tests for SARS-CoV-2 pneumonia included: blood routine test, serum biochemical test (including glucose, renal and liver function, creatine kinase, lactate dehydrogenase, and electrolytes), coagulation profile, cytokine test, markers of myocardial injury, infection-related makers, and other enzymes. Repeated tests were done every three to six days for monitoring the patient's condition.
Oxygen support (from nasal cannula to invasive mechanical ventilation) was administered to patients according to the severity of hypoxaemia. All patients were administered with empirical antibiotic treatment, and received antiviral therapy. Most of patients improved after regular treatment. Fitness for discharge was based on abatement of fever for at least 10 days, significantly improved respiratory function, and negative RT-PCR for SARS-CoV-2 twice in succession. However, the condition of a few critical patients continued to deteriorate and eventually died.
The presence of SARS-CoV-2 in respiratory specimens was detected by real-time RT-PCR methods.
The primers and probe target to envelope gene of CoV were used and the sequences were as follows: forward primer 5′-TCAGAATGCCAATCTCCCCAAC-3′; reverse primer 5′-AAAGGTCCACCCGATACATTGA-3′; and the probe 5′CY5-CTAGTTACACTAGCCATCCTTACTGC-3′BHQ1. Conditions for the amplifications were 50°C for 15 min, 95°C for 3 min, followed by 45 cycles of 95°C for 15 s and 60°C for 30 s.

Data collection
58 fatal cases of SARS-CoV-2 pneumonia (39 male, median age 66 years) were collected by the electronic medical record system. 68 discharged patients with SARS-CoV-2 pneumonia whose age and gender matched the non-survivors were selected (46 male, median age 66 years). The admission date of these patients was from Feb 16, 2020 to Mar 20, 2020. We reviewed all laboratory findings for each patient. Results of repeated tests were carefully compared to find the greatest deviation from normal value. In general, the greatest number in series of values was recorded. However, for platelets, red blood cell, lymphocytes, hemoglobin, calcium, total protein, albumin, estimated glomerular filtration rate (eGFR), and prothrombin activity (PTA), the minimum was recorded. These recorded laboratory findings were considered as features of a patient. A initial data set of features of 126 patients (nonsurvivor 58, discharge 68) was thus built.
There were 16 patients who did not have the entire group of laboratory features, thus their data were deleted from the dataset. The remaining data of 110 patients (51 non-survivor, 59 discharge) were analyzed by machine learning. Age and gender were added in the data set for statistical comparison purpose. Training cohort and validation cohort were randomly divided according to 8:2. Thus there were 88 patients in training cohort and 22 patients in validation cohort.

Statistical analysis and modeling for training cohort
First, all the laboratory features were compared between non-survivors and discharged patients using the Mann-Whitney U test for non-normally distributed features or the independent t-test for normally distributed features. 16,17 Features with p < 0.05 were considered significant variables and selected. 16,17 Second, Spearman's correlation coefficient was used to compute the relevance and redundancy of the features. 16,17 Third, we applied the maximum relevance minimum redundancy (mRMR) algorithm to assess the relevance and redundancy of the features. 16,17 The features were ranked according to their mRMR scores. 16,17 Fourth, the top 15 features with high-relevance and lowredundancy were selected for least absolute shrinkage and selection operator (LASSO) logistic regression model. The LASSO logistic regression model with 5-fold cross-validation was adopted for further features selection and construction of signature. 16,17 Some candidate features coefficients were shrunk to zero and the remaining variables with non-zero coefficients were finally selected. 16,17 The features and corresponding coefficients were used for calculating signature for each patient. Mann-Whitney U test and receiver operator characteristic (ROC) analysis were used for comparing signature between two groups. 16,17

Statistical analysis for validation cohort
The model derived from training cohort was used for the validation cohort. 16 The signature was calculated for each patient, and compared between non-survivors and discharged patients using a Mann-Whitney U test. ROC analysis was used to determine AUC, sensitivity and specificity.

16,17
The following R packages were used: the "corrplot" package was used to calculate Spearman's correlation coefficient; the "mRMRe" package was used to implement the mRMR algorithm; the "glmnet" was used to perform the LASSO logistic regression model, and the "pROC" package was used to construct the ROC curve. 16,17 Results Nine features were eliminated in the first step of feature selection because of non-significance. The remaining thirty-eight features were significantly different between two groups (P < 0.05), and then mRMR scores were obtained for them. There were seven features having non-zero coefficients after 5fold cross-validation of LASSO algorithm, and were selected to construct a new signature. Table 1 shows the fifteen features with the highest mRMR scores. Figure 1 shows the correlation matrix heatmap of the thirty-eight significant features. Figure 2 shows the feature selection process with LASSO algorithm. Figure 3 shows the contribution of the seven features to the new signature. Figure 4 and figure 5 show the signatures of all patients in training and validation cohort respectively. Non-survivors (n=51) and discharged patients (n=59) did not differ in age or gender (median age 67 vs. 66, P=0·75; percentage of males, 66% vs. 64%, P=0·66). The comparisons of laboratory finding between non-survivors and discharged patients in training cohort (n=88) are shown in Table 2.

Blood routine test
WBC and neutrophils were significantly higher in non-survivor group versus discharge group.
Lymphocyte, platelets and red blood cells were significantly lower in non-survivor group versus discharge group. AUC for them were 0·646~0·910.

Electrolyte
Potassium, chlorine and sodium were significantly higher in non-survivor group versus discharge group. Calcium was significantly lower in non-survivor group versus discharge group. AUC for them were 0·634~0·652

Serum biochemical test
Glucose and globulin were significantly higher in non-survivor group versus discharge group. Albumin and total protein were significantly lower in non-survivor group versus discharge group. AUC for them were 0·649~0·736.

Renal function
Urea and creatinine were significantly higher in non-survivor group versus discharge group. The eGFR was significantly lower in non-survivor group versus discharge group. AUC for them were 0·672~0·907.

Liver function
Total bilirubin, direct bilirubin, IB and glutamic oxaloacetic transaminase were significantly higher in non-survivor group versus discharge group. AUC for them were 0·647~0·806.

Coagulation profile
Prothrombin time, activated partial thromboplastin time, D-dimer, international normalized ratio (INR), fibrinogen and FgDP were significantly higher in non-survivor group versus discharge group. PTA was significantly lower in non-survivor group versus discharge group. AUC for them were 0·847~0·886.

Infection-related markers and myocardial injury markers
Procalcitonin, high sensitive C-reactive protein, ferritin and N-terminal pro-brain natriuretic peptide (NT-proBNP) were significantly higher in non-survivor group versus discharge group. Myoglobin, MB isoenzyme of creatine kinase and high sensitive cardiac troponin I were significantly higher in nonsurvivor group versus discharge group. AUC for them were 0·843~0·915.

Discussion
Non-survivors and discharged patients with SARS-CoV-2 pneumonia differed significantly in thirtyeight laboratory findings. By using machine learning method, we established a predication model involving seven laboratory features. The model was found highly accurate in distinguishing nonsurvivors from discharged patients. The seven features selected by artificial intelligence also indicated that dysfunction of multiple organs or systems correlated with the prognosis of SARS-CoV-2 pneumonia.
The SARS-CoV-2 spreads and invades through respiratory mucosa, triggers a series of immune responses and induces cytokine storm in vivo, resulting in changes in immune components.18,5 When immune response is dysregulated, it will result in an excessive inflammation, even cause death.19,7 We found leukocyte and neutrophils count were significantly higher in non-survivors than in survivors.
Excessive neutrophils may contribute to acute lung damage, and are associated with fatality.20 The absolute value of lymphocytes was reduced in SARS-CoV-2 non-survivors, suggesting depletion of lymphocytes caused by strong innate inflammatory immune response. Higher serum levels of proinflammatory cytokines (IL-2r and IL-6) and C-reactive protein were found in non-survivors, also indicating excessive immune response. In addition, high leukocyte count in SARS-CoV-2 patients may be also due to secondary bacterial infection.21,5 Elevated procalcitonin was seen in fatal cases, representing more prominent inflammation.22 All these laboratory parameters mentioned above may be associated with prognosis of SARS-CoV-2 pneumonia.
Lung lesions have been considered as the major damage caused by SARS-CoV-2 infection. Severe cases may develop acute respiratory distress syndrome (ARDS) and respiratory failure. However, liver injury has also been reported to occur during the course of the disease,23,24 and is associated with the severity of diseases. Abnormal transaminase levels accompanied by decreased serum albumin and increased serum bilirubin levels were observed in fatal cases. The levels of liver function associated markers were significantly higher in non-survivors compared to survivors. Acute kidney injury could have been related to direct effects of the virus, hypoxia, or shock.25,26 Blood urea, and creatinine levels continued to increase, until death occurred. Non-survivors had lower eGFR and higher blood urea compared to survivors. Myocardial injury was seen in non-survivors, which was suggested by elevated level of myoglobin, high sensitive cardiac troponin I, or MB isoenzyme of creatine kinase.
The pathologic mechanisms of multiple organ dysfunction or failure may be associated with the death of patients with SARS-CoV-2 pneumonia. Some patients with SARS-CoV-2 infection progressed rapidly with sepsis shock, which is well established as one of the most common causes of disseminated intravascular coagulation (DIC).27 Conventional coagulation parameters during course may be also associated with prognosis of SARS-CoV-2 pneumonia. The non-survivors in our cohort revealed significantly longer prothrombin time and APTT compared to survivors. At the late stages of SARS-CoV-2 infection, levels of fibrin-related markers (D-dimer and FgDP) markedly elevated in most cases, suggesting a secondary hyperfibrinolysis condition in these patients.
A number of laboratory features were compared between non-survivors and discharged patients with SARS-CoV-2 pneumonia. The two groups differed significantly in as many as thirty-eight features.
However, none of the futures provided adequate accuracy in predicating the outcome of SARS-CoV-2 pneumonia. Thus, a novel accurate predication model involving multiple features was established in the study. With machine learning methods previously used in radiomics, a predication model combining seven out of the thirty-eight laboratory features was highly accurate in predicating the outcome of SARS-CoV-2 pneumonia, for either training cohort or validation cohort.
The mRMR algorithm was used for assessing significant features to avoid redundancy between features. The features were ranked according to their relevance-redundancy scores. The mRMR score of a feature is defined as the mutual information between the status of the patients and this feature minus the average mutual information of previously selected features and this feature.28,29,17 The top fifteen features with high mRMR scores were selected for the next step of modeling. The least absolute shrinkage and selection operator logistic regression model was used to processing the features selected by mRMR algorithm. LASSO is actually a regression analysis method that improves the mode prediction accuracy and interpretability.30 Some candidate features coefficients were shrunk to zero and the remaining variables with non-zero coefficients were selected. After using LASSO, new signature could be calculated with selected features and their coefficients. The signature used for predication of outcome can be positive or negative number, corresponding with poor and good prognosis respectively.
Our results showed that the signature provides excellent efficiency for discriminating survivor from non-survivor. The sensitivity and specificity were both excellent. The AUC of the signature was 10~40% higher than AUC of a single laboratory feature. As this predication model was established by artificial intelligence, all we did was to match the age and gender of discharged patients and non-survivors before providing laboratory findings to computer. Although the modeling process is a black box to us, the choice of features seems reasonable. PTA can more accurately reflect the coagulation function compared to prothrombin time, and can also reflect the degree of liver injury. Urea is a good index to reflect the degree of renal function damage. WBC can not only reflect immune status, but also be used to evaluate secondary infection. IL-2r is an indicator of inflammation and immune response.20 IB is related to both liver function and possible hemolysis.
Myoglobin reflects the degree of myocardial injury. The increase of FgDP is related to coagulation disorders including DIC. Thus the current model involves multiple important systems closely related to the prognosis. Based on the high accuracy of the prediction model, it seems that we can deduce the following conclusions: liver, kidney, myocardial damage, coagulation disorder and excess immune response all contribute to the outcome of SARS-CoV-2 pneumonia.
One limitation of this model is that it did not cover all laboratory tests. Some important laboratory tests, such as lymphocyte, albumin or creatinine, were not included. Fortunately, there are moderate to high correlations between the unselected and selected features, which is confirmed by our statistical analysis. Furthermore, models involving too many features are not easy for clinicians to use. Another limitation of the model is that it did not involve clinical variables, because we focused on maximizing the predication value of objective laboratory variables.
Our study has some limitations. First, this is a single-center retrospective study with relatively small sample size. There were only 88 patients in training cohort and 22 patients in validation cohort. Multicenter large-sample studies are required to validate our predication model. Second, due to the difference of instrument among centers, the same patient may have different values for the same laboratory test in different hospitals. Our model based on the laboratory data from the author's center may not be directly used in other centers. However, they could easily establish a predication model using their own data with machine learning method. Third, age and gender were matched for discharged patients and non-survivors in the current study. It is well established age and gender influence the results of laboratory tests. Because we eliminated the interference of age and gender, the difference of laboratory feature was caused by the disease severity. This study focused on the real predictive value of laboratory tests and aimed to improve prediction accuracy by combining multiple laboratory findings. However, a more complex model combining laboratory features and clinical variables should be constructed in future study. Fourth, it is difficult for general clinicians to understand the method of artificial intelligence. With more and more artificial intelligence used in medical diagnosis, this prediction model will be paid more attention to.
In conclusion, it is feasible to establish a accurate prediction model of outcome of SARS-CoV-2 pneumonia using machine learning method. Injury of liver, kidney and myocardium, coagulation disorder and excess immune response all correlate with the outcome of SARS-CoV-2 pneumonia.

Data availability
After publication, the data will be made available to others on reasonable requests to the corresponding author. Correlation matrix heatmap of 38 significant features. Spearman's correlation coefficient was used to compute the relevance and redundancy of the features.

Figure 2
The 5-fold cross-validation (A) of the least absolute shrinkage and selection operator algorithm for feature selection process. A vertical line was drawn at the optimal value. Some candidate features coefficients were shrunk to zero (B) and the remaining seven variables with non-zero coefficients were finally selected to construct the signature.