A novel severity score to predict inpatient mortality in COVID-19 patients

COVID-19 is commonly mild and self-limiting, but in a considerable portion of patients the disease is severe and fatal. Determining which patients are at high risk of severe illness or mortality is essential for appropriate clinical decision making. We propose a novel severity score specifically for COVID-19 to help predict disease severity and mortality. 4711 patients with confirmed SARS-CoV-2 infection were included. We derived a risk model using the first half of the cohort (n = 2355 patients) by logistic regression and bootstrapping methods. The discriminative power of the risk model was assessed by calculating the area under the receiver operating characteristic curves (AUC). The severity score was validated in a second half of 2356 patients. Mortality incidence was 26.4% in the derivation cohort and 22.4% in the validation cohort. A COVID-19 severity score ranging from 0 to 10, consisting of age, oxygen saturation, mean arterial pressure, blood urea nitrogen, C-Reactive protein, and the international normalized ratio was developed. A ROC curve analysis was performed in the derivation cohort achieved an AUC of 0.824 (95% CI 0.814–0.851) and an AUC of 0.798 (95% CI 0.789–0.818) in the validation cohort. Furthermore, based on the risk categorization the probability of mortality was 11.8%, 39% and 78% for patient with low (0–3), moderate (4–6) and high (7–10) COVID-19 severity score. This developed and validated novel COVID-19 severity score will aid physicians in predicting mortality during surge periods.

The first confirmed case of COVID-19 in New York City was on March 1st, 2020. Within a few short weeks all of the hospitals in the area were overwhelmed hitting a peak on April 6th, 2020 of 6,377 confirmed positive cases that day. As of July 3rd, 2020, there have been 18,535 deaths, 55,110 hospitalizations and a total of 213,212 cases in this city 1 . New York City is an international travel hub with a high population density, and a heavy reliance on mass transportation that provided the permissive substrate for rapid viral spread 2 . As such, this region was one of the earliest areas in the United States to encounter the full impact of the pandemic 3 . Over these first few months much has been learned about the disease, its deadliness, and those who are at higher risk for dying.
In many people the disease is mild and self-limiting, but in a considerable portion of patients the disease is severe and fatal. Determining which patients are at high risk of severe illness or mortality is an essential part of understanding this illness. Prior reports from Wuhan identified certain comorbidities as diabetes, hypertension and coronary artery disease as patients more likely to present to their hospital 4 . They also discovered that patients with older age, higher Sequential Organ Failure Assessment (SOFA) score, and elevated d-dimers were significantly associated with inpatient mortality 4 . Further reports have shown other predictors of poor outcome such as acute kidney injury, acute hepatic injury, the need for mechanical ventilation, elevated c-reactive protein (CRP), interleukin-6 (IL-6), lymphocyte count, and Procalcitonin levels [5][6][7][8] .
COVID-19 is unique in its ability to not only cause sepsis, and multi-system organ failure, but also to cause a severe inflammatory response that can lead to systemic multi-vascular thrombosis 9,10 . While the SOFA score is also predictive of mortality for COVID-19, it does not address the additional thrombotic mitigators of severe illness 11  www.nature.com/scientificreports/ development of disseminated intravascular coagulation (DIC), and now being used to help guide the use of anticoagulation for patients with COVID-19 [12][13][14] . We propose a novel score specifically for COVID-19 in-hospital mortality, combining elements of both of these scores to help predict disease severity and mortality.

Methods
After approval of this study by the Montefiore Medical Center/Albert Einstein College of Medicine Institutional  Review Board, information on demographics, comorbidities, admission laboratory values, admission medications, admission supplemental oxygen orders, discharge and mortality was identified through a healthcare surveillance software package (Clinical Looking Glass [CLG]; Streamline Health, Atlanta, Georgia) and review of the primary medical records. The Montefiore Medical Center/Albert Einstein College of Medicine Institutional Review Board approved waiver of patient informed consent due to the retrospective design of the study. To our knowledge, a description of the entire cohort of patients, as in the current manuscript, has not been reported in other submissions. In the interest of transparency, anonymized data will be made available at https ://figsh are. com/s/79827 c396a f7df4 2b3d7 .
All methods were carried out in accordance with relevant guidelines and regulations Laboratory measures were extracted by identifying those obtained-on-admission. Comorbidities were identified based on the International Coding Disease coding system (ICD-10). The comorbidities chosen for this study are those used in the Charlson Comorbidity Index. Each patient's medical record was queried for any diagnosis occurring within 5 years of his or her index admission. We included the laboratory markers that were made part of the routine tests on admission during the period of the study in our institution, among the available markers we selected the ones that have been reported to be commonly altered accordingly to recent studies (Online Appendix 1).
This study is an observational cohort study validating a novel, simple COVID-19 in-hospital mortality score to predict inpatient mortality risk in 4711 patients with confirmed SARS-CoV-2 infection using a combination of presentation vital signs, and basic admission laboratory values. This model was created on patients presenting from March 1st to April 16th. We used the first numeric half of patients during this period (n = 2355) as the "derivation cohort" in which the severity score was developed and internally validated. The second numeric half of our cohort (n = 2356) was used to confirm the power of the prediction score; this part of the cohort was considered the "validation cohort".
Inclusion criteria was defined as all patients admitted to a hospital within a large healthcare network that were positive by detection of SARS-CoV-2 RNA using real-time reverse transcriptase-polymerase chain reaction (RT-PCR) assay testing, performed within the hospital system or documented at an outside system prior to transfer. Patients evaluated in the emergency room but not admitted, or those that died in the emergency room, were excluded from the analysis, given the relative paucity of data. Most patients had only one admission, and we only considered the last hospitalization for those that had multiple admissions during this period.

Statistical analysis.
Continuous values were represented using mean ± standard deviation (SD), or median and interquartile range (IQR). Categorical variables were described using frequencies and proportions. Comparisons were performed using Student's t test, the nonparametric Mann-Whitney test or χ2 tests as appropriate. No imputation was made for missing data. The primary outcome of this study was in-hospital mortality. Hence, all the following statistical steps were done with in-hospital mortality as the only dependent variable.
Candidate predictors with P < 0.10 in univariate analyses were included a multiple logistic regression. In addition, a backward stepwise bootstrap regression model, in which 1000 random samples patients were generated with replacement, was also performed to investigate the relative importance of each variable included in our model 15 . Frequencies of occurrence of each covariate in the final model were noted; if predictors occurred in 70% or more of the bootstrap models, they were retained in the final multiple regression model. Beta coefficients and odds ratios (OR) were calculated with 95% confidence intervals (CI). The multiple regression coefficients of the predictive factors were used to assign integer points for the prediction score. However, for the simplicity of the score we allocated points in sequential order for variables with multiple categories (e.g., age in years < 60, ≥ 60, ≥ 70, and ≥ 80 would equal to 0, 1, 2 and 3 points in the score, respectively).
As described in previous validation methods 13 , we assessed the discriminative power of the prediction score by calculating the area under the receiver operating characteristic (ROC) curves (AUC). A predictor with an AUC above 0.7 was considered to be useful, while an AUC between 0.8 and 0.9 indicated good diagnostic accuracy. Risk categories were determined using the classification and regression tree (CART) analysis. The CART algorithm builds decision tree based on Gini's impurity index as splitting criterion; the score was iteratively subdivided to find the cut-off point that produces the greatest reduction of impurity, meaning that it measures how often a random patient that died will be incorrectly labeled as low-risk and vice versa, a patient that survived will be labeled as high-risk 16 . Calibration of the risk score reflecting the link between predicted and observed risk, was evaluated by the Hosmer-Lemeshow goodness of fit test. A P value < 0.05 was considered statistically significant for all analyses. Data were analyzed using the STATA version 12 and IBM SPSS version 24. www.nature.com/scientificreports/
The bootstrap analysis revealed that, out of the 10 independent predictors of mortality, age, oxygen saturation, MAP, BUN, CRP, INR and procalcitonin were reproducibly selected in more than 70%. Due to the large number of missing data for procalcitonin (44%), this variable was excluded in order to avoid noise predictors. Allocation of points for the COVID-19 severity score was made based on Beta coefficients and BCa 95%CI, however for the simplicity of the score we allocated points 1 to 3 in subcategorized variables (Age & MAP) ( Table 4). The total prediction score ranges between 0 and 10 with a high score indicating high risk of in-hospital mortality.
A ROC curve analysis was performed in the derivation cohort (Fig. 1), the novel COVID-19 severity score achieved an AUC of 0.824 (95% CI 0.814-0.851) indicating a good discrimination for patients with higher risk  www.nature.com/scientificreports/ www.nature.com/scientificreports/ of in-hospital mortality. Furthermore, the Hosmer-Lemeshow goodness of fit test of tenfold cross-validation did not reach statistical significance (P = 0.244) indicating a good match of predicted risk over observed risk. Finally, we applied the score to the 2356 patients in the validation cohort. The ROC curve analysis showed an AUC of 0.798 (95% CI 0.789-0.818) still indicating a useful discrimination for our model (Fig. 2A). Then, we determined that low risk patients (0-3 points) had a 11.8% risk of mortality, moderate risk patients (4-7 points) had a 39% risk of mortality and high-risk patients (> 7 points) had a 78% risk of mortality (Fig. 2B).  www.nature.com/scientificreports/

Discussion
We propose a novel scoring system to aid in the prediction of inpatient mortality for patients presenting with SARS-CoV-2 infection to hospital emergency rooms. The score is based on simple pragmatic demographic data, and presenting biomarker values. This score incorporates the unique constellation of various presentations in which COVID-19 can manifest in severe illness. We avoided incorporating mechanical ventilation use into the score as this was tied to a clinical decision, which over time with more knowledge an approach that changed. While IL-6 also seems to predict mortality, we avoided incorporating this biomarker, as it is a non-routine test, and were not available in a large percentage of our patient population. As of yet there are no scoring systems created that are specific to the elements of COVID-19 illness manifestations and that can predict mortality. The limitations of this study are its retrospective design, its cohort, which is primarily a minority urban population, and the epoch at which the data was required. Since the data and outcomes were recorded during the highest surge of the pandemic this may bias the results towards higher mortality as this was a great strain on treating hospitals at the time. Prior reports also have shown increased mortality in racial and ethnic minority patients 17 . Given the sociodemographic background of our patient population the score may again be biased towards higher mortality risk. While the design of the study may limit its generalizability to other populations, these findings are meaningful in that they are specifically applicable to minority urban centers that are suffering from large surge populations of infected patients, which in the first wave of the pandemic across the United States of America suffered the most. The encountered mortality rate is certainly high, but most likely the result of the high comorbidity burden in our population, the fact that all of these patients had enough symptom severity to warrant admission, and the fact that the study period was early in the pandemic when there was limited understanding regarding the disease. Nonetheless, given the diverse patient population of the Bronx, it is possible that this score can be generalized to other large inner-city populations. Future research is needed to validate this score in other populations, as well as to compare this score to the SOFA and ISTH DICS score. The health network from which this data was captured is comprised of a network of 3 major hospitals in the Bronx in New York City, one of which is a large quaternary care facility accepting transfers for complex and severely ill patients in the region beyond the Bronx into Westchester County. The mortality rates reported here are for hospitalized patients who tended to be older and more severely affected than others infected with the virus. Hence the mortality rate for hospitalized patients is higher than the more commonly reported case-fatality rate that reflects the number of deaths per documented infection. In any case, the rates reported here are broadly comparable to mortality rates for hospitalized patients in other countries at comparable time points in their respective pandemic outbreaks: China-48% 4 , Italy-26% 18 , and New York-21% 19 .
The mortality rates were slightly different between the training set (26.4%) and the testing set (22.4%). This is likely secondary to the temporal difference between the sets. During the first 3 weeks of the pandemic surge, there was still little known about optimal management strategies for severely ill patients. As time went on, mortality rates decreased. In addition, there was more community awareness of the potential impact of the virus and it is possible patients were more likely to seek medical attention sooner and arrived in less severe states. Despite this mortality rate differences, the severity score itself remained valid. There were also variances in racial distribution between the two cohorts. Despite these differences in race, the severity score remained valid in predicting in-hospital mortality. www.nature.com/scientificreports/ In other metropolitan areas outside of New York City there have been reports of racial disparity and outcome, we found no difference in mortality rates between races 17,20 . There are a number of possibilities why. The Bronx is uniquely diverse in its racial and ethnic populations however also one of the poorest regions in the United States of America with median income of $38,085 and 27.3% of persons living in poverty 21 . One reason could be that other social determinants of health, including poverty level are more powerful predictors of mortality rather than race alone.
While mortality prediction is neither perfect nor absolute, having a simple score to predict how severe a patient's illness and hospital course will be, can aid admitting and emergency room physician's ability to triage severity and predict prognosis during surge periods. This can also be used to guide recommendations for palliative care consultation early in a patient's hospital course. www.nature.com/scientificreports/ Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/.