A predictive score for progression of COVID-19 in hospitalized persons: a cohort study

Accurate prediction of the risk of progression of coronavirus disease (COVID-19) is needed at the time of hospitalization. Logistic regression analyses are used to interrogate clinical and laboratory co-variates from every hospital admission from an area of 2 million people with sporadic cases. From a total of 98 subjects, 3 were severe COVID-19 on admission. From the remaining subjects, 24 developed severe/critical symptoms. The predictive model includes four co-variates: age (>60 years; odds ratio [OR] = 12 [2.3, 62]); blood oxygen saturation (<97%; OR = 10.4 [2.04, 53]); C-reactive protein (>5.75 mg/L; OR = 9.3 [1.5, 58]); and prothrombin time (>12.3 s; OR = 6.7 [1.1, 41]). Cutoff value is two factors, and the sensitivity and specificity are 96% and 78% respectively. The area under the receiver-operator characteristic curve is 0.937. This model is suitable in predicting which unselected newly hospitalized persons are at-risk to develop severe/critical COVID-19.


INTRODUCTION
Approximately 10 percent of persons with SARS-CoV-2-infection are hospitalized because they develop severe/critical coronavirus disease (COVID-19) [1][2][3] . According to the interim guidance of the Centers for Disease Control and Prevention (CDC) 5 , in areas with sustained community-level outbreaks 4 , typically urban, most hospital admissions are for persons with severe/critical and/or comorbidities. This selection bias reflects limited acute care resources but is not representative of the global epidemiology of either SARS-CoV-2-infection or of COVID-19. Consequently, most prognostic and predictive scores using admission co-variates are for death from COVID-19 in persons with severe/critical COVID-19 rather than disease progression in persons with less severe disease 6,7 . For example, in a report from Italy of 1591 subjects, it was mentioned that all subjects were admitted to an intensive care unit (ICU), 99 percent of evaluable subjects required respiratory support, and the case fatality rate for ICU subjects was 26 percent 8 .
Understandably, most reports of large series of persons with COVID-19 are from urban centers where SARS-CoV-2-infection is an epidemic. However, as the numbers of confirmed cases and endemic areas increase every day, it is with high to certain probability that more and more areas currently with sporadic or clustered cases will eventually become areas with sustained community-level outbreaks and would involve hospitalizations for less severe disease under current containments. This is what we have seen in the historical influenza epidemics according to CDC's pandemic interval framework 9 . Under these circumstances, a precise and convenient triage strategy would be especially important in allocating health care capacity. To address this issue, we studied outcomes of 98 consecutive subjects with COVID-19 in a region of 2 million persons where most admissions were for persons with mild or moderate COVID-19. We were able to use these data to develop a predictive model of the risk of progression to severe/critical COVID-19. These data may help physicians prioritize use of medical resources accordingly.

RESULTS
From January 17 to February 13, 2020, 98 patients with COVID-19 in Zhuhai were admitted to the hospital (Table 1); 46 subjects were male. Their median age was 47 years (interquartile range [IQR], 34-62; range, 10 months to 80 years). From the total number of patients, 77 had traveled to an epidemic area and 18 had contact with a SARS-CoV-2-infected person; 45 subjects had comorbidities on admission, including hypertension (N = 17), diabetes (N = 7), cancer (N = 5), tuberculosis (N = 2), and chronic kidney disease (N = 2). The median duration from symptoms onset to admission was 3 days (IQR, 1.0-5.3). On admission, 13 subjects were classified as having mild disease, 79 were classified as moderate, and 3 were classified as patients with severe COVID-19. None of these subjects were critical. (3 subjects were not classified on admission).
During hospitalization, four subjects received mechanical ventilation; ten subjects in the moderate severity cohort were administered corticosteroids. Further, 17 subjects (15 in the moderate cohort) were administered chloroquine, 12 (11 in the moderate cohort) subjects were administered lopinavir/ritonavir (LPV/r), and 13 subjects were administered intravenous immunoglobulin.
At the time of final follow-up (median 55 days from admission; IQR, 52-58; range, 37-79 days), the highest COVID-19 severity scores were mild in 8 subjects, moderate in 66, severe in 19, and critical in 5. Severity grade shifts are summarized in Table 2. Among 92 mild and moderate patients on admission, 21 (22.8%) progressed during hospitalization, including 3 with critical illness. The progression rates were 15.4% and 24.1% in the mild and moderate groups respectively. The median duration of hospitalization or interval to death was 18 days, which was not significantly different among the severity cohorts. The median duration of virus shedding was 8 days (IQR: 4-10 days; range, 1-19 days), and was similar among the severity cohorts. Detailed laboratory results of patients with COVID-19 on admission by severity is illustrated in Table 3.
In subjects with mild or moderate severity disease on admission (N = 95), we interrogated co-variates associated with risk of progression to severe/critical disease. Some binary co-variates were excluded because of low sensitivity and specificity. Continuous variables were tested in receiver-operator characteristic (ROC) curves to identify cutoff values ( Fig. 1 and Supplementary Fig. 1), transformed into categorical variables, and entered in multivariate backward stepwise logistic regression analysis with clinical covariates significantly associated with the risk of progression (as shown in Tables 1 and 3). Several duplicates and co-linear covariates were excluded such as International Normalized Ratio, CD4 and CD8-positive cell concentrations. C-reactive protein (CRP) > 5.75 mg/L, prothrombin time (PT) > 12.3 s, age > 60 years, and blood oxygen saturation (SpO 2 ) < 97% correlated with a severer disease characteristic, showing a clinical value in categorizing patients with a higher risk of progression to severe/critical diseases. Co-variates, odds ratios and 95% confidence intervals are shown in Fig. 2. The score of each point was defined as the relative weights assigned according to the regression coefficient of each categorical co-variate, namely 1 point for each. The area under the ROC curve (AUROC) of the score was 0.937. The cutoff value in dividing subjects into high-and low-risk groups with the potential risk for progressing to severe/critical cases was 2, with a sensitivity of 96% and specificity of 78%. The hazard ratio of progression to severe/ critical COVID-19 in subjects with a score ≥2 was 42 (11-164) compared with subjects whose score was <2. 59 percent (43, 73%) subjects with a score ≥2 developed severe/critical COVID-19 compared with 2 percent (0.3, 9.0%, P < 0.001) of subjects with a score <2.

DISCUSSION
In our study of 98 consecutive, unselected subjects with SARS-CoV-2-infection and COVID-19 (including all hospitalized persons) in an area of 2 million people with sporadic or clustered cases, we identified four admission co-variates that were significantly associated with progression to severe/critical disease. We used these co-variates to develop a predictive score that identified subjects with a 40-fold increased risk of progression of COVID-19 to a severe/critical stage with a sensitivity of 96% and a specificity       There are several reports of prognostic and predictive scores of outcomes of COVID-19, although most studies have important biases and are not representative of real-world experience with the SARS-CoV-2 pandemic and with COVID-19 10 . For example, in regions where authorities imposed home isolation and social distancing, most persons with mild/moderate COVID-19 were not hospitalized 11 . On the other hand, in urban regions with large numbers of cases of COVID-19 and limited intensive care resources such as mechanical ventilation, most hospital admissions were for persons with severe/critical COVID-19 alone [12][13][14] . There were also obvious selection biases as to why and where people were hospitalized in these studies, and hospitalized persons from a center or a few centers were unlikely representatives of the distribution of cases of COVID-19 in a region, especially a region with sporadic or clustered cases of SARS-CoV-2infection 15,16 . In several studies, there was censoring of subjects still in-hospital, which biased the interpretations of results accordingly 12,17 . In contrast, we were able to identify every patient of COVID-19 in our area, all of whom were actually hospitalized. We also conducted a complete follow-up of all the subjects. These biases are obvious when we consider the 1 death in our study versus an average of 10-20 percent in other studies 14,18 . Our subjects were more likely to be similar to a typical non-epidemic setting of exposure to SARS-CoV-2; therefore, our prognostic score is more likely to be widely useful 1,19,20 .
Similar to other studies, we found that age, CRP, and SpO 2 on admission correlated with outcomes 12,21-23 . However, our study differed from other predicting tools 12,17,21,22 , wherein we identified a new risk factor, PT. Previous studies indicate that coagulation disorders are common in patients with severe COVID-19 24,25 and are associated with an increased risk of acute respiratory distress syndrome 26 . We suggested early monitoring of PT to predict the likelihood of progression.
In conclusion, our study had limitations including its retrospective design, relatively few subjects, and no validation cohort. We also lacked detailed data on post-admission interventions. However, other than oxygen supplementation and mechanical ventilation, none of the other interventions proved effective 27 . Also, our aim was to predict outcomes from admission to better allocate medical resources. Our score is easily implemented and should assist physicians to identify persons with COVID-19 on admission at the greatest risk to develop severe/critical disease.

METHODS Subjects
This retrospective observational study was conducted at the Fifth Affiliated Hospital of Sun Yat-sen University, the largest tertiary academic hospital in Zhuhai. The Institutional Review Board of the Fifth Affiliated Hospital of

Data
Demographic data, clinical symptoms, and laboratory results were collected and extracted from the hospital electronic medical records. Two investigators independently coordinated and integrated the data with discordances adjudicated by reviewing original records. Subject identifiers were deleted, thereby creating an anonymized dataset. Laboratory assessments included complete blood count with differential, liver and kidney function tests, coagulation tests, and C-reactive protein and lymphocyte subsets.

Definitions
Criteria for the diagnosis of COVID-19 followed the interim guidelines of the National Health Commission, China 29 . A confirmed case was based on the exposure history, which included exposure to suspected cluster outbreaks, clinical manifestations (fever and/or respiratory symptoms), chest computed tomography imaging, and results of qRT-PCR for SARS-CoV-2 and anti-SARS-CoV-2 IgM and IgG antibodies using enzyme-linked immunosorbent assay. Classification of COVID-19 severity was based on the interim or 7th edition guidelines of the National Health Commission 29 . Severity was stratified on admission and revised based on disease progression during hospitalization. Outcomes were evaluated at the date of last follow-up, discharge or death, and by whether the subject required continued hospitalization. The highest severity during hospitalization was designed as the primary outcome to develop the predictive model for the likelihood of progression.

Statistics
Descriptive statistics were used to summarize demographic data. Results were reported as medians and IQRs, means with standard deviations, or counts and frequencies Continuous variables were compared using the ttest or one-way analysis of variance (ANOVA). Categorical dependent parameters were compared using the chi-square test and Fisher's exact test. Cut-off values were identified following Youden index of ROC curve. All tests were two-sided, and a P-value < 0.05 was considered significant.

Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.

DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.