A rapid screening model for early predicting novel coronavirus pneumonia in Zhejiang Province of China: a multicenter study

Novel coronavirus pneumonia (NCP) has been widely spread in China and several other countries. Early finding of this pneumonia from huge numbers of suspects gives clinicians a big challenge. The aim of the study was to develop a rapid screening model for early predicting NCP in a Zhejiang population, as well as its utility in other areas. A total of 880 participants who were initially suspected of NCP from January 17 to February 19 were included. Potential predictors were selected via stepwise logistic regression analysis. The model was established based on epidemiological features, clinical manifestations, white blood cell count, and pulmonary imaging changes, with the area under receiver operating characteristic (AUROC) curve of 0.920. At a cut-off value of 1.0, the model could determine NCP with a sensitivity of 85% and a specificity of 82.3%. We further developed a simplified model by combining the geographical regions and rounding the coefficients, with the AUROC of 0.909, as well as a model without epidemiological factors with the AUROC of 0.859. The study demonstrated that the screening model was a helpful and cost-effective tool for early predicting NCP and had great clinical significance given the high activity of NCP.


Scientific Reports
| (2021) 11:3863 | https://doi.org/10.1038/s41598-021-83054-x www.nature.com/scientificreports/ currently named 2019 novel coronavirus (2019-nCoV) 1 . The disease has swept across China rapidly through human-to-human transmission [2][3][4] . Since February 27, 2020, more than 78,000 people were confirmed to be infected and more than 2700 were died in China 5 . As the number of patients soaring, scholars have summarized the clinical characteristics of NCP [6][7][8] . Symptoms at onset of disease included fever, cough, headache, vomiting, diarrhea and so on. Normal or decreased leukocyte count was common. Radiologic abnormalities like ground-glass opacity and patchy shadowing on chest X-ray or computed tomography (CT) were marked characteristics. Acute respiratory distress syndrome, arrhythmia and shock could also occur in severe cases. Until now, to detect 2019-nCoV by the accurate real-time reverse transcription polymerase chain amplification (RT-PCR) assessment has been regarded as the golden diagnostic standard 9 .
Nevertheless, false negative results in initial RT-PCR examination existed in a number of cases 10 . Besides, the time-consuming process, short supply of kits, and difficulty in qualified sampling prevented us from early-stage diagnosis and treatment, as well as prompt isolation of patients. Therefore, it necessitates establishment of a rapid diagnostic model to screen high-risk patients with 2019-nCoV infection.
In this study, we aimed to develop a novel screening scale to determine highly suspected subjects based on epidemiological data, clinical manifestations, laboratory and radiological examinations. Given the evolution of the pandemic, we combined the epidemic regions into one parameter, and even dropped the epidemiological factors from the model, to establish another two simplified, but still effective models. It is the first study for screening and predicting NCP in Zhejiang Province, China, and can be popularized nationwide and even worldwide.

Results
Clinical characteristics of the study participants. Of the 880 subjects enrolled in the study, 21 subjects were excluded due to missing data, and 859 participants were eligible for evaluation. Of them, 339 were diagnosed as NCP with the positive detection of 2019-nCoV by real-time RT-PCR, while the other 520 participants were ruled out with at least two times negative results by RT-PCR. The 21 excluded cases refused chest X-ray or CT because of pregnancy or preparing for pregnancy. Fortunately, their RT-PCR tests all showed negative results.
The characteristics of participants were exhibited in Table 1. Among these 339 NCP sufferers, 188 (55.46%) were male, and the mean age was 46.88 ± 14.65 years. The age of NCP sufferers was significantly larger than those without NCP (P < 0.001). 33.63% of the confirmed patients had a history of travel or residence in Wuhan within 14 days, and 27.43% had contacted patients with fever or respiratory symptoms from Wuhan within 14 days. 35.10% of the confirmed cases were related to cluster outbreaks in families or places of work.
Predictors associated with NCP. We performed both univariate and multivariate logistic regression analyses to assess predictors of NCP (Table 2). In the univariate analysis, age, co-existing diseases, travel or residence history within 14 days in Wuhan, neighboring areas of Wuhan in Hubei Province, and other areas with persistent local transmission, or community with definite cases, contacting patients with fever or respiratory symptoms within 14 days from Wuhan, neighboring areas of Wuhan in Hubei Province, and other areas with persistent local transmission or community with definite cases, relationship with a cluster outbreak, presence of sputum, fatigue, dyspnea, diarrhea or bellyache, muscle soreness, absence of nasal congestion or sore throat, decreased WBC count, lymphocyte count, and neutrophil cell count, and imaging changes in chest X-ray or CT were observed to be associated with higher odds of NCP.
The above characteristics were utilized in the subsequent multivariate analysis, revealing that the following nine characteristics were independent risk factors for NCP: travel or residence history within 14  The AUROC of the model was 0.920 (95% CI 0.902-0.938), indicating a greater capability to discriminate NCP than WBC count (AUROC 0.727, 95% CI 0.692-0.762) or chest imaging score (AUROC 0.795, 95% CI Table 1. Clinical characteristics of the study participants. NCP novel coronavirus pneumonia, COPD chronic obstructive pulmonary disease, CT computed tomography. a One valvular heart disease, one atrial fibrillation, one HIV infection, two cases of ankylosing spondylitis, and one anxiety disorder. b Three cases of chronic nephritis, two cases of cerebral infarction, one depression, one schizophrenia, one rheumatoid arthritis, one gout, one hypothyroidism, and one trauma. c Fever is defined as body temperature > 37.5 °C. d Normal range of white blood cell count: 3.50-9.50 × 109. e Normal range of lymphocytes: 1.10-3.20 × 109. f Normal range of neutrophil cells: 1.80-6.30 × 109. Derivation of a simplified model. We further combined all the geographical regions into one parameter as "epidemic areas with persistent local transmission or communities with definite cases". The results of the univariate and multivariate logistic analyses were showed in Table 4. To develop a simplified model, we rounded the coefficients and elicited the model as follows. Simplified model score (model 2) = 1 (if contacting patients with fever or respiratory symptoms from areas with persistent local transmission or community with definite cases within 14 days) + 3 (if relating to a cluster outbreak) + 1 (if having fatigue) + 2 (if having dyspnea) -1 (if having nasal congestion) + 3 (if feeling muscle soreness) − 0.3 * WBC count + 2 * pulmonary imaging score. The AUROC was 0.909 (95% CI 0.889 -0.929) (Fig. 2). In fivefold Cross-Validation, the average AUROC was 0.862, with the standard deviation of 0.028. The Hosmer-Lemeshow χ 2 was 11.962 (P = 0.153). The optimal cutoff value was 0.7, with a sensitivity of 82.3% (95% CI 76.3-87.0%), a specificity of 86.2% (95% CI 82.9-88.9%), a diagnostic accuracy of 84.6% (95% CI 82.2-87.0%), and a Youden index of 0.685.  Table 5. Consequently, a predictive model without epidemiological history was established.

Discussion
In this study, we compared the characteristics between the NCP patients and the suspected individuals who were finally ruled out of NCP. Having analyzed the clinical and epidemiological features, we developed a rapid screening model for predicting NCP in a Zhejiang population. The model included four epidemiological features: travel or residence history within 14 days in Wuhan, contacting patients with fever or respiratory symptoms within 14 days from Wuhan, contacting patients from other areas with persistent local transmission or community with definite cases, relationship with a cluster outbreak; and five clinical manifestations: fatigue, dyspnea, muscle soreness, decreased WBC count, and imaging changes in chest X-ray or CT. The diagnostic performance of the established scale was excellent with an AUROC of 0.920.  www.nature.com/scientificreports/ At a cut-off value of > 1.0, the model could detect NCP with a sensitivity of 85% and a specificity of 82.3%. Due to the nature of a communicable disease, the associated costs of a false negative are huge, therefore it is essential to avoid missed diagnoses, in particular given its surging outbreak. When the score was higher than 4.0, subjects were more likely to suffer from NCP (with a specificity of 98.3%) and they should be immediately isolated and further tests are highly recommended. In contrast, during the outbreak, a great quantity of patients with flu-like symptoms were scared and crowded into hospitals, giving clinicians great pressure. A model score of < −0.5 demonstrated a very small probability to be infected by 2019-nCoV (with a sensitivity of 97.9%). Clinicians can set the best cut-off value based on actual demands.
Under the circumstance of continuing spread of 2019-nCoV, Zhejiang model established in this study, as the first rapid screening diagnostic model for NCP, is of great significance in this battle. Unlike virus isolation or RT-PCR testing, the screening scale is economical, uncomplicated and fast, which can be used to select potential patients for further RT-PCR examinations.
Nevertheless, there were several limitations based on the model. Firstly, the enrolled participants were limited to Zhejiang Province, leading to certain regional limitations in the application of the screening model, in particular the epidemiological characteristics of possible compromise in another area. Secondly, our research was confined to early and rapid screening, without adequate information on disease progression and prognosis. Last but not least, with the development of epidemic situation, weight of certain characteristics, especially epidemiological characteristics, should be modified to increase the scope and accuracy of the diagnostic model.
In the purpose of eliminating the effects of location-specific factors like Wuhan-related criteria, which might be no longer applicable with the evolution of the pandemic, a simplified model was developed by combining all the epidemic regions. Furthermore, we repeated the analysis by dropping the epidemiological data. Both of the subsequent models were proved to be effective. In addition, fivefold Cross-Validations were repeated in each model during internal validation to quantify any optimism in the predictive performance, and Hosmer-Lemeshow χ 2 test was utilized to measure calibration. Further nationwide even worldwide studies are needed to access the utility of this model, and subject to further adjustment and calibration if necessary.
According to the recent literatures, most patients with NCP are characterized by fever, cough, fatigue, and myalgia in the initial stage 11 . Atypical symptoms include diarrhea, nausea, headache, sore throat and so on. As the illness progressed, a proportion of patients gradually presented with dyspnea, especially in the populations with low immune functions 12 . Complications like acute respiratory distress syndrome (ARDS), arrhythmia and shock, is probably associated with a poor prognosis 7 .
The most common laboratory abnormalities observed are leukopenia and lymphocytopenia. Moreover, it is reported that hypoalbuminemia, elevated CRP and lactate dehydrogenase (LDH), and decreased CD8 count can be seen in part of cases 6,13 . The most frequent imaging manifestation is patchy/punctate ground glass opacities involved in single or multiple pulmonary lobes 14 . Alterations on chest CT can reflect the severity and progress of NCP 15 . However, 2019-nCoV infection can also present with normal pulmonary imaging, particularly in early stage, suggesting the necessity to combine epidemiological information, clinical manifestations and imaging in the screening and diagnosis 16 .
At present, RT-PCR remains the confirmation criteria for the diagnosis of 2019-nCoV infection. RT-PCR is a technology combining RNA reverse transcription (RT) with polymerase chain amplification (PCR) of cDNA. It has been widely used in detecting different coronavirus (such as SARS-CoV and MERS-CoV) in laboratory, because of its high specificity and sensitivity. Besides that the RT-PCR test can be time-consuming, a shortage of test kits supply may not meet the needs of a growing infected population. Furthermore, RT-PCR of 2019-nCoV may be false negative due to unstable kits or unstandardized sampling 17 . Xiao et al. reported that some patients who met the diagnosis of NCP based on clinical and imaging findings, had negative results for viral RNA 18 . In Table 5. Predictors associated with NCP (dropping the epidemiological history). NCP novel coronavirus pneumonia, β regression coefficient, SE standard error, OR odds ratio, CT computed tomography. Bold texts refer to statistical significance (P < 0.05). www.nature.com/scientificreports/ another study, initial negative RT-PCR results turned positive in repeated testing in a number of patients 10 . With the purpose of timely isolation and early treatment, it is necessary to establish a rapid screening diagnostic model for distinguishing highly suspicious patients with NCP. Actually, the authors are trying to develop a procedure for fast scoring in clinical application based on this model. In conclusion, the study established a rapid screening model for predicting NCP in a Zhejiang population. What's more, we developed a simplified model by combining the epidemic regions and rounding the coefficients, as well as a model without any epidemiological factor. The models can be used as a simple, fast, and cost-effective tool for screening NCP with significant clinical value.

Methods
Patients. From January 17 to February 19, 2020, a total of 880 patients who were suspected of 2019-nCoV infection were recruited from hospitals in Hangzhou, Wenzhou, Shaoxing, Taizhou, Ningbo and Jiaxing in Zhejiang Province. The study was approved by the Ethics Committee of Zhejiang Provincial People's Hospital (2020KY006); in addition, all research was performed in accordance with relevant guidelines 19 . Exempt informed consent was approved by the Ethics Committee of Zhejiang Provincial People's Hospital because the subjects would not be exposed to any risk in this observational study, and the information of subjects was anonymized at collection and prior to analysis.
Epidemiological history and clinical manifestations were collected in each individual. Age, gender, region, coexisting diseases, body temperature, results of blood routine test and chest X-ray or CT were recorded for all participants. Throat swab, sputum, blood, or stool samples were collected to examine the 2019-nCoV nucleic acid using real-time RT-PCR. If the first-time RT-PCR test revealed negative, samples should be collected after 24 h for a repeated test.
Eligibility. Patients admitted to the fever clinics who were initially suspected of NCP were included in the study. Suspected or confirmed cases were diagnosed according to the 5th edition of the Chinese recommendations for diagnosis and treatment of pneumonia caused by 2019-nCoV 19 .
Suspected cases. NCP should be suspected if subjects conform to any one of the criteria in the epidemiological history and any two of the standards in clinical presentations. If there is no epidemiological history, suspected cases should meet three of the criteria in clinical presentations.
Epidemiological history: (1) Subjects with a travel or residence history in Wuhan or its neighboring areas, or other areas with persistent local transmission, or communities with definite cases within 14 days; (2) Subjects with a history of contacting confirmed cases with 2019-nCoV infections (positive nucleic acid detection) within 14 days; (3) Subjects with a history of contacting patients with fever or respiratory symptoms who have a travel or residence history in Wuhan or its neighboring areas, or in other areas with persistent local transmission, or communities with definite cases within 14 days; (4) Subjects who are associated with a cluster outbreak, which is defined as one definite case with NCP in family or place of work within 14 days, along with other patients with fever or respiratory symptoms.
Clinical presentations: (1) Fever and/or respiratory symptoms; (2) Typical chest imaging features of NCP, such as ground-glass opacity, infiltrating shadows, and pulmonary consolidation. (3) Normal or decreased white blood cell (WBC) count, or decreased lymphocyte count in the early stage of the disease.
Confirmed cases. Suspected cases who accord with any one of the following criteria: (1) Positive 2019-nCoV nucleic acid in throat swab, sputum, blood samples, or stool by using real-time RT-PCR; (2) Genetic sequencing of samples being highly homologous with the known 2019-nCoV.
Establishment of the rapid screening scale. Based on the epidemic situation in Zhejiang Province, we included age, gender, co-existing diseases, the epidemiological parameters, clinical symptoms, body temperature, WBC count, lymphocyte count, neutrophil count, and chest imaging to establish a novel diagnostic model of NCP. The epidemiological features and symptoms were considered as binary variables, and were scored as "1" if "yes", and "0" if "no". As to chest radiologic changes, they were simply classified as "normal", "unilateral local patchy shadowing", "bilateral multiple ground glass opacity", "bilateral diffuse ground glass shadowing with pulmonary consolidation", and "Other imaging alterations such as pulmonary nodule or pleural effusion", and were scored as "0", "0.5", "1", "2" and "0.3", respectively.
The samples were classified to NCP, 339 individuals, and non-NCP, 520 individuals, according to their realtime PT-PCR outcomes since the detection of the 2019-nCoV nucleic acid using real-time RT-PCR was considered the golden standard. Consequently, after the derivation of the screening model, the diagnostic performance of the established scale was also verified. We further combined all the geographical regions into one parameter, that is "epidemic areas with persistent local transmission or communities with definite cases", and developed a simplified model by rounding the coefficients. Moreover, we dropped the epidemiological parameters that are likely to become outdated given the evolution of the pandemic, and repeated the analysis. Statistical analysis. Statistical analyses were conducted using SPSS software (version 22.0) for Windows (SPSS, Chicago, IL). Continuous variables were presented as mean ± standard deviation. Continuous variables were compared using the Student's t-test, and categorical variables were compared using the chi-squared test. For multiple comparisons, the one-way analysis of variance (ANOVA) was performed. Univariate logistic regression www.nature.com/scientificreports/ analyses were conducted to assess the factors associated with NCP. The parameters with statistical significance were loaded to a multivariate logistic regression model to further identify independent predictors for NCP. To identify candidate predictors, we performed a stepwise logistic regression analysis (P value to enter = 0.05 and P value to remove = 0.10). A model based on the results of multiple logistic regression analysis was established to screen NCP; furthermore, fivefold cross-validation was employed repeatedly for 10 times to evaluate the performance of the model and examine whether the model was over fitted. Model calibration was evaluated using the Hosmer-Lemeshow χ 2 test. Area under receiver operating characteristic curve (AUROC) with 95% CI was used to assess the predictive accuracy of the screening model for determining NCP 20 . Bootstraps with 500 resample were applied to overplot the point-wise 95% CIs of the ROC curves by the R software (version 4.0.3) (not to re-estimate the regression coefficients). Optimal cut-off values were set, and the corresponding sensitivities, specificities, diagnostic accuracies, positive likelihood ratios, and negative likelihood ratios of the model were calculated. A two-sided P value cutoff < 0.05 was considered to be statistically significant.