Predictive value of neutrophil-to-lymphocyte ratio for the fatality of COVID-19 patients complicated with cardiovascular diseases and/or risk factors

Previous studies have reported that a high neutrophil-to-lymphocyte ratio (NLR) is associated with disease severity and poor prognosis in COVID-19 patients. We aimed to investigate the clinical implications of NLR in patients with COVID-19 complicated with cardiovascular diseases and/or its risk factors (CVDRF). In total, 601 patients with known NLR values were selected from the CLAVIS-COVID registry for analysis. Patients were categorized into quartiles (Q1, Q2, Q3, and Q4) according to baseline NLR values, and demographic and clinical parameters were compared between the groups. Survival analysis was conducted using the Kaplan–Meier method. The diagnostic performance of the baseline and follow-up NLR values was tested using receiver operating characteristic (ROC) curve analysis. Finally, two-dimensional mapping of patient characteristics was conducted using t-stochastic neighborhood embedding (t-SNE). In-hospital mortality significantly increased with an increase in the baseline NLR quartile (Q1 6.3%, Q2 11.0%, Q3 20.5%; and Q4, 26.6%; p < 0.001). The cumulative mortality increased as the quartile of the baseline NLR increased. The paired log-rank test revealed significant differences in survival for Q1 vs. Q3 (p = 0.017), Q1 vs. Q4 (p < 0.001), Q2 vs. Q3 (p = 0.034), and Q2 vs. Q4 (p < 0.001). However, baseline NLR was not identified as an independent prognostic factor using a multivariate Cox proportional hazards regression model. The area under the curve for predicting in-hospital death based on baseline NLR was only 0.682, whereas that of follow-up NLR was 0.893. The two-dimensional patient map with t-SNE showed a cluster characterized by high mortality with high NLR at follow-up, but these did not necessarily overlap with the population with high NLR at baseline. NLR may have prognostic implications in hospitalized COVID-19 patients with CVDRF, but its significance depends on the timing of data collection.

As evidence accumulates, it has become clear that the presence of cardiovascular disease (CVD) is closely related to the prognosis of COVID-19 patients 1,2 . Early studies reported that patients who required intensive care were more likely to have CVD 3 , and myocardial injury was associated with fatal outcomes in COVID-19 4 . More recently, Cereda et al. reported that the coronary calcium score contributes to stratifying the risk of complications in COVID-19 patients 5 . Similarly, virus-related cardiac injury has also been highlighted through investigations of hospitalized patients 6 . Viral infections are often associated with lymphopenia, and COVID-19 is no exception to this. Several studies have found a correlation between disease severity and lymphopenia [7][8][9] . On the other hand, the role of neutrophils in COVID-19 is also attracting increasing attention 10 . A recent study by Parackova et al. demonstrated that neutrophils from COVID-19 patients induced T-cell polarization, leading to reduction in the percentage of Th1 cells 11 . In this context, it is suggested that the balance between neutrophils and lymphocytes reflects the disease activity of COVID-19. The neutrophil-to-lymphocyte ratio (NLR) is a biomarker of systemic inflammatory status that can easily be obtained from differential white blood cell count 12 . NLR, calculated as a simple ratio between the neutrophil and lymphocyte counts, reflects the balance between acute and chronic inflammation and is predictive of mortality, even in the general population 13 . A series of recent studies have shown that NLR is an independent risk factor for critical illness and hospital mortality in COVID-19 patients 14,15 . In 2020, Qin et al. reported that plasma T lymphocyte levels were significantly reduced, while neutrophil levels were augmented in patients with severe COVID-19 compared with those in patients with mild symptoms 7 . Importantly, there were significantly more cases of CVD in patients with severe symptoms. Therefore, we believe that the comorbidity of CVDs should be considered when investigating the clinical significance of NLR in COVID-19 patients. In this study, we sought to investigate the prognostic significance of NLR in patients with COVID-19 complicated with CVD and/or its risk factors (CVDRF).

Methods
Study design and population. This was a retrospective analysis conducted using Clinical Outcomes of COVID-19 Infection in Hospitalized Patients with Cardiovascular Diseases and/or Risk Factors (CLAVIS-COVID registry). This study was approved by the Ethics Committee of Ehime Prefectural Central Hospital (no. 02-22), and the study protocol complied with the tenets of the Declaration of Helsinki. The CLAVIS-COVID registry is a retrospective, observational, national, multicenter study that included an adult population with CVDRF hospitalized for COVID-19 in Japan. The main aim of the registry was to evaluate the characteristics and clinical outcomes of hospitalized COVID-19 patients with CVDRF. The study protocol including the optout consent method was approved by the review board of each institution, and all patients provided informed consent to participate in the study. This clinical study was registered with the University Hospital Medical Information Network Clinical Trial Registry (UMIN-ID: UMIN000040598; further details accessible at https:// upload. umin. ac. jp/ cgi-open-bin/ ctr_e/ ctr_ view. cgi? recpt no= R0000 46132) before the first patient was enrolled, in accordance with the International Committee of Medical Journal Editors. Detailed inclusion/exclusion criteria, decision of hospitalization/discharge, and definition of collected data are described in the original article 16 . Briefly, a total of 1518 patients were recruited from 49 hospitals from January 1 to May 31, 2020. Among all participants, 693 were complicated with CVDRF. Cardiovascular risk factors were defined as hypertension, diabetes mellitus, and dyslipidemia. Pre-existing CVD was defined as a history and/or manifestations upon admission for any of the following: heart failure, coronary artery disease, myocardial infarction, peripheral artery disease, valvular heart disease, cardiac arrhythmia, pericarditis, myocarditis, congenital heart disease, pulmonary hypertension, deep vein thrombosis, pulmonary embolism, aortic dissection, aortic aneurysm, cerebral infarction/transient ischemic attack, heart transplantation, and cardiac arrest and the use of cardiac devices (e.g., a pacemaker, implantable cardioverter-defibrillator, cardiac resynchronization therapy device, and left ventricular assist device). COVID-19 was diagnosed based on a positive polymerase chain reaction test of nasal or pharyngeal swab specimens in all patients. All patients admitted to the participating hospital and enrolled in this study were discharged by November 8, 2020, the deadline for data transfer. Clinical data, including symptoms, demographics, medical history, home medications, baseline comorbidities, physical findings, laboratory test results, radiography and chest computed tomography findings, electrocardiography and cardiac echocardiography results, treatment information, and outcomes, were obtained from electronic medical records using data collection forms. All laboratory and imaging data were obtained at the time of admission. We defined the data at "follow-up" as the results of the final blood test performed before hospital discharge, regardless of the form of discharge. According to this definition, baseline and follow-up values will be the same in cases wherein only one blood test was performed during hospitalization.
A list of all studied and excluded variables is made available through a Mendeley data repository (available at https:// doi. org/ 10. 17632/ 66djg 6mmzf.2). Patients were divided into four groups according to the quartiles of baseline NLR values as previously reported 13,14 , to outline the characteristics of the cohort as the first step of the analysis. Patients whose NLR data were not available were excluded from the analysis. A schematic of the study population is shown in Fig. 1. The contribution of NLR to in-hospital mortality was analyzed as the primary endpoint using the statistical methods described below.
Conventional statistical analysis. Data are shown as percentage for categorical variables and median (interquartile ranges, IQR) for continuous variables. One-way ANOVA and Kruskal-Wallis tests for continuous variables and χ 2 tests for qualitative variables were used for between-group comparisons based on data distribution. Kaplan-Meier survival analysis was performed to compare 30-day in-hospital mortality among individuals in each NLR quartile. Multivariable Cox proportional hazards models were used to determine the hazard ratios and 95% CI for each factor. The validity of the proportional hazard assumption was verified using scaled Schoenfeld residuals. Receiver operating characteristic (ROC) curves were generated to obtain the area under the curve www.nature.com/scientificreports/ (AUC) as a predictor of mortality from the NLR values. The optimal cut-off value for predicting 30-day mortality was determined based on the Youden index. Statistical significance was defined as a P-value of < 0.05. All statistical analyses were performed using SciPy, a Python library, and SPSS statistical package (Version 12, SPSS Inc, Chicago IL, USA).

Clustering analysis and data visualization.
We used t-stochastic neighborhood embedding (t-SNE) 17 to visualize patients' clinical characteristics. Scikit-learn, a Python machine learning library, was used for the analysis. All demographic and clinical variables at baseline and follow-up were included in the analysis, except for those that were missing in > 30% of patients. In the case of missing values, data were replaced with the item mean as previously described 18 . After confirming that the major clusters were consistently identified through different values of perplexity and iteration numbers ( Supplementary Fig. S1), the default parameters of Scikitlearn (dimension = 2, perplexity = 30, learning rate = 200, and iteration number = 1000) were used to create a 2D map. Parameters of interest (mortality, intubation, baseline NLR, and follow-up NLR) were displayed on each data point as a heat map.
Survival analysis according to NLR quartiles. Kaplan-Meier curves show the different in-hospital mortality according to NLR quartiles, as shown in Fig. 2. Briefly, cumulative mortality increased as the quartile of baseline NLR increased. The paired log-rank test revealed significant differences in survival for Q1 vs Q3 (p = 0.017), Q1 vs. Q4 (p < 0.001), Q2 vs. Q3 (p = 0.034), and Q2 vs. Q4 (p < 0.001). There were no significant differences in survival between Q1 and Q2 (p = 0.706) or between Q3 and Q4 (p = 0.121). Multivariable Cox regression analysis revealed that older age, male sex, higher BMI, higher creatinine, and higher CRP values were significantly associated with 1-month mortality, while the baseline NLR was not (Table 3).
Prognostic value of the baseline and follow-up NLR for disease fatality. As shown in Fig. 3, the AUC for predicting in-hospital death based on baseline NLR was only 0.682. In contrast, the AUC for predicting www.nature.com/scientificreports/ in-hospital death by the follow-up NLR was 0.893. The optimal cut-off value of the baseline NLR for predicting in-hospital death was 5.39. Fig. 4, we identified two distinct clusters in the fourth quadrant (lower right), one of which was characterized by high in-hospital mortality. The population that required mechanical ventilation was also seen in the fourth quadrant and seemed to overlap with the population with a high baseline NLR. Patients who survived in the fourth quadrant had a lower NLR at the follow-up.  www.nature.com/scientificreports/

Discussion
To date, several studies have reported that severe systemic inflammation is associated with higher incidence of CVD 12,[19][20][21][22] . The results of the current study illustrate the clinical implications of the NLR in COVID-19 patients with CVDRF. The main findings were that in-hospital mortality was higher among patients with higher baseline NLR, but the predictive performance of baseline NLR for fatality was insufficient. Similarly, survival analysis showed that baseline NLR was not significantly associated with 30-day mortality, unlike other parameters, including age and sex. In contrast, NLR at follow-up was significantly higher in deceased patients than in those who survived, leading to a high area under the ROC curve for the prediction of mortality. Further analysis using unsupervised machine learning-based visual mapping revealed that a cluster with a high baseline NLR tended to require mechanical ventilation but did not necessarily show high mortality. Previous reports have revealed an association between NLR and disease severity or mortality in COVID-19 patients. However, the cut-off values for prognosis prediction vary across studies, probably due to the heterogeneity of the cases studied [23][24][25][26] . In addition, the cut-off value calculated in this study (baseline NLR > 5.39) was higher than any of the previously reported values. According to a study by Caillon et al., NLR was not selected as an important variable in their mortality prediction model 26,27 . A possible factor for the varying results could be the timing of data collection. A recent study by Jimeno et al. showed that the peak NLR value and the rate of NLR increase, but not the NLR value at hospital admission, are significantly associated with mortality in COVID-19 patients 28 . Although the prevalence of comorbidities has not been reported in detail, their findings are consistent with our results that later data are more reflective of disease severity. Interestingly, the median number of days from symptom onset to hospital admission in the data of Jimeno et al. was the same as ours, with a median of 7 days. During the study period, the Japanese government mandated the hospitalization of all patients with COVID-19 regardless of disease severity during patient enrollment 29 . Therefore, laboratory data at the time of hospitalization may have been collected before the onset of the disease or at a relatively mild stage compared with in reports from other countries. In any case, it should be noted that the follow-up blood tests in our study were collected at the end of hospitalization; therefore, they are not clinically useful in predicting disease severity.
Another unique aspect of our research lies in our application of the cluster analysis method using unsupervised machine learning (specifically, t-SNE). Analysis using machine learning has become a trend in the field of cardiovascular medicine owing to its advantages over existing statistical methods 30 . A nonlinear dimensionality reduction technique, t-SNE is a manifold learning algorithm that is commonly used for the visualization of highdimensional data in genomic analysis. The use of this algorithm is not limited to genomic data but also includes Table 3. Factors associated with 30-day mortality in hospitalized patients with COVID-19. BMI body mass index, Cre Creatinine, CRP C-reactive protein.

Covariates
Hazard ratio (95% CI) P-value www.nature.com/scientificreports/ the analysis of electronic medical record data and posturography data in neurodegenerative diseases 31,32 . Recently, De Canniere et al. reported the efficacy of two-dimensional (2D) visualization of 6-min walking test data using t-SNE-based mapping 33 . The 2D map allows for the simultaneous assessment of the relative similarity of all subjects in our dataset, along with the distribution of their clinical characteristics. In this study, we visualized the different distributions of NLR quartiles according to the time at which the data were obtained (admission or follow-up). We believe that this method is useful for phenotyping patient groups, as it can represent highdimensional data in a human-interpretable (2D) way. In summary, we have shown that while NLR clearly reflects disease activity, it does not necessarily predict future disease severity based on values at admission. Further validation with therapeutic intervention is required to confirm the usefulness of the NLR as a biomarker.

Limitations
The current study has several limitations. First, this was a retrospective study, and there were considerable missing values for baseline serum biomarkers, especially cTn and BNP/NT-proBNP. The missing data might result in our univariate and multivariate analyses of cardiac biomarkers differing from the results of previous studies. In addition, patients with mild disease had only one blood test performed during hospitalization, which may have led to bias due to missing data. The second limitation was the study population. As mentioned earlier, this study used a registry that enrolled patients with relatively mild disease, which may not necessarily reflect the current status of most hospitalized patients. Recently, the proportion of new variant strains of SARS-CoV-2 has increased in hospitalized patients, and the variant strains have been reported to differ from conventional strains in infectivity and severity of disease 34,35 . Therefore, the influence of the variant strains is a major limitation that cannot be addressed in this study. Third, we were unable to show the clear benefit of using the ratio of neutrophils to lymphocytes, rather than using their sole values. As shown in Supplementary Fig. S2, the advantage of using NLR only emerged at the time of follow-up. Fourth, when we excluded CRP from the covariates in the multivariable Cox proportional hazards regression model, the baseline NLR became an independent factor for predicting 1-month mortality (Supplementary Table S1). Therefore, multicollinearity was suspected between CRP level and NLR. Lastly, the CLAVIS-COVID registry analyzed in this study focuses on patients with CVDRF, and no blood test data are available for the control group (subjects without CVDRF). Therefore, it is unclear whether patients with CVDRF exhibit a higher NLR than those without.