C-reactive protein-to-albumin ratio as a biomarker in patients with sepsis: a novel LASSO-COX based prognostic nomogram

To develop a C-reactive protein-to-albumin ratio (CAR)-based nomogram for predicting the risk of in-hospital death in sepsis patients. Sepsis patients were selected from the MIMIC-IV database. Independent predictors were determined by multiple Cox analysis and then integrated to predict survival. The performance of the model was evaluated using the concordance index (C-index), receiver operating characteristic curve (ROC) analysis, and calibration curve. The risk stratifications analysis and subgroup analysis of the model in overall survival (OS) were assessed by Kaplan–Meier (K–M) curves. A total of 6414 sepsis patients were included. C-index of the CAR-based model was 0.917 [standard error (SE): 0.112] for the training set and 0.935 (SE: 0.010) for the validation set. The ROC curve analysis showed that the area under the curve (AUC) of the nomogram was 0.881 in the training set and 0.801 in the validation set. And the calibration curve showed that the nomogram performs well in both the training and validation sets. K–M curves indicated that patients with high CAR had significantly higher in-hospital mortality than those with low CAR. The CAR-based model has considerably high accuracy for predicting the OS of sepsis patients.

www.nature.com/scientificreports/At present, diagnosis using the combination of multiple biochemical markers is a much pursued topic in sepsis research [5][6][7] .It is known that the development of sepsis is related to the imbalance of proinflammatory and anti-inflammatory responses 8,9 .In the early stage of sepsis, pro-inflammatory factors are released in large quantities by activated immune cells, which leads to a hyperimmune response and the generation of a cytokine storm 10 .As the disease progresses to the advanced stage, immune suppression or immune paralysis gradually occurs as anti-inflammatory cytokine levels increase in the presence of macrophage inactivation 11 .Clinically, serum C-reactive protein (CRP) is one of the most sensitive indicators for tissue damage and infection 12 .In addition, the relationship between inflammation and malnutrition is complex 13 .Inflammation during sepsis can lead to malnutrition, as it decreases appetite and alters metabolism 14,15 .On the other hand, malnutrition can predispose patients to infections, leading to exaggerated inflammation 16 .In general, malnourished individuals are at greater risk for developing serious infections, further exacerbating inflammation 17,18 .ALB (Serum albumin) has been widely used as an indicator of malnutrition in the clinical setting 19 .Recent studies have reported that CAR (C-reactive protein-to-albumin ratio) can serve as a new predictor for pneumonia, stroke, and cancers [20][21][22][23] .
Nevertheless, the role of CAR in predicting in-hospital mortality in sepsis patients is currently unclear.Therefore, this study is aimed to elucidate the association between CAR and 30-d (30-day) mortality in sepsis patients in the ICU and to further develop a prediction model for predicting the risk of in-hospital death in sepsis patients.

Materials and methods
Data collection.All data, including demographic characteristics, laboratory indicators, drugs, and complications, were extracted from the MIMIC-IV database.Patients confirmed with sepsis were eligible for enrollment.Sepsis was diagnosed according to the Third International Consensus Definitions for Sepsis (Sepsis-3).Briefly, patients with a suspected infection plus a Sequential Organ Failure Assessment (SOFA) score of ≥ 2 were diagnosed with sepsis.According to the criteria, a final total of 6414 eligible sepsis patients were included in this retrospective cohort study.The first author (XZ) obtained access to the database and was responsible for data extraction (certification number 53012064).
Data extraction.In this study, CAR is the primary study variable.The following potential confounders were extracted: Demographics (Age, Gender, Race, Insurance, Marital status), drug treatment (Omeprazole, Cefazolin, Amoxicillin Clavulanic Acid, Ceftazidime, Pantoprazole, Miconazole, Linezolid, Metronidazole Flagyl, Heparin, Ciprofloxacin HCl, Erythromycin, Daptomycin, Dobutamine, Dopamine, Dexamethasone, etc.), comorbidities (Acute liver failure, Acute Kidney Injury, Acute Respiratory Failure, Heart Failure, Septic Shock, Venous Thrombosis, Stress Ulcer, Toxic Encephalopathy, Disseminated Intravascular Coagulation), laboratory indicators (Red Blood Cells, White Blood Cells, Platelets, Hemoglobin, Neutrophil, Lymphocyte, Lymphocyte percentage, Monocyte, Reticulocyte, Erythrocyte Specific Volume, Alanine Aminotransferase, Aspartate Transaminase, Creatine Kinase, Creatine Kinase-MB, Alkaline Phosphatase, C-reactive Protein, Serum Albumin, Globulin, Total Protein, Total Bilirubin, Direct Bilirubin, Indirect Bilirubin, Serum Glucose, Serum Creatinine, Blood Urea Nitrogen, Thrombin, Serum Potassium, Serum Sodium, Serum Chloride, Homocysteine, D-dimer, NT-proBNP (N-Terminal Pro-Brain Natriuretic Peptide), Troponin T, Lactate, Ferritin, Total Cholesterol, Triglyceride, High-density Lipoprotein, Low-density lipoprotein), primary outcome (Admission time, Discharge time, Death time), Simplified Acute Physiology Score II (SAPSII) and Sequential Organ Failure (SOFA).The endpoint event of this study was in-hospital death.All laboratory data were measured on the day of admission.If a variable was recorded more than once in the first 24 h, the value associated with the most severe disease was used.All codes used for comorbidities were based on the recorded ICD-9 (International Classification of diseases) codes and ICD-10 codes.The data mentioned above were extracted and cleaned by pg-Admin4 and R-studio.To avoid bias, continuous variables with over 50% missing data and categorical variables with less than 1% positive events were excluded, and missing values were imputed using multiple interpolations.Finally, the main study variable CAR was generated.

Construction of prediction nomogram.
The 6414 patients were randomly divided into the training set and validation set in a 7:3 ratio.The training set was used to determine the survival-related factors and to establish the nomogram.The validation set was used to verify the nomogram.A total of 43 variables related to the survival of sepsis patients were identified by univariate Cox regression.Least Absolute Shrinkage and Selection Operator (LASSO) regression was subsequently performed to further narrow down the above variables to 30.Variables with statistical significance were further included in the multivariate Cox regression analysis to determine independent predictors.An alignment diagram was established based on the independent risk factors affecting the prognosis of sepsis patients.The reliability of the model was verified by C-index, ROC curve, and calibration curve.Finally, patients were divided into high risk and low risk groups based on the constructed nomogram.Survival curves were generated using the K-M estimator, and the prognostic difference was determined by the log rank test.The optimal cut-off value of CAR was determined by median value, and this was used to divide the CAR into high risk and low risk groups.A subgroup analysis was performed to investigate the effect of CAR on different subgroups.

Statistical analysis.
Continuous variables with normal distribution are presented as mean ± standard deviation (Mean ± SD), and those without normal distribution are expressed as median and interquartile range [M (Q1, Q3)].Variables with normal distribution and homogeneity of variance were compared between the two groups using the Student's t-test; otherwise, they were compared using the Mann-Whitney U-test.Additionally, categorical data are expressed as total number and composition ratio [n (%)] and the difference between groups www.nature.com/scientificreports/ was compared using Pearson χ 2 test and Fisher's exact test.All tests were two-tailed and a P < 0.05 was considered statistically significant.All statistical analyses were performed using R 4.2.1, and SPSS 22.0.

Results
General patient data.The patient selection flow chart is shown in Fig. 1.Of the 6414 sepsis patients (3500 males and 2914 females) included in this study, 4489 were assigned to the training set and 1925 were assigned to the validation set.The vast majority of patients were Caucasian and married or single, and only a minority had Medicaid insurance.During hospitalization, 61.6% of the patients developed acute renal failure, and 44.81% of the patients developed acute respiratory failure.In the training set, the mean patient age was 62.53 years and the average length of hospital stay was 9.8 days.The mean values of CAR were 17.5 and 17.56 in the training and validation sets, respectively.The in-hospital mortality rate was 3.24% (208/6414), 3.01% (135/4489), 3.79% (73/1925) in the total, training, and validation cohorts, respectively (log-rank test, training vs. validation cohort: p = 0.104).There were no significant differences in most clinical characteristics between the training and the validation cohorts.The baseline characteristics of all patients are summarized in Table 1 and Supplement Table 1.CAR is an independent prognostic predictor.A total of 88 covariates with significant differences (P < 0.05) were identified by univariate COX regression analysis (Table 2).The results indicated that CAR was significantly associated with in-hospital mortality (HR, 1.01; 95% CI, 1.01-1.01;P < 0.001).In addition, 30 predictors were identified by LASSO regression analysis (Fig. 2).These predictors included age, race, insurance, marital status, neutrophil, lymphocyte, lactate, ferritin, CAR, CK-MB, HDL, NT-proBNP, Cefazolin, Ceftazidime, Ceftriaxone, Ciprofloxacin HCl, Daptomycin, Dexamethasone, Metronidazole Flagyl, Amoxicillin Clavulanic Acid, Linezolid, Erythromycin, Miconazole, Dobutamine, Dopamine, Heparin, Pantoprazole, Omeprazole, SAPS II and SOFA.These variables were included in the multivariate Cox regression analysis, and the results showed that CAR was positively correlated with the risk of in-hospital death among sepsis patients.Notably, CAR might be an independent predictor of in-hospital death in sepsis patients (HR, 1.01; 95% CI, 1.00-1.01;P = 0.0071).

Nomogram construction and validation.
Based on the results of multivariate Cox regression analysis, 23 independent prognostic factors, including age, race, insurance, marital status, NT-proBNP, HDL, lactate, CAR, SOFA, Amoxicillin Clavulanic Acid, Cefazolin, Ceftazidime, Ceftriaxone, Ciprofloxacin HCl, Daptomycin, Dobutamine, Dopamine, Erythromycin, Heparin, Metronidazole Flagyl, Miconazole, Omeprazole, Pantoprazole, were included in the construction of the nomogram for predicting the in-hospital death of sepsis patients, and the predictive model was also validated using the validation cohort (Fig. 3).The C-index of the model was 0.917 (SE: 0.112) for the training set and 0.935 (SE: 0.010) for the validation set, indicating that the nomogram has a good discriminative ability.Furthermore, 30-days ROC analysis showed that the AUC was 0.881 (95% CI: 0.840-0.924) in the training set and 0.798 (95% CI: 0.709-0.887) in the validation set (Fig. 4).Calibration curve analysis indicated that the nomogram has high prognostic accuracy and clinical value in both the training and validation cohorts (Fig. 5).
Nomogram-based risk classification system.The risk scores of sepsis patients were predicted by the nomogram, and the cutoff value was determined by median.Patients were divided into the low risk (total score ≤ 361.97) and high risk (total score > 361.97) groups.K-M analysis showed that the prognosis was significantly different between the two groups (P < 0.0001) (Fig. 6A).These data demonstrate that the risk stratification system based on the nomogram can accurately distinguish the survival of sepsis patients.www.nature.com/scientificreports/Subgroup analysis.Subgroup analysis was performed to determine whether the correlation between CAR and OS in sepsis patients was stable across subgroups.When the stratified analysis was performed for CAR, the K-M curve showed that higher CAR values were associated with lower OS in each subgroup population (P < 0.0001) (Fig. 6B).

Discussion
In this study, we showed that CAR was an independent prognostic factor for sepsis patients in the MIMIC-IV database and established a CAR-based risk-prediction nomogram.Our results demonstrated that age, race, insurance, marital status, NT-proBNP, HDL, lactate, CAR, SOFA, Amoxicillin Clavulanic Acid, Cefazolin, Ceftazidime, Ceftriaxone, Ciprofloxacin HCl, Daptomycin, Dobutamine, Dopamine, Erythromycin, Heparin, Metronidazole Flagyl, Miconazole, Omeprazole, and Pantoprazole were independent prognostic factors for sepsis patients.
Analyses of C-index, ROC curve, and calibration curve showed that the CAR-based nomogram has good predictive performance in the prognosis of sepsis.In addition, the risk stratification system based on the nomogram further indicated that sepsis patients with high CAR had significantly higher in-hospital mortality than those with low CAR, which was further supported by the findings of the subgroup analysis.Taken together, these data Sepsis has always been a major challenge in the clinical treatment of patients with severe infection 24 .Due to limited treatment methods, early identification and early intervention are the key to the treatment of sepsis 25 .Previous studies have investigated the predictive performance of several biomarkers in the prognosis of sepsis patients, including platelet 26 , neutrophil/lymphocyte ratio (NLR) 27 , lactate/albumin ratio 28 , plasminogen activator inhibitor-1 (PAI-1) 29 , signal peptide-CUB-epidermal growth factor-like domain-containing protein 1 (SCUBE-1) 30 , and vitamin D receptor 31 .CAR has been recently used as a new prognostic marker in COVID-19 32 , severe fever with thrombocytopenia syndrome (SFTS) 33 , and cardiac arrest 34 .However, the predictive performance of CAR in the prognosis of sepsis patients has not been reported.
Currently, the prevailing mechanism for sepsis pathophysiology is the balance theory of host response 35 .In sepsis, the loss of balance in immune responses may lead to excessive inflammatory response characterized by the release of inflammatory factors, which in turn triggers systemic inflammation and multiple organ failure.Alternatively, immune imbalance may also result in anti-inflammatory hyperactivity and the production of antiinflammatory cytokines, which can affect immune cell functions and cause the body to be in a state of imbalance.Thus, inflammatory factors are the most prominent markers throughout the development of sepsis.CRP is an acute phase reaction protein and a commonly used inflammatory marker in clinical practice 36 .It was found that the CRP concentration is almost positively correlated with the degree of inflammation and tissue injury 37 .Many studies have reported that CRP is helpful in the diagnosis of inflammatory diseases, such as neonatal sepsis 38 39 , pneumonia 40 , and influenza 41 .However, CRP alone cannot distinguish sepsis from SIRS due to its elevated level in both conditions 42 .ALB is a protein synthesized by the liver and a conventional indicator for assessing the nutritional status of the body.In recent years, ALB has been shown to play an emerging role in human innate immunity by acting as a host defense agent during infection 43 .It was reported that ALB can serve as an independent predictor for in-hospital mortality in hospitalized patients over 90 years of age with acute infectious diseases 44 .Similarly, a retrospective cohort study reported that ALB was a predictor for the severity of abdominal sepsis in adult patients 45 .Hypoalbuminemia is also commonly present in neonatal sepsis 46 , which is more likely to be associated with enhanced clearance from circulation rather than impaired synthesis by the liver 47 .
A growing number of studies have pointed out that CAR plays a certain role in the evaluation and prognosis of heart disease 48 , cancer 49 , and infectious diseases.Elevated CAR level has been shown to be a reliable marker for infective endocarditis (IE) 50 and is associated with increased morbidity of spinal epidural abscess (SEA)      51 .Moreover, the CAR level is also higher in complicated appendicitis than in simple appendicitis 52 .A recent study has concluded that CAR is an independent risk factor for pneumonia in middle-aged and older Finnish men 22 .In the case of neonatal sepsis, it is reported that CAR was an independent predictor for the presence and severity of neonatal sepsis, as well as the presence of sepsis in neonates with pneumonia 53,54 .In addition, CAR is an independent predictor for mortality in adult patients with severe sepsis and septic shock receiving early goal-directed therapy in the emergency department.Collectively, these findings indicate that CAR is a readily available and objective hematological biomarker for systemic inflammation.
To our knowledge, this is the first study to examine the relationship between CAR and sepsis in a relatively large population.Nevertheless, the study has several limitations.First, this study is a retrospective study with inherent biases that cannot characterize the association between CAR and morality in sepsis patients as well as prospective studies do.Therefore, further large-cohort multicenter prospective studies are warranted.Second, many clinical data that may be related to prognosis were unavailable due to the limitations of public databases.Third, we examined the relationship between mortality and the first measured CAR after admission, which did not allow us to assess the impact of dynamic CAR on prognosis.Last, our model was based on the MIMIC-IV database, which primarily contained U.S patients.Therefore, the generalizability of the model to the global population remains unclear.

Conclusion
The CAR-based model we developed and validated has good predictive performance for the risk of in-hospital death in sepsis patients.This model can be used as a simple and practical tool for clinicians to timely and accurately identify sepsis patients at high risk of in-hospital death.Moreover, the model can provide a reference for disease risk stratification and support the development of prognostic treatment strategies and follow-up strategies for prolonging the survival of sepsis patients.

Figure 1 .
Figure 1.The flow diagram of study sample selection.CRP C-reactive protein, ALB serum albumin.

Figure 2 .
Figure 2. The LASSO logistic regression-based model for screening predictors: screening retention variables with the optimal value of lambda.(A) Tuning parameter (λ = 0.004832) selection using LASSO-type penalized logistic regression with tenfold cross-validation using minimum criteria.The partial likelihood deviance (binomial deviance) curve was plotted versus log (λ).Dotted vertical lines were drawn at the optimal values by using the minimum criteria and the one SE of the minimum criteria (the 1-SE criteria).(B) LASSO coefficient profiles of the thirty variables of the radiomic features.A coefficient profile plot was plotted versus the log (λ).Each colored line represents the coefficient of each feature.LASSO least absolute shrinkage and selection operator.

Figure 3 .
Figure 3.The nomogram for predicting 30-day survival for sepsis patients.The left column shows the points bar (top) and 21 parameters, with each to be scored with a vertical line to the points bar based on the different parameter values.The sum of the points is calculated (total points range, 0-650), and a vertical line is drawn from the total points bar to the 30-day survival probability below, to obtain survival probability of the patient.CAR C-reactive protein-to-albumin ratio, HDL high-density lipoprotein, NT-proBNP N-Terminal Pro-Brain Natriuretic Peptide, SOFA Sequential Organ Failure Assessment.

Figure 4 .
Figure 4.The 30-day Receiver operating characteristic (ROC) analysis in the training (A) and validation sets (B) for the nomogram.AUC area under the curve.

Figure 5 .
Figure 5. Calibration curve analysis for 30-day survival in the training (A) and validation cohorts (B).The abscissa (x-axis) represents the predicted survival rate and the ordinate (y-axis) represents the actual survival rate.The red dotted line is the reference line (predicted value equals the actual value), the solid grey line is the curve fitting line, and the error bars represent 95% confidence intervals.The calibration curves depict the agreement between predicted probabilities and observed outcomes.

Figure 6 .
Figure 6.Kaplan-Meier curves for OS stratified by three risk groups in the sepsis patients (A) and Subgroup analysis of the correlation between CAR and OS in sepsis patients (B).Kaplan-Meier curves shows cumulative probability of OS according to groups at 30 days.OS overall survival, CAR C-reactive protein-to-albumin ratio.

Table 1 .
Basicindicate that our prediction nomogram has high clinical application value and can help clinicians accurately predict the prognosis of patients and make individualized treatment plans.