Introduction

The novel biomarker neutrophil gelatinase-associated lipocalin (NGAL) shows great promise as a prognostic indicator in COVID-19 patients in the emergency department1,2 and ICU3,4,5 as well as in other critically ill patients6,7,8,9,10,11. Being present in many tissues, NGAL is a marker for inflammation and kidney injury12,13. As opposed to the currently used and accepted markers of kidney function—such as creatinine and cystatin c—NGAL is not a marker of filtration but rather the injury itself. Therefore, NGAL is detectable before any elevation in creatinine occurs following kidney injury14,15, reporting tubular stress in real-time in animal models16. A similar quick response is seen in clinical studies where the time of renal insult is known, such as cardiopulmonary bypass during surgery17,18,19,20 and in conjunction with contrast administration during percutaneous coronary intervention21. A link between NGAL and the clinically important outcomes of renal replacement therapy (RRT) initiation and in-hospital death has also been show in cardiac surgery patients22, as well as the potiential to improve existing clinical scoring systems23.

In the outpatient setting, NGAL helps detect Kidney Disease Improving Global Outcomes (KDIGO) stage progression in chronic kidney disease24. In the emergency department, NGAL has been associated with subsequent RRT and 90-day mortality for patients with COVID-191,2 and in-hospital mortality for patients without COVID-1914.

In critically ill patients, NGAL predicts acute kidney injury (AKI)6,7,8,9,10, and predicts mortality in patients with systemic inflammatory response syndrome (SIRS)11.

In patients undergoing RRT, NGAL predicts mortality25,26. As is the case in patients with cardio-renal syndrome or with AKI and underlying cirrhosis, where NGAL has also been suggested to be indicative of disease progression27,28.

NGAL is elevated in asymptomatic SARS-CoV-2 infection21. Furthermore, NGAL has been associated with histopathologic kidney injury, loss of kidney function, dialysis, shock, prolonged hospitalization, and in-hospital death in COVID-19 patients presenting to the emergency department1 as well as disease severity, critical illness, AKI, in-hospital mortality, 6-week mortality,3,4,29,30 and subsequent RRT in non-critical hospitalized COVID-19 patients3,29. Urinary NGAL has shown potential to predict AKI and mortality31,32 in critically ill COVID-19 patients. Length of mechanical ventilation has also been linked to NGAL32.

Plasma NGAL levels are comparable in patients dying without AKI to those diagnosed with AKI. It has therefore been hypothesized that NGAL measured in plasma mainly indicates general disease severity rather than kidney injury33. Plasma and urinary NGAL levels have been shown to have similar discriminative performance and sensitivity for AKI requiring dialysis34, although exhibiting different temporal profiles—with urinary NGAL performing better at predicting mortality or dialysis when measured as a peak value during the first 24 h following admission and the optimal temporal window for sampling both urinary and plasma NGAL being approximately 6–24 h after emergency department presentation35.

As blood is more frequently sampled during ICU care, determining whether these results also hold in plasma would be of clinical value, especially as the utility of plasma NGAL has previously been questioned as a marker of AKI in ICU patients due to it being of mainly neutrophil rather than renal origin36. Even though elevation of urinary NGAL in AKI is associated with a neutrophil component in animal models37, the opposite might not be true in plasma.

Therefore, we aimed to investigate the prognostic utility of plasma NGAL in critically ill COVID-19 patients. Our primary objective was to assess NGAL as a predictor of subsequent initiation of CRRT. As NGAL is an early marker associated with kidney injury, we hypothesized that it would improve prediction compared to conventional tests as part of a multivariable model. The secondary objective was to assess NGAL as a predictor of mortality.

Methods

This prospective multicenter cohort study used data from the Swecrit-Covid-IR database, described more in-depth by Didriksson et al.38. The study is part of the more comprehensive Swecrit project, as summarized by Frigyesi et al.39.

Data was collected from patients admitted between May 11, 2020 and May 10, 2021. Except for two patients discharged to other geographical regions and lost to follow-up and one patient whose last follow-up occurred 9 days after intensive care unit (ICU) admission, all survivors were monitored for at least 101 days after ICU admission with a review of vital status. Additional blood samples were also collected at set time intervals, where possible.

Participants were recruited from ICUs in four hospitals in Skåne in southern Sweden, out of which two were university hospitals. Eligible for inclusion were adults admitted for laboratory-confirmed COVID-19 during the study period. The study was conducted in accordance with the Declaration of Helsinki, relevant guidelines, and regulations. Written informed consent was obtained from study participants at admission or at three or twelve-month follow-up. Due to the minimal patient risk presented by study participation, since no intervention was carried out, informed consent from non-survivors who had not been able to consent on admission, due to their illness, was waived. In these cases, the next of kin were informed and allowed to opt out for their late relatives. The study protocol was approved by the Swedish Ethical Review Authority (Dnr 2020-011955).

Blood samples were drawn from patients on ICU admission and on days two and seven in the ICU. Further blood samples were collected at three and twelve-month follow-up visits. Data on demographics, CRRT initiation, and survival until hospital discharge were collected during each patient’s ICU stay and later compiled from medical records. The 90-day survival of discharged patients was determined using the national Swedish population register.

Samples were collected in ethylenediaminetetraacetic acid (EDTA) treated test tubes, aliquoted following centrifugation and stored at -80 degrees Celsius. They were batch-analyzed for NGAL, creatinine, and cystatin c following the end of the data collection period. NGAL was analyzed using a commercial sandwich ELISA (DY1757, R &D Systems, Minneapolis, MN, USA) according to the manufacturer’s recommendations. The total coefficient of variation for the NGAL assay was approximately 6%. Creatinine and cystatin c were analyzed on a Mindray BS380 chemistry analyzer (Mindray Medical International, Shenzhen, China) using IDMS traceable enzymatic creatinine reagents from Abbott Laboratories (Abbott Park, IL, USA) and particle-enhanced turbidimetric cystatin c reagents from Gentian (Moss, Norway).

Since tests were analyzed after the study period had ended, treating clinicians were blinded to all NGAL results, but only partially to creatinine and cystatin c, as these are used in clinical practice. Laboratory staff analyzing tests were blinded to clinical state.

Cases with missing data were excluded from the analysis. Due to their skewness, the creatinine, NGAL, and cystatin c levels were log10 transformed. The difference in NGAL measured on day zero and day two of the ICU stay was calculated by subtracting the former from the latter. The calculated value is referred to as \(\Delta _{\textrm{NGAL}}\) in this report.

Statistical analyses were performed using R40. For hypothesis tests, a level of significance of 5% was used. Several binomial linear candidate models were fitted with and without splines and interactions between the independent variables, using R’s built-in stats package. The area under the curve (AUC) for the receiver operating characteristic curves (ROC) and Brier scores were calculated for comparison, using the pROC41 and yardstick42 packages respectively. The areas under the ROC curves were compared using the method described by DeLong et al.43 using the pROC package. Models with a larger area under the ROC curve were considered superior when selecting candidate models. To further describe the difference between selected models, the integrated discrimination improvement (IDI)44 was calculated, comparing the best models including NGAL to the best models excluding NGAL.

In spline models, knots set to the first quartile, the median, and the third quartile were compared to algorithmically chosen knots with four degrees of freedom. Whichever gave the best apparent performance was used in the final model.

Since the study analyzed subsequent CRRT initiation and 90-day mortality, patients with missing data required for one analysis could be included in the other analysis, given that sufficient data were available.

To characterize the patient population, percentages were calculated for categorical variables; similarly, means and standard deviations (SD) were calculated for continuous variables. To compare the characteristics of patients receiving CRRT versus those not receiving CRRT and survivors versus non-survivors, the \(\chi ^{2}\) test was used for categorical variables and the Welch test was used for continuous variables.

To illustrate a possible correlation between NGAL and creatinine in plasma in the study population, locally estimated scatterplot smoothing (LOESS) regression was calculated using the ggplot2 package45 after removing all values under the \(1{\textrm{st}}\) and over the \(99{\textrm{th}}\) percentile of either measurement to reduce the influence of outliers.

To illustrate a possible correlation between plasma NGAL and mortality, 90-day survival was recoded as 0 and 90-day non-survival as 1, after which LOESS regression was calculated for plasma NGAL vs survival. The difference in mean plasma NGAL concentration on admission of both groups was compared using a one-sided Student’s t-test.

To illustrate the performance of our mortality model, the population was stratified into tertiles based on predicted risk of death, after which a Kaplan-Meier plot for each tertile was plotted using the survival package46. The net reclassification improvement (NRI)44 for the same risk strata was calculated, comparing it to the best model not using NGAL.

Results

Of 607 patients screened, 65 were in the ICU for reasons other than their laboratory-confirmed COVID-19, 25 were missed for inclusion, and 19 did not consent. The 498 patients remaining were found eligible for inclusion; 494 patients could be analyzed for CRRT initiation, and 399 patients could be analyzed for survival after removing incomplete cases. Flow chart, see Fig. 1.

Figure 1
figure 1

Flow chart. COVID-19, Coronavirus Disease 2019. CRRT, Continuous Renal Replacement Therapy. SAPS, Simplified Acute Physiology Score.

Of the patients missing NGAL data, 16 did not survive until the second sample could be obtained. An additional 36 patients had been discharged from the ICU before the second sample, possibly causing the second sample to be missed in the ward receiving the patient after ICU discharge. Three of the patients missing SAPS 3 data were admitted from ICUs at hospitals outside the study region, and the rest have no data registered on the exact route of admission to the ICU.

Of the 494 patients in the CRRT analysis, 70 received CRRT in the ICU. Of the 399 patients in the mortality analysis, 154 did not survive beyond 90 days. Patient characteristics are summarized in Tables 1 and 2.

Table 1 Characteristics of patients analyzed for initiation of continuous renal replacement therapy (CRRT).
Table 2 Characteristics of patients analyzed for survival.

In our analysis of subsequent CRRT initiation, the group receiving CRRT had a a higher prevalence of preexisting hypertension (71% vs. 52%), complicated diabetes (31% vs. 11%), and chronic kidney disease (14% vs. 2%). On admission, the CRRT recipients had a higher mean plasma NGAL level (243 ng/ml, SD 215 ng/ml vs. 148 ng/ml, SD 144 ng/ml). Their mortality was higher both in the ICU (57% vs. 26%) and over 90 days (61% vs. 35%). (See Table 1).

In our analysis of mortality, the non-survivors had a higher mean clinical frailty scale value (3.1, SD 1.1 vs. 2.8, SD 1.0) and Charlson comorbidity index (4, SD 2 vs. 2, SD 2). On admission, the non-survivors had a higher mean plasma NGAL level (184 ng/ml, SD 172 ng/ml vs. 141 ng/ml, SD 117 ng/ml). They received CRRT to a larger degree than the survivors (25% vs. 9%). (See Table 2.)

Our models for predicting subsequent CRRT initiation yielded AUCs ranging from 0.87 to 0.95. The model achieving the largest area under the ROC curve used NGAL combined with creatinine and cystatin c as independent variables (AUC 0.95). The IDI of the model, compared to the best model excluding NGAL—which used creatinine and cystatin c—was 7.4%. Figure 2 reports areas under the ROC curves. The differences in the areas under the curves were statistically significant, apart from NGAL alone (AUC 0.92) versus creatinine alone (AUC 0.91, p 0.271) and NGAL alone versus creatinine combined with cystatin c (AUC 0.92, p 0.500). Figure 2 reports areas under the ROC curves, including their confidence intervals. All p-values are reported in the caption of Fig. 2.

Figure 2
figure 2

Receiver operating characteristics curves for predicting subsequent continuous renal replacement therapy. The areas under the curves were compared using one-sided DeLong’s test, yielding the following p-values: cystatin c vs. creatinine < 0.001, creatinine and cystatin c < 0.001, NGAL 0.002, NGAL, creatinine, and cystatin c < 0.001; creatinine vs. creatinine and cystatin c 0.021, NGAL 0.271, NGAL, creatinine, and cystatin c 0.001; creatinine and cystatin c vs. NGAL 0.500, NGAL, creatinine, and cystatin c 0.004; NGAL vs. NGAL, creatinine, and cystatin c 0.007. AUC, area under curve. CI, 95% confidence interval. NGAL, neutrophil gelatinase-associated lipocalin.

Our models for predicting 90-day mortality yielded AUCs ranging from 0.63 to 0.83. The model achieving the largest area under the ROC curve used age, sex, NGAL on admission, and \(\Delta _{\textrm{NGAL}}\) between day zero and two in the ICU as independent variables (AUC 0.83). The IDI of this model, when compared to the model using age and sex—which was the best model excluding NGAL—was 2.0%. When predicted risk was split into tertiles, the NRI calculated between the same two models was 8.1%. The next best AUC was the the model using age, sex, and NGAL as independent variables (AUC 0.80, p 0.005). Figure 3 reports areas under the ROC curves including their confidence intervals. All curves differed significantly. All p-values are reported in the caption of Fig. 3.

Figure 3
figure 3

Receiver operating characteristics curves for predicting 90-day mortality. The areas under the curves were compared using one-sided DeLong’s test, yielding the following p-values: NGAL vs. age and sex < 0.001, age, sex, and NGAL < 0.001, age, sex, NGAL, and \(\Delta _{\textrm{NGAL}}\) < 0.001; age and sex vs. age, sex, and NGAL 0.030, age, sex, NGAL, and \(\Delta _{\textrm{NGAL}}\) 0.001; age, sex, and NGAL vs. age, sex, NGAL, and \(\Delta _{\textrm{NGAL}}\) 0.005. AUC, area under curve. CI, 95% confidence interval. NGAL, neutrophil gelatinase-associated lipocalin.

Figure 4
figure 4

Correlation between neutrophil gelatinase-associated lipocalin (NGAL) and creatinine concentrations between the \(1{\textrm{st}}\) and \(99{\textrm{th}}\) percentiles. Blue line fitted to data using locally estimated scatterplot smoothing (LOESS) with the gray band showing the 95% confidence interval.

The relationships between NGAL and creatinine and 90-day mortality are plotted in Figs. 4 and 5. Regression lines in the figures were computed using LOESS. In Fig. 5, means were compared using a one-sided Student’s t-test.

Figure 5
figure 5

Admission neutrophil gelatinase-associated lipocalin (NGAL) vs. 90-day mortality. The dotted line represents the means of respective groups (difference in means p < 0.001, calculated using one-sided Student’s t-test). Blue line fitted to data using locally estimated scatterplot smoothing (LOESS) with the gray band showing the 95% confidence interval.

Figure 6
figure 6

Kaplan-Meier plot showing actual 90-day survival stratified by predicted risk of death calculated using the best performing model in our study. The model uses age, sex, neutrophil gelatinase-associated lipocalin (NGAL), and \(\Delta _{\textrm{NGAL}}\) between day zero and two in the intensive care unit. Bands show 95% confidence intervals. Table 3 shows the net reclassification improvement for the model, compared to the best model excluding NGAL, using the same risk strata.

Figure 6 represents Kaplan-Meier curves showing 90-day survival of patients stratified into tertiles of predicted mortality risk. The bands in the figure show 95% confidence intervals. Table 3 shows reclassification for the model, compared to the best model excluding NGAL, using the same risk strata.

Table 3 Predicted risks and reclassification of 90-day mortality risk using the best performing model in our study.

Discussion

Our study shows that plasma NGAL, in addition to conventional tests for kidney function, can improve the prediction of subsequent initiation of CRRT compared to conventional biomarkers alone in critical COVID-19. Furthermore, NGAL on ICU admission and an early increase in NGAL can improve mortality prediction compared to age and sex alone.

This is an important finding, as the rapid rise of NGAL in response to renal stress enables clinicians to start treatment promptly and avoid unnecessary deterioration—something essential in the critically ill whose physiologic reserves are already strained47.

When compared to models not using NGAL, the discriminatory ability for subsequent CRRT initiation of our model utilizing NGAL, creatinine, and cystatin c, as measured by the area under the ROC curve, increased less in relative terms than the discriminatory ability of our model utilizing age, sex, NGAL, and \(\Delta _{\textrm{NGAL}}\) for prediction of mortality. In absolute terms, our CRRT model did, however, perform better than our mortality model, which implies that NGAL might be more useful for CRRT prediction in clinical practice.

Previous research has shown that NGAL in urine predicts mortality31 and subsequent initiation of CRRT32 in critical COVID-19. In the initial resuscitative phase of intensive care, patients do not always produce urine, making reliance on urine tests for rapid diagnostics unreliable. The finding that plasma NGAL can be used instead may also simplify testing, as blood samples are more routinely handled in the ICU, and NGAL can potentially be analyzed in the same samples as other tests. This increases its utility.

Our findings align with previous research in the area, showing the potential of NGAL as a diagnostic tool. Elevation in NGAL has been documented even in asymptomatic SARS-CoV-2 infection21. Despite this, NGAL has been found useful in assessing risk for dialysis2 and poor outcome1 in patients presenting to the emergency department with COVID-19. Severe inflammation does not seem to diminish its discriminatory capability for AKI10, although it may make tailoring of cut-off values and models necessary48.

A strength of our study was that we could follow up a large proportion of the surviving patients for more than three months, which enabled us to use 90-day mortality rather than less patient-centered, shorter-term survival as an outcome. The multicenter design included university and community hospitals which increases the generalizability of our results. The fact that we acquired sequential blood tests also enabled the analysis of dynamics in NGAL. A weakness was, however, the frequency of sampling. Our study sampled NGAL on days zero, two, seven, and 90, yielding a limited temporal resolution. Elevation of NGAL is detectable up to 48 hours before AKI diagnosis8 and dynamics over 48 hours have been linked to mortality48, it is possible that more frequent testing would have provided data for even better prediction models. Furthermore, our zero-day sample is from the day of ICU admission, i.e. when the patients had already developed critical illness. NGAL sampled before ICU admission should be considered for future studies, as this would also allow for the prediction of ICU need altogether.

Moreover, the analysis methods used in our study did not allow us to distinguish between monomeric NGAL, of renal origin, and dimeric NGAL, of neutrophil origin, which has previously been proposed as a way of improving diagnostics when measuring NGAL in urine49. Measurement of the different forms of NGAL and their ratio in blood might be valuable, and it is an area that warrants further investigation.

NGAL varies significantly with age and sex, with higher levels seen in females and older individuals24. Even though our mortality model considered this, as age is a strong predictor of mortality in COVID-1938, our CRRT model did not. Stratifying patients in different age and sex groups or looking at relative change or rate of change in NGAL rather than absolute levels could result in a superior model. However, the uneven male-to-female distribution in our study population did not lend itself to any far-reaching conclusions about the topic. A weakness of our study is also the fact that creatinine is used both to define AKI and as a guide when deciding to start CRRT, which can introduce biases.

The fact that NGAL did not significantly outperform creatinine individually but substantially improved diagnostics as part of a multivariable model implies that constructing such models may be more fruitful than finding a perfect individual prognostic marker. Even though NGAL is a good predictor of AKI, so are several other markers6,7. In an era where statistical models with the help of machine learning and artificial intelligence are ever more capable of complex pattern recognition, looking at interactions between several factors promises to become more and more relevant in the intensive care unit of tomorrow, e.g. combining NGAL with radiological data4.

It is also possible that improved diagnostics and more sensitive tests will require us to revisit the definition of conditions such as AKI15 or open up new therapeutic pathways to target13. However, as our study was mainly exploratory, further development and validation of clinical prediction models are needed.

In summary, our study shows that plasma NGAL predicts subsequent initiation of CRRT and mortality in critical COVID-19. Plasma NGAL may be valuable for future clinical scoring systems and prediction models.