Introduction

Acute kidney injury (AKI) is reported in 52.9–57.3% of critically ill patients and strongly associated with increased morbidity and mortality1,2,3. In addition, AKI survivors have increased risk of developing chronic kidney disease (CKD), CKD progression, and end-stage kidney disease (ESRD). Factors associated with an increased risk of long-term complications include diabetes, older age, number of AKI episodes, AKI severity and serum albumin4,5,6. Recently, AKI duration was ascertained as another prognostic factor for short-term and long-term mortality7,8,9,10.

Novel biomarkers measured at the time of intensive care unit (ICU) admission have been reported to predict short-term and long-term outcomes. Examples include urinary neutrophil gelatinase-associated lipocalin (uNGAL), urinary interleukin-18 (IL-18), urinary kidney injury molecule-1 (KIM-1), urinary insulin-like growth factor-binding protein-7 and tissue inhibitor of metalloproteinases-2 (TIMP-2xIGFBP-7)11,12,13. uNGAL is a 25-kDa protein belonging to the lipocalin family. uNGAL is produced in renal epithelia and leukocytes in response to tubular injury and systemic inflammation. High uNGAL can be used to predict AKI14,15,16,17, discriminate intrinsic AKI from pre-renal AKI18,19, predict renal non-recovery, in-hospital mortality20,21,22, long-term CKD progression, ESRD, and mortality11,12,23. Previous predictive models have utilized uNGAL to predict AKI, in-hospital renal replacement therapy (RRT), or death21,24,25. However, no studies have assessed the clinical utility of uNGAL for assessment of overall spectrum of outcomes ranging from in-hospital mortality, renal replacement therapy (RRT), and renal non-recovery, to survival and renal function after hospital discharge.

King Chulalongkorn Memorial Hospital has employed uNGAL test as a routine measurement in AKI patients since 2015. This study aimed to first evaluate the role of uNGAL in an adult population for predicting persistent AKI. Secondly, we aimed to assess and evaluate, using multivariable logistic regression, the usefulness of uNGAL measurement in combination with standard clinical covariates for prediction of 30-day and 365-day major adverse kidney events in a heterogeneous adult population.

Results

Population

During the study period, there were 1,607 patients with at least one uNGAL measurement; after excluding duplicates, 1,322 AKI patients were included in the analysis. In our study, baseline SCr was obtained from pre-admission values in 947 (71.6%) patients, the majority of patients in our study. We used MDRD equation back-estimation in 5.6% and the first SCr on admission in 22.8%. Among AKI patients, 23.1% were detected with transient AKI and 76.9% with persistent AKI. At 30 days, MAKE30 occurred in 45.1%. At 365 days, 1,307 patients were available for MAKE365 analysis; 61.7% had MAKE365 (Fig. 1).

Figure 1
figure 1

Study flow chart.

Most patients were male (55%), and had a median age of 66 (IQR, 55–80) years. Baseline SCr was obtained from pre-admission values in 947 (71.6%) patients. We used MDRD equation back-estimation in 5.6% and the first SCr on admission in 22.8%. uNGAL was requested in the ward, ICU, and emergency department in 70.7%, 24.4%, and 4.9% of all patients, respectively. Indications for uNGAL request were presented in Supplementary Table 1.

Urine chemistry and uNGAL for prediction of persistent versus transient AKI

Most of the AKI population (76.9%) had persistent AKI. These patients presented more frequently with AKI stage 3, sepsis, CKD, had a higher median SCr at diagnosis, and a higher peak SCr. They were less likely to have AKI from ischemic cause compared with transient AKI patients (p < 0.001). Those with persistent AKI had a higher proportion of MAKE30 and MAKE365, including their individual components. At 30 days, renal recovery occurred in 55.3% of those with persistent AKI versus 77.6% of those with transient AKI (p < 0.001) (Table 1).

Table 1 Comparison between urine NGAL, FENa, FEuric, and FEurea in patients with transient AKI versus persistent AKI.

Median uNGAL concentrations in persistent AKI were 603 (IQR 124, 2004) ng/mL versus 103 (IQR 72,273) ng/mL in transient AKI (p < 0.001, Fig. 2A). Univariate logistic regression analysis revealed AKI staging, sepsis, chronic kidney disease, FENa, FEuric, and loge uNGAL to be associated with persistent AKI (p < 0.001), while ischemic cause was a protective factor (p < 0.001). After adjusting for these covariates in a multivariate logistic regression analysis, loge uNGAL remained significant (Table 2).

Figure 2
figure 2

Box plot showing uNGAL concentration distribution between persistent versus transient AKI (A), MAKE30 versus non MAKE30 (B), and MAKE365 versus non-MAKE365 (C). The box represents the 25th and 75th percentiles and line within the box represents the median. Pairwise comparison was calculated by Mann-Whitney U test.

Table 2 Univariate and multivariate logistic regression analysis for persistent AKI.

To compare the performance of each individual biomarker in the prediction of persistent AKI, subjects who had missing values in one or more biomarkers were excluded for ROC curve analysis. Overall, there were 446 patients with available uNGAL, uNGAL/Cr, FENa, FEuric, and SCr collected on the same day. All studied biomarkers were significant predictors of persistent AKI. uNGAL/Cr ratio showed the best discrimination of persistent AKI versus transient AKI with an AROC of 0.76 (95% CI = 0.72–0.81), comparably to uNGAL (AROC 0.75, 95% CI = 0.70–0.80) but superior to other biomarkers (Fig. 3).

Figure 3
figure 3

Area under the curve receiver operating characteristic curve for Urine NGAL, Urine NGAL to Cr ratio, FENa, FEuric, and Cr at AKI diagnosis for persistent AKI; NGAL, neutrophil gelatinase-associated lipocalin; FENa, fractional excretion of sodium; FEuric, fractional excretion of uric; Cr, creatinine; AKI, acute kidney injury.

MAKE outcomes at 30 days

The overall MAKE30 incidence was 45.1%. The incidence of each endpoint contributing to MAKE30 was 28.3% death, 12.0% RRT, and 4.8% persistent renal dysfunction. Median uNGAL concentration between patients with MAKE30 and without MAKE30 were 846 (IQR 176, 2907) versus 179 (IQR 96, 704) ng/mL, p < 0.001 (Fig. 2B). When comparing the performance of our multivariate logistic models for predicting MAKE30 and MAKE365, we randomized the entire dataset into training and validation datasets, representing 70% and 30% of the observations, respectively; patient characteristics between these groups comparable (Supplementary Table 2). In the training dataset, 417/925 patients developed MAKE30. Table 3 shows the results from the multivariate model for MAKE at 30 days. An increased risk of MAKE30 was associated with uNGAL concentration, male sex, patients in the ICU versus other wards, those with AKI stage 3 versus 2 and 1, with sepsis, ischemic cause of AKI, malignancy and chronic liver disease.

Table 3 Results from multivariable logistic regression model of factors associated with major adverse kidney events at 1 month (MAKE30) and 1 year (MAKE365).

We used the estimates from the multivariate logistic regression model to generate a MAKE30 predictive score for every individual in the training dataset. The overall mean score was 0.24 ± 1.11. For those who developed MAKE30 the mean was 0.33 ± 1.0 compared with −0.71 ± 0.98 in those who did not (mean difference 1.04 (95%CI 0.91–1.17); P < 0.001). The AROC from this model was 0.77 (95%CI 0.73–0.80), indicating a reasonable ability of the score to discriminate between those who developed MAKE30 and those who did not. Furthermore, this AROC was significantly better than a model which only included uNGAL concentration (AROC 0.72 (95%CI 0.68–0.75); P for difference in AROC < 0.001) (Fig. 4A, Table 4). We stratified patients according to deciles of scores; Table 5 shows the observed versus expected number of patients who developed MAKE30 in each stratum. The Hosmer-Lemeshow test statistic was 5.04 (P = 0.75), indicating no evidence for lack of fit (Table 5).

Figure 4
figure 4

ROC curves from a multivariate clinical model, uNGAL, and the model with uNGAL for prediction of MAKE30 in the training dataset (A) and validation dataset (B).

Table 4 AUC of clinical model with uNGAL for prediction of MAKE30 and MAKE365.
Table 5 Expected and observed numbers of major adverse kidney events at 1 month (MAKE30) and 1 year (MAKE365), stratified by deciles of logistic regression score (1 = lowest, 10 = highest), in the training and validation datasets.

When the same predictive score was calculated for the 397 patients in the validation sample (overall mean score −0.23 ± 1.10), we noted similar results. For those who developed MAKE30 the mean was 0.27 ± 1.11 compared with −0.64 ± 0.94 in those who did not (mean difference 0.91 (95%CI 0.71–1.11); P < 0.001). The AROC was 0.74 (95%CI 0.69–0.79), and the Hosmer-Lemeshow test was not significant, indicating a similar level of performance observed in the training sample.

In addition, the addition of uNGAL to the multivariate model resulted in a significant increase (P < 0.001) in AROC in the validation dataset, compared to that predicted with uNGAL alone; AROC = 0.66 (95%CI 0.61–0.71) (Fig. 4B, Table 4).

MAKE outcomes at 365 days

The overall MAKE365 incidence was 61.7%. The MAKE365 incidence after excluding the patients who already had MAKE30 was 32.4%; the additional endpoints were mortality (27.9%), RRT (2.7%), and persistent renal dysfunction (1.8%). Median urinary concentration between patients with MAKE365 and without MAKE365 were 625 (IQR 136, 2265) and 137 (93, 612) ng/mL, p < 0.001 (Fig. 2C).

Results from the multivariate equation for MAKE365 in the training dataset are shown in Table 3. An increased risk of MAKE365 was associated with increasing uNGAL concentration, increasing age, an ICU admission, AKI of stage 3 versus stages 1 and 2, sepsis, malignancy, ischemic heart disease, persistent AKI and chronic liver disease. Sex and ischemic cause of renal disease that were significant in the MAKE30 model were not significant in the MAKE365 model; however, age and ischemic heart disease that were not significant in the MAKE30 model were important in the MAKE365 model. The mean predicted MAKE365 score in the training sample was 0.60 ± 1.2 (1.0 ± 1.0 in those who developed MAKE365 and −0.07 ± 1.0 in those who did not; mean difference = 1.08 (95%CI 0.94–1.22; P < 0.001). The AROC was 0.77 (95%CI 0.74–0.80), which was significantly better than the AROC with uNGAL alone (0.70 (95%CI 0.67–0.74); P < 0.001) (Fig. 5A, Table 4). The Hosmer-Lemeshow test statistic was 4.73 (P = 0.79) (Table 5).

Figure 5
figure 5

ROC curves from a multivariate clinical model, uNGAL, and the model with uNGAL for prediction of MAKE365 in the training dataset (A) and validation dataset (B).

As found with the MAKE30 outcomes, the equation derived from the multivariate model in the training dataset also performed well in the validation dataset. The mean difference in MAKE365 scores between those with and without the outcome was 0.84 (95%CI 0.62–1.05); P < 0.001. The AROC was 0.72 (95%CI 0.67–0.77), suggesting reasonable discriminative ability. This was also a significant improvement (P < 0.001) on the AROC for uNGAL alone in predicting MAKE365 (AROC = 0.64 (95%CI 0.59–0.70) (Fig. 5B). The Hosmer-Lemeshow test statistic was 4.30 (P = 0.93), suggesting good calibration. The observed and expected numbers of MAKE365 events in the training and validation datasets is shown in Table 5.

New-onset CKD and CKD progression

We further explored associations of uNGAL and new-onset CKD and CKD progression in patients who were alive at 1 year (n = 659). In patients without baseline CKD and were alive at 1 year (n = 370), there were a 68.9% incidence of new CKD (stage 3–5 and ESRD). Compared between transient AKI and persistent AKI, there were 69.1% vs 79.3% of patients who had CKD at 1 year (p = 0.006). Univariate and multivariate logistic regression revealed loge uNGAL, age, and nephrotoxic cause of AKI to be associated with CKD (Supplementary Table 5). In patients with baseline CKD (n = 289), there was a 42.9% incidence of CKD progression. Univariate logistic regression revealed loge uNGAL, persistent AKI, ischemic heart disease, AKI stage 3, and SCr at AKI diagnosis to be associated with CKD progression. Multivariate logistic regression, however, showed only uNGAL and ischemic heart disease as associated factors (Supplementary Table 6). In total, 18.2% of all patients progressed to ESRD.

Discussion

To our knowledge, this is the first report using urine biomarker in a large cohort of heterogeneous patients for prediction of persistent AKI, MAKE30, and MAKE365. We found that uNGAL measured at the onset of AKI was associated with persistent AKI, MAKE30, MAKE365, new-onset CKD, and CKD progression. Urinary NGAL performed better than other markers for prediction of persistent AKI. For MAKE30 and MAKE365, predictive models combining uNGAL with clinical covariates significantly enhanced predictive value of MAKE30 and MAKE365 compared with uNGAL or clinical models alone.

Persistent AKI is associated with increased ICU length of stay, time on the ventilator, readmission, and mortality8,9,10. Persistent AKI also displayed significantly more adverse long-term outcomes with a cumulative incidence of>60% compared with 40% in transient AKI26. Coca et al. first described urinary injury markers as predictors for AKI duration in post-cardiac surgery patients7. Our data showed that patients with persistent AKI had higher uNGAL, uNGAL/Cr ratio, FENa, FEuric, and SCr than transient AKI. Among these biomarkers, uNGAL and uNGAL/Cr ratio were superior to other conventional biomarkers including SCr, by ROC curve analysis. These findings confirm the concept that an increase in creatinine lags for AKI diagnosis and adds little value in risk stratification. Thus, identification of persistent AKI by uNGAL is beneficial for timely intervention and more careful monitoring in-hospital and after discharge.

The 30-day and 1-year mortality rates in our study were 28.5% and 50%, similar to those reported in western cohorts2,27,28,29. Previous studies have shown uNGAL levels’ association with adverse in-hospital outcomes including mortality, which was likewise demonstrated in our study18,30,31,32,33,34,35. However, there is a paucity of data contributing to the evidence base regarding associations between uNGAL and long-term renal outcomes after AKI. Coca et al. reported the highest values of uNGAL and L-FABP were independently associated with long-term mortality in an AKI cohort12, but did not evaluate associations with renal function. Singer et al. showed that uNGAL measured at the time of AKI diagnosis was superior to creatinine levels and AKI staging in predicting long-term ESRD or death23. Our study included the largest number of patients to date, and showed an independent association of uNGAL with short-term and long-term death, ESRD, and persistent renal dysfunction. Furthermore, a few studies have investigated prognostic significance of uNGAL for progression of CKD in CKD cohorts36,37,38. In AKI patients, only one study explored uNGAL on ICU admission as a predictor CKD progression in MAKE-free ICU survivors11. Our study shows similar findings of uNGAL as an independent factor in AKI patients associated with new-onset CKD and CKD progression in 1-year survivors.

There have been few predictive models using urine biomarkers for prediction of AKI and AKI outcomes. Mcmahon et al. demonstrated the addition of urinary NGAL/albumin to the clinical model modestly improved the prediction of AKI, in particular severe stage 3 AKI and the prediction of 30-day RRT or death24. The TRIBE study showed that postoperative urine IL-18 and plasma NGAL improved AKI prediction over clinical model alone from AUC of 0.69 to 0.76 and 0.75, respectively21. In the SAPPHIRE and TOPAZ studies, a multivariate clinical model with TIMP2-IGFBP-7 helped predict moderate-to-severe AKI within 12 hours with AROC of 0.8839. Our study is the first to develop predictive model of MAKE30 and MAKE365 by using clinical models and uNGAL. The addition of uNGAL to clinical model significantly improved the risk prediction of MAKE30 and MAKE365 in both training and validation cohort. It must be emphasized that the clinical models alone performed better than uNGAL. The modest performance of uNGAL in our study maybe from the heterogeneous timing of uNGAL testing and presence of confounding factors including sepsis and CKD.

Our study has several limitations. First, because uNGAL is ordered at the discretion of any physician, the timing between AKI onset and uNGAL measurement may be heterogeneous. To mitigate this issue, we restricted inclusion in our study to those with NGAL results were collected within 72 hours of suspected AKI, and this reflects real world practice. Moreover, uNGAL request indications were predominantly for prediction of persistent AKI, implying that uNGAL tests were requested as early biomarkers for outcome prediction. Second, we did not have information of urine output, so outcomes might be underestimated by using creatinine criteria alone40. Third, we did not include patients without AKI, so the increased risk for adverse events with AKI staging was more prominent in AKI stage 3 compared with stage 1/2. Lastly, we used random split-sampling method for internal validation, so external validation of MAKE30 and MAKE365 predictive models is further needed. Despite these limitations, ours is a large observational study to assess the utility of NGAL for prognostic information in real-world practice, where physicians routinely use uNGAL to facilitate their clinical decisions. In addition, our study also evaluates NGAL measurements in the emergency room, general ward, and intensive care units, providing evidence on the usefulness of uNGAL in diverse clinical settings.

In conclusion, our current study demonstrates that significant association of uNGAL and risks for persistent AKI, MAKE30, and MAKE365. The use of clinical models incorporating AKI severity and other predictors with addition of uNGAL modestly improved prediction of MAKE30 and MAKE365. Determining uNGAL levels at the time of AKI diagnosis, in addition to improving diagnostic assessment, may help in early selection of patients who require close monitoring during hospital admission and after their stay, to prevent CKD progression, incident RRT, and death in AKI survivors.

Materials and Methods

Study design

The present study was conducted retrospectively on prospectively collected data of uNGAL registry at King Chulalongkorn Memorial Hospital. Inclusion criteria were patients aged ≥18 years with at least one uNGAL measurement within 72 hours of AKI onset between January 2016 and September 2017. Two nephrologists independently adjudicated acute kidney injury (AKI) status and onset using creatinine criteria from the KDIGO guideline without knowledge of NGAL value41. Patients were excluded if they had end-stage kidney disease, missing AKI status due to single measurement of serum creatinine, or unavailable outcomes at 365 days. The study protocol was approved by the Ethics Committee of Chulalongkorn University (IRB No. 037/61). Informed consent was waived by the Institutional Review Board of the Faculty of Medicine, Chulalongkorn University due to retrospective observational nature of the study and extraction of the patients’ data by an independent researcher with data deidentification. All research was performed in accordance with the international guidelines for human research protection as Declaration of Helsinki42, The Belmont Report43, and International Conference on Harmonization in Good Clinical Practice (ICH-GCP)44. The design, execution, and reporting of this study are in accordance with the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE)45 and Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) criteria46.

Variables

Baseline characteristics of the patients were obtained from electronic medical records. AKI cause was adjudicated by two nephrologists by reviewing medical charts. Indication for uNGAL order was obtained from each physician at the time of request. Daily serum creatinine within the first week, at day 30, and at day 365 were obtained. For short-term outcomes, 30-day all-cause mortality, RRT, hospital length of stay, and renal recovery were recorded. For long-term outcomes, 365-day all-cause mortality, incident RRT, and readmission from all causes were collected. Mortality and renal replacement therapy were obtained from electronic in-house and national databases.

Definition

AKI was defined and staged according to KDIGO criteria41. Baseline serum creatinine (SCr) was defined as the pre-admission SCr (obtained not more than 365 days prior) available from the hospital electronic records. When this value was not available, first admission SCr or the Modification of Diet in Renal Disease (MDRD) equation back-estimation was used, whichever one was lower47,48.

We defined transient AKI as an increase of SCr ≥1.5 times that of baseline within 7 days, that decreased to <1.5 times by 2 days after AKI onset. Persistent AKI was defined as an increase of SCr by 1.5 times of baseline that remained elevated after 3 days or more, or non-recovery before death49.

Major adverse kidney events at 30 days (MAKE30) comprised death, incident dialysis, and doubling of SCr within 30 days. Major adverse kidney events at 365 days (MAKE365) comprised death, incident dialysis, and persistent renal dysfunction (doubling of SCr or estimated glomerular filtration rate <50% from baseline) within 365 days. Renal recovery was defined as SCr <1.5 times of baseline and being free from RRT for 14 days50. Sepsis was diagnosed in accordance with the Third International Consensus Definitions for Sepsis and Septic Shock (SEPSIS-3)51. Chronic kidney disease (CKD) was defined as eGFR of less than 60 ml/min/1.73 m2 on at least 2 occasions lasting for more than 3 months. CKD stages were defined according to KDIGO guidelines and based on eGFR levels52. Progression of CKD stage in patients not on dialysis at study entry was determined by the last measured eGFR at 1 year in alive patients, resulting in reclassification to a more advanced stage or ESRD37.

Urine chemistry and urinary NGAL

Urinary NGAL was measured at the time of AKI diagnosis by physicians’ discretion using a commercially available chemiluminescence method (Abbott, USA). Urinary Na, uric acid, and urea were also measured at the same time as uNGAL. The fractional excretion of sodium (FENa) was calculated by the formula (SCr × UNa)/(SNa × UCr). The fractional excretion of uric acid (FEuric) and fractional excretion of urea (FEurea) were calculated using the same formula.

Outcomes

The primary outcome was predictive value of uNGAL for persistent AKI compared with conventional markers including serum creatinine, FENa, FEuric, and FEurea at the onset of AKI. The secondary outcomes were association and predictive performance of uNGAL alone and in combination with clinical models for MAKE30 and MAKE365. Association of uNGAL with new-onset CKD in patients without baseline CKD and CKD progression in patients with baseline CKD were also explored.

Statistical analysis

Formal comparisons of continuous variables between groups were made by an unpaired t-test or by Mann–Whitney (Wilcoxon rank sum) test, and categorical data were compared using a Chi-Square or Fisher’s exact test, as appropriate. Univariate and multivariate logistic regression was to compare biomarkers associated with AKI persistence, new-onset CKD, and CKD progression, in the presence of clinical characteristics. We used logistic regression in preference to survival analysis techniques because there was no censoring in our dataset. Variables were adjusted for in the multivariate model if on univariate analysis they were related to persistent AKI at P < 0.1 using backwards stepwise procedure. AROC was compared between each biomarker for prediction of persistent AKI. Thereafter, we identified factors independently associated with MAKE30 and MAKE365 with multivariate regression models. For this purpose, the study cohort was randomized into a training dataset (n = 925) to generate a model, and validation dataset (n = 397) for assessing the adequacy of model fit. We used the training set to identify factors independently associated with MAKE30 and MAKE365, using a backwards stepwise procedure, and assessing Akaike’s information criteria (AIC) after each step. Model selection using AIC has better statistical properties compared to P-value based selection, and avoids arbitrary and inefficient selection rules based on P values53. Factors assessed in univariate models included uNGAL concentration, sex, age, AKI stage, persistent AKI, the ward in which the patient had the uNGAL sample collected, if the patient’s etiology of AKI was ischemic cause, and comorbidities including malignancy, diabetes, ischemic heart disease, chronic liver disease and sepsis. We assessed the linearity of continuous covariates against the logit function, and if this assumption was not met, the variable was transformed or modelled in quartiles. Adjacent categories were collapsed together if the odds ratios and 95%CI were similar. A logarithmic transformation was applied for uNGAL to linearize the covariate against the logit function.

We used the coefficients from the final multivariate models (the natural logarithms of the odds ratios) to derive scores for every individual that can be used to estimate the probability of developing MAKE30 and MAKE365. The score can be calculated by multiplying the coefficient by the corresponding covariate value for each individual. These MAKE30 and MAKE365 scores can be then used to estimate the probability of that individual developing a major adverse kidney event within 30 or 365 days, respectively, using the following equation:

$${\rm{Probability}}=\frac{{e}^{Score}}{1+\,{e}^{Score}}$$

Model adequacy was tested in several ways. First, we compared the mean scores in those who developed versus those who had not developed MAKE30 and MAK365 using an unpaired t-test. Second, we calculated the area under the Receiver Operating Characteristics Curve (AROC), as a measure of the ability of the model to discriminate between those with and without clinical outcomes54, comparing the AROC for uNGAL alone with that from the full multivariate model. Lastly, we assessed the calibration of the models in the training and validation datasets, by stratifying patients in deciles according to the MAKE30 and MAKE365 scores, and comparing the observed and expected number of events in each group using the Hosmer-Lemeshow Statistic55. Since the fit of any model is always better in the dataset used to derive the model than in the general population, we also calculated these statistics in the validation dataset to provide unbiased estimates of the model adequacy. Statistical analysis was conducted using Stata 16.0 (Statacorp, College Station, TX, USA).