Introduction

Research on acute kidney injury (AKI) in critically ill neonates has lagged, relative to research in pediatric and adult populations. Recently, several small, single-center studies in select neonatal groups (congenital heart disease,1,2,3,4 sepsis,5 hypoxic injury,6,7,8,9 receiving extracorporeal membrane oxygenation,10,11,12 and very-low birthweight infants13,14,15,16,17,18,19) suggest that AKI is common and that neonates with AKI have higher mortality. We, the Neonatal Kidney Collaborative, recently published on the epidemiology of neonatal AKI (nAKI) using a 24-center retrospective cohort of 2022 infants across the gestational age (GA) spectrum in the Assessment of Worldwide Acute Kidney Epidemiology in Neonates (AWAKEN) database.20,21 Using the Kidney Disease: Improving Global Outcomes (KDIGO) AKI definition which defines AKI as a rise in SCr ≥0.3 mg/dl and/or a ≥50% rise in SCr from baseline, we found that 30% of the cohort had nAKI. As shown in older children and adults, nAKI was associated with higher mortality and longer length of stay (LOS), even after controlling for multiple confounders.

Despite finding a strong association between nAKI using the KDIGO definition with outcomes, this definition was adapted from an adult patient-driven definition and may not be optimal for use in neonates. In 2013, an NIH-NIDDK Neonatal AKI definition working group22 concluded that the KDIGO definition was the best currently available approach to define nAKI, as it allows for consistency across studies and neonatal populations. They strongly suggested that this definition (and other approaches) should be critically interrogated against meaningful clinical endpoints, in large neonatal cohorts. However, they identified several major potential problems in validity for using the KDIGO definition in neonates. One issue was that the definition does not consider the changes which occur normally in SCr concentrations in the first postnatal week of life. This trajectory varies by GA for, many reasons, including: infants SCr starts as maternal SCr,quantity and stage of nephron developmental. Given the variable physiologic trajectory of SCr across GA groups, the same change in SCr for ascertaining AKI in all neonates, may not be appropriate. Furthermore, the extent to which using the criterion of percent change in SCr in neonates adds to the use of an absolute change in SCr is unknown. Finally, the cutoffs which will achieve the most valid and parsimonious definition for nAKI need to be determined.

Tailoring a nAKI definition which addresses neonatal-specific physiology can only have a positive effect on a more accurate description of nAKI epidemiology, a better understanding of the risk factors associated with AKI, the ability to prognosticate outcomes and the ability to develop and implement specific therapies designed to improve AKI outcomes. Using data from our international AWAKEN database of patients admitted to the neonatal intensive care unit (NICU), we explored our hypothesis the SCr thresholds will differ by GA groups. To address this hypothesis, we (a) compared the performance of the KDIGO absolute and percent SCr rise thresholds to predict mortality for 3 GA groups, (b) determined the optimal absolute SCr rise, percent SCr rise, and maximum SCr thresholds to maximize AUC and specificity of mortality for each of 3 GA categories, (c) compared the performance of these newly derived SCr thresholds to predict mortality against each other and against the KDIGO definition, and (d) evaluated whether the addition of a percentage rise in SCr provides any added value to an absolute rise in SCr.

Methods

Patient population

Complete description of the formation of the NKC, and methods for developing the AWAKEN database have been published elsewhere.16 The University of Alabama at Birmingham Institutional Review Board (IRB) approved this collaborative study, and each center received approval from their respective IRBs. The study was registered at ClinicalTrials.gov NCT02443389.

The main criteria for AWAKEN study inclusion was admission to NICU within the first 14 days of life and receipt of intravenous fluid for at least 2 days of hospitalization. Exclusion criteria included: (1) admission at ≥ 14 days of life, (2) congenital heart disease requiring surgical repair at < 7 days of life, (3) lethal chromosomal anomaly, (4) death within 48 h of NICU admission, and (5) severe congenital kidney and urinary tract abnormalities. For the current analysis, infants needed to meet 2 additional criteria: (a) infants had to have at least 1 SCr drawn on day 1 or 2 (day of birth = day 1) which needed to be at least 0.5 mg/dl, and (b) infants had to have at least 1 SCr drawn on days 3–8 (so that we could ascertain changes in SCr). Fig. 1 shows the reasons for exclusion of the AWAKEN cohort for this analysis. The analysis is performed on 3 GA groups, ≤ 29 weeks, 29–36 weeks, and ≥36 weeks. This GA classification was used in previous AWAKEN publications, aimed at highlighting differences across GA, while preserving sample size in the lowest GA group.

Fig. 1
figure 1

Flow diagram of infants and reasons for exclusion

The KDIGO definition of nAKI

All clinically measured SCr values available from the medical record were used. As previously described, KDIGO defines AKI using a rise ≥0.3 mg/dL and/or by ≥50% in SCr from baseline SCr. For the purposes of this study, when describing the KDIGO methods, we needed to have only one baseline and one peak to define an absolute and a percent SCr rise. We used the lowest SCr on days 1 or 2 as the baseline SCr and the highest SCr on postnatal days 3–8 as the peak. The median number of SCr during days 1–2 was 2 (IQR = 2–3). The median number of SCr on days 3–7 was 3 (IQR = 2–5). Notably, we do not address how UOP cutoffs alone or in combination with SCr cutoffs are associated with mortality because there was a very large number of patients who did not have quantifiable UOP documented in the medical record.

Statistical analysis

Categorical variables were compared between groups using the Chi-square test or Fisher exact test (where appropriate). Continuous variables were tested for normality using the Shapiro–Wilk Test. For normally distributed continuous variables the mean ± standard deviation (SD) were reported and compared between groups using a Student t-test. For non-normally distributed variables, the median and interquartile range (IQR) were reported, and groups were compared using the Wilcoxon Signed Ranks test.

The primary outcome was hospital mortality. Nonparametric tests were used to compare the areas under the receiver operating curves as previously described by DeLong et.al.23 Comparing the differences between the specificity was performed using the McNemar’s test as previously described.24 SAS 9.4 (Cary, North Carolina) was used for all analyses.

Results

Patient population

First, we evaluated differences in demographics by GA groups (Table 1). Second, we compared demographics from a sensitivity analysis between subjects enrolled in AWAKEN that were included vs. subjects that were excluded based on additional criteria (Table 2). Third, we calculated KDIGO nAKI incidence and mortality rates separately for 3 different GA categories ( ≤ 29 weeks, 29–36 weeks, and ≥36 weeks). The sensitivity, specificity, negative predictive value (NPV), positive predictive value (PPV) for absolute, and percent SCr rise cutoffs to predict mortality were calculated. The positive likelihood ratio (LR + ), the negative likelihood ratio (LR−) and the receiver area under the curve (AUC) for predicting mortality were then calculated using the sensitivity and specificity values. We compared differences between absolute SCr rise (reference), percent SCr rise, and absolute and/or percent SCr rise for all three GA groups for the AUC and specificity to predict mortality (Table 3).

Table 1 Demographic variables by gestational age groups
Table 2 Demographic differences between those included and excluded
Table 3 Performance of KDIGO thresholds for absolute SCr rise, percent SCr rise, and combination to predict mortality using KIDGO

Fourth, we derived the optimal thresholds that would yield the highest possible AUC for mortality using the absolute SCr rise from baseline, the percent SCr rise from baseline, and the maximum SCr value during the first postnatal week for each of the three GA groups. Using these optimized thresholds we report the incidence, mortality, sensitivity, specificity, NPV, PPV, LR + , LR−, and AUC to predict mortality. We compared differences between absolute SCr rise (reference), percent SCr rise, and absolute and/or percent SCr rise and max SCr for all three GA groups for AUC (Table 4).

Table 4 Optimized thresholds and performance of absolute SCr rise, percent SCr rise and max SCr to predict mortality using thresholds designed to optimize AUC

Fifth, we derived absolute SCr rise, percent SCr rise, and maximum SCr value thresholds that yield the highest specificity without significantly affecting the overall AUC for each of the three GA groups. Using these optimized thresholds we report the incidence, mortality, sensitivity, specificity, NPV, PPV, LR + , LR−, and AUC to predict mortality. We compare differences between the specificity to predict mortality using absolute SCr change (reference), percent SCr change, absolute and/or percent SCr rise, and Max SCr change for all 3 GA groups (Table 5)

Table 5 Optimized thresholds and performance of absolute SCr rise, percent SCr rise to predict mortality using thresholds designed to optimize specificity

Figure 1 shows that 990/2162 (45.8%) infant enrolled in AWAKEN were included in this study. Of the 1172 who were excluded, 423 did not have a SCr on postnatal days 1 or 2 and 847 did not have at least one SCr measured on days 3–8. The specific neonatal and maternal demographics, reasons for hospital admission, birth complications, and maternal medications and disposition are shown in Table 1 stratified by GA categories.

Differences between included and excluded subjects

Table 2 compares the differences in the AWAKEN subjects included vs. excluded in the current analysis. There were more infants ≤ 29 weeks and less infants in the ≥36 week GA group in the included patients. Mortality rates of the included group was higher than the excluded group [49/990 (5%) vs. 34/1172 (2.9%); p < 0.01). Compared to those excluded, included infants were more often Hispanic, born in-center, had lower Apgar scores, admission of prematurity, respiratory failure, HIE, omphalocele and born of a mother with pre-eclampsia/eclampsia, maternal steroids, anti-hypertensive, and steroids. Included infants were less likely large for GA, admitted for sepsis, hypoglycemia, hyperbillirubinemia, and necrotizing enterocolitis.

Counts and mortality rates by absolute and percent SCr rise thresholds

The mortality rate for the cohort was 49/990 (5.0%). The median (IQR) day of death was on postnatal day 21 (10, 47). The mortality rate for the ≤ 29 weeks GA was 14.1%; for the 29–36 week GA cohort was 2.8% and for the ≥ 36 week cohort was 3.4%. Fig. 2 shows the number of subjects attaining the absolute and percent change SCr threshold, and associated mortality rates. Approximately half (46%) the cohort had either no rise or a decrease in SCr from baseline; those subjects had a mortality rate of 1.3%. As expected, as the absolute or percent SCr rise thresholds increase, so do the mortality rates.

Fig. 2
figure 2

Incidence of AKI and mortality by absolute and percent changes in SCr

Performance of KDIGO thresholds for absolute SCr rise, percent SCr rise, and combination to predict mortality

Table 3 shows the performance of different components of the KDIGO thresholds for the 3 GA groups. p-values are shown for differences in specificity and AUC using the ≥0.3 mg/dl thresholds as reference. The absolute SCr rise threshold for KDIGO of ≥0.3 mg/dl provides excellent sensitivity (0.92) but poor specificity (0.5) to predict mortality in those ≤ 29 weeks. Alternatively, for the 29–36 week groups and the ≥36 weeks, the ≥0.3 mg/dl cutoff provided poor sensitivity but excellent specificity. Similar trends were seen when using the percent SCr rise threshold of ≥50%; but the overall the performance was worse than using absolute SCr rise threshold. When either one or both of the KDIGO criteria were explored (≥0.3 mg/dl or ≥50% percent rise), the performance did not improve, in fact, compared to the the specificity component for the 29–36 week groups and the ≥0.3 mg/dl alone, the specificity was worse for the 29–36 week and the ≥36 weeks group, and the AUC was significantly worse for the ≤ 29 weeks group.

Association of AUC-driven optimal thresholds with mortality, using absolute and percent SCr rise and maximal (max) SCr concentration

The optimal absolute SCr rise thresholds (which maximized AUC to predict mortality) were ≥ 0.3, ≥ 0.1, and ≥ 0.1 mg/dl rise in the three increasing GA groups, respectively; the optimal percent SCr thresholds were ≥ 37%, ≥ 14% and ≥ 8%, respectively; and the optimal max SCr thresholds were > 1.13, 0.98, and 0.91 mg/dl, respectively (Table 4). In the 29–36 weeks GA group only, the AUC to predict mortality was significantly higher using the optimal percent SCr rise compared to the optimal absolute SCr rise method (AUC = 0.73 vs. 0.75; p-value = 0.002, Table 4). There was no statistically significant difference AUC values of the optimal absolute SCr rise vs. the optimal max SCr value methods (all p > 0.05, Table 4). The AUC’s for predicting mortality were not significantly higher when using the optimal absolute and/or percent SCr rise methods vs. using only the optimal absolute SCr rise threshold method (Table 4), in any GA group. Rather, in the ≤ 29 weeks GA group, adding optimal percent SCr to the optimal absolute SCr rise method was associated with a significantly lower AUC vs. using only the optimal absolute SCr rise (AUC = 0.68 vs. 0.71; p = 0.004).

Association of specificity-driven optimal thresholds with mortality, using absolute and percent SCr rise and maximal (max) SCr concentration

The optimal absolute SCr rise thresholds (which maximized specificity for mortality) were ≥ 0.6, ≥ 0.3, and ≥ 0.3 mg/dl rise in the three increasing GA groups, respectively; the optimal percent SCr thresholds were ≥ 93, 66, and 40%, respectively; and the optimal max SCr thresholds were ≥ 1.59, 1.24, and 1.51 mg/dl, respectively (Table 5).

Compared to the absolute SCr rise threshold, the percent SCr threshold gave a lower specificity for the ≥ 36 week GA group (p < 0.001), and the combination of both the absolute and/or percent SCr group provided lower specificity for all the GA groups (all p < 0.001). The max SCr value gave a higher specificity for the ≤ 29 week cohort and the ≥36 week cohort (p < 0.001), but no significant difference was evident in the 29–36 week GA cohort.

Differences between KDIGO and optimal SCr thresholds

Several insights are made when comparing the KDIGO (Table 3) and the optimized absolute SCr rise thresholds (Tables 4 and 5). For the ≤ 29 weeks GA, the optimized absolute SCr rise to maximized AUC was identical to the cutoff used in KDIGO while the optimized absolute rise thresholds was lower for both the 29–36 week and ≥36 weeks GA groups. Alternatively, for the ≤ 29 weeks GA the optimized absolute SCr rise to maximized specificity was ≥0.6 mg/dl, while for the optimized absolute SCr rise to maximize specificity happen to be the same threshold used in KDIGO for both the 29–36 week and ≥36 weeks GA groups.

Comparing the KDIGO nAKI definition thresholds to determined optimal thresholds

Several insights are made when comparing the KDIGO (Table 3) and the optimized absolute SCr rise thresholds (Tables 4 and 5). In the ≤ 29 weeks GA group, the AUC-driven optimal absolute SCr rise was the same as the threshold used in the KDIGO definition (i.e., ≥0.3 mg/dl), and a higher threshold ( ≥ 0.6 mg/dl) was found to optimize the specificity. For the 29–36 week and ≥36 weeks GA groups, the specificity-driven optimal absolute SCr rise thresholds was the same as the KDIGO cutoff, while optimization of AUC required a lower SCr threshold of ≥ 0.1 mg/dl for both groups.

Discussion

The ability to diagnose AKI using SCr-based definitions during the first postnatal week is complex and challenging. Using a meaningful clinical outcome (mortality) we tested the hypothesis that the ideal cutoffs for SCr changes in the first week of life differ by GA groups. Indeed, we found that the optimal absolute and percent SCr rise cutoffs are higher in ≤ 29 week GA neonates. We found that overall, the absolute SCr change has better ability to predict mortality than percent SCr changes, and we did not find a major benefit to incorporate both a percent and absolute change. In fact incorporating the percent SCr change made the ability to maximize AUC and specificity worse, suggesting that the use of percent SCr rise adds little value and increases complexity. Interestingly, the max SCr outperformed was as good as a SCr rise at predicting mortality AUC, and was better at predicting mortality specificity in some of the cohort.

Despite its limitations, SCr changes continue to be used in neonatal cohorts as a metric to diagnose nAKI. We have previously shown that the KDIGO nAKI definition is independently associated with meaningful outcomes across the GA spectrum.21 Despite the use of by us and others to diagnose AKI in neonates, it is important to recognize that this empiric definition was adapted from the adult literature25 and has not been validated in neonates. In 2013, an expert working group from the NIH-sponsored Neonatal AKI Workshop recommended that researchers use the KDIGO definition to define nAKI because it would allow comparison between studies. However, this working group emphasized that in order to have true meaning, the KDIGO empiric definition or a novel nAKI definition would need to be tested against hard meaningful endpoints.22 Without this critical step, the field of nAKI could waste decades using an incorrect definition in describing the natural history of kidney disease and could lead to failed intervention nAKI studies.

The use of SCr to define nAKI is particularly challenging due to the obligate cardiovascular and renal adaptations during the transition into the extra-uterine environment. At birth, neonatal SCr reflects maternal SCr. Over the next days, the neonate establishes their own SCr steady-state which is primarily determined by innate kidney function. As kidney function varies greatly by GA, so will the SCr trajectories and steady-state levels.26 In this context, it may be normal for an extremely premature infant without nAKI to have a slight rise in SCr after birth. This was shown nicely in a study of extremely low birthweight infants that showed that 89% of infants born < 26 weeks GA had a rise in SCr ≥0.3 mg/dL.14 Alternatively, a healthy neonate with normal kidney function, should have a steady decline in SCr after birth. The absence of this normal trajectory could signify a substantial injury. This was recently shown in a study of term asphyxiated neonates by Gupta et.al which compared clinical parameters i.e., receipt of hemodynamic support) and kidney injury biomarkers between 3 groups of infants: (1) infants with nAKI (SCr ≥0.3 mg/dL), (2) infants whose SCr trajectories did not drop as expected but did not meet nAKI criteria, and (3) infants with normal SCr trajectories. They clearly showed that infants with abnormal SCr trajectories were more similar to those with nAKI than to those who had normal SCr trajectory. This suggests that term infants without the normal expected SCr decline may have a significant kidney injury.27 For these reasons, it would not be surprising that the nAKI thresholds differ by GA.

Our analysis only explores the SCr criteria to define nAKI during the first postnatal week. Similar studies to determine the optimal SCr cutoffs values after the first postnatal week will need to be performed. We do not explore the KDIGO UOP criteria’s ability to predict mortality. We do not explore which cutoffs are most predictive of mortality. Furthermore, determination of the SCr trends, the impact that fluid adjustment of SCr has on SCr cutoffs and combinations of SCr and UOP thresholds are not explored in this manuscript. We acknowledge that the associations we make between SCr thresholds and mortality do not infer a causal relationship, but instead allow for a better understanding of the associations between SCr changes and mortality. These limitations should be addressed in large multi-center studies that have adequate data for their analysis. The strengths of our study includes a large sample size of infants which allows us to test how different SCr metrics (absolute rise vs. percent rise vs. max value) predict mortality in different GA groups. Despite these strengths, we acknowledge several important limitations. The main limitation of the study is our ability to fully generalize across all infants. The main inclusion/exclusion criteria for AWAKEN was that NICU infants must have received intravenous fluids for at least 48 h. One of the main reasons for this decision was to capture infants that were more likely to have ample SCr values to decipher SCr-AKI status. Furthermore, in order to gain as much internal validity as possible for this analysis, we excluded infants who did not have at least one SCr in the first two postnatal days and at least one on postnatal days 3–8. These criteria selected infants that tended to be more ill (see Table 2). Studies that incorporate large cohorts with systematically collected SCr data on all NICU infants are needed to corroborate our findings and allow for generalizability to the entire NICU. We also acknowledge that although we chose patients who had at least 1 SCr in the early part and 1 in the later part of the week (and most had multiple values at both of these timepoints) not all infants had daily SCr values. In addition, choosing to delineate infants across only 3 GA could influence our findings, and we acknowledge that for example, infants who are 24 weeks GA could have different optimal SCr thresholds than infants born at 28 week GA. Finally, we did not adjust for fluid status, and it is possible that fluid balance could alter the SCr values and the performance of the changes in SCr,28 independent of kidney functional changes.

In conclusion, this study provides insights into on one of the most elusive, yet fundamental questions in neonatal nephrology—how to best delineate a significant change in kidney function during the first week of life. This analysis suggests that optimal SCr thresholds that predict meaningful outcomes differ across GA groups, that absolute SCr outperformed percent SCr rise, and that incorporation of both metrics does not provide much added value. Maximum SCr value can provide very good mortality specificity. In infants ≤29 weeks GA The absolute SCr threshold ≥0.3 mg/dL predicts mortality with great sensitivity (but bad specificity); alternatively, in infants ≥ 29 weeks GA, this threshold predicts mortality with high specificity (but poor sensitivity).

We caution investigators and researchers to change the KDIGO nAKI definition based on just this study. Instead, we encourage similar evaluation of other cohorts, at different timepoints. Studies designed to understand how UOP cutoffs, and fluid adjusted corrections can be used to better define neonatal AKI are greatly needed. Furthermore, a multi-center prospective studies that captures SCr and UOP systematically and evaluates other outcome metrics (i.e., CKD, and hypertension during childhood) is needed to corroborate and advance these findings. Similar studies on neonates that develop AKI after the first week of age are also greatly needed. Once these data is available, expert panels should be developed to cohesively make evidence-based recommendation on the most meaningful nAKI definition for clinicians and researchers to use. Finally, although the term “AKI” has gained wide acceptance to be synonymous with changes in SCr, patients can have functional changes in kidney function without true injury (i.e., dehydration), while others may have true tubular injury without a functional SCr change. Studies that also incorporate tubular injury biomarkers are needed to better understand how specific etiologies (acute tubular injury with an acute decrease in kidney function vs. acute decrease in kidney function without tubular injury) are greatly needed.