Acute injury to the kidneys occurs in one in five patients in US hospitals1. It is a common condition in hospital patients because it can be caused by a number of factors, including abnormal blood pressure or blood volume. But the ability to predict whether or when acute kidney injury will happen is limited. For people who are at high risk of developing this condition, the standard clinical approach is daily assessment of their laboratory test results, including the concentration of creatinine in their blood, because high levels of this molecule are a hallmark of kidney problems.
Writing in Nature, Tomašev et al.2 report that an approach involving artificial intelligence makes it possible to identify impending acute kidney injury, for most patients, one or two days before the condition would be diagnosed using standard clinical tests. Kidney injury is usually spotted only at a late stage, when irreversible damage has occurred that could lead to death or the need for temporary or long-term dialysis. Being able to catch the condition early would be a major step forward in enabling effective treatment.
In the artificial-intelligence method known as deep learning, an algorithm is developed to identify patterns in the data that are associated with an outcome of interest — in this case, the development of acute kidney injury. The authors used this approach on data collected between 2011 and 2015 from more than 700,000 adults treated in 172 hospitals and 1,062 outpatient clinics run by the US Department of Veterans Affairs — a health-care provider for military workers and their families. The anonymized information provided the authors with data for these individuals that included: demographics, electronic health records, laboratory test results, medications prescribed and records of procedures undergone. Tomašev and colleagues used these cumulative data to train their computer by running a time-series analysis of around 6 billion data points and more than 600,000 recorded features. They chose a method for deep learning called a recurrent neural network, which is ideal for assessing sequential data inputs that are obtained over time.
The authors tested the system using data for individual patients that had been set aside for this purpose. They obtained computer-generated probability values that continuously traced the likelihood over time that any individual would develop acute kidney injury within the next 48 hours. If the probability exceeded a threshold value, the prediction was considered positive (Fig. 1). Checking whether the patient was subsequently diagnosed with the condition revealed the accuracy of the prediction. The authors’ model also provided an indication of the level of uncertainty for the probability value, enabling a doctor to assess the strength of the predictive signal.
Tomašev and colleagues’ approach is more accurate than other statistical or machine-learning methods that have been proposed for identifying impending kidney damage3,4. As might be expected, the prediction accuracy of the authors’ system was highest for people in hospital settings, where acute kidney injury is more frequent, has a more rapid onset and occurs within a shorter window of time than is typical in outpatient clinics. For all patients and any type of acute kidney injury, including less severe forms, the system was 56% accurate. Successful predictions for more serious forms of the condition were 84% and 90% for people who subsequently required dialysis treatment within 30 and 90 days, respectively. The model’s accuracy was similar across the different health-care sites and throughout the time period studied.
The authors used a method called ablation analysis to determine the factors linked to the risk of developing acute kidney injury. They found many contributory factors, which might explain why trying to determine this risk has been a vexing task in the past.
Consideration of the case of a hypothetical patient (Fig. 1) underscores the potential usefulness of the system developed by Tomašev and colleagues. This patient’s daily creatinine values gave no indication of acute kidney injury until their fourth day in hospital. By contrast, the authors’ system predicted organ damage two days earlier, giving more time for treatment interventions such as increasing the patient’s fluid intake, or avoiding the use of drugs that could cause kidney toxicity.
However, the authors’ system generated many ‘false positive’ predictions — predictions of injury that did not occur. For each accurate prediction, there were two false positives. Most of these occurred in people who had chronic kidney disease, which would make superimposed acute kidney injury more difficult to predict.
A limitation of the authors’ work is that it is a retrospective study. There are examples of the use of artificial intelligence in retrospective studies of medical data in which the model’s accuracy declined when it was tested prospectively5. Such declines probably occur because dealing with data in a real-world clinical environment is more complicated than dealing with a ‘cleaned’, pre-existing data resource.
Prospective studies are essential for determining the true clinical value of a predictive system. Moreover, successful prediction is not the only factor that should be assessed. One way to determine whether these predictive warnings result in a reduction in acute kidney injury would be to carry out a clinical trial using a randomized design, in which only half of the predictions of impending injury are relayed to doctors. The authors’ model should also be tested to determine how well it works in other groups of patients. Moreover, less than 7% of Tomašev and colleagues’ study group were women. Whether the model’s ability to predict acute kidney injury differs depending on gender thus needs further investigation.
Although the authors’ system included a variety of data types, other data sources might also be valuable for inclusion. For example, it is possible that written notes in medical records, or continuous monitoring of vital signs, such as heart rate, from wearable sensors, might provide relevant information.
For patients who are not in an intensive-care unit, the standard monitoring approach is to take their vital signs once daily. However, all too often, the daily doctors’ rounds can reveal a patient who has suddenly become critically ill. Tomašev and colleagues’ study shows the benefit of being able to anticipate serious organ damage well before it occurs. Most predictive studies using artificial intelligence in a clinical context have previously focused on patient outcomes such as deaths, readmissions or the time spent in hospital6. The work by Tomašev et al. stands out by providing a prediction that might enable effective clinical intervention.
The use of deep learning has considerable promise as a way of alerting doctors to concerns about any organ. Its implementation will probably require a change in the medical mindset. But moving from infrequent, one-off tests to relying more on systems that allow continuous assessment might provide a better way of predicting what lies ahead for a patient.
Nature 572, 36-37 (2019)
Competing Financial Interests
E.J.T. is a member of the Verily Advisory Board.