Personalized predictions of patient outcomes during and after hospitalization using artificial intelligence

Hospital systems, payers, and regulators have focused on reducing length of stay (LOS) and early readmission, with uncertain benefit. Interpretable machine learning (ML) may assist in transparently identifying the risk of important outcomes. We conducted a retrospective cohort study of hospitalizations at a tertiary academic medical center and its branches from January 2011 to May 2018. A consecutive sample of all hospitalizations in the study period were included. Algorithms were trained on medical, sociodemographic, and institutional variables to predict readmission, length of stay (LOS), and death within 48–72 h. Prediction performance was measured by area under the receiver operator characteristic curve (AUC), Brier score loss (BSL), which measures how well predicted probability matches observed probability, and other metrics. Interpretations were generated using multiple feature extraction algorithms. The study cohort included 1,485,880 hospitalizations for 708,089 unique patients (median age of 59 years, first and third quartiles (QI) [39, 73]; 55.6% female; 71% white). There were 211,022 30-day readmissions for an overall readmission rate of 14% (for patients ≥65 years: 16%). Median LOS, including observation and labor and delivery patients, was 2.94 days (QI [1.67, 5.34]), or, if these patients are excluded, 3.71 days (QI [2.15, 6.51]). Predictive performance was as follows: 30-day readmission (AUC 0.76/BSL 0.11); LOS > 5 days (AUC 0.84/BSL 0.15); death within 48–72 h (AUC 0.91/BSL 0.001). Explanatory diagrams showed factors that impacted each prediction.


Supplementary Materials
) and for each predictive task (Supplementary Table 2).

Readmission
Readmission was defined as any new Cleveland Clinic (CC) hospitalization starting 4 hours after any CCF discharge. For prediction of readmission, patients whose discharge disposition was "expired" were removed. Patients with an admission class of "observation" were retained, as it has been suggested that readmission reduction programs have resulted in an increased use of the observational setting. 2,3 The 4 hour cutoff removed patients who were simply transferring from one CCF department or hospital to another, and was selected based on histogram analysis of first-day readmissions.

Length of Stay
Length of stay was defined as the time between a given discharge date and admission date for each hospitalization. Only variables available within 24 hours of admission were considered with the exception of primary diagnosis code, and patients with an admission class of "observation" were removed.

Death
Death within 48-72 hours of admission was defined as a recorded EHR, Social Security, or Ohio Death Index death date, or a discharge disposition of "expired," within the given time frame. Only variables available within 24 hours of admission were considered with the exception of primary diagnosis code, and patients with an admission class of "observation" were removed.

Machine Learning Models
We used Gradient Boosting Machines (GBMs) to predict binary and numeric outcomes of interest. Gradient boosting machines function by consecutively training decision trees to predict the outcome of interest. Each consecutive tree learns from the ensemble of predictions that came before it, and attempts to minimize the error of the current prediction. 4 The GBM implemented by LightGBM contains several optimizations that allow it to obtain robust models quickly. The chief optimization is onthe-fly binning of continuous variables into discrete buckets, in order to allow for more straightforward splitting of the decision tree. 5 As mentioned in the main text, Gradient Boosting Machines, and LightGBM in particular, allow for heterogeneous data input, including variables with a large number of categories, missing values, and zero values. They do therefore not require imputation, which is advantageous when the lack of a variable for a particular case is important (as in a patient who has never been admitted before and therefore has a "Length of stay of last admission" of "missing," rather than "zero," which may indicate something entirely different). Additionally, not every patient has the same set of labs drawn, and it would be inappropriate to impute values for these, especially if the imputation was based on the average or median value across the cohort, considering that many patients are likely to have abnormal values if the lab warranted being drawn.
GBMs also do not require scaling, rendering the output of the explanations more human-readable. Lastly, because of the nonlinear combinations of variables probable in a large healthcare dataset, a tree-based method such as GBM may be more interpretable than a linear model. This is primarily due to the likelihood of the latter to exhibit greater sensitivity to "model mismatch," wherein the high-bias nature of the linear model cannot adequately represent the underlying nature of the data, and so may be more likely to report spurious associations even at a comparable accuracy. 6 We used Bayesian hyperparameter optimization, as available in the Python package hyperopt, to select hyperparameters for the main predictive targets.
We also used several comparator models, including a deep neural network within the Pytorch framework as implemented by fast.ai and several standard ML models from sklearn. Standard data imputation and scaling techniques were applied to the data to allow ingestion by the models.
Interpretation of the final model  Interestingly, the primary diagnosis (Bipolar disorder) decreased the likelihood of readmission.
Another pt with high probability of readmission within 30 days, primarily due to a diagnosis of hepatic failure, in addition to their BlockGroup GeoID, recent admission, and cancer diagnosis. Number of past admissions played a role, but a less extreme one compared to the example above.
Pt with very low probability of readmission within 30 days, largely due to a diagnosis of osteoarthritis, a single prior admission that was over a year before the current admission, a short length of stay, and other variables as shown.
Another pt with very low probability of readmission within 30 days, largely due to a very short length of stay for angioneurotic edema, no prior admissions or listed comorbidities, and low number of listed medications on the day of discharge (all prescribed treatments are counted in this number).
b. Length of stay > 5 days Young pt with MRSA-related sepsis transferred to our facility, with a nearly 100% probability of LOS >5d.
Pt with alcoholic liver cirrhosis (K7031) transferred to our hospital with a pressure ulcer and a lengthy prior admission, with a non-recorded BMI and low systolic blood pressure, but no ICU admission. Assigned ~90% probability of long LOS.
Relatively young pt with Type 2 Diabetes admitted for ketoacidosis, who had never been admitted before. Assigned a probability of LOS >5d ~15%.