Prediction model for early graft failure after liver transplantation using aspartate aminotransferase, total bilirubin and coagulation factor

This study was designed to build models predicting early graft failure after liver transplantation. Cox regression model for predicting early graft failure after liver transplantation using post-transplantation aspartate aminotransferase, total bilirubin, and international normalized ratio of prothrombin time was constructed based on data from both living donor (n = 1153) and deceased donor (n = 359) liver transplantation performed during 2004 to 2018. The model was compared with Model for Early Allograft Function Scoring (MEAF) and early allograft dysfunction (EAD) with their C-index and time-dependent area-under-curve (AUC). The C-index of the model for living donor (0.73, CI = 0.67–0.79) was significantly higher compared to those of both MEAF (0.69, P = 0.03) and EAD (0.66, P = 0.001) while C-index for deceased donor (0.74, CI = 0.65–0.83) was only significantly higher compared to C-index of EAD. (0.66, P = 0.002) Time-dependent AUC at 2 weeks of living donor (0.96, CI = 0.91–1.00) and deceased donor (0.98, CI = 0.96–1.00) were significantly higher compared to those of EAD. (both 0.83, P < 0.001 for living donor and deceased donor) Time-dependent AUC at 4 weeks of living donor (0.93, CI = 0.86–0.99) was significantly higher compared to those of both MEAF (0.87, P = 0.02) and EAD. (0.84, P = 0.02) Time-dependent AUC at 4 weeks of deceased donor (0.94, CI = 0.89–1.00) was significantly higher compared to both MEAF (0.82, P = 0.02) and EAD. (0.81, P < 0.001). The prediction model for early graft failure after liver transplantation showed high predictability and validity with higher predictability compared to traditional models for both living donor and deceased donor liver transplantation.

Statistical analysis. The prediction models were built using variables that are clinically familiar and relevant. Two models each for LDLT and DDLT were constructed. After building the models, the two models were compared to MEAF score and EAD criteria by comparing C-index and time dependent area-under-the-curve (AUC) at 2 weeks and 4 weeks 14,15 . MEAF score was calculated based on the previous study reported by Pareja et al. 15 The comparing process was performed using R packages ' compareC' and 'timeROC' . Validation process for the chosen modeling process was performed. Internal validation using 20-time repeated fivefold cross-validations were performed using R package 'survAUC' to calculate the C-statistic and AUC estimator proposed by Uno et al. 16 Calibration plot was drawn to validate the models through 1000 bootstrap resamples of the same size as the original data. Decision curve analysis to evaluate the clinical usefulness of the models was performed by drawing a decision curve computing the net benefit, and the range of positive net benefit was analyzed.
Statistical analyses were performed using SPSS 20.0 (IBM, Chicago, IL, USA), SAS v9.4 (SAS Institute Inc, Cary, NC, USA), and R 3.6.1 (R Foundation for Statistical Computing, Vienna, Austria) using packages 'rmda' for decision curve analysis and 'rms' for drawing a calibration plot.
Ethical approval. This study was approved by the Institutional Review Board (IRB) of Samsung Medical Center (IRB No. 2020-02-013).
Informed consent. The need for informed consent was waived by the IRB of Samsung Medical Center due to the retrospective nature of this study. Investigational methods used in this study were implemented in accordance with the relevant guidelines and regulations of the IRB. www.nature.com/scientificreports/

Results
Characteristics of the patient group.
The predicted survival probabilities from Cox proportional hazards model for a set of covariates X may be estimated by the equation below where S 0 (t) is Breslow estimator for baseline survival function.   Calibration plot. Calibration plots of ABC models at 2 weeks and 4 weeks through 1000 bootstrap resamples were performed. Figure 2 shows the calibration plots of ABC models for both LDLT and DDLT. The predicted probability and actual survival probability showed relatively competent calibration for ABC models for LDLT and DDLT.

Decision curve analysis.
To evaluate the clinical usefulness of ABC model, decision curves were computed to calculate the net benefit. Figure 3 shows the decision curves of ABC models for LDLT and DDLT. For both 2 weeks and 4 weeks, and for both LDLT and DDLT, the decision curve constantly calculated above the zerobenefit line, showing beneficial expectation of the models.
Time-dependent AUC curves of ABC model. Time-dependent AUC curves of ABC models were illustrated in Fig. 4. When the reference line was set as AUC of 0.75, the time-dependent AUCs were calculated to be above the reference line until 1 year in LDLT, and around 250 days in DDLT.

Discussion
Due to improvement in surgical skills, optimization of immunosuppression, and postoperative intensive care, the outcome of LT has improved throughout the decades, and graft failure rate has significantly decreased. However, there are still recipients who experience graft dysfunction and require appropriate decision making to undergo re-transplantation. Nevertheless, new competent liver grafts for those experiencing graft dysfunction are not always available, creating an urgent need for re-transplant criteria. The criteria of OPTN are utilized as guidance in allocating deceased donor livers although they are limited in allocating new grafts for patients with potential graft failure. Several studies have built a prediction model for graft failure. Although such studies showed improvement in prediction, there is no consensus on a definite model for predicting graft failure. This study was designed to build a prediction model for graft survival using simplified variables among the largest studied cohort. Nonfunctioning livers usually show a similar pattern of laboratory values. AST and ALT peak at day 1 and 2 post-LT, respectively, and gradually decrease thereafter; there can be additional peaks when the graft is injured by mechanisms such as hypotension. The pattern is similar in successful grafts, but maximum AST and ALT indicate extent of graft injury. Since AST and ALT show similar trend during graft dysfunction, we decided to include only one to the model. On the other hand, TB level changes slowly and gradually increases along the clinical course in failing grafts. The initial TB level is dependent on pre-LT TB level and transfusions, which are performed intensely during the initial post-LT period. Therefore, both successful and failing LTs show a decreasing pattern in the initial period, while failing LTs then show gradual increase. Patterns of INR level are most similar between successful and failing grafts, although the levels are higher in nonfunctioning grafts and remain higher during the post-LT course. However, the time point and level of the peak may vary among LT cases. Therefore, peak AST after LT and maximum TB and INR after the early post-LT period are important regardless Table 2. Comparisons of C-index, time-dependent AUC at 2 weeks, and time-dependent AUC at 4 weeks between Cox proportional hazard regression models using MEAF score, EAD criteria and newly developed multivariable model for predicting graft survival of recipients who underwent living donor liver transplantation. HR hazard ratio, AUC area under the curve, CI confidence interval, MEAF modeling early allograft function, EAD early allograft dysfunction, AST aspartate aminotransferase, TB total bilirubin, INR international normalized ratio. www.nature.com/scientificreports/ of day. This is why we built a model to choose the maximum AST of the post-LT period and maximum values of TB and INR starting from day 3 post-LT. ABC model was built based on LT data from 1153 LDLT and 359 DDLTs. The reason why separate analyses were performed for LDLT and DDLT was due to the different clinical characteristics. While LDLT uses partial graft with less ischemic injury compared to DDLT, DDLT usually uses whole graft with considerable amount of ischemic injury. The laboratory values after LT are also different between LDLT and DDLT as presented in Table 1. AST, TB and PT/INR of DDLT are higher compared to LDLT in the initial period. As a result, the AUCs of the prediction models were 0.96 and 0.98 in predicting graft failure within 2 weeks and 0.93 and 0.94 in predicting graft failure within 4 weeks, for LDLT and DDLT, respectively. ABC model is also very intuitive by including the maximum values of AST, TB, and INR during the first week for predicting early graft failure. The model was compared to previously published models, such as MEAF score and EAD criteria. By comparing the C-index and time-dependent AUC at 2 weeks and 4 weeks, ABC model showed superior outcome compared to the other two models. The difference of ABC model from other models is that it is optimized for both LDLT and DDLT. While EAD criteria and MEAF score were modeled based on DDLT, our model consists of two versions using same variables. Prediction probability can be calculated easily if the clinician knows the maximum AST, www.nature.com/scientificreports/ TB, and INR during the post-LT period, by inserting the values to our supplementary Excel document, which is well-calibrated to the retrospective cohort of our institution. Our prediction calculator not only predicts the probability of graft failure at a certain time point after LT but also the graft survival curve which can give visual information useful both for the clinicians and patients. The limitation of our study is that it is based on data from a single institution. The model was based on a cohort of predominantly LDLT and number of cases included in the DDLT model was 359 cases. EAD criteria and MEAF score were based on DDLT cases which is more dominantly performed worldwide. The EAD criteria has been extensively validated while ABC model is only on the starting point. Nevertheless, our study showed high validity during internal validation; therefore, good results during external validation with other cohorts is expected. The two different models with same statistical approach is also the strength of our study. Although many countries are performing LT with DDLT, there are still many countries with significant number of LDLT. ABC model will serve as a good tool for predicting early graft failure after LDLT.
Whether it is advantageous to use ABC model instead of traditional measures is up to the clinicians. While we showed that the statistical data showed superior outcome of ABC model compared to the two models, some clinicians might consider that the two traditional measures are more useful since they also showed good statistical Table 3. Comparisons of C-index, time-dependent AUC at 2 weeks, and time-dependent AUC at 4 weeks between Cox proportional hazard regression models using MEAF score, EAD criteria and newly developed multivariable model for predicting graft survival of recipients who underwent deceased donor liver transplantation. HR hazard ratio, AUC area under the curve, CI confidence interval, MEAF modeling early allograft function, EAD early allograft dysfunction, AST aspartate aminotransferase, TB total bilirubin; INR international normalized ratio. www.nature.com/scientificreports/ outcome and were validated by other investigators. Our model was based on single institutional data consisted of Korean patients which is expected to be different to cohorts used for other models. Therefore, we propose other investigators to externally validate ABC model. The currently applied criteria for primary nonfunction as suggested by OPTN served as a good decision tool. However, the criteria were quite restrictive; in countries like the Republic of Korea where donation from deceased donors is relatively lower than in other countries, many patients with graft failure are unable to undergo re-LT with liver from deceased donor. Our prediction model provides objective data on the probability of graft survival, which can guide patient selection in those requiring urgent re-LT even the first week after LT. For the future, ABC model should be validated by other cohort.