Validation of a survival benefit estimator tool in a cohort of European kidney transplant recipients

Pre-transplant prognostic scores help to optimize donor/recipient allocation and to minimize organ discard rates. Since most of these scores come from the US, direct application in non-US populations is not advisable. The Survival Benefit Estimator (SBE), built upon the Estimated Post-Transplant Survival (EPTS) and the Kidney Donor Profile Index (KDPI), has not been externally validated. We aimed to examine SBE in a cohort of Spanish kidney transplant recipients. We designed a retrospective cohort-based study of deceased-donor kidney transplants carried out in two different Spanish hospitals. Unadjusted and adjusted Cox models were applied for patient survival. Predictive models were compared using Harrell’s C statistics. SBE, EPTS and KDPI were independently associated with patient survival (p ≤ 0.01 in all models). Model discrimination measured with Harrell’s C statistics ranged from 0.57 (KDPI) to 0.69 (SBE) and 0.71 (EPTS). After adjustment, SBE presented similar calibration and discrimination power to that of EPTS. SBE tended to underestimate actual survival, mainly among high EPTS recipients/high KDPI donors. SBE performed acceptably well at discriminating post-transplant survival in a cohort of Spanish deceased-donor kidney transplant recipients, although its use as the main allocation guide, especially for high KDPI donors or high EPTS recipients requires further testing.

Survival estimation and actual survival. year patient survival was calculated using the prognostic score developed by Bae et al. 12 , available at https ://www.trans plant model s.com/kdpi-epts/. Briefly, SBE estimates the survival benefit (i.e. the absolute reduction in mortality risk expressed in percentage points) for a distinct recipient of receiving a kidney from a specific donor using a combination of EPTS and KDPI scores.
Actual post-transplant patient survival was defined as the time from KT to death, censoring by the end of the study. All patients were followed-up long enough to establish 60-month survival, regardless the allograft status (functioning or not).
The following donor variables were used to calculate KDPI: Age, race, height, weight, hypertension, diabetes, serum creatinine, hepatitis C seropositivity and cause of death, using the methods depicted in the OPTN site. KDPI scores range from 0 to 100% and indicate the potential longevity and quality of a kidney graft compared to a reference population 9 . EPTS was calculated using recipient variables including age, diabetes, time on dialysis and previous solid organ transplant, following the methods described in the OPTN site. EPTS scores vary from Scientific Reports | (2020) 10:17109 | https://doi.org/10.1038/s41598-020-74295-3 www.nature.com/scientificreports/ 0 to 100%. Patients with lower EPTS scores should experience more years of graft function from high-longevity kidneys compared to subjects with higher EPTS scores 11 . To examine the association of SBE score with patient survival, we stratified our sample by SBE quintiles. Adjusted models were built using SBE, EPTS or KDPI and other recipient variables of clinical relevance (sex, hypertension, ischemic heart disease, congestive heart failure, peripheral vascular disease, stroke and hepatitis C status). Those variables already considered in the calculation of KDPI or EPTS scores were not included in this last analysis.
Statistical analysis. Data are expressed as mean and standard deviation (SD) or median and range for nonnormal distributions. Qualitative variables are described as frequencies and percentages. T tests and analyses of variance were used to assess continuous variables. Chi-square test was used to compare categorical data between groups.
To address missing values, we first applied Little's missing completely at random test to assess the assumption of missing completely at random for multivariate quantitative data 22 , which was non-significant (P > 0.05). Multiple imputation by fully conditional specification was applied to the dataset. A total run length of 1000 iterations was used, creating 10 imputed data sets to take into account uncertainty associated with missing data. All variables needed to calculate EPTS and KDPI score were included in the procedure (donor age, race, height, weight, hypertension, diabetes, serum creatinine, hepatitis C seropositivity and cause of death; recipient age, diabetes, time on dialysis and previous solid organ transplant). Final results were averaged across the created sets.
To compare the original SBE cohort with our study cohort, mean and standard deviation were estimated on the basis of the reported median and range according to the method developed by Hozo SP et al. 23 . Quintile distribution according to SBE score (Q1: 0-75; Q2: 75.1-83.5; Q3: 83.6-88.8; Q4: 88.9-93; Q5: 93.1-100) was used to obtain groups of comparable size. Overall survival probabilities for each group were estimated using the Kaplan-Meier method and compared by the log-rank test. Hazard ratios (HRs) with 95% confidence interval (CI) were calculated using unadjusted and adjusted Cox proportional hazards regression modelling. To ensure confidence interval coverage and type I error rate as defined by Vitinghoff et al. 24 the study sample included at least ten events per predictor variable. Model discrimination was estimated using Harrell's C statistic 25 . Harrell's C estimates the proportion of correct predictions, serving as a concordance index. The results of Harrell's C index varied from 0.5 (no discrimination) to 1 (perfect discrimination). The Benjamini-Hochberg procedure was used to control for the false discovery rate 26 . We examined the Akaike information criterion (AIC), a technique based on in-sample fit to estimate the likelihood of a model to predict future values. The statistical significance level was set at p < 0.05. Statistical analysis was performed using IBM SPSS v.

Results
Baseline characteristics. A total of 1200 KT were performed in two Spanish hospitals between January 2000 and July 2015. The final analysis included 935 KT patients. Flow-chart of patients is represented in Fig. 1.
Regarding baseline characteristics, a comparative between Bae´s cohort and our study cohort is summarized in Table 1.
Spanish recipients and donors were older and predominantly Caucasians (92.2%). Spanish recipients showed lower prevalence of diabetes, less time on dialysis before KT and lower rates of re-transplantation. The proportion of pre-sensitized patients was higher in Bae´s cohort, while KDPI scores were higher and EPTS scores were lower in the Spanish cohort.
The study sample was divided into five groups according to SBE value distribution. Table 2 summarizes recipient, donor and transplant related data.
Patients with worse estimated post-transplant survival were older, had longer dialysis vintage and suffered more frequently from diabetes or cardiovascular disease. These patients received kidneys from older donors with a higher prevalence of hypertension and diabetes. There was a higher percentage of KT from cardiac-death donors among recipients with better estimated post-transplant survival.

Distribution of KDPI and EPTS scores in the Spanish cohort.
In the Spanish cohort 36.6% of donors presented with KDPI > 80%, while a low number of recipients (10.3%) had high EPTS scores (EPTS > 80). The majority of recipients with EPTS > 85 (62.9%) received a kidney from a KDPI > 85 donor. The distribution of KDPI and EPTS scores in our cohort is shown in Fig. 2.
Comparison of 5-year predicted vs actual survival. SBE was used to estimate 5-year patient survival at the time of transplantation. The study sample was divided into quintiles using the results of the SBE score. Figure 3 compares the average estimated survival rate with the actual survival in each SBE quintile.
Actual survival rates in our cohort were higher than predicted ones in the first and second highest and lowest SBE quintiles.
Albeit patients with the worst SBE 5-year patient survival estimation (1st quintile) had a 23 median percentage point higher EPTS score and a 14 median percentage point higher KDPI score compared to the 2nd quintile group, actual survival between these two groups did not differ significantly (log Rank: 1.665; P = 0.197). This difference was not significant after adjustment for multiplicity.

Scientific Reports
| (2020) 10:17109 | https://doi.org/10.1038/s41598-020-74295-3 www.nature.com/scientificreports/ Discrimination and calibration of SBE, EPTS and KDPI models are summarized in Table 3 and were tested with Harrell's C statistic for Cox models. Discrimination was poor when using the KDPI model, with a Harrell's C of 0.57. Using the SBE and EPTS scores improved discrimination when compared to the KDPI model, with a Harrell's C of 0.69 and 0.71, respectively. Finally, adjusting also for clinically significant variables made a further improvement to a maximum c-statistic of 0.72 and 0.74 for the SBE and EPTS adjusted models, respectively.

Discussion
Clinical tools aimed to optimize selection of donor-recipient pairs and estimate the potential survival benefit that a specific organ could involve for a specific recipient have been recently developed.
The present study evaluated the predictive performance of SBE in terms of 5-year patient survival after kidney transplant, even if the patient lost the graft during that time. In our sample, SBE, EPTS and KDPI score are associated with 5-year patient survival and they remained significant even after multi-adjusted survival analysis by recipient comorbidities (Table 3).
In our study SBE proved to be an independent post-transplant survival predictor both in unadjusted and adjusted analysis. SBE showed acceptable discriminatory performance, with a C-statistic of 0.69 that was close to that of EPTS (0.71) and higher than that of KDPI (0.57). Surprisingly, the calculated C-statistic for the SBE score in our sample was higher than that obtained in its development sample (0.69 vs 0.64) 12 . However, in this European population SBE tended to underestimate actual survival mainly in the subsets of patients with either highest or lowest SBE scores, therefore reflecting limited utility of this tool in these specific populations. Subjects within the worst and second-worst quintiles according to SBE estimated 5-year survival did not demonstrate significant differences in actual survival (Fig. 3). This is not an isolated finding; in a sample of KT performed in a Eurotransplant center Schulte et al. described a mortality rate of 16 percentage points higher in recipients with an EPTS score between 61-80, compared to those with an EPTS score > 80 27 . The absence of differences in survival www.nature.com/scientificreports/ between worst and second-worst quintiles in our sample could be due to higher prevalence of cardiovascular disease (i.e. congestive heart failure or peripheral vascular disease) in the second-worst quintile compared with the worst (Table 2), which is not considered when calculating the EPTS or SBE scores. Conceivably, donor and recipient risk factors may affect post-transplant survival differently in non-US population, with older age possibly being the predominant risk trait in our Spanish cohort. Additionally, the sample size of these two groups is www.nature.com/scientificreports/   www.nature.com/scientificreports/ underpowered to detect a difference of 5.2 percentage point in the mortality rates of the worst and second-worst quintiles. EPTS score showed moderate discrimination in the original US data, with a C-statistic of 0.69 28 . Clayton et al. evaluated this score in 4983 kidney transplant recipients from the Australia and New Zealand Dialysis and Transplant (ANZDATA) Registry. The authors analyzed three different Cox models (EPTS only; EPTS plus donor age, hypertension status and HLA-DR mismatch; EPTS plus log-Kidney Donor Risk Index) and reported a Harrell's C-statistic of 0.67, 0.68 and 0.69 for each model respectively 18 . We found similar discrimination power with a C-statistic of 0.71 in unadjusted analysis that improved to 0.74 when relevant clinical recipient variables were added to the model. EPTS was also tested as a prognostic tool after deceased-donor KT in Mexican patients, with an area under the receiver operating characteristic curve of 0.64 29 .
KDPI showed the lowest survival predictive power among the tools. This was an expected finding, as KDPI and Kidney Donor Risk Index were essentially designed as tools to estimate graft durability, but their potential application as predictors of post-transplant patient survival has been tested before. We found a 1.1% higher risk of mortality per percentage point of KDPI increase, and poor 5-year patient survival prediction capacity with a C-statistic of 0.57 in unadjusted analysis that improved to 0.65 when additional recipient variables were added to the analysis. Similar results were obtained by Peters-Sengers et al., reporting 5-year mortality C-statistic for Kidney Donor Risk Index (including deaths after graft loss) of 0.68 30 . Calvillo-Arbizu et al. also suggested that KDPI could constitute a potential indicator of patient survival, especially for recipients older than 60 years 31 . However, in another study with Spanish transplant patients no relationship was found between KDPI score and recipient death 17 .
The observed disparities between our results and American studies could be explained by several relevant demographic differences between the US and Spain. For instance, more than 90% of our donor and recipient sample was constituted by Caucasians whereas the cohort used in the development of SBE was much more ethnically diverse 12 . In addition, Spanish recipients were older but with less prevalence of diabetes, shorter dialysis vintage and lower sensitization degree. Cold ischemia time was also shorter in our sample. These differences were translated into a lower EPTS score compared to that of Bae's study. On the other hand, our donors were older and more frequently diabetics and with hypertension, which translated in a higher KDPI compared to that registered in SBE development sample 12 .
Divergences in transplant era can also be a possible explanation of score performance discrepancies, as pointed by Rose et al. 32 . The Cox proportional hazards regression model used to develop Kidney Donor Risk Index was obtained from a sample of US kidney recipients from 1995-2005. Donor and recipient characteristics have changed significantly over the last 20 years, both in the US and in Europe 33,34 , as well as other factors such as dialysis-associated mortality or waitlist removal 35 . In addition, the increasing prevalence of chronic kidney disease, the growing demand for organ donations and the incorporation of expanded criteria donors as a viable organ source have modified even further the current transplant landscape in the last decades.
Differences also affect transplant policies and organ allocation. Between 2004 and 2015 more than 50% of the retrieved kidneys with KDPI > 85% were discarded in the US, even after the initiation of the KDPI era 36,37 . Although in Spain there are no official data regarding KDPI-associated discard rates as the use of KDPI is not routinely extended, Arias-Cabrales et al. reported that 35.5% of patients in their study received a kidney from a KDPI > 85 donor 17 , a percentage which was slightly lower (30.6%) in our current sample.
Results may also depend on other donor, recipient and procedure characteristics not accounted for during calculation of SBE, KDPI or EPTS scores, such as graft damage or abnormalities, excessive first warm ischemia time, additional recipient comorbidities or likelihood of transmission of diseases during transplantation. Additionally, divergences in health care systems between the US and Spain could explain some of the observed discrepancies. More than 70% of US kidney transplant patients report economic problems to access immunosuppressive medication 38 , with Medicare coverage for those drugs being lost after the first three years after transplantation 39 . In comparison, the transplant procedure, follow-up monitoring and all associated medication is completely covered in Spain and other European countries 40 , regardless of transplant duration.
The present analysis has some limitations. Extensive efforts were undertaken to adjust for potential confounding, but residual confounding due to unmeasured variables is still possible. Moreover, most of our sample consisted of Caucasians. Therefore, our results cannot be extrapolated to subjects of other ethnic groups. Likewise, most of the donors in this study were brain-dead so the studied scores may perform differently in other settings such as donation after cardiac death. Additionally, we only tested the predictive capabilities of the SBE score at the 5-year time point. As a consequence, our results should not be extrapolated to other time points. Finally, organ allocation strategies and policies may diverge considerably between countries and, to a lesser extent, between transplant centers, which may affect the predictive potential of scores such as SBE.
But the present study has several strengths too. To our knowledge, this is the first study that offers an external validation analysis of the SBE score in European population. SBE scoring considers both donor and recipient characteristics to build a post-transplant survival estimation. Although the discrimination power was similar to that of EPTS in our sample, due to recipient traits being the most relevant to determine recipient survival, the inclusion of donor associated variables provides a small but important improvement to survival prediction, thanks to their significant effect on future graft function. Our sample was built using data from two different transplant centers and included survival information regardless of actual graft function.
In sum, this is one of the first studies to provide external validation of the use of the SBE score as a standalone score in a non-US sample of kidney transplant recipients. Further analysis should be performed to adequately characterize its potential to produce accurate and individualized post-transplant survival predictions. www.nature.com/scientificreports/

Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.