Development and validation of a risk prediction model for end-stage renal disease in patients with type 2 diabetes

The aim of this study is to develop a prediction model for ESRD in patients with type 2 diabetes. A retrospective cohort study was conducted, consisting of 24,104 Chinese patients with type 2 diabetes. We adopted the procedures proposed by the Framingham Heart Study to develop a prediction model for ESRD. Participants were randomly assigned to the derivation and validation sets at a 2:1 ratio. The Cox proportional hazard regression model was used for model development. A total of 813 and 402 subjects (5.06% and 5.00%, respectively) developed ESRD in the derivation and validation sets over a mean follow-up period of 8.3 years. The risk-scoring systems included age, gender, age of diabetes onset, combined statuses of blood pressure and anti-hypertensive medication use, creatinine, variation in HbA1c, variation in systolic blood pressure, diabetes retinopathy, albuminuria, anti-diabetes medications, and combined statuses of hyperlipidemia and anti-hyperlipidemia medication use. The area under curves of 3-year, 5-year, and 8-year ESRD risks were 0.90, 0.86, and 0.81 in the derivation set, respectively. This risk score model can be used as screening for early prevention. The risk prediction for 3-year, 5-year, and 8-year period demonstrated good predictive accuracy and discriminatory ability.

nephropathy. One study targeting participants with type 2 diabetes without diabetic nephropathy focused on the composite outcome of major kidney-related event, including doubling of serum creatinine to 2.26 mg/dl, renal replacement therapy, or renal death 15 . Composite measure has the advantage of being more powerful due to low incidence rates of events, but it has the disadvantage of being less specific. Because the prevention strategy for kidney disease, ESRD, or renal death was different, there is a need to develop prediction model for ESRD. No prior model has been developed for patients with type 2 diabetes on ESRD outcome or in the Chinese population. In addition, prior studies reported that visit-to-visit variation in plasma glucose and blood pressure were associated with ESRD and these two factors haven't been considered in prediction models of ESRD 16,17 . This deficiency results in a need to construct a prediction model of ESRD risk in Chinese patients with type 2 diabetes. With a large cohort of 24,104 participants enrolled from clinical centers in the entire country, the Taiwan Diabetes Study (TDS) cohort represents a significant opportunity to examine whether the characteristics of patients at baseline were associated with ESRD incidence. This prediction model considered risk factors that are generally available in clinical practice and are precisely measured to ensure that this score system is acceptable in clinical practice. In the present study, we develop a point-based risk prediction model for incident ESRD using TDS data.

Results
Taiwan Diabetes study included 24,104 type 2 diabetic patients aged 30-84 years old. These patients were free of ESRD at baseline and had been followed-up for 8.3 years. We randomly assigned 16,070 patients into the derivation set and 8,034 patients into the validation set. A total of 813 (5.06%) and 402 (5.00%) ESRD incident cases were found in the derivation and validation sets, respectively. Table 1 shows the baseline characteristics of participants according to derivation and validation sets. The mean age of participants in the derivation and validation sets was both about 61 years; 47% of the samples were male. All baseline socio-demographic, clinical, and laboratory variables had standardized effect size values less than 0.1, demonstrating that variables in the two sets were distributed similarly.
In crude Cox's proportional hazard analysis, significant factors were observed for age groups of 60-64 years    Table 2). Table 3 shows the adjusted regression coefficients, and means or proportions of significant risk factors that remained in the final multivariate Cox's proportional hazards model. ESRD risk score of each factor was defined as 5 times the regression coefficient of age. The total risk score ranged from −16 to 97. The 3-, 5-, and 8-year risks of ESRD were estimated for total scores followed by the algorithm developed in the Framingham heart study ( Table 4).
The areas under receiver operating curves (AUCs) of the final prediction model for 3-, 5-, and 8-year ESRD risks were 0.91, 0.86, and 0.81, respectively in the derivation set and were 0.92, 0.87, and 0.80, respectively in the validation set ( Fig. 1). Our prediction model demonstrated excellent discrimination ability (i.e., AUCs ≥ 0.80) both in the derivation and validation sets. Figure 2 shows the calibration plots, showing the actual and predicted ESRD events according to deciles of 3-, 5-, and 8-year risks in the validation set. The ESRD events increased steadily from the sixth to 9 th deciles and then increased sharply at 10 th decile. The results of Hosmer-Lemeshow x 2 tests for 3-, 5-, and 8-year risk revealed no significant differences between the observed and predicted ESRD events in the validation set (all p > 0.05). The internal validation of the present model performance was assessed based on 1000 samples from bootstrap resampling. The optimism corrected calibration intercept was 0.009 with a mean absolute error of 0.00008 and the corresponding slope was 0.92 with a mean absolute error of 0.00191. These statistics indicate pretty good calibration for the present model and there is no need for shrinkage of regression coefficients in the prediction model.
We use multiple imputation method to handle missing data in HbA1c-CV and SBP-CV based on information on age, age of diabetes onset, and duration of type 2 diabetes in the sensitivity analysis. A total of 52,528 patients were included for this sensitivity analysis. The adjusted regression coefficients estimated from dataset with multiple imputation were similar to those from dataset with complete dataset (Supplementary Table 2). We used our risk score to predict ESRD risk for this imputed dataset. We found that the AUC values were 0.90 (95% CI: 0.88−0.91), 0.84 (0.83-0.85), and 0.79 (0.78-0.80) for 3-, 5-, and 8-year periods, respectively. These results were similar to those in our original risk model. Our risk prediction model was robust and not sensitive to missing data.

Discussion
Our study established an ESRD risk prediction model in a nationwide cohort of patients with type 2 diabetes. We identified independent factors, including age, age of diabetes onset, combined statuses of blood pressure and anti-hypertensive medication, creatinine, HbA1c-CV, SBP-CV, diabetes retinopathy, albuminuria, anti-diabetes medication, and combined statuses of hyperlipidemia and antihyperlipidemia medications. The dominance of creatinine, diabetic retinopathy, combined statuses of blood pressure and antihypertensive medication, combined statuses of hyperlipidemia and anti-hyperlipidemia medication, and age in predicting risk is evident. The subsequent factors in terms of dominance were age at diabetes onset, albuminuria, and the clinical management of diabetes (such as HbA1c-CV) and blood pressure (such as SBP-CV). These predictors reflect both the intensity and duration of pathology associated with diabetes and its vascular complications. Utilizing a nationwide cohort from all clinical settings and backgrounds enabled us to confirm the importance of eGFR, albuminuria, other glucose and blood pressure variation indicator and complications in predicting the risk of ESRD events.
Creatinine is a clinical measurement of renal function and albuminuria is a measure of kidney damage. They are also major factors in our prediction model. Several studies have reported that serum albumin, serum creatinine, and eGFR are strongly associated with ESRD 10-14, 18-28 . Most of the prediction models for ESRD included these important factors; these factors were considered essential in predicting ESRD 10-14, 24, 28 . In addition to the assessment of markers of renal function, our model examined the contribution of socio-demographic factors, laboratory variables, diabetes-related factors, and medication use to increase prediction ability.
HbA1c and BP measurements were performed routinely for diabetes care. HbA1c reflects the average of blood glucose levels, but not glucose fluctuations. Visit-to-visit variation in HbA1c measured by HbA1c-CV could provide long-term fluctuation of blood glucose. Monitoring HbA1c during follow-up visits facilitated the availability of this glucose variation measure. Prior studies demonstrated that glucose variation is an independent predictor of ESRD 16,29 . High BP may increase pressure on the kidney, which may cause kidney damage. Existing literature reports that poor blood pressure control is associated with increased ESRD risk 30,31 . Similar results are observed in our study. We found that variation in HbA1c and SBP, and baseline BP are strong predictors of ESRD risk. Both variation in HbA1c and SBP were the novel predictors that were first considered in the risk score prediction model.
The ADVANCE study enrolled 11,140 type 2 diabetes patients aged 55 years or older in 20 counties for 5 years. This study predicted two kidney-related outcomes, namely, doubling of serum creatinine to >= 2.26 mg/dL, renal replacement therapy, or renal death and new-onset albuminuria 15 . The TREAT study was a prospective cohort study nested within a randomized clinical trial to establish ESRD risk and composite of death or ESRD risk models using serum levels of the cardiac biomarkers TnT and NT-pro-BNP in 4038 patients with type 2 diabetes, anemia, and CKD 14 . The RENAAL study was also a prospective cohort study nested within a multinational randomized control trial to develop ESRD risk scores in 1,513 patients with type 2 diabetes and nephropathy 13 . The former two prediction models demonstrated good discrimination ability (C statistic of 0.84-0.85) 14,15 , whereas the RENAAL study did not report statistics for discriminatory ability. The predictors identified in these three studies, such as age, creatinine, blood pressure, albuminuria, diabetic retinopathy, and insulin use, were similar to those in our model. In addition to these factors, our prediction model also considers HbA1c variation, one of key indicator for diabetes care. Besides blood pressure level, our prediction model integrates information of anti-hypertensive medication and variation in SBP. Our prediction model showed excellent discrimination ability (AUCs of 0.90). The frequency of 3-, 5-and 8-year observed and predicted events were similar in the derivation and validation sets and in the Hosmer-Lemeshow x 2 test. Our prediction models fit the data with good prediction ability.
Risk prediction models can provide individual estimates of risks to specific diseases and serve as guidance for the clinical management of high risk patients. The use of the risk scores for risk stratification can be helpful to clinicians or health policy providers to provide interventions such as diets to modify behaviors to reduce ESRD risk and health cost. It has been reported that dietary intervention improved renal function measured by eGFR in moderately obese persons with moderate impaired renal function 35 . Several disease risk models have been developed and validated in Taiwan; these models are targeted for diseases including stroke in general population 36 , type 2 diabetes 37 , cirrhosis and hepatocellular carcinoma [38][39][40] , gastric cancer 41 , and periodontal disease 42 . There was no risk prediction model being established to determine ESRD in patients with type 2 diabetes in Taiwan. The present study developed a risk score to predict the 3-, 5-, and 8-year ESRD risk in patients with type 2 diabetes in Taiwan.

Strengths and limitations.
Our study has the advantages of a relatively large group of patients with type 2 diabetes and a long-term follow-up period of 8.3 years. Our study took into account a wide range of potential risk factors. Baseline macro-and microvascular diseases were evaluated and chosen in our prediction model. Medical management for the control of diabetes, hypertension, and hyperlipidemia were also considered in our model. Baseline renal function, albuminuria, and visit-to-visit variation in HbA1c and SBP play principal roles in our prediction model. The present study showed excellent discrimination ability (AUCs of 0.90) for ESRD risk score in Chinese patients with type 2 diabetes.
Our research has five limitations. First, we obtained data on renal function, HbA1c-CV, SBP-CV, medication data, and risk factors from 2001 to 2003, but we did not had data after 2003. Exposure status may have changed during the follow-up period. In addition, the ESRD risk score was not able to predict ESRD risk longer than 8 years because of unavailability of data for ESRD. Second, the NDCMP database did not contain any physical activity, diet, genetic risks, and cardiac biomarkers that may be associated with ESRD risk. Third, although these patients were prospectively followed up by care managers, we had no control over the nature and the quality of the measurements that were made because the study design was retrospective. In addition, the diagnosis of comorbidity is based on ICD codes, which were dependent on the diagnostic accuracy of our database. The accuracy of diagnoses in the database had been improved by the routine check on samples of medical charts conducted by the bureau of NHI in Taiwan 43 . The punishment on every false diagnostic report was severe. Further, to ensure accurate diagnosis of comorbidity, we included only those cases with at least three outpatient visits or at least one hospitalization. Therefore, the prevalence of comorbidity could be somewhat underestimated. This kind of underestimation might be random. Because there is no evidence indicates that undiagnosed comorbidity is associated with ESRD. Thus, this error results in the effect may be toward the null, a lesser threat to validity. Fourth, our study had NDCMP participants as study subjects, which may result in selection bias. In order to assess the possibility of selection bias, we assessed age and gender distributions between NDCMP participants and population with type 2 diabetes in Taiwan. We found similar distributions and non-differential distributions in age and sex demonstrated this kind of selection error might be random, resulting in the effect toward the null, a lesser threat to results' validity. Last, the primary outcome measure, ESRD, was ascertained by ICD-9-CM code in catastrophic illness certification from the Registry for Catastrophic Illness database of NHI program. Because each individual registered in the catastrophic illnesses database is exempted from any co-payment for treatment, the process for evaluating applicants' eligibility for this registry is strict and comprehensive. For the case of ESRD, the catastrophic illness certification was issued by a nephrologist and confirmed by another nephrologist. Under this condition, the possibility of ESRD cases identified in this study being true positive are high and there is a likelihood of underestimation of ESRD incidence. If a true association exists between each factor and ESRD incidence, then this type of underestimation would result in reduction in the power of the study for detecting such an association. Even though the power of the present study may be lessened, the prediction model has very good predictive and discriminatory ability.
In conclusion, our study developed 3-, 5-, and 8-year ESRD risk scores with good prediction accuracy and discriminatory ability. Our study need additional sample for further external validation. Our results demonstrate the risk prediction model is a useful screening tool for preventing ESRD in Chinese patients with type 2 diabetes. This tool can guide clinicians in planning intervention and providing information for policy makers to set up public health strategies to prevent ESRD and reduce healthcare costs. Study subjects. A total of 63,084 enrolled diabetic patients were diagnosed with type 2 diabetes based on the criteria of American Diabetes Association in the NDCMP from 2002 to 2004. We included patients who had at least 1 year of follow-up for calculation of visit-to-visit variation in glycated hemoglobin (HbA1c) and blood pressure, and without ESRD at baseline or missing information regarding baseline characteristics of comorbidities and laboratory blood test results. Figure 3 shows the flowchart of the recruitment procedure in the study. Baseline factors were com_pared between patients included and those excluded using standardized mean differences. We observed most of standardized mean differences were less than 0.1 standard deviations (SD), indicating a negligible difference in proportions or means between included and excluded patients. Although the standardized mean differences of fasting plasma glucose, eGFR, and variation in HbA1c were greater than 0.  Table 3. ESRD risk score based on the final multivariate Cox's proportional hazards model. set (n = 16, 070) and a validation set (n = 8,034) at a 2:1 ratio. This study was approved by the Human Research Committee of China Medical University Hospital and all methods were performed in accordance with the relevant guidelines and regulations. Informed consent of the study participants was not required because the dataset used in this study consists of de-identified secondary data released for research purposes.
Variables. Independent variables included sociodemographic factors, lifestyle behaviors such as smoking and alcohol drinking, diabetes-related factors and biomarkers, comorbidity, and medication use. Three types of personal information (sociodemographic factors, lifestyle behaviors, and diabetes related factors and biomarkers) were obtained at baseline from NDCMP database by case managers who did not know the objectives of the present study objective. The coefficient of variation (CV) for HbA1c and SBP measurements from outpatient visits within the first year of index date for each patient were calculated for those with more than two HbA1c and SBP measurements in the first year. To adjust for the possibility that the number of visits might affect variation, the CV of HbA1c and SBP was divided by square root of the ratio of total visits divided by total visits minus 1. Comorbidity and medication use of individual patients were retrieved from NHIRD database for 1-year period after index date. Comorbidity was determined by at least three service claims for outpatient care, or one service claim for inpatient care.   , and albuminuria (ICD-9-CM code 719.0 or urinary albumin-to-creatinine ratio ≥30 mg g −1 creatinine) from NDCMP were identified. History of medication included anti-diabetes medications (no medication, oral anti-diabetes drug (OAD), insulin monotherapy, and insulin plus OAD), hypertension medications (ACE inhibitors, ARBs, β-blockers, calcium channel blockers, and diuretics), cardiovascular medication (antiarrhythmic, anticoagulants, antiplatelet, digoxin, and nitrates) and hyperlipidemia medications (statins and fibrates). The primary outcome was ESRD incidence, which was ascertained from catastrophic illness certification (ICD-9-CM code 585 with V45.1) database. In Taiwan, the diagnosis of ESRD is based on tests and exams including a discussion of health history, physical exam, blood and urine tests, and imaging test for assessment of kidney's structure and size to look for abnormalities. The classification of chronic kidney disease adopts the National Kidney Foundation's (NKF) Kidney Disease Outcomes Quality Initiative (KDOQI) staging criteria according to estimated glomerular filtration rate (eGFR) and the presence of kidney damage 44,45 . The Modified Diet in Renal Disease (MDRD) formula is used as the equation for estimating GFR 44 . An individual is defined as ESRD case if his/her level of eGFR is less than 15 mL/min/1.73 m 2 , accompanying by signs and symptoms of uremia, or he/she needs for initiation of kidney replacement therapy (dialysis or transplantation) for treatment for complications of decreased GFR. We identified individuals who experienced ESRD from 1 year after their enrollment in NDCMP  to 31 December 2011. This approach enabled the elimination of potential inverse causality. To enable patients with ESRD to obtain certificates from NHI and to be registered in the catastrophic illness certification database, they had to be diagnosed by at least two nephrologists who will assert their ESRD status. Given this requirement, patients who have been diagnosed with ESRD were less likely to be false positives. In addition, the validity of administrative claim data was assessed by the expert reviews performed by NHI Bureau on random samples of every 50 to 100 outpatient and inpatient claims in each hospital and clinic. Because the outcome ascertainment had been done in clinical setting, health professionals assessing the outcome status were not aware of the objective of the study, i.e., outcome assessment being blind.
Statistical analysis. Proportions were presented for categorical variables. Means and standard deviations (SDs) were presented for continuous variables. The effect sizes were calculated to describe baseline characteristics of derivation and validation sets. The hazards ratios of predictor variables were estimated using Cox's proportional hazards models to develop a prediction model of ESRD in the derivation set and to assess the model's predictive accuracy in the validation set. We have four steps to select independent variables that result in a "best" model 46 . First, we conducted a careful univariable analysis of each variable. Second, we selected variables with univariable test of a p-value < 0.25 47, 48 as a candidate for our multivariable model. Third, we constructed a multivariable model with candidate variables without collinearity. In addition, only variables with p-value < 0.05 were retained. Lastly, after refining a main effects model, we checked assumption of Cox's proportional hazard model for all variables in our multivariate model. We verified the proportional hazards assumption by the graph of the log (−log(survival)) versus the log of the survival time graph and by comparing the regression coefficients from models censored at 3, 5, and 8 years.
The steps for predictive model development were based on Framingham Heart study in the determination of the ESRD risk score 49 . The seven steps were as follows: (1) estimating the parameters of the multivariable Cox's proportional hazards model with backward elimination approach for model building strategy; (2) grouping the risk factors into categories and determine their reference values W ij ; (3) assigning a score for each category to determine the referent risk factor profile; a base category for each risk factor has a 0 score; (4) determining the distance from the base category to each category in regression units; (5) setting the constant B, which is the number of regression units that reflect 1 point in the final points system; (6) calculating the number of points for each category of each risk factor, where Point ij = (W ij − W iREF )/B; and (7) determining the prediction risks for all possible total scores. The risk of ESRD was calculated by the following equation is the baseline disease-free probability, β i is the regression coefficient for X i , and the X i is the mean level of X i . The calculation of constant B depends on age considered in multivariate model. If we consider eGFR as a risk factor in the prediction model, it will result in collinearity issue due to overlapped information of age and eGFR. Thus, serum creatinine is considered in our study instead of eGFR. There were six continuous variables being categorized in step 2. Age was classified into categories by 5 years for an interval, the same as Framingham study; variations in HbA1c and systolic blood pressure were based on their tertiles; total cholesterol was based on NCEP ATP III criteria; systolic and diastolic blood pressure were based on classification of blood pressure for WHO/ISH reports 50 ; serum creatinine was according to proposed classification/staging system for acute kidney injury based on modification of RIFLE criteria 51 . The receiver operating characteristic (ROC) curve analysis was applied to assess the predictive accuracy, and area under curve (AUC) was used to assess the discriminatory ability of the predictive model. Goodness-of-fit was performed by comparing the observed and predicted events of ESRD based on risk groups of deciles using the Hosmer-Lemeshow x 2 test. Internal validation was carried out to correct the potential for overfitting or "optimism" by using 1000 times bootstrap resampling 52 . Model calibration was conducted to evaluate the agreement between model-predicted probabilities and observed probabilities. We used calibration-in-large approach to calculate the intercept for evaluation of the extent whether predictions are systematically too low or too high. When the value of intercept is close to zero, it indicates that there is no systematic deviation of estimation of predicted probabilities. In addition, we calculate the calibration slope, an estimate of extremeness of predicted probabilities to test whether its value deviates from the ideal of 1.0. If the value of slope is close to one, it would reflect there is no overfitting of a model, i.e., there is no condition that the probability of an ESRD event tends to be underestimated in low risk patients and overestimated in high risk patients. Furthermore, the mean absolute error in calibration for slope and intercept were reported for assessing calibration, with error referring to the difference between the observed values and the bias-corrected calibrated values. We used a multiple imputation method to impute missing data for sensitivity analysis. A total of 28,424 subjects were imputed for missing data of variation in HbA1c and systolic blood pressure because of too short duration to calculate these two measurements. The method we used is multiple imputation method with a fully conditional specification (FCS) method, assuming the existence of a joint distribution for baseline variables of age, age of diabetes onset, duration of type 2 diabetes, and variation in HbA1c and systolic blood pressure, and regression analysis as imputation method. We performed statistical analysis using SAS version 9.4 (SAS Institute Inc., Cary, NC). Two-tailed p < 0.05 denotes statistical significance.