A prediction nomogram for the 3-year risk of incident diabetes among Chinese adults

Identifying individuals at high risk for incident diabetes could help achieve targeted delivery of interventional programs. We aimed to develop a personalized diabetes prediction nomogram for the 3-year risk of diabetes among Chinese adults. This retrospective cohort study was among 32,312 participants without diabetes at baseline. All participants were randomly stratified into training cohort (n = 16,219) and validation cohort (n = 16,093). The least absolute shrinkage and selection operator model was used to construct a nomogram and draw a formula for diabetes probability. 500 bootstraps performed the receiver operating characteristic (ROC) curve and decision curve analysis resamples to assess the nomogram's determination and clinical use, respectively. 155 and 141 participants developed diabetes in the training and validation cohort, respectively. The area under curve (AUC) of the nomogram was 0.9125 (95% CI, 0.8887–0.9364) and 0.9030 (95% CI, 0.8747–0.9313) for the training and validation cohort, respectively. We used 12,545 Japanese participants for external validation, its AUC was 0.8488 (95% CI, 0.8126–0.8850). The internal and external validation showed our nomogram had excellent prediction performance. In conclusion, we developed and validated a personalized prediction nomogram for 3-year risk of incident diabetes among Chinese adults, identifying individuals at high risk of developing diabetes.


Materials and methods
Study design and participants. The data was obtained from a public, non-profit computerized database established by the Rich Healthcare Group in China, namely, the 'DATADRYAD' database (www.Datad ryad. org). We downloaded the raw data shared by Chen et al. 29 from: Association of body mass index and age with incident diabetes in Chinese adults: a population-based cohort study. Dryad Digital Repository. http://dx.doi. org/10.1136/bmjop en-2018-02176 8. And the raw data is available publicly for use. The original study enrolled 685,277 participants ≥ 20 years old with at least two routine health checks from 2010 to 2016 across 32 sites and 11 cities in China (Shanghai, Beijing, Nanjing, Suzhou, Shenzhen, Changzhou, Chengdu, Guangzhou, Hefei, Wuhan, Nantong).
The original study initially included all study participants at least 20 years old with at least two routine health checks between 2010 and 2016. Participants were excluded at baseline in the original study, as follows:(1) no available information on weight, height and gender; (2) extreme BMI values (< 15 kg/m 2 or > 55 kg/m 2 ); (3) visit intervals < 2 years; (4) no available fasting plasma glucose value; (5) participants diagnosed with diabetes at baseline (participants diagnosed by self-report or diagnosed by a fasting plasma glucose ≥ 7.0 mmol/L) and participants with undefined diabetes status at follow-up. A total of 211,833 participants remained after applying the exclusion criteria in the original study. Our study further excluded participants with the missing value of baseline variables to predict the 3-year risk of incident diabetes. Figure 1 depicted the participants' selection process. Finally, our study included 32,312 subjects (20,995 male and 11,317 female) for secondary analysis.
The study was conducted in accordance with the Declaration of Helsinki and patient consent was not required, referencing the original study article 30 . www.nature.com/scientificreports/ glucose oxidase method. The clinical measurements of FPG, TC, TG, LDL-C, HDL-C, BUN, Scr, and ALT were performed on an autoanalyzer (Beckman 5800). The data were collected under standardized conditions and conducted following uniform procedures. Laboratory methods also were carefully standardized through stringent internal and external quality controls.
Definitions. The diabetes definitions were fasting blood glucose ≥ 7.00 mmol/L and/or self-reported diabetes during follow-up. Patients were censored either at the time of the diagnosis or at the last visit, whichever comes first.
Statistical analysis. All participants were randomly stratified into the training cohort and the validation cohort. Baseline characteristics were expressed as means ± standard deviations (normal distribution) or medians (quartiles) (skewed distribution) for continuous variables and as frequency or percentages for categorical variables. Two-sample t-tests were applied to analyze differences between training cohort and validation cohort for normally distributed continuous variables, Wilcoxon rank-sum tests for non-normally distributed continuous variables, and chi-square tests for categorical variables. Standardized differences of less than 0.10 for a given covariate indicate a relatively small imbalance 31 . We also showed the baseline characteristics of the training and validation cohort stratified by the incidence of diabetes. After collinearity screening, logistic regression models were used to assess each variable's significance to investigate the independent risk factors of developing diabetes. The risk factors reported in the literature associated with incident diabetes were candidates for the multivariate analysis [26][27][28][32][33][34][35] .
To find a simple and reliable risk prediction model, we established four models for comparison. First, we apply all risk factors to build a full model. Second, we conducted a backward step-down selection process according to the Akaike information criterion (AIC) to establish a parsimonious model (stepwise model) 36 . Third, according to the multivariable fractional polynomials (MFP) algorithm, we used the iterative fashion to determine the significant variables and functional form by backward elimination to establish a stable model (MFP model) in the real world 37 . The least absolute shrinkage and selection operator (LASSO) method is suitable for reducing high-dimensional data and is applied to select the most useful prediction candidates 24,25 . Candidates with non-zero coefficients are selected to establish LASSO model 38 . Considering that fewer variables in the LASSO model and the prediction performance are relatively good, we choose the LASSO model for further analysis. To evaluate and compare the discriminatory power of these prediction models, we plotted the receiver operating characteristic (ROC) curve and calculated the area under the ROC curve (AUC) with 95% confidence intervals (CI) in the training cohort and validation cohort, respectively. We simultaneously presented the sensitivity, specificity, accuracy, positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio (PLR), negative likelihood ratio (NLR), and diagnostic odds ratio (DOR) of these four models calculated according to standard definitions. Sensitivity  39 . The effect of the variable with the highest β coefficient (absolute value) is assigned 100 points. The points are added across independent variables to derive total points, converted to predicted probabilities of developing diabetes. The nomogram score is a numeric value representing the prediction model score of the individual patient. Sensitivity and specificity for predicting diabetes at different cut-off values of nomogram scores are different. Besides, we compared the predicted risk and observed a 3-year incidence of deciles of predicted diabetes risk for the training cohort in the nomogram. The predicted and actual risks in each decile were compared by the Hosmer-Lemeshow × 2 test 40 . Decision curve analysis was conducted to determine the clinical use of the risk prediction model for diabetes: the proportion of the person who showed a true positive result subtracted by the proportion of the person who showed the false positive result, and then weighed the relative hazard of the false positive and false negative results to obtain a net benefit of making a decision 41 . Bootstraps with 500 resample were applied to ROC curve, nomogram and decision curve analysis to decrease the overfit bias 27,42 . We also performed the ROC curve to analyze each risk factor of incident diabetes' performances and optimal cut-off value in the LASSO model. What's more, we used a cohort of 12,545 Japanese participants from the NAGALA (NAfld in the Gifu Area, Longitudinal Analysis) database for the external validation. The data were also extracted from the 'DATADRYAD' database (www.Datad ryad.org), shared by Okamura et al. 43 from: Ectopic fat obesity presents the greatest risk for incident type 2 diabetes: a population-based longitudinal study. Dryad Digital Repository. https ://doi.org/10.1038/s4136 6-018-0076-3. And we did a sensitivity analysis on the overall population of the original study (n = 211,833). Multiple imputations were used to replace the missing values. All results are reported according to the TRIPOD statement 44 .
All analyses were performed with the statistical software package R (http://www.R-proje ct.org The R Foundation) and Empower-Stats (http://www.empow ersta ts.com, X&Y Solutions, Inc, Boston, MA). The tests were 2-tailed, and P < 0.05 was taken as statistically significant.
Ethical approval. In the previously published article 29

Results
The present study included 32,312 eligible participants (64.98% men and 35.02% women). Figure 1  Baseline characteristics of participants. Table 1 illustrated the basic demographic, anthropological, and clinical information of the eligible participants. We divided all participants into the training cohort (n = 16,219) and the validation cohort (n = 16,093). During the 2.66 years of the median follow-up period, 155 and 141 participants developed diabetes in the training and validation cohort, respectively. As for all baseline characteristics, the difference between the training cohort and the validation cohort was not statistically significant (all P > 0.05). Table 2 showed the baseline characteristics of the two cohorts by incident diabetes status. The participants with incident diabetes had higher age, BMI, SBP, DBP, FPG, TG, ALT, BUN, Scr, and higher rates of ever or current smokers in the training and validation cohort (all P < 0.05). And there was no statistically significant difference in the family history of diabetes (P > 0.05).
Univariate and multivariate analysis. Table 3 Table S1). The AUC of these four models were relatively close. Given that the LASSO model incorporated fewer risk factors and could predict the 3-year diabetes risk relatively well, we choose the LASSO model as the final risk prediction model for diabetes and further construct a corresponding nomogram (Fig. 3). The total nomogram score was applied to obtain the sort of probability for predicting incident diabetes.  (Table 4). At the best threshold, the sensitivity rates were 89.03% and 85.11%, and the specificity percentages were 80.11% and 82.30% for the training cohort and the validation cohort, respectively. Notably, the AUC of the prediction nomogram was internally confirmed to be relatively stable through the bootstrap validation (AUC = 0.909) (Fig. 4). The differences in AUC, sensitivity, specificity, and accuracy between the four models were relatively small, both in the training cohort and the validation cohort. The other three models' results were shown in the Supplemental Appendix (Table 4, Table S1, Fig S1).
We also evaluated how close the predicted risk was to the observed 3-year incidence of deciles of predicted diabetes risk for the nomogram's training cohort. Figure 5 illustrates the fraction of individuals in each decile of predicted risk in the training cohort. Our nomogram underestimated the 3-year risk of diabetes. However, the Hosmer-Lemeshow × 2 test showed no statistically significant difference between the predicted diabetes risk and observed diabetes (P > 0.05). www.nature.com/scientificreports/ We also showed the prediction performance of each risk predictor in the nomogram, including age, BMI, SBP, FPG, TG (Table S2, Fig S2). The AUC of the prediction nomogram was greater than the AUC of each risk factor for incident diabetes. The predictive ability of other similar risk prediction models for diabetes in China was summarized in Table S3.
Optimal cut-off value for nomogram score. Table 5 showed the sensitivity and specificity for predicting diabetes at different cut-off values. At a cut-off value of 0.05, the specificity is 95.61% and the sensitivity is 61.29%. When the cut-off value increased to 0.3, the specificity increased to 99.78%, while the sensitivity drops to 12.26%. In summary, although higher cut-off values resulted in higher specificity, the sensitivity rapidly fell to a relatively low point. Figure 6 demonstrated the result of the LASSO model's decision curve analysis in the training and validation cohorts. The black line represents the net benefit when none of the participants are considered to develop diabetes. In contrast, the light gray line represents the net benefit when all participants are considered to develop diabetes. The area between the "no treatment line" (black line) and "all treatment line" (light gray line) in the model curve indicates the clinical utility of the model. The farther the model curve is from the black and light gray lines, the better the nomogram's clinical application. Specifically, in the training cohort, if the threshold probability of a patient was 4% in the LASSO model, the net benefit was about 50%, which was equivalent to performing 50 additional diabetes screenings (such as oral glucose tolerance test) per 100 Chinese adults when without a significant change in the incidence of diabetes.

Clinical use of the nomogram.
External validation. The external validation was performed on a cohort of 12,545 Japanese participants. The mean age, BMI, SBP, and FPG of the participants were 43.56 ± 8.68 years old, 22.11 ± 3.11 kg/m 2 , 114.42 ± 14.89 mmHg, and 5.15 ± 0.41 mmol/L, respectively. The median TG was 0.75 (0.50-1.12) mmol/L. (Table S4).The AUC of the external validation was 0.849 (Fig. 7A). At the best threshold, the specificity and sensitivity rates were 81.46% and 75.25%, respectively. (Table S5). The external validation revealed that our nomogram had excellent prediction performance.

Sensitivity analysis.
To perform the LASSO model's sensitivity analysis, we used multiple imputations to replace the missing values of variables of the overall population in the original study (n = 211,833).    (Table S4). The AUC was 0.918 ( Fig. 7b). At the best threshold, the specificity and sensitivity rates were 86.17% and 83.90%, respectively. (Table S5).

Discussion
In this retrospective cohort study, we developed and validated a personalized prediction nomogram for the 3-year risk of incident diabetes by cost-effective and readily available parameters among Chinese adults, helping clinicians identify individuals with a high risk of developing diabetes. The nomogram included five parameters: age, BMI, SBP, FPG, and TG. The internal and external validation showed that our nomogram had excellent prediction performance. We also summarized the sensitivity and specificity of the nomogram for predicting diabetes at different cut-off values. Decision curve analysis illustrated the clinical use of the nomogram. Although many diabetes risk prediction models based on demographic, anthropological, and clinical information have been established, they are mainly used in European [45][46][47] and American populations [48][49][50] . Only a limited number of reliable diabetes prediction models were established in the Chinese population, each of which included different risk predictors. Besides, their prediction performance and clinical usefulness varied greatly. In 2019, Zeyin Lin et al. 51 performed cox proportional hazards regression analysis to develop a nomogram to predict the 5-year incidence of type 2 diabetes mellitus based on age, sex, BMI, and hypertension dyslipidemia, smoking status and family history of diabetes. The C-index of the model was 0.815 (95% CI, 0.797-0.834). However, they did not conduct a decision curve analysis to evaluate the clinical usefulness of the model. Additionally, they did not try other methods to compare and screen the most suitable risk prediction model for incident diabetes. Moreover, age, BMI, TC, TG, HDL-C, and LDL-C are continuous risk predictors, and categorizing them into categories will cause detrimental information loss and affect the ability to detect real relationships 52,53 . In 2019, Kun Wang et al. 54 developed a nomogram to predict the 3-year risk of T2DM in healthy mainland China residents based on age, BMI, FPG, LDL-C, HDL-C, and TG. The AUCs were 0.847 (95% CI, 0.801-0.892) and 0.755(95% CI, 0.717-0.794) for females and males, respectively. Consistent with our nomogram, their nomogram incorporated continuous predictors. Besides, they established a full model, MFP model, and stepwise model, and chose an appropriate model after comparison. However, they did not take into account family history of diabetes, Figure 3. Nomogram to predict the risk of diabetes for Chinese adults. The patient's score for each risk predictor is plotted on the appropriate scale. The patient's score for each risk predictor is plotted on the appropriate scale and vertical lines are drawn from that value to the top Points scale to obtain the corresponding scores. All scores are summed to obtain the total points score. The total points score is plotted on the bottom Total Points scale. The corresponding value shows the predicted probability of incident diabetes.

Scientific Reports
| (2020) 10:21716 | https://doi.org/10.1038/s41598-020-78716-1 www.nature.com/scientificreports/ smoking, and drinking history. Although our nomogram did not include them, we have considered them in the variable selection process. Besides, they did not measure how closely the predicted risk fits the actual risk. In 2015, Carlos et al. 55 developed a simple non-laboratory-and laboratory-based risk assessment algorithms and nomogram to predict undiagnosed diabetes in Hong Kong. The AUCs were 0.686 (95% CI, 0.650-0.722) for nonlaboratory-based algorithm and 0.696 (95% CI, 0.661-0.731) for laboratory-based algorithm. They produced two different nomograms based on anthropometric and biochemical assessments, respectively. And each nomogram included relatively few risk predictors, which may lead to insufficient accuracy and prediction performance of the diabetes prediction model. Thus, their model's predictive ability is relatively low (AUC = 0.686 and 0.696), which revealed that we need to incorporate relatively more risk factors in developing the risk prediction model to ensure the prediction performance. Furthermore, this was a single-center study based on a professional driver community project. The cohort's inappropriate selection and relatively small sample size made it insufficient to represent the Chinese population. It is worth mentioning that none of these studies have performed external validation. Compared with the similar studies mentioned above, our nomogram filled those gaps. Our research sample size was considerable (n = 32,312), and participants were from multiple centers, so our findings may be better applied to the Chinese population. Unlike most previous Chinese DM risk scores with integer points or www.nature.com/scientificreports/ segmented values in China, our nomogram uses continuous variables to provide more precise and personalized risk prediction. It is worth mentioning that we constructed four models and selected the simplest and reliable LASSO model to ensure clinical practicality. Given that a nomogram could provide accurate and individualized risk prediction for each individual. According to the LASSO model, we constructed the corresponding nomogram, which makes up for the deficiencies of many other similar Chinese studies. Notably, our nomogram has an excellent prediction performance (AUC = 0.9125, 95% CI, 0.8887-0.9364). Besides, we proved no significant difference between the predicted diabetes risk and the observed incidence of diabetes. Diabetes can cause various complications, bring severe physical and psychological distress to patients, and bring a huge burden to the healthcare system. And it tends to be undiagnosed due to the lack of specific symptoms. However, screening for diabetes through oral glucose tolerance test may increase the yield and economic efficiency of screening 56 . In this study, we used the LASSO model with relatively good predictive performance to construct the nomogram. And we provided a corresponding formula to calculate the risk of diabetes based on risk predictors, which could help clinicians accurately identify individuals at high risk for diabetes, guide them in timely diabetes screening, and avoid the costs and efforts of prevention and treatment in low-risk groups. And our nomogram underestimated the 3-year risk of diabetes, so the individuals at high risk of developing diabetes identified by our nomogram are indeed at higher risk. Our nomogram items are routine clinical variables readily available to clinicians, thus allowing the nomogram to be easily adopted in practice. Furthermore, the nomogram's predictive performance was high both in the internal and external validation, which suggests its high generalizability. Notably, there were subtle differences between the AUC of our model and that of internal and external validation models. AUC of the external validation model was slightly smaller than the AUC of our nomogram (AUC = 0.849 vs. AUC = 0.913). The difference may come from the following: (1) the study populations were different, our study was performed on the Chinese, and the validation dataset was from Japanese. (2) Participants with FPG ≥ 6.1 mmol/L were excluded from the external validation cohort. (3) The outcome of the external validation cohort was T2DM. However, we could not distinguish between type 1, type 2, and other diabetes types in our model. (4) Diabetes was diagnosed as HbA1c ≥ 6.5%, FPG ≥ 7 mmol/L, or self-reported in the external validation cohort. However, the definitions of diabetes in our nomogram did not include HbA1c ≥ 6.5%. For sensitivity analysis, the AUC for the original study's overall population was close to that of our nomogram (AUC = 0.918 vs. AUC = 0.913), which showed that our study participants could represent the general population.
The risk predictors included in our nomogram were age, BMI, SBP, FPG and TG, which were also included in previous diabetes risk prediction models. Venerable age is a nonmodifiable risk factor for developing diabetes 57 .  www.nature.com/scientificreports/ Aging pancreatic β cells result in the decline of glucose sensitivity and insulin secretory defects 58 . Age-related glucose intolerance is usually accompanied by insulin resistance and β-cell dysfunction 59 . Obesity could increase the fat content of the liver and pancreas, which affect the function of pancreatic β cells 60 . Besides, obesity leads to metabolic derangements and adipose organ dysfunction, leading to insulin resistance 61 . Hypertension and diabetes are often concurrent. The substantial mediators could involve inflammation, oxidative stress, endothelial dysfunction, and insulin resistance 62 . FPG is an independent risk factor of the onset of diabetes, and people with relatively high FPG had a higher risk score of diabetes in our nomogram. It may be that FPG is closely related to insulin response and insulin sensitivity 63 . Dyslipidemia and diabetes often co-exist in the same individual.
As an endocrine organ, adipose tissue can affect glucose and lipids' metabolism, and TG is the most abundant lipid in adipose tissue 64 . Excess fatty tissue can release many lipid metabolites, proinflammatory cytokines, and cellular stress, which mediate insulin resistance 65 . Therefore, the application of the five risk predictors in our models is well-founded. There are some strengths in the present study, as follows: (1) The present study has a large sample size, and participants were from multiple centers. (2) We established four prediction models, including the LASSO model, full model, stepwise, and MFP models. And we selected the simplest LASSO model with relatively good prediction performance to construct the nomogram to ensure clinical practicability. (3) We provided a formula to calculate the risk of diabetes based on risk predictors, which helps clinicians quickly and accurately calculate the individual's risk of developing diabetes and provide external verification information for other similar studies. (4) Our decision curve analysis demonstrated the nomogram's clinical use and could avoid performing additional diabetes screenings (such as OGTT) for individuals with low-risk diabetes. (5) We performed both internal and external validation to ensure the reliability of the results. (6) As this was a retrospective cohort study, it decreased the risk of selection bias and message bias.
Although our nomogram performed well, the study still has some potential limitations. First of all, this is a secondary retrospective study. The raw data did not provide other diabetes risk factors, such as waist/hip ratio, medical history, and lifestyle factors, affecting the onset of diabetes. However, our nomogram has excellent prediction performance in both internal and external validation, suggesting that the nomogram based on the existing five risk factors has high generalizability. Second, the database did not distinguish between type 1, type 2, and other diabetes types. And the risk factors of different kinds of diabetes are somewhat different. However, type 2 diabetes is the most common kind of diabetes, accounting for over 90% of diabetes cases 66 . The nomogram is approximately used to predict the 3-year risk of developing type 2 diabetes. Third, the researchers did not conduct an oral glucose tolerance test and measure glycosylated hemoglobin. A study showed that 55% of diabetic patients were diagnosed by testing fasting blood glucose alone in Asians 67 . Thus, the diagnostic criteria www.nature.com/scientificreports/ for diabetes in our study may underestimate the true prevalence of diabetes. In other words, the development and validation datasets included only very small numbers of diabetes cases, which may be related to the diagnostic criteria for diabetes in our study. However, a 2-h oral glucose tolerance test for all participants was not feasible in such a large cohort. Fourth, we excluded participants with incomplete records for complete-case analysis to build the models, which may introduce selection bias. However, we used multiple imputations to replace missing values to do sensitivity analysis. And the results proved that our study participants could well represent the overall population. Therefore, in the future, we can consider designing our studies or cooperating with other researchers to collect as many variables as possible, reduce missing values, and distinguish the types of diabetes. Fifth, there were no interactions between the covariates included within the full model, which may cause bias in the results of the full model. However, we focused predominantly on the LASSO model, which has the fewest variables and is more convenient for clinical application, rather than the full model.

Conclusion
We developed and validated a personalized prediction nomogram for the 3-year risk of incident diabetes among Chinese adults, including age, BMI, SBP, FPG and TG. The nomogram had excellent prediction performance in both training and validation cohorts for estimating the risk of developing diabetes, and it has high generalizability. The nomogram was a simple and reliable tool to help clinicians accurately identify individuals with high diabetes risk.