Introduction

Type 2 diabetes is a chronic metabolic disease caused by a combination of genetic, dietary and environmental factors. The disease is characterised by insufficient insulin secretion or an inability to utilise insulin efficiently, resulting in persistent elevation of blood glucose1. According to a report by the World Health Organization (WHO), diabetes is the direct cause of 1.5 million deaths in 2019, with 48% of diabetes deaths occurring before the age of 702. According to the data released by the International Diabetes Federation (IDF) in 2021, there are 140 million people with diabetes in China, with a prevalence rate of approximately 10.6%, and both the number of people with the disease and the prevalence rate are on the rise, with type 2 diabetes accounting for more than 90% of the Chinese diabetic population3. In addition, diabetes mellitus may lead to various complications, such as blindness4, kidney failure5 and hypertension6, due to factors such as poor blood sugar control over a long period of time.

Hyperuricaemia is the greatest risk factor for gout7 and is mainly due to excessive production or poor excretion of uric acid, the main source of which is purines8. Past studies have shown that hyperuricaemia is a risk factor for diabetes mellitus, cardiovascular disease, metabolic syndrome and other diseases9,10,11. A meta-analysis showed that the prevalence of hyperuricaemia in the Chinese population was 16.4%12.

Studies have shown that type 2 diabetes and hyperuricaemia can interact. On the one hand, people with type 2 diabetes are often insulin resistant, which may lead to increased tubular reabsorption of uric acid, which can lead to hyperuricaemia13,14. On the other hand, epidemiological studies have shown that hyperuricaemia is a risk factor for insulin resistance, prediabetes and diabetes9,11,15. In addition, recent evidence suggests that high levels of uric acid interfere with insulin signalling in endothelial cells at both the receptor and post-receptor levels, and that at the post-receptor level, both proximal (IRS and PI3K-Akt components) and distal (eNOS-NO system) steps of the insulin signalling pathway are affected by uric acid16. Risk predictors of high uric acid levels in patients with type 2 diabetes mellitus have been explored, including hip circumference, total cholesterol, high-density lipoprotein, etc.17,18.

Previous studies have applied the Cox regression model and machine learning methods to build a risk model of hyperuricemia based on sociodemographic data, routine physical examination markers, dietary risk factors, blood biomarkers, and alterations of the gut microbiome19,20,21,22,23,24. However, these established studies have only modelled hyperuricaemia in healthy population, while there was a study exploring the development of a hyperuricemia risk model in diabetic kidney disease patients25. To the best of our knowledge, only one study has established a predictive model of hyperuricaemia in the type 2 diabetic population26.

In recent years, Artificial neural networks (ANN) have become popular and useful models for classification, clustering, pattern recognition and prediction in many disciplines. It has a fast and wide range of uses in dealing with a variety of complex real-world problems27. The popularity of ANN lies in its information processing characteristics, including learning ability, high parallelism, fault tolerance, nonlinearity, noise tolerance and generalisation27,28. Also, Dalakleidi K et al.29 showed, ANN is superior to other machine learning algorithms. Although, ANN advantages are obvious, the previous did not use ANN algorithm to model the risk of hyperuricaemia risk factors.

In this study, we constructed a risk model for hyperuricaemia in patients with type 2 diabetes mellitus based on ANN algorithm, and assessed the validity of the model. This has an important role in clinically distinguishing high-risk individuals and identifying risk factors, which in turn has far-reaching significance in alleviating disease symptoms, reducing the risk of patient death and reducing the healthcare burden.

Methods

Study participants

This was a retrospective cross-sectional survey. Between June and December 2022, we randomly recruited patients with type 2 diabetes from one community in each of the six urban areas of Fuzhou City. All participants underwent a face-to-face survey using a homemade uniform questionnaire and took a physical examination, which were both conducted by trained primary care professionals.

Patients with malignancy, history of gout, hyperuricaemia occurring before type 2 diabetes mellitus, type 1 diabetes mellitus, gestational diabetes mellitus and other specific diabetes mellitus were not included in this study. After exclusion of incomplete physical examination data, a total of 8243 cases were obtained.

All respondents completed an informed consent form the ethical research board committee of Fuzhou Center for Disease Control and Prevention (approval number: 2022002) approved the research. In addition, all participants and/or their legal guardians consented to use their medical data in this study. This study was carried out following the Helsinki Declaration contents.

Data measurements

Basic personal information and medical history were investigated through questionnaires, including gender, age, history of smoking and alcohol consumption, duration of diabetes, medication history, etc.

Physical examination was performed to obtain data covering height, weight, waist circumference, blood pressure, etc.

Biochemical indicators were obtained through laboratory tests. Patients were fasted overnight for at least 10 h and drank water for at least 8 h, and fasting venous blood was taken between 7:00 and 9:00 am the following morning. Fasting blood glucose (FPG), uric acid (UA), alanine aminotransferase (ALT), total bilirubin (TBil), serum creatinine (SCr), γ-glutamyl transpeptidase (GGT), blood urea nitrogen (BUN), total cholesterol (TC), triglycerides (TGs), low-density lipoprotein cholesterol (LDL-C) and high-density lipoprotein cholesterol (HDL-C) were measured using a fully automated biochemistry analyser (Model 7100, Hitachi, Japan).

Description of variables

  1. (1)

    Men with uric acid < 420 μmol/L and women with uric acid < 360 μmol/L were considered the normal uric acid group, with a total of 4,477 cases; men with uric acid > 420 μmol/L and women with uric acid > 360 μmol/L were considered the high uric acid group, with a total of 3,766 cases (Fig. 1).

  2. (2)

    According to the Chinese Comprehensive Diabetes Control Objectives (2019)30, the normal reference range: FPG:4.4 to 7.0 mmol/L; blood pressure: < 130/80 mm Hg; TC: < 4.5 mmol/L; TGs: < 1.7 mmol/L; LDL-C: < 2.6 mmol/L (uncomplicated atherosclerotic cardiovascular disease) or < 1.8 mmol/L (complicated atherosclerotic cardiovascular disease); HDL-C: > 1.0 mmol/L (men) or > 1.3 mmol/L (women); uric acid: upper limit < 420 μmol/L for men and < 360 μmol/L for women; SCr: 55–133 μmol/L for men and 44–97 μmol/L for women; BUN: 2.9–7.5 mmol/L; ALT: 5 to 40 U/L; GGT: < 40U/L; TBil: 1.71 to 17.10 μmol/L; waist circumference ≥ 90 cm for men and ≥ 85 cm for women as central obesity; body mass index (BMI) < 18.5 kg/m2 is considered underweight, normal reference range 18.5 kg/m2 ≤ BMI < 24.0 kg/m2, BMI ≥ 24.0 kg/m2 is considered overweight, BMI ≥ 28.0 kg/m2 is considered obese. BMI = weight/height2.

  3. (3)

    Estimation of glomerular filtration rate (eGFR): eGFR was calculated using the Chronic Kidney Disease Epidemiology Cooperative Study Group (CKD-EPI) formula31: for men with SCr ≤ 0.9 mg/dl, eGFR = 141 × (SCr/0.9)-0.411 × 0.993age; Female SCr ≤ 0.7 mg/dl, eGFR = 144 × (SCr/0.7)-0.329 × 0.993age; male Scr > 0.9 ml/dl: eGFR = 144 × (Scr/0.9)-1.209 × (0.993)age; female SCr > 0.7 mg/dl, eGFR = 141 × (SCr/0.7)-1.209 × 0.993age. The Chinese guidelines for the prevention and treatment of type 2 diabetes mellitus define eGFR < 60 mL/min × 1.73m2 as a decrease in the GFR32.

  4. (4)

    WHO defines smokers as "those who have smoked continuously or cumulatively for ≥ 6 months in their lifetime"; alcohol drinkers are defined as those who have consumed alcohol at least once a week for ≥ 6 months; and adequate exercise is defined as achieving moderate intensity exercise, with a duration of ≥ 30 min per exercise session and frequency ≥ 3 times per week.

Figure 1
figure 1

Flow chart of the study.

Statistical methods

Data were double entered using EpiData (version 3.1) and analysed using IBM SPSS (version 22.0) and RStudio (version 4.2.3); measured data conforming to a normal distribution were expressed as (\(\overline{x }\)±s) and compared between groups by a t test, and count data were compared by a chi-square test. Univariate and multivariate logistic regression analyses were conducted using uric acid levels as a dependent variable and sociodemographic characteristics and physiological and biochemical indicators as independent variables, with variables introduced and excluded at a test level of 0.05. The variance inflation factor (VIF) was used to examine collinearity among the independent variables included in the multivariate logistic regression analysis in this study. Data management and statistical analysis were conducted using R version 4.3.2.

Development and validation of the classification models

We utilized multivariable stepwise logistic regression analysis for variable selection. The ANN algorithm was used to build models for three different data scenarios (baseline data only, biochemical indicators only, and baseline data and biochemical indicators).

The incorporated data were divided into a training–testing set (80%) and an independent validation set (20%) using stratified sampling. We utilized grid search to search the hyperparameter space efficiently. This allowed us to find the optimal combination of hyperparameters for three ANN models. To avoid overfitting and promote the models, we used a tenfold cross-validation for the training–testing set and referenced the best models to the independent validation set.

The areas under curves (AUCs) of the three ANN models in the training–testing set were evaluated to assess model performance. In addition, we calculated performance metrics including AUC, accuracy, recall, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), precision, negative predictive value (NPV), kappa, and F1-score. After a comparison of the above performance metrics, the constructed optimal ANN model is visible in Fig. 2. Finally, the calibration curve was analyzed to assess the agreement by the slope, intercept, and Brier score (an ideal value of 0; a value of > 0.3 indicates poor calibration) of the calibration curve.

Figure 2
figure 2

Artificial neural network model.

All models were performed using R version 4.3.2.

Ethics approval and consent

This study was approved by the Ethics Committee of the Fuzhou Center for Disease Control and Prevention (approval number: 2022002). Informed consent was obtained from all participants and/or their legal guardians for this study. There is no conflict of interest in this study.

Results

Demographic characteristics

A total of 8243 diabetic patients were investigated in this survey. Table 1 shows descriptive statistics of sample characteristics, including age, gender, tobacco use, drinking alcohol, sport, waist circumference, BMI, disease duration, diabetes medication use, systolic blood pressure and diastolic blood pressure.

Table 1 Demographic characteristics of the patients in this survey (N = 8243).

Univariate and multivariate analyses of baseline information

Baseline data were included in separate univariate logistic regression analyses to screen for a total of eight variables: gender, tobacco use, alcohol use, exercise, waist circumference, BMI, diabetes medication use and DBP (P < 0.05). The VIF for these eight baseline variables are all less than 5, so there is no multicollinearity (Table 5). Further inclusion in the multivariate logistic regression analysis revealed that gender, exercise, waist circumference, DBP and diabetic medication use were influential factors (P < 0.05). Details are presented in Table 2.

Table 2 Univariate and multivariate logistic regression analysis of baseline information.

Univariate and multivariate analyses of biochemical indicators

Biochemical indicators were included in the univariate logistic regression analysis, and a total of nine variables, including GGT, TBil, BUN, TGs, TC, LDL-C, HDL-C, FPG and the eGFR, were screened (P < 0.05). There is no multicollinearity among nine biochemical indicators variables (all VIF < 5) (Table 5). Further inclusion in the multivariate logistic regression analysis revealed that GGT, TBil, BUN, TGs, TC, LDL-C, HDL-C, FPG and the eGFR were influential factors for hyperuricaemia (P < 0.05, Table 3).

Table 3 Univariate and multivariate logistic regression analysis of biochemical indicators.

Univariate and multivariate analyses of baseline & biochemical indicators

Baseline and biochemical indicators were included in the univariate logistic regression analysis, and a total of seventeen variables, including gender, tobacco use, alcohol use, exercise, waist circumference, BMI, diabetes medication use, DBP, GGT, TBil, BUN, TGs, TC, LDL-C, HDL-C, FPG and the eGFR, were screened (P < 0.05). There is no multicollinearity among 17 baseline & biochemical indicators variables (all VIF < 5) (Table 5). Further inclusion in the multivariate logistic regression analysis revealed that gender, waist circumference, diabetes medication use, DBP, GGT, BUN, TGs, LDL-C, HDL-C, FPG and the eGFR were influential factors for hyperuricaemia (P < 0.05) (Table 4).

Table 4 Univariate and multivariate logistic regression analysis of baseline & biochemical indicators.

Model performance

As described previously, after certain inclusion and exclusion criteria, we built three models, with baseline data, biochemical indicators and baseline & biochemical indicators, respectively (Table 5). By using Grid-optimization, the hyperparameters of the optimal ANN model were : hidden = c(8), threshold = 0.01, stepmax = 1e + 05, rep = 1, learningrate.factor = list(minus = 0.5, plus = 1.2), lifesign.step = 1000, algorithm = rprop + , err.fct = sse, act.fct = logistic. Figure 3 and Table 6 show the performances of all three models. After comparison, Baseline & biochemical model has the best performance with cutoff, AUC, accuracy, recall, specificity, PLR, NLR, precision, NPV, KAPPA and F1-score were 0.488, 0.744(0.721–0.768), 0.689(0.6890.690), 0.625(0.591–0.658), 0.749(0.720–0.778), 2.489(2.191–2.828), 0.501(0.454–0.553), 0.697(0.663–0.731), 0.684(0.654–0.713), 0.375(0.331–0.420) and 0.659(0.625–0.693). In addition, its Brier score was 0.169 and the calibration curve also showed good agreement between fitting and observation (Fig. 4).

Table 5 Colinearity diagnostics of independent variables included in the above three multivariate logistic regression analysis in this study.
Figure 3
figure 3

Areas under curves (AUCs) of the three ANN models developed.

Table 6 Performance comparison of the three models developed.
Figure 4
figure 4

Calibration curves for testing the stability of three models in the study.

Discussion

Although ANN have been widely used in predictive modelling of diseases, however, as far as we know, no study has modelled the risk of hyperuricaemia in a large sample of type 2 diabetic population, as previous studies have only modelled and analysed type 2 diabetes33,34 or hyperuricaemia19. After comparing the performances of the three models and after model validation, we confirmed that the baseline and biochemical model was the optimal model. Interestingly, we noted in our study that the model for baseline information was superior to that for biochemical indicators. Baseline information is relatively stable and more reflective of the patient's true condition over a long period of time than biochemical indicators, which are one-off test results that only indicate the status of the patient's biochemical levels for the first day or two or the first few days of testing. In other words, in a cross-sectional study, baseline information may be more important and more reflective of the patient's true condition than biochemical indicators. Certainly, we will need to demonstrate this in future studies.

We have long identified hyperuricaemia as the greatest risk factor for gout7, and men are usually more likely to develop hyperuricaemia than premenopausal women35. However, our study showed that the detection rate of hyperuricaemia was not only higher in women (33.86%) than in men (11.83%) in the type 2 diabetic population but also the risk of developing hyperuricaemia was 4.15 times higher (95%Cl 3.70–4.66) in women than in men; we also noted that the confidence interval for the mean age of women in this population was 67.16 ± 7.48 years (P < 0.001). This may be due to the decline in hormone levels in postmenopausal women, who lack the protection of, for example, progesterone36 and oestrogen37.

Obesity has been shown to be associated with hyperuricaemia: firstly, obese individuals tend to have higher levels of uric acid compared to normal-weight individuals because of their higher urinary excretion and reduced clearance of uric acid38. Secondly, weight loss in obese individuals is accompanied by reduced uric acid levels and xanthine oxidoreductase (XOR) activity39; XOR is responsible for the breakdown of hypoxanthine and xanthine into uric acid. Finally, animal experiments have shown that the underlying mechanism of elevated uric acid in obese adipose tissue may be due to dysregulation of adipocytokines and chronic low-grade inflammation40,41.

A meta-analysis42 showed that sodium-glucose cotransporter 2 (SGLT-2) inhibitors might potentially prevent gout-related events in patients with type 2 diabetes mellitus, and recent studies43,44 have shown a reduction in blood uric acid levels in diabetic patients on glucose-lowering drugs. This may be related to the renal protective effects of hypoglycaemic agents45,46,47, such as SGLT-2 inhibitors, which not only promote anti-inflammatory and antifibrotic pathways, improve renal oxygenation, and reduce glomerular hypertension and hyperfiltration but also reduce the renal hypoxia characteristic of diabetes, thus exerting effects similar to those of β-blockers in the heart. However, the results of this survey showed that not taking glucose-lowering medications was negatively associated with hyperuricaemia in this type 2 diabetic population; however, the specific names of the glucose-lowering medications taken by this population were not available for this survey, and thus, further research is needed to confirm the results.

The mechanism of the blood pressure lowering effect on serum uric acid reduction is still under investigation. In a large trial of 10,617 hypertensive patients, therapeutic control of their blood pressure resulted in a significant reduction in the prevalence of hyperuricaemia48, similar to the present investigation. However, it has also been shown that appropriate systemic blood pressure control may lead to increased uric acid excretion through modulation of glomerular and tubular function, which in turn reduces serum uric acid and may ameliorate various forms of renal damage in the long term49,50.

Previous studies have shown that the kidneys eliminate 70% of uric acid daily4; therefore, the functional status of the kidneys also influences the development and progression of hyperuricaemia. Similarly, and similar to previous studies51,52, a decrease in the eGFR is indicative of a decrease in renal function, which can lead to serum uric acid retention and thus increase the risk of developing hyperuricaemia53. In the type 2 diabetic population, eGFR < 60 mL/(min × 1.73 m2) is generally defined as diabetic nephropathy (DN)24; therefore, approximately 7.44% (n = 613) of the patients in this population may have had DN, and further deterioration may lead to end-stage renal disease54. Additionally, some studies55,56 showed that type 2 diabetes mellitus combined with hyperuricaemia was associated with a higher risk of all-cause mortality and end-stage renal disease. Our model also suggests that when BUN is ≥ 7.5 mmol/L, this population is at increased risk of developing hyperuricaemia. For these reasons, emphasis should be placed on improving the screening and management of renal function in the type 2 diabetic population at an early stage.

When glycaemic control is poor in diabetic patients, uric acid levels are reduced owing to the permeability of glucose, causing increased excretion of urinary sugar, which in turn leads to competitive inhibition of uric acid reabsorption57, similar to the present findings. Recent studies have found that abnormal liver function is also a risk factor for the development of hyperuricaemia58, which may be related to the source of uric acid production. However, only elevated γ-glutamine transferase (GGT) was positively associated with hyperuricaemia in type 2 diabetic patients; thus, this investigation does not yet identify abnormal liver function as a risk factor for hyperuricaemia, and further studies are needed to support this.

Abnormalities in TC, TGs, HDL-C or LDL-C are generally diagnosed as dyslipidaemia, and dyslipidaemia is increasingly shown to be a risk factor for many diseases59,60,61. Previous studies62,63 have confirmed the positive correlation between TGs levels and hyperuricaemia, and Nakanishi et al.64 found that basal TGs remained an independent predictor of new-onset hyperuricaemia even when long-term medicated patients with diabetes mellitus were excluded, which is consistent with our study. Some studies have also attempted to explain the mechanism of elevated TGs and hyperuricaemia; as TGs rise, the production and utilisation of free fatty acids in the body increases and the catabolism of adenosine triphosphate is accelerated, leading to an increase in uric acid production65. Our study found a positive association between elevated LDL-C and hyperuricaemia, which may be related to the role of LDL-C by inducing vascular inflammation, atherogenesis, calcification and thrombosis66. In agreement with Xu et al.67, low HDL-C levels can trigger hyperuricaemia. HDL-C has anti-inflammatory, antioxidant and anti-apoptotic effects68, and it has also been found that HDL-C reduces inflammation induced by urate crystals, suggesting that HDL-C is involved in uric acid-induced inflammatory responses69.

Clinical and public health potential

Our study identified a total of eleven factors affecting hyperuricaemia in the type 2 diabetes population, which could provide theoretical support in clinical decision-making and provide decision-making physicians with ideas for treating type 2 diabetes combined with hyperuricaemia. Meanwhile, in the health management of type 2 diabetes population, female type 2 diabetes patients should pay special attention to their uric acid level, and also strengthen the monitoring and management of risk factors such as abdominal obesity, elevated blood pressure, decreased liver and renal function, and dyslipidaemia, in order to the risk posed by type 2 diabetes mellitus combined with hyperuricaemia, which is of far-reaching significance for the prevention of progressive deterioration of the disease, the enhancement of the quality of life, and the reduction of medical costs.

Strengths and Limitations

The strength of this study lies in its cross-sectional design to explore the risk factors for hyperuricaemia in a type 2 diabetes mellitus population with a large sample size, as well as the model based on logistic regression and ANN algorithms that were developed and fully validated. Our study also has many shortcomings. Firstly, the AUC value of our established ANN model is not outstanding and fails to reach the desired level. Secondly, although our study implemented strict inclusion and exclusion criteria, based on the nature of cross-sectional studies, the causal argument in determining hyperuricaemia remains unclear, and for this reason, further prospective studies are needed to validate it. Furthermore, it is difficult to control various biases in the survey, so that the truthfulness of some of the data is unconvincing. Finally, our study didn’t include enough variables to be explored, especially ignoring the effect of dietary factors on hyperuricaemia. Based on the above drawbacks, we will improve them in future studies to validate and refine the risk model.

Conclusion

The ANN model built in this study based on eleven variables performed well and can provide theoretical support for clinical decision-making and self-care of type 2 diabetes mellitus patients to mitigate the harm caused by type 2 diabetes mellitus combined with hyperuricaemia.