Introduction

With rapid economic development, life-style of human beings has been changed dramatically globally. The prevalence and incidence of type 2 diabetes mellitus (T2DM) are increasing at fast speed in the world. Issued in 2015, the International Diabetes Federation (IDF) estimated that the number of global diabetes individuals aged 20–79 was 415 million and will increase to 642 million by 20401. In China, the prevalence of T2DM was 11.6% in 20102. Similarly developing trend of T2DM in rural areas of China has been sharply increased from less than 1% in 1980 to 10.3% in 20102,3. Although the development of urbanization is speeding up in the recent years, the Chinese rural population is still very large. Up to 2015, 44% population lives in rural area of China. Therefore, the prevention and control of T2DM are urgent in rural areas. In addition, diabetes is a major risk factor of cardiovascular diseases including ischemic heart disease and stroke, which accounted for an estimated 12.9 million death globally in 20104,5. Screening high risk individuals, taking effectively preventive measures, and avoiding from the risk factors of T2DM are good strategies in prevention and delay of T2DM occurring and its cardiovascular complications.

Personalized intervention is helpful to prevent or delay T2DM by life-style changed and pharmaceutical interfering6. Fasting plasma glucose (FPG), oral glucose tolerance test (OGTT) and HBA1c are commonly used for T2DM determination in clinical and epidemiological studies7. However, their application has some limitations, which can’t succinctly identify high risk individuals and screen large population on spot. Many risk factors associated with diabetes can be used to recognize high-risk individuals for early intervention8,9. Risk scores based on some risk factors without laboratory tests have been demonstrated as an effective, low cost and noninvasive tool for identifying the high-risk individuals of T2DM10,11,12,13,14. Because of incomplete health care system and underdeveloped economy in rural areas, the prevalence of T2DM is already high and continuously increasing in rural areas of China2,3. Thus, establishing a suitable risk score must be useful in identifying high risk individuals for the prevention and control of T2DM in rural areas.

A risk score of T2DM had been developed according to the data of a nationwide study in China14. However, because of quickly increased prevalence of T2DM and the different levels of risk factors in rural population of China, we tried to establish a rural risk assessment tool (the RuralDiab risk score) for T2DM based on the data from the Rural Diabetes, Obesity and Lifestyle (RuralDiab) study. Another prospective study from Henan Province was used to validate and compare the performance between the RuralDiab risk score and previous risk scores.

Results

Population characteristics

The characteristics of establishment population was shown in Table 1, which showed that the crude prevalence of undiagnosed T2DM was 4.29% (234 of 5453 individuals), while age, marital status, family history of diabetes, more vegetable and fruit intake, treated with anti-hypertensive medication and body mass index (BMI) had no sex difference. The percentages of high fat intake, current smoking, hypertension and dyslipidemia were higher, but physical activity was lower in men than that in women. Detailed characteristics of validation population were presented in Supplementary Table 1. A total of 249 patients of T2DM were detected in the validation population with a 6-year follow-up.

Table 1 Population characteristics of establishment population from the RuralDiab study for developing the RuralDiab risk score.

Establishment of risk score

Table 2 describes the results of the multivariate logistic regression analysis. The characteristics of establishment population were significantly associated with undiagnosed T2DM included sex, age, family history of diabetes, physical activity, waist circumference, history of dyslipidemia, diastolic blood pressure (DBP), and BMI. The well-fitting was shown by Hosmer-Lemeshow test (χ2 = 5.25, P = 0.731), which the observed prevalence matched well with the predicted prevalence of undiagnosed T2DM in the multivariate logistic regression model. BMI, DBP and history of dyslipidemia by the net reclassification improvement (NRI) analysis were added in the multivariate logistic regression model fitting with sex, age, family history of diabetes, physical activity and waist circumference. The results of analysis showed that the contribution of DBP is higher compared with systolic blood pressure (SBP) for risk score of T2DM, and the strength co-linearity was found between DBP and SBP for effecting on T2DM. In the sensitivity analysis of predicting T2DM, the area under the curve (AUC) of DBP was bigger than that of SBP in the multivariate logistic regression model. Thus, the DBP was incorporated into the risk score of the T2DM. And they improved the predicted probabilities with NRI = 0.2192 (Z = 4.67, P < 0.001). Detail of NRI was presented in Supplementary Table 2. The AUC of model with BMI, DBP and history of dyslipidemia (AUC = 0.718) was significantly higher than that of the model without BMI, DBP and history of dyslipidemia (AUC = 0.684) (P = 0.010) in establishment population. A simple score was derived from the coefficients (β) of logistic regression model: β < 0.3 was one, 0.3 ≤ β < 0.6 was two, 0.6 ≤ β < 0.9 was three, 0.9 ≤ β < 1.2 was four, 1.2 ≤ β < 1.5 was five, 1.5 ≤ β < 1.8 was six, 1.8 ≤ β < 2.1 was seven, 2.1 ≤ β < 2.4 was eight. Finally, the RuralDiab risk score was established with range from 0 to 36.

Table 2 Logistic regression model with undiagnosed T2DM for the RuralDiab risk score in the establishment population.

Validation the RuralDiab risk score and its advantages compared with others

Table 3 presents the validation of the RuralDiab risk score for predicting risk of T2DM in an external prospective study. The AUCs of the RuralDiab risk score were 0.723 (95% CI: 0.710–0.735) in total population, 0.711 (95% CI: 0.688–0.732) in men and 0.726 (95% CI: 0.709–0.742) in women. The optimal cutoff value was 17 in total population. The AUCs of the RuralDiab risk score was better than that of the American Diabetes Association (ADA) score (AUC: 0.636 in total, 0.628 in men), the Inter99 score (AUC: 0.669 in total, 0.618 in men), the Oman risk score (AUC: 0.675 in total, 0.659 in men) in total population and men. The significant difference of the AUC was only found between the RuralDiab risk score and the ADA score in women (AUC: 0.648). Comparing with the New Chinese Diabetes risk score, the RuralDiab risk score significantly improved the reclassification in all risk scores, and the net reclassification improvement (NRI) were 6.33% in total, 3.86% in men, 9.23% in women, respectively.

Table 3 Performance of the RuralDiab risk score and comparison with previously published risk scores for predicting T2DM in validation population.

The Figure 1 showed that the comparison between the RuralDiab risk score and previous risk scores was executed.

Figure 1
figure 1

The comparison of the AUC for different risk scores to predict T2DM in validation population ((a) Total, (b) Men, (c) Women). RuralDiab = the RuralDiab risk score; CHN = the New Chinese Diabetes Risk Score; ADA = the American Diabetes Association score; Inter99 = the Inter99 score; Thai = the Thai risk score; Oman = the Oman risk score; T2DM = type 2 diabetes mellitus; AUC = area under the curve.

Discussion

The RuralDiab risk score, which was developed from a large-scale rural population study, is the first risk assessment tool for T2DM with noninvasive factors in rural population. Meanwhile, the RuralDiab risk score was validated and evaluated by an external prospective study for T2DM prediction, which showed some advantages of the RuralDiab risk score compared with previous risk scores.

The result of Framing-ham Offspring Study reported that the incident of T2DM was mainly in middle-aged adults15. Therefore, the RuralDiab risk score was established in Chinese with aged 30–59 years living in rural area. Previous reports showed that T2DM was a multi-factor metabolic disorder disease, and environment factors and life-style played important roles16,17,18,19,20,21. The results of data analysis found that sex, age, family history of diabetes, physical activity, waist circumference, history of dyslipidemia, DBP, BMI were included in the RuralDiab risk score. Compared with previously published the New Chinese Diabetes Risk Score, the RuralDiab risk score added physical activity and history of dyslipidemia, and made DBP substituted for SBP with adjusting “treated with anti-hypertensive medication”.

With some advantages compared with previous risk scores, especially in validity of T2DM risk prediction, the RuralDiab risk score is a reliable and inexpensive health check tool, which could be used for screening diabetes in the large population. Although it might inevitably omit individuals with T2DM risk22,23, there are some clinical meanings. Firstly, applying the RuralDiab risk score to predict T2DM may reduce the suffering of individuals with invasive procedure. Secondly, the application of the RuralDiab risk score could quickly identify the high-risk individuals of T2DM in rural areas for both the general population and health care providers. Finally, wide application of the RuralDiab risk score could improve the public awareness of T2DM and help people realize the relevant risk factors.

Although the RuralDiab risk score is the first rural assessment tool for T2DM in China based on a large-scale, population-based data— the RuralDiab study, there are some limitations. Firstly, the cases of undiagnosed T2DM were ascertained by fasting glucose level without OGTT or HBA1c, which might omit some potential T2DM individuals, and OGTT or HBA1c will be considered in future study. Secondly, some important covariates, such as dietary and lifestyle might have reporting bias, but potential covariates were adjusted as much as possible. Thirdly, the current performance might be not ideal enough for risk prediction in practice, and some new indicators or biomarkers, especially for hereditary factors could improve the performance of the risk score in the future. Finally, only one provincial data was applied to establish and validate the RuralDiab risk score, which might limit the popularization and application. In addition, the performance of the risk tool need to be further confirmed in the multi-centered prospective studies.

In conclusion, the current study develops the RuralDiab risk score including sex, age, family history of diabetes, physical activity, waist circumference, history of dyslipidemia, DBP and BMI for predicting T2DM. Compared with the previously published risk scores, the RuralDiab risk score was more suitable for rural population, which might be helpful for rural health care practitioners to assess the risk of T2DM, and then improve the awareness of disease prevention for rural population. However, the potential clinical application remains to be determined.

Methods

Study design and participants

Establishment population of the RuralDiab risk score was derived from the Rural Diabetes, Obesity and Lifestyle (RuralDiab) study. In brief, the participants were selected by stratified random cluster sampling from eligible candidates listed in the residential registration record. Firstly, 3 townships were selected from 22 rural areas of Yuzhou County in consideration of the adherence and local medical conditions. Secondly, all permanent residents who satisfied the inclusion criteria and signed informed consent were selected as the subjects. Ultimately, a total of 11032 participants aged 18 years and older were recruited between July and August in 2015 from Yuzhou County in Henan Province of China. The participants were excluded based on the criteria, which comprised (1) previously diagnosed diabetes (n = 818); (2) aged younger than 30 or older than 59 years (n = 4725); (3) with incomplete information (n = 36). Finally, the information of 5453 participants aged 30–59 years was used to establish the RuralDiab risk score of T2DM in the present study.

An external population from one prospective study was used as validation population to evaluate the RuralDiab risk score. The baseline study was conducted from 2007 to 2008, and 10009 participants aged 18 years and above who lived in their current location with at least 10 years were recruited from Xinan County in Henan Province of China. Then, participants were followed up during 2013 and 2014. Individuals with the drop-off (n = 1280), the death (n = 580), age younger than 30 or older than 59 years (n = 1627), diagnosed diabetes at baseline (n = 654), and incomplete information at baseline or follow-up (n = 1215) were excluded. Ultimately, 4653 participants aged 30 to 59 years were included in the current study.

The two surveys were approved by the Zhengzhou University Medical Ethics Committee, and written informed consent was obtained from all participants. The studies were executed with the principles of the Declaration of Helsinki.

Data collection and laboratory measurement

Using standardized methods for stringent levels of quality control, a standard questionnaire was given to each participant with face-to-face interview by well trained public health workers and physicians to collect information on demographics (age, sex, income status, educational level and marital status), family and individual disease history (diabetes, hypertension, coronary heart disease and stroke), dietary intake and lifestyle (smoking, alcohol drinking, intakes of fat, vegetable and fruit, and physical activity). Age was classified into three categories: ≥30 and <40, ≥40 and <50, ≥50 and <60 years. The educational level was classified into four categories: illiterate, primary school, secondary school, and college and above. Marital status was classified into two categories: married/cohabitation and unmarried/divorced/widowed. Family history was defined as the parents or siblings of participants with a history of disease.

Food frequency method was used to estimate the daily intake of fat, vegetable and fruit in the past one year according to the China Food Composition Table24. Based on the Chinese Dietary Guidelines, the appropriate consumption of vegetable and fruit should be more than 500 g daily, and high fat intake was defined as consuming an average of more than 75 g per day25. Physical activity for each participant was classified as low, moderate and high level based on the International Physical Activity Questionnaire (IPAQ)26. The participants with high or/and moderate level of physical activity were defined as physical activity. Smoking status was classified as current smoking and not current smoking. Participants who were current smoking at least one cigarette per day along with sequential or cumulative 6 months were defined as current smoking according to the definition of the World Health Organization27.

Followed the standard procedure, body weight, waist circumference and height of the participants were measured twice to the nearest 0.1 kg and 0.1 cm respectively, and the average values were taken. Blood pressure and heart rate were measured in the sitting position by a standardized protocol28. Waist circumference (in centimeter) was classified into five categories: <80, ≥80 but <90, ≥90 but <100, ≥100 but <110, ≥110 in men and <70, ≥70 but <80, ≥80 but <90, ≥90 but <100, ≥100 in women. Body mass index (as kg/m2) was classified into six categories: <22, ≥22 but <24, ≥24 but <28, ≥28 but <30, ≥30 but <32, ≥32. Diastolic blood pressure (in mmHg) was classified into four categories: <70, ≥70 but <80, ≥80 but <90, ≥90 or treated with anti-hypertensive medication.

Blood specimens were collected with vacuum tubes containing ethylene diamine tetraacetic acid (EDTA)-K2 after overnight fasting and were centrifuged at 4 °C and 3000 rpm for 10 min. The plasma was transferred with the cold chain and stored at −80 °C for biochemical analyses. Plasma glucose was measured using a modified hexokinase enzymatic method.

Definitions

Undiagnosed T2DM was defined as having fasting plasma glucose level ≥7.0 mmol/L without previously diagnosed diabetes based on the American Diabetes Association (ADA) diagnostic criteria29. After excluding type 1 diabetes mellitus, gestational diabetes mellitus, and other special type diabetes, T2DM was defined as a self-reported diagnosed diabetes or undiagnosed T2DM. All participants brought their prescribed medications during the investigation, and a self-reported history of diabetes was confirmed by the use of insulin or oral hypoglycemic agents. In addition, the hospitalized patients with diabetes had their charts reviewed.

Previous risk scores selection

This study selected previously representative risk scores of T2DM with noninvasive measures in varied regions and ethnicity, including the American Diabetes Association score (ADA)10 from Americans, the Inter99 score (Inter99)11 from Europeans, the Thai risk score (Thai)12 from Thais, the Oman risk score (Oman)13 from Arabians and the New Chinese Diabetes Risk Score (CHN)14 from Chinese to compare with the RuralDiab risk score (RuralDiab) in validation population.

Statistical analysis

Data of the participants’ characteristics were compared. The categorical variables and continuous variables were analyzed through Chi-square and t-test, respectively. In this analysis, we re-categorized these parameters and used logistic regression analysis to select factors and derive the risk score. Forward stepwise likelihood ratio method of multivariate logistic regression analysis was used to investigate significant risk factors for the RuralDiab risk score. Net reclassification improvement analysis was used to identify whether adding some risk factors could improve the classification of the predicted probabilities of the multivariate logistic regression model30. The quintiles of predicted probabilities of having diabetes according to the model comprised of sex, age, family history of diabetes, physical activity, waist circumference were classified into five categories: ≤2.1%, >2.1% and ≤2.9%, >2.9% and ≤3.9%, >3.9% and ≤5.8%, and >5.8%. The risk score was calculated according to the coefficients (β) of the model. Then, the receiver-operating characteristics curves were plotted for the RuralDiab risk score, the sensitivity was plotted on the y-axis, and the false-positive rate (1-specificity) was plotted on the x-axis. The area under the curves reflected the discriminating accuracy of different curves using different combinations of predictors31, and the optimal cutoff point was the peak of the curve. Sensitivity, specificity, likelihood ratio, predictive value and the AUC were applied to compare the performance among different risk scores.

A two-tailed P-value < 0.05 was deemed statistically significant. Statistical analyses were performed using SAS 9.3 (SAS Institute, USA).

Additional Information

How to cite this article: Zhou, H. et al. Development and evaluation of a risk score for type 2 diabetes mellitus among middle-aged Chinese rural population based on the RuralDiab Study. Sci. Rep. 7, 42685; doi: 10.1038/srep42685 (2017).

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.