A diabetes risk score for Qatar utilizing a novel mathematical modeling approach to identify individuals at high risk for diabetes

We developed a diabetes risk score using a novel analytical approach and tested its diagnostic performance to detect individuals at high risk of diabetes, by applying it to the Qatari population. A representative random sample of 5,000 Qataris selected at different time points was simulated using a diabetes mathematical model. Logistic regression was used to derive the score using age, sex, obesity, smoking, and physical inactivity as predictive variables. Performance diagnostics, validity, and potential yields of a diabetes testing program were evaluated. In 2020, the area under the curve (AUC) was 0.79 and sensitivity and specificity were 79.0% and 66.8%, respectively. Positive and negative predictive values (PPV and NPV) were 36.1% and 93.0%, with 42.0% of Qataris being at high diabetes risk. In 2030, projected AUC was 0.78 and sensitivity and specificity were 77.5% and 65.8%. PPV and NPV were 36.8% and 92.0%, with 43.0% of Qataris being at high diabetes risk. In 2050, AUC was 0.76 and sensitivity and specificity were 74.4% and 64.5%. PPV and NPV were 40.4% and 88.7%, with 45.0% of Qataris being at high diabetes risk. This model-based score demonstrated comparable performance to a data-derived score. The derived self-complete risk score provides an effective tool for initial diabetes screening, and for targeted lifestyle counselling and prevention programs.

Development of the Qatari risk score. The model simulated the incidence and progression of T2DM and risk factors in the total Qatari population, that is, it generated a representation of the T2DM epidemic in the entire Qatari population in silico (i.e., through computer simulations). Subsequently, this in silico population was utilized by randomly sampling from it a total of 5,000 Qataris aged 15-79 years, i.e., the sampling frame was the entire simulated Qatari population. Sampling was implemented using a Monte Carlo sampling method. We used model classifications, outcomes, and projections, at four time points: 2012 (the year of the STEPwise survey, for validation), 2020, 2030, and 2050. Sex, age, obesity, smoking, and physical inactivity were the covariables incorporated for generating the risk score.
To derive a risk score for each time point, a multivariable logistic regression was performed by assigning each covariable a score based on the regression model's β-coefficient, using established methodology 12 . For each covariable, the regression β-coefficient was multiplied by 10 and rounded to the nearest integer. The aggregate risk score for each individual in the sample was obtained by adding up the scores, thus ranging between 0 and 49. No interaction terms between the covariables were considered, to keep the score easy to use 27,28 . Assessment of the Qatari risk score performance. For each time point, performance of the risk score was evaluated by estimating the area under the receiver operating characteristic curve (AUC) and the prob- www.nature.com/scientificreports/ abilities of: a T2DM diagnosis given the individual has T2DM (sensitivity), a no-T2DM diagnosis given the individual does not have T2DM (specificity), of having T2DM given a T2DM diagnosis (positive predictive value; PPV), and of not having T2DM, given a no-T2DM diagnosis (negative predictive value; NPV). The PPV was estimated as the proportion of individuals who were truly living with T2DM among those who were "identified" by the risk score as having T2DM. The NPV was estimated as the proportion of individuals who were truly not having T2DM among those who were "identified" by the risk score as not having T2DM.
The optimal cut-off score was chosen by maximizing the sum of the sensitivity and specificity. Consequently, the proportion of individuals who have a score greater than or equal to the cut-off was estimated, thus determining the proportion of individuals needing to be biochemically tested for T2DM.
Two sensitivity analyses were conducted for the 2020 sample in which the cut-off value was chosen based on increasing the specificity to 90%, to reduce false positives, thereby reducing the fraction of individuals needing to be biochemically tested for T2DM, and by increasing the sensitivity to 90%, to increase true positives, i.e., the proportion of individuals with T2DM detected by the risk score. Validation of the model-derived Qatari diabetes risk score. Using the above described methodology for deriving risk scores, we derived an independent diabetes risk score directly from the 2012 Qatar STEPwise Survey data 20 -that is data-derived risk score not using the model outcomes. To assess the validity of the model-derived Qatari diabetes risk score, we applied this risk score (for the year 2012) to the (empirical) sample of the 2012 Qatar STEPwise Survey 20 , and compared its performance to that of the data-derived risk score as applied to this same sample. Comparison with regional and international diabetes risk scores. Performance of the developed Qatari risk score was compared with validated regional and international risk scores that employ similar variables, especially obesity, which is critically important in the case of Qatar 19 . Regional risk scores were Omani 12 , Emirati 13 , and Saudi 14 , while international and widely-discussed scores 4 were American 29 , Danish 30 , Dutch 28 , Finnish 27 , Taiwanese 31 , and Thai 32 . Each score was reanalyzed to evaluate its performance on the Qatari population in 2020 by including only the covariables in the Qatari sample (i.e., risk factors included in both the present study and other published risk-score studies). For each applied risk score, we recalculated the cut-off, maximizing the sum of sensitivity and specificity for the Qatari population.
All statistical analyses were conducted using IBM SPSS Statistics 25 33 .

Results
Characteristics of simulated samples. Of  Yields of a diabetes testing program. Figure 1 and Table S2 of SM show the yields of a T2DM testing program, for each targeted subpopulation stratum. In 2020, numbers of obese, smoking, and physically inactive women that needed to be tested to identify one T2DM case ranged from 26.3, 52.4, and 64.8, respectively, for those 15-19 years old, to 2.7, 4.8, and 3.8, respectively, for those 75-79 years old (Fig. 1A). Similarly, numbers of obese, smoking, and physically inactive men that needed to be tested to identify one T2DM case ranged from 11.0, 20.8, and 23.0, respectively, for those 15-19 years old, to 2.3, 3.7, and 3.2, respectively, for those 75-79 years old (Fig. 1B). For individuals with none of these risk factors, the testing yield for women and men ranged from 67.6 and 23.3, respectively, for those 15-19 years old, to 4.9 and 3.9, respectively, for those 75-79 years old ( Fig. 1). The yields in 2030 were relatively similar to those in 2020, while the yields in 2050 were superior to those in 2020 (Table S2).
Univariable and multivariable logistic regression. Table 1 shows the univariable and multivariable logistic regression results for 2020, 2030, and 2050, and the specific risk score for each variable. All considered covariables were significantly associated with T2DM in univariable-level analyses and remained so at the multivariable level (Table 1). Overall, in the multivariable analysis, age and obesity were the strongest predictors for T2DM and contributed most to the risk score (Table 1). Individuals aged ≥ 55 were at substantially higher risk of T2DM compared to younger individuals. The specific risk score for age decreased with time, while the specific risk score for sex, obesity, smoking, and physical inactivity remained largely stable (Table 1).
For illustration, the 2020 Qatari diabetes risk score was expressed using the formula illustrated in Box 1.
In the sensitivity analysis in which the cut-off value was 34.5, a value chosen to achieve a specificity of 90%, 15.8% (95% CI 14.8-16.9%) of Qataris aged 15-79 years old were at high risk of having undiagnosed T2DM in 2020, and would therefore be recommended for glycemia testing. By maximizing the specificity, 59.1% of T2DM cases would be missed.
In the sensitivity analysis, in which the cut-off was 18.5, a value chosen to achieve a sensitivity of 90%, 59.7% (95% CI 58.3-61.0%) of Qataris aged 15-79 years old were at high risk of having undiagnosed T2DM in 2020, and would therefore be recommended for glycemia testing. By maximizing sensitivity, only 10.0% of T2DM cases would be missed. Validation of the model-derived Qatari diabetes risk score. Table S3 shows the data-derived risk score using the 2012 Qatar STEPwise Survey data. Table S4 shows also the 2012 model-derived risk score, as derived using the model outcomes. Table S5 shows performance of these two risk scores when both are applied to the 2012 STEPwise survey sample. For the model-derived risk score, the AUC was 0.69 (95% CI 0.66-0.72), similar to the AUC of the data-derived risk score of 0.70 (95% CI 0.68-0.73). Diagnostic performance was affirmed similar for both of these scores. Comparison with regional and international diabetes risk scores. Table 3 shows performance of regional (Emirati, Omani, and Saudi) and international (American, Danish, Dutch, Finnish, Taiwanese, and Thai) risk scores as applied to the 2020 Qatari sample. For all risk scores, the AUC ranged between 0.71 and 0.77; lower than the AUC of the Qatari risk score (0.79). Of the regional risk scores, the Emirati score had the largest AUC at 0.76 (95% CI 0.74-0.78) with a sensitivity of 62.5% (95% CI 59.4-65.5%) and a specificity of 77.4% (95% CI 76.1-78.7%). Of the international risk scores, the Danish score had the largest AUC at 0.77 (95% CI 0.76-0.79) with a sensitivity of 76.1% (95% CI 73.3-78.7%) and a specificity of 66.7% (95% CI 65.3-68.2%).

Discussion
In addition to existing published methodologies for deriving diabetes risk scores, which are mostly based on logistic regression analyses of cross-sectional or prospective data 2-4 , we demonstrated a new methodology with broad utility and application. There are two major advantages to this new approach compared to existing Table 2. Performance of the Qatari diabetes risk score at three different time points: 2020, 2030, and 2050. AUC area under the curve; CI confidence interval; PPV positive predictive value; NPV negative predictive value. € The risk score cut-off was chosen based on the maximum sum of sensitivity and specificity. £ Proportion of individuals who had a risk score greater or equal to the cut-off value.
Year AUC (95% CI) Sensitivity (%; 95% CI) Specificity (%; 95% CI) PPV (%; 95% CI) NPV (%; 95% CI) Risk score cut-off €   Table 3. Performance of three regional and four international diabetes risk scores in predicting diabetes mellitus among Qataris in 2020. AUC area under the curve; CI confidence interval. *For each risk score, the cut-off was recalculated to maximize sum of sensitivity and specificity for the Qatari sample. www.nature.com/scientificreports/ methods. First, it can be applied to countries with limited or insufficient nationally-representative populationbased survey data. Second, it dynamically factors the temporal evolution of T2DM epidemics and T2DM risk factors; thus, it can provide risk scores at variable time points in the future. The presented approach is especially suited for countries with inconsistent, or apparently conflicting survey data as well as in countries where data are limited or sparse (such as in MENA, Africa, or other low-income countries), but where conducting T2DM modeling informed by a global understanding of T2DM epidemiology is possible. Many countries may have different population-based surveys, but the data are difficult to reconcile due to variations in survey quality, time, design, geographic coverage, and methods to ascertain T2DM and risk factors, in addition to inconsistent definitions of outcomes and differences in response rates among others 18,[34][35][36] . By using the introduced modeling approach, model fitting will ensure that the best fit to the data is reached, factoring all existing survey data, adjustments/corrections to these data, and weights for the level of confidence in data from each survey, irrespective of discrepancies and limitations in available data.
Here we applied this methodology to Qatar, one of the most T2DM-burdened nations worldwide. T2DM prevalence in this nation was projected to reach 24.0% by 2050, with a relative increase of 43% between 2012 and 2050 19 . Close to one-third of national health expenditure in 2050 was predicted to be spent on tackling T2DM and its complications 19 . These figures highlight the urgency of cost-effective interventions for early detection of undiagnosed T2DM cases, such as the use of risk scores. This approach might also be useful as a tool for screening campaigns and programs, and to disseminate awareness and increase knowledge about T2DM and its risk factors. Our model-derived risk score, though simple to implement and non-invasive, demonstrated adequately high diagnostic accuracy with a PPV of 36% and a NPV of 93% in 2020 (Table 2). Importantly, its application to empirical survey data demonstrated a performance similar to that of a data-derived risk score (Table S5) affirming the reliability of this approach.
Results of this model-derived score showed that a large proportion of the adult Qatari population (> 42%) has a score above or equal to the cut-off value of the score; hence, the need to be tested for glycaemia on regular basis ( Table 2). The model-derived risk score indicated that virtually any Qatari older than 55 years of age, or any Qatari living with obesity and older than 35 years of age, is at high risk of having undiagnosed T2DM (Table 1), and should be regularly tested for it. Similarly, recent results from developing a risk score in Jeddah, Saudi Arabia showed that everyone aged 50 years or older should be tested for glycaemia, since more than half of people in this age group have it 15 . The presented results also demonstrated large variations in the yields of T2DM testing by sex, age, and T2DM risk factors as well as over time ( Fig. 1 and Table S2). The best yields of T2DM testing were attained for those older than 50 years of age, or those living with obesity, where generally well below 10 tests are needed to diagnose an individual living with T2DM.
Findings of the model-derived score indicated that despite some variation, the structure and coefficients of the risk score were only minimally variable over time ( Table 1). The same was true for the proportion of Qataris that needed to be regularly tested for T2DM, which only varied between 42% in 2020 and 45% in 2050 (Table 2). Though the model-derived Qatari diabetes risk score demonstrated superior performance on the Qatari population compared to that of other regional and international risk scores (Table 3), the other risk scores still showed good diagnostic accuracy, suggesting the universality of some aspects of the global T2DM epidemic-in particular the effects of age, ageing cohorts, and obesity.
This study has some limitations. Even though the risk score was derived from a sample generated directly from the model outcomes, it did not have a perfect performance compared to the model outcomes (Table 2 and Fig. 2). By design, a risk score has to be simple in structure for ease of use; therefore, it cannot fully represent the rich modelled T2DM dynamics, such as overlap and interactions of the different T2DM risk factors 19 . We developed the risk score from model-simulated, population-based samples, akin to how risk scores are derived using samples recruited through cross-sectional, population-based surveys [12][13][14]28,30,32 , thereby yielding an accessible risk score that can be used broadly, both in health facilities and by the general population. The approach was also validated by comparing the model-derived risk score to that of a data-derived risk score (Table S5). Yet, the 2012 Qatar STEPwise Survey data used to validate the score were also part of the input data used to calibrate the model. Preferably, validation of the score should be based on fully independent data such as those of the next planned STEPwise Survey.
Limitations in the input data have affected the number of factors that could be included in the risk score, as well as its application to non-Qataris residing in Qatar. However, given that diabetes risk scores developed in other populations also showed good accuracy in detecting T2DM (Table 3), this provides assurance that the risk score developed here could be of utility to non-Qatari residents. Variables originally included in the mathematical model also affected the factors that could be included in the risk score. For instance, the risk score did not include family history of diabetes as this factor was not part of the original mathematical model. However, as more population-based data become available, there will be opportunities to expand the mathematical model and to refine this score by including other factors such as family history among others.
The score cut-off value was chosen by maximizing the sum of sensitivity and specificity, but other approaches could have been used, as required by any specific program, such as the need to maximize sensitivity or specificity, presented here in sensitivity analyses. Clearly, maximizing specificity will always be more efficient, but has a "cost" of missing many people with undetected T2DM. We compared our risk score with some regional and international scores, but we could not compare with other scores due to insufficient overlap with the variables used in our risk score 11,[37][38][39] . Finally, this novel method of deriving risk scores remains to be further tested and validated by applying it to different populations and learning from these experiences.
In conclusion, a diabetes risk score for Qataris, based on a set of non-invasive and easy-to-capture variables, was derived using an innovative approach of broad utility and application, and it can account for temporal variation in T2DM epidemiology. The model-derived score demonstrated diagnostic accuracy and comparable performance to that of a data-derived score. It also identified population strata that should be prioritized for testing www.nature.com/scientificreports/ for glycaemia and preventive interventions. With the above findings, the developed self-complete score can be easily implemented as part of awareness campaigns and initial screening programs to determine the need for invasive biochemical testing, or to prioritize individuals for lifestyle counselling and T2DM prevention programs.

Data availability
MATLAB codes for the model can be obtained from the authors.