Efficiency score from data envelopment analysis can predict the future onset of hypertension and dyslipidemia: A cohort study

Primary prevention focuses on ensuring that healthy people remain healthy. As it is practically difficult to provide intervention for an entire healthy population, it is essential to identify and target the at risk of risks population. We aimed to distinguish at risk of risks population using data envelopment analysis (DEA). Efficiency score was calculated from the DEA using a cohort sample and its association with the onset of hypertension and dyslipidemia was analyzed. A stratification analysis was performed according to the number of conventional risk factors in participants. The adjusted odds ratios (aORs) of the incidence of hypertension and dyslipidemia according to a 0.1-point increase in efficiency score were 0.66 (90% confidence interval [CI] 0.55–0.78, p < 0.0001) and 0.84 (90% CI 0.75–0.94, p = 0.01), respectively. In the stratification analysis, aOR of the incidence of hypertension according to a 0.1-point increase in efficiency score was 0.57 (90% CI 0.37–0.89, p = 0.04) in participants with no conventional risk factors. Participants with lower efficiency score were suggested to be at high risk for future onset of hypertension and dyslipidemia. The DEA might enable us to identify the risk of hypertension where conventional methods might fail.


Data envelopment analysis
We performed DEA for hypertension using two estimating methods for salt intake; salt intake was obtained from a brief self-administered diet history questionnaire, and estimated salt intake was calculated from urine sodium and urine creatinine at baseline. Comparison of the two efficiency scores are shown in Supplementary Figure S3.
Results of the data envelopment analysis are shown in Supplementary Tables S1 and S2, for hypertension and dyslipidemia, respectively. There were 12 efficient (efficiency score = 1) participants (A to L DMU) in hypertension, and 10 (O to X DMU) in dyslipidemia. We showed the least efficient 5 DMUs in Additional Tables 1 and 2; M to Q DMU in hypertension and Y to AC DMU in dyslipidemia. Among these DMUs, we selected 4 least efficient DMUs, that is P, Q, AB, and AC DMU, and calculated excess use in these DMUs using lambda values (weight of the peers) shown in Supplementary Tables S1 and S2. Lambda values show the variables related to the constraints limiting the efficiency of each unit to be no greater than 1. These results are shown in Supplementary Tables S3 and S4; for example, although Q DMU had less salt intake by 8.5 g, less energy intake by 1814.7 kcal, and more physical activity equivalent to 26.0 METs-h/day, he was at a risk of hypertension (inefficient) due to his predisposition.

Logistic regression analysis
In the logistic regression analysis, BMI was categorized into the unstratified model for dyslipidemia. The linearity of age on the logit for the incidence of dyslipidemia in participants with three or more conventional risk factor seemed to be non-linear; however, using categorized age in this model inflated the VIF value to 10.5.
Hence, we used uncategorized age. The largest VIF value among all the models was 2.29, indicating that there was no collinearity in the models. Interaction terms were not added to the model for a reliable analysis, because none of the interaction terms were significant in the model, and they inflated the VIF value to an unacceptable range. Results of the two sensitivity analyses that were performed are shown in Supplementary Table S5; 1) efficiency score using estimated salt intake from urinary analysis, and 2) adjusting the model of hypertension with potassium intake.
ROC curves of the efficiency scores for the onset of hypertension and dyslipidemia after the multivariate logistic regression analysis are shown in Supplementary Figure S4.

DISCUSSION
Results of the sensitivity analysis showed comparable results with the primary models shown in the main manuscript.
As described above and in Supplementary Tables S3 and S4, inefficient DMUs had better lifestyles than do the efficient DMUs; however, their health status-blood pressure and serum cholesterol levels-were poorer than that of the efficient DMUs. Factors such as genetic, socio-economic, lean body mass, and renal function for example, could be affecting this difference in the efficiency; we assume that the combined effect of these and other confounding factors, including unknown ones, is equivalent to the excess use observed in each DMU. As an example, Q DMU is predisposed to the risk of hypertension that equals the intake of 8. Odds ratio calculated using logistic regression analysis. Efficiency score calculated using data envelopment analysis.
a Variables in the models are efficiency score, conventional risk score, age, sex, and body mass index at baseline.
b Variables in the models are efficiency score, conventional risk score, age, sex, body mass index, and potassium intake at baseline.