Towards precision cardiometabolic prevention: results from a machine learning, semi-supervised clustering approach in the nationwide population-based ORISCAV-LUX 2 study

Fagherazzi, Guy; Zhang, Lu; Aguayo, Gloria; Pastore, Jessica; Goetzinger, Catherine; Fischer, Aurélie; Malisoux, Laurent; Samouda, Hanen; Bohn, Torsten; Ruiz-Castell, Maria; Huiart, Laetitia

doi:10.1038/s41598-021-95487-5

Download PDF

Article
Open access
Published: 06 August 2021

Towards precision cardiometabolic prevention: results from a machine learning, semi-supervised clustering approach in the nationwide population-based ORISCAV-LUX 2 study

Guy Fagherazzi^1,2,
Lu Zhang³,
Gloria Aguayo¹,
Jessica Pastore¹,
Catherine Goetzinger^1,4,
Aurélie Fischer¹,
Laurent Malisoux¹,
Hanen Samouda¹,
Torsten Bohn¹,
Maria Ruiz-Castell¹ &
…
Laetitia Huiart^1,4

Scientific Reports volume 11, Article number: 16056 (2021) Cite this article

1596 Accesses
8 Citations
3 Altmetric
Metrics details

Subjects

Abstract

Given the rapid increase in the incidence of cardiometabolic conditions, there is an urgent need for better approaches to prevent as many cases as possible and move from a one-size-fits-all approach to a precision cardiometabolic prevention strategy in the general population. We used data from ORISCAV-LUX 2, a nationwide, cross-sectional, population-based study. On the 1356 participants, we used a machine learning semi-supervised cluster method guided by body mass index (BMI) and glycated hemoglobin (HbA1c), and a set of 29 cardiometabolic variables, to identify subgroups of interest for cardiometabolic health. Cluster stability was assessed with the Jaccard similarity index. We have observed 4 clusters with a very high stability (ranging between 92 and 100%). Based on distinctive features that deviate from the overall population distribution, we have labeled Cluster 1 (N = 729, 53.76%) as “Healthy”, Cluster 2 (N = 508, 37.46%) as “Family history—Overweight—High Cholesterol “, Cluster 3 (N = 91, 6.71%) as “Severe Obesity—Prediabetes—Inflammation” and Cluster 4 (N = 28, 2.06%) as “Diabetes—Hypertension—Poor CV Health”. Our work provides an in-depth characterization and thus, a better understanding of cardiometabolic health in the general population. Our data suggest that such a clustering approach could now be used to define more targeted and tailored strategies for the prevention of cardiometabolic diseases at a population level. This study provides a first step towards precision cardiometabolic prevention and should be externally validated in other contexts.

Identification and epidemiological characterization of Type-2 diabetes sub-population using an unsupervised machine learning approach

Article Open access 27 May 2022

Saptarshi Bej, Jit Sarkar, … Olaf Wolkenhauer

Identifying top ten predictors of type 2 diabetes through machine learning analysis of UK Biobank data

Article Open access 24 January 2024

Moa Lugner, Araz Rawshani, … Björn Eliasson

Cross-sectional metabolic subgroups and 10-year follow-up of cardiometabolic multimorbidity in the UK Biobank

Article Open access 21 May 2022

Anwar Mulugeta, Elina Hyppönen, … Ville-Petteri Mäkinen

Introduction

Globally, the epidemic of cardiometabolic diseases, such as type 2 diabetes and hypertension, is rising, thus there is an urgent need for better tools to manage the crisis and prevent as many cases as possible¹. Primary prevention has been shown to be possible; lifestyle intervention, medication or bariatric surgery strategies have shown to be efficient to reduce the incidence of type 2 diabetes or hypertension in at-risk individuals^2,3,4. However, these strategies may be sub-optimal and do not rely on a complete understanding of the detailed cardiometabolic profiles of the general population⁵. Most of the screening and prevention strategies are simply based on a few factors such as age, body mass index, metabolic syndrome, hyperglycemia or risk score such as Findrisc⁶ to identify eligible people. We tend to omit a potentially high variability in individuals at a given level of risk, for instance in terms of genetic profiles⁷, inflammation, oxidative stress⁸, insulin resistance⁹ and hepatic gluconeogenesis¹⁰, that could open a window of opportunity for more relevant strategies.

Cluster analyses are useful approaches to identify subgroups with different cardiometabolic profiles. Such an approach has recently been developed among people with diabetes, the analysis revealing 5 subgroups with different clinical profiles and risks of diabetes-related complications¹¹, but has never been investigated in the general population at large scale¹². Besides, clustering approaches used in the litterature so far were mostly unsupervised where it is assumed that there is no outcome variable nor is anything known about the relationships between the observations in the data set, which is not a reliable hypothesis with respect to cardiometabolic prevention. Semi-supervised clustering techniques may therefore be more adapted to derive meaningful groups¹³, similarly to what has been recently suggested in people with type 1 diabetes¹⁴, to redefine the way we consider, prevent and treat cardiometabolic diseases in the general population, not as independent entities but rather with a more comprehensive, patient-centered, approach.

Therefore, based on the unique set of cardiometabolic data available in the nationwide population-based ORISCAV-LUX 2 study, our objective was to stratify the general population in terms of cardiometabolic profiles with a high level of granularity, guided by two key factors to assess cardiometabolic health, namely (1) body mass index (BMI), the most frequently used indicator to evaluate adiposity in large populations and an established risk factor of numerous cardiometabolic disorders, highly correlated with various cardiometabolic and cardiovascular risk factor and (2) glycated hemoglobin (HbA1c), a reliable and documented biomarker of glycemic control that is also correlated with many cardiometabolic conditions and surrogate markers^15,16,17,18. This new clustering will help to have a better understanding of the cardiometabolic health of the general population and might eventually help to tailor and target early prevention strategies to people who would benefit the most, thereby representing a first step towards precision prevention for cardiometabolic diseases.

Materials and methods

ORISCAV-LUX 2 study

The “Observation of cardiovascular risk factors in Luxembourg” (ORISCAV-LUX) 2 is the second wave of the nationwide cross-sectional, population-based ORISCAV-LUX study. The ORISCAV-LUX 1 survey, conducted between November 2007 and January 2009, was the first nationwide cross-sectional survey of cardiovascular health monitoring in Luxembourg with the objective of describing baseline information on the prevalence of “traditional” cardiovascular risk factors, including obesity, hypertension, diabetes mellitus, lipid disorders, smoking and physical inactivity among the general adult population in Luxembourg¹⁹. The second wave of ORISCAV-LUX was initiated in 2016 to update and monitor the evolution of cardiometabolic parameters in the general population. An extended set of health indicators, new clinical examinations and self-reported information were then integrated in this second round of data collection. The data collection workflow has already been detailed extensively elsewhere²⁰. Informed consent was obtained from all participants. The study design and information collected were approved by the National Research Ethics Committee (CNER, No 201,505/12) and the National Commission for Private Data Protection (CNPD). All methods were carried out in accordance with the Declaration of Helsinki, 2008.

Study population

We included participants from the second wave of the ORISCAV-LUX study (2016–2018), where more detailed information on cardiometabolic health was available. We initially included 1558 participants, then excluded participants who only filled in the self-administrated questionnaire (n = 120), did not get a lab test (n = 51), with no body composition measures available (n = 30) and an outlier in the HbA1c distribution (HbA1c = 109 mmol/mol, n = 1). Therefore, we finally considered N = 1356 participants in the present analysis (see flow chart, Fig. 1).

Clinical and laboratory data assessment

HbA1c was measured on an HPLC analyser, Tosof G8™. Heart rate, pulse wave velocity, central pressure, arterial age, lying position blood pressure were measured with Complior™. ECG were read and interpreted by a cardiologist and then categorized as normal or abnormal. Bioimpedanciometry measures of body fat percentage in the trunk, muscle mass in the trunk, total fat and fat free mass in the trunk were assessed with a Tanita™ digital scale. Insulin was measured on Abbott immunology analyser (chemiluminescence technique). Insulin resistance was assessed with the HOMA-IR index, calculated as Insulin (mIU/l) × Glucose (mmol/l)/22.5. Insulin sensitivity was estimated with the Quicki index, calculated as 1/[log (Insulin, mUI/l) + log (Glucose, mg/dl)]. Glomerular filtration Rate was estimated with the MDRD formula.

Cluster analysis

We performed a semi-supervised cluster analysis guided by BMI and HbA1c to identify subgroups of interest¹³. Five measures, i.e. the means and variances of BMI and HbA1c, as well as the covariance between BMI and Hba1c, were predicted for each individual using reinforcement learning trees (RLT), a type of tree-based machine learning technique²¹. The five clustering variables (RLT-predicted means and variances of BMI and HbA1c and their covariance) were standardized and a k-means clustering algorithm²² with Euclidean distance was applied. Clustering was tested with and without taking the covariance between BMI and HbA1C into account.

A set of 51 cardiometabolic factors was available in ORISCAV-LUX 2. The factors of body fat and muscle mass from different body parts were highly correlated (pearson coefficient > 0.95), so we only kept the body fat and muscle mass from the trunk for further analysis to increase clustering stability. Overall, we used a subset of 31 factors to be included in the cluster analysis (the remaining factors were only used a posteriori for illustrative purposes, see Table 1). RLT prediction was performed based on the following set of cardiometabolic factors: demographic (age and sex), clinical (ECG interpretation, heart rate, carotid-femoral pulse wave velocity, central pressure, arterial age, defined as the average age for a given carotid-femoral pulse wave velocity²³, lying position blood pressure), anthropometric (waist circumference, hip circumference, thigh circumference, waist to hip ratio, anthropometrically predicted visceral adiposity²², body fat percentage in the trunk, muscle mass in the trunk, total fat and fat free mass in the trunk), and laboratory (insulin, insulin resistance, insulin sensitivity, glomerular filtration rate, creatinine, total cholesterol, LDL cholesterol, HDL-cholesterol, triglycerides, CRP) measures. A missing at random mechanism was assumed and missing values were imputed using multiple imputation by chained equations (mice R package²⁴).

Table 1 Study characteristics in the overall population. (ORISCAV-LUX 2 study, N = 1356).

Full size table

Clustering stability was assessed using clusterboot function from the fpc R package. The data is resampled 100 times using bootstrap and the Jaccard similarities²⁵ of the original clusters to the most similar clusters in the resampled data are computed. The mean over these similarities is used as an index of the stability of a cluster. The assessment was applied to the clustering with the number of clusters from 3 and 8. We chose the clustering with the highest mean Jaccard similarity index of the clusters and the smallest cluster greater than 20 participants. Clusters were ordered by increasing HbA1c median. Each cluster was then described according to the variables used for the clustering, but also with additional illustrative variables: lifestyle factors (physical activity assessed with the International Physical Activity (IPAQ) questionnaire, time spent in seated position and smoking status categorized into never, former and current smoker), equivalised disposable income, sedentary occupation and other health factors such as self-perceived health (five categories from excellent to poor), family history of diabetes, hypertension, hypercholesterolemia and personal history of diabetes, cancer and hypertension.

Data are presented in Table 1 as n [%] and median [min, max] for categorical and continuous variables, respectively in the entire population In Table 2, study participants’ characteristics are displayed according to their clusters. In Table 2, we also computed the average 10-year cardiovascular risk [%] per cluster, based on either the SCORE²⁶ (validated for people < 70 years and no previous cardiovascular disease or type 2 diabetes mellitus) or the ADVANCE²⁷ (validated for people with type 2 diabetes) risk score, whichever was most appropriate. We used the median values of the continuous variables, and considered that the binary variables were present if more than 50% of the cluster were concerned. In Fig. 2, a scatter plot of body mass index and HbA1c distribution was computed and stratified by cluster group. In Fig. 3, we have plotted the distribution of the clusters in radar diagrams according to 35 key characteristics grouped in 5 themes (Diabetes-related factors, Anthropometry, Lipids & Biomarkers, Cardiovascular Health, Sociodemographic, Lifestyle and other Health Factors). For each feature, we computed the relative difference, expressed in percentage, between the median value (or frequency for categorical variables) in the cluster and the median value (or frequency for categorical variables) in the overall population.

Table 2 Study characteristics by cluster. (ORISCAV-LUX 2 study, N = 1356).

Full size table

Results

Population study characteristics

The RLT model without taking the covariance between BMI and HbA1C into account provided the most stable clusters. We tested iteratively clustering with k = 3 to 8 and we defined the final number of clusters as the one which maximized the stability index while ensuring a sufficient number of individuals in each group, with at least 20 individuals. Therefore, the optimal number of clusters appeared to be 4 and the analysis revealed a very high level of stability, with Jaccard similarity index values of 100%, 100%, 94% and 92% for clusters 1, 2, 3 and 4 respectively (Table 1). Based on the extensive description of characteristics of individuals in each cluster, Cluster 1 was labeled “Healthy”, Cluster 2 was labeled “Family history—Overweight—High Cholesterol”, Cluster 3 was labeled “Severe Obesity—Prediabetes—Inflammation” and Cluster 4 was labeled “Diabetes—Hypertension—Poor CV Health”.

Cluster 1 “Healthy” encompassed a total of N = 729 participants (53.76% of the total population). Compared to the overall population (Table 1), members of Cluster 1 were characterized by young individuals (median, m = 46.69 years old) with a low median HbA1c level (m = 34.00 mmol/mol) and low BMI (m = 23.36 kg/m²) (Fig. 2). They also had the lowest values for anthropometric features such as waist-to-hip ratio (m = 0.85), fat mass percentage (m = 24.30%) or predicted visceral adiposity (m = 6.00 cm²). In terms of lipids and biomarkers, they had the highest level of HDL cholesterol (m = 60.00 mg/dl), a high percentage of family history of hypercholesterolemia (42.39%) and the best renal function (GFR = 84.88 ml/min/1.73 m²). Regarding diabetes-related factors, Cluster 1 members had the lowest values for fasting blood glucose (m = 86.00 mg/dl), diabetes diagnosis (1.10%) and HOMA-IR (m = 1.24). Oppositely, they had the highest insulin sensitivity (Quicki index m = 0.37). Cluster 1 can be considered as the healthiest cluster in terms of cardiovascular health, as they had the lowest values of vascular age (m = 43.00 years), central pulse pressure (m = 38.00 mmHg), pulse wave velocity (m = 7.50 m/s), abnormal ECG reading (10.70%), and systolic blood pressure (m = 120.00 mmHg). Finally, they were also more frequently non-smokers (m = 62.83%), had higher income (3750.00 €/month) and had a higher median time spent sitting (m = 360.00 min/day) and sedentary occupation (m = 59.26%) (Table 1, Fig. 3). The average 10-year cardiovascular risk for Cluster 1 was 0%.

Cluster 2 “Family history—Overweight—High Cholesterol” encompassed N = 508 participants (37.46% of the total population). Members of Cluster 2 were in the vast majority overweight (m = 28.48 kg/m²) with low values of HbA1c levels (m = 37.00 mmol/mol). Overall, they had intermediate values for all considered anthropometric features. They were characterized by elevated total (m = 205.00 mg/dl) and LDL cholesterol levels (m = 128.50 mg/dl). They also had a high frequency of family history of diabetes (25.00%) and a high percentage of family history of high blood pressure (43.70%). The average 10-year cardiovascular risk for Cluster 2 was 2%.

Cluster 3 “Severe Obesity—Prediabetes—Inflammation” encompassed N = 91 participants (6.71% of the total population). Cluster 3 included individuals with obesity or severe obesity with a higher BMI (m = 35.69 kg/m²) and a higher HbA1c level (m = 39.00 mmol/mol) than those in Cluster 2. Cluster 3 was characterized by the highest values for all considered anthropometric features—except waist-to-hip ratio—with elevated waist circumference (m = 114.00 cm), hip circumference (m = 118.30 cm) or fat mass percentage (m = 42.20%). They had the highest level of inflammation, based on CRP levels (m = 3.03 mg/l). Cluster 3 members had intermediate values for all diabetes-related factors and cardiovascular health factors. There was an over-representation of women in Cluster 3 (61.54%), with a relatively high level of physical activity (3558.00 MET-minutes/week). The average 10-year cardiovascular risk for Cluster 3 was 1%.

Cluster 4 “Diabetes—Hypertension—Poor Cardiovascular Health” encompassed N = 28 participants (2.06% of the population). Members of Cluster 4 were mainly overweight and individuals with obesity (BMI, m = 29.20 kg/m²) with elevated HbA1c levels (m = 54.50 mmol/mol). Cluster 4 is characterized by elevated Waist-to-Hip ratio (m = 1.00). Members of Cluster 4 had the highest triglycerides levels of all (m = 149.00 mg/dl). Regarding diabetes-related factors, most of Cluster 4 members had diabetes (89.29%), they had the highest levels of fasting blood glucose (m = 149.50 mg/dl), insulin levels (m = 16.05 μIU/ml) and insulin resistance (HOMA-IR, m = 6.54). Most of them had hypertension (89.29%) and had the highest values for vascular age (m = 61.00 years), central pulse pressure (m = 43.00 mmHg), pulse wave velocity (m = 9.55 m/s), systolic blood pressure (m = 136.50 mmHg) and percentage of abnormal ECG reading (m = 32.14%). When compared to the overall population, Cluster 4 members were the oldest participants (m = 63.24 years) and had elevated time spent sitting (m = 360 min/day) but the lowest frequency of sedentary occupation (m = 46.43%). The average 10-year cardiovascular risk for Cluster 4 was 15%.

Discussion

In this large, nationwide population-based study, we have observed 4 stable clusters of individuals from the general population with diverse cardiometabolic health profiles. Our study suggests that this classification could help disentangle the heterogeneity in the general population in terms of cardiometabolic health and be used to tailor prevention strategies. Whereas a first group of more than 50% of the total population (Cluster 1 “Healthy”) was characterized with healthy cardiometabolic features and could benefit from a general prevention strategy, the other 3 groups (Clusters 2–4) may benefit from a more personalized and intensive approach to improve their health. Individuals in Cluster 2 “Family history—Overweight—High Cholesterol” may benefit from a more comprehensive strategy regarding overweight/obesity management and cholesterol with a personalized treatment (e.g. through diet, physical activity, psychology or pharmacological treatment) and starting from an early age for individuals with family history of cardiometabolic diseases. This could delay or prevent them from transitioning from Cluster 2 to Clusters 3 or 4²⁸. People in the Cluster 3 “Severe Obesity—Prediabetes—Inflammation” may benefit from an intense lifestyle management strategy adapted to individuals with moderate obesity^29,30, or bariatric surgery for those with severe obesity^31,32 with a close monitoring of the impact on low-grade inflammation levels and the reverse of prediabetes to a normoglycemic status^33,34. Cluster 4 ‘Diabetes—Hypertension—Poor Cardiovascular Health” are often in a multimorbid state, with diabetes and hypertension simultaneously and for a third of them with an abnormal ECG reading or elevated triglyceride levels. Therefore, they could benefit from an intensive combined approach, personalized according to the socioeconomic profile and occupation, with nutritional/dietary³⁵ or lifestyle³⁶ interventions, smoking cessation³⁷, medication or surgery strategies, targeting both high blood pressure and diabetes with the ultimate objective to reduce arterial stiffness and prevent the occurrence of cardiovascular disease and improve general health status^38,39.

Overall, these groups may benefit from more efficient prevention and therapeutic strategies. If externally validated, general practitioners could one day rely on this profiling to have a better picture of a new patient when limited information is available and try to optimize several cardiometabolic parameters simultaneously. Some previous attempts of defining metabotypes⁴⁰, i.e. metabolomic profiles or combinations of specific metabolites used for classification of individuals into groups have been proposed to advance cardiometabolic prevention⁴¹. These approaches, along with other recent technologies (big data analysis of gut microbiota, integration of real-time data from wearables), are still complex and not yet cost-effective to implement in practice⁴² and our approach could help to fill the gap and help move towards precision cardiometabolic prevention.

These findings are also an opportunity to rethink the strategies that can be offered, for instance to people with obesity⁴³, with new models developed according to a more refined definition of the targeted sub-population. Cardiometabolic health relies on complex, intricate, physiological relationships between all the considered parameters in this work. These results imply a move from a “one-size fits-all” vision to a precision cardiometabolic prevention approach to tackle cardiometabolic diseases according to the variety of phenotypes observed in the general population¹⁴.

Strengths and limitations

This study has numerous strengths. First, the large population size, combined with a unique set of cardiometabolic features or lifestyle and demographic factors, enabled us to extensively and deeply phenotype the general population in terms of cardiometabolic health. It has been shown that the ORISCAV-LUX 2 population was representative of the Luxembourgish adult population in terms of geographical district, but not with respect to sex and age distribution, young and elderly individuals being slightly under-represented and women over-represented. Nonetheless, it has been demonstrated that ORISCAV-LUX 2 is a reliable tool for epidemiological research and for cardiometabolic health monitoring in the adult residents in Luxembourg²⁰. We also used a semi-supervised clustering approach, guided by two main features for cardiometabolic health, which seems to be more adapted than totally unsupervised clustering to the reality of the knowledge of cardiometabolic health¹³.

This study also has some limitations. Cluster labelling is always subject to interpretation. We used, to the best of our ability, a systematic approach and relied on the most distinctive characteristics in each cluster to label them. Changing the choice of the key factors to guide the semi-supervised clustering (here BMI and HbA1c) could yield to different distributions, but they were chosen as they are frequently assessed in large populations and valid surrogate of the overall cardiometabolic health status^15,16,17,18. The relatively low number of individuals in clusters 3 and 4 could limit the inference that can be made out of these groups.

Stability of the clusters has been evaluated internally but now there is a need to replicate this approach externally, in other large nationwide population-based studies to evaluate external validation of this grouping. Some factors used to describe the clusters, such as physical activity, are self-reported, and therefore could be reported differently in the clusters. Besides, no mental health nor sleep-related factors were included in the descriptive analysis. In future replication studies, wearable devices could be used to collect objective measures of physical activity and sleep quality, which may be valuable information to add in the cluster description.

Conclusion

In conclusion, our work provides an in-depth characterization and thus, a better understanding of the general population in terms of cardiometabolic health. Our data suggest that such a clustering approach could now be used to define more targeted and tailored strategies for the prevention of cardiometabolic diseases at a population level. This study provides a first step towards precision cardiometabolic prevention and should be replicated in other contexts. Further studies evaluating the associations between these clusters and subsequent incidence of various cardiometabolic and cardiovascular diseases are warranted.

References

IDF Diabetes Atlas 9th edition 2019. (Accessed 1 July 2021); https://www.diabetesatlas.org/en/.
Diabetes Prevention Program Research Group. Reduction in the incidence of type 2 diabetes with lifestyle intervention or metformin. Obstetr. Gynecol. Surv. 58(3), 182–183 (2003).
Article Google Scholar
Shubrook, J. H., Chen, W. & Lim, A. Evidence for the prevention of type 2 diabetes mellitus. J. Am. Osteopath. Assoc. 118, 730–737 (2018).
PubMed Google Scholar
Diaz, K. M. & Shimbo, D. Physical activity and the prevention of hypertension. Curr. Hypertens. Rep. 15, 659–668 (2013).
Article Google Scholar
Sánchez, A., Silvestre, C., Campo, N., Grandes, G. & PreDE Research Group. Type-2 diabetes primary prevention program implemented in routine primary care: A process evaluation study. Trials 17, 254 (2016).
Article Google Scholar
Kivelä, J. et al. Obtaining evidence base for the development of Feel4Diabetes intervention to prevent type 2 diabetes—A narrative literature review. BMC Endocr. Disord. 20, 140 (2020).
Article Google Scholar
Padilla-Martínez, F., Collin, F., Kwasniewski, M. & Kretowski, A. Systematic review of Polygenic risk scores for type 1 and type 2 diabetes. Int. J. Mol. Sci. 21, 1703 (2020).
Article Google Scholar
Cӑtoi, A. F. et al. Metabolically healthy versus unhealthy morbidly obese: Chronic inflammation, nitro-oxidative stress, and insulin resistance. Nutrients 10, 1199 (2018).
Article Google Scholar
Samocha-Bonet, D. et al. Metabolically healthy and unhealthy obese—The 2013 Stock Conference report. Obes. Rev. 15, 697–708 (2014).
Article CAS Google Scholar
Roden, M. & Shulman, G. I. The integrative biology of type 2 diabetes. Nature 576, 51–60 (2019).
Article ADS CAS Google Scholar
Ahlqvist, E. et al. Novel subgroups of adult-onset diabetes and their association with outcomes: A data-driven cluster analysis of six variables. Lancet Diabetes Endocrinol. 6, 361–369 (2018).
Article Google Scholar
Tzeng, C.-R. et al. Cluster analysis of cardiovascular and metabolic risk factors in women of reproductive age. Fertil. Steril. 101, 1404–1410 (2014).
Article Google Scholar
Bair, E. Semi-supervised clustering methods. Wiley Interdiscip. Rev. Comput. Stat. 5, 349–361 (2013).
Article Google Scholar
Kahkoska, A. R. et al. Characterizing the weight-glycemia phenotypes of type 1 diabetes in youth and young adulthood. BMJ Open Diabetes Res. Care 8, e000886 (2020).
Article Google Scholar
Zorena, K. et al. Association between vascular endothelial growth factor and hypertension in children and adolescents type I diabetes mellitus. J. Hum. Hypertens. 24, 755–762 (2010).
Article CAS Google Scholar
Bower, J. K. et al. Glycated hemoglobin and risk of hypertension in the atherosclerosis risk in communities study. Diabetes Care 35, 1031–1037 (2012).
Article CAS Google Scholar
Takao, T., Matsuyama, Y., Suka, M., Yanagisawa, H. & Iwamoto, Y. The combined effect of visit-to-visit variability in HbA1c and systolic blood pressure on the incidence of cardiovascular events in patients with type 2 diabetes. BMJ Open Diabetes Res Care 3, e000129 (2015).
Article Google Scholar
Huang, T. et al. A network analysis of biomarkers for type 2 diabetes. Diabetes 68, 281–290 (2019).
Article CAS Google Scholar
Alkerwi, A., Pagny, S., Lair, M.-L., Delagardelle, C. & Beissel, J. Level of unawareness and management of diabetes, hypertension, and dyslipidemia among adults in Luxembourg: findings from ORISCAV-LUX study. PLoS ONE 8, e57920 (2013).
Article ADS CAS Google Scholar
Alkerwi, A. et al. Challenges and benefits of integrating diverse sampling strategies in the observation of cardiovascular risk factors (ORISCAV-LUX 2) study. BMC Med. Res. Methodol. 19, 27 (2019).
Article Google Scholar
Zhu, R., Zeng, D. & Kosorok, M. R. Reinforcement learning trees. J. Am. Stat. Assoc. 110, 1770–1784 (2015).
Article MathSciNet CAS Google Scholar
Samouda, H. et al. VAT=TAAT-SAAT: Innovative anthropometric model to predict visceral adipose tissue without resort to CT-Scan or DXA. Obesity 21, E41–E50 (2013).
Article Google Scholar
Khoshdel, A. R., Thakkinstian, A., Carney, S. L. & Attia, J. Estimation of an age-specific reference interval for pulse wave velocity: A meta-analysis. J. Hypertens. 24, 1231–1237 (2006).
Article CAS Google Scholar
van Buuren, S. & Groothuis-Oudshoorn, K. Mice: Multivariate imputation by chained equations in R. J. Stat. Softw. 45, 1–67 (2011).
Article Google Scholar
Hancock, J. M. Jaccard distance (Jaccard Index, Jaccard similarity coefficient). Dict. Bioinform. Comput. Biol. https://doi.org/10.1002/9780471650126.dob0956 (2004).
Article Google Scholar
Motamed, N. et al. Comparison of cardiovascular risk assessment tools and their guidelines in evaluation of 10-year CVD risk and preventive recommendations: A population based study. Int. J. Cardiol. 228, 52–57 (2017).
Article Google Scholar
Woodward, M. et al. Prediction of 10-year vascular risk in patients with diabetes: The AD-ON risk score. Diabetes Obes. Metab. 18, 289–294 (2016).
Article CAS Google Scholar
Mutie, P. M., Giordano, G. N. & Franks, P. W. Lifestyle precision medicine: The next generation in type 2 diabetes prevention? BMC Med. 15, 171 (2017).
Article Google Scholar
König, D., Hörmann, J., Predel, H.-G. & Berg, A. A 12-month lifestyle intervention program improves body composition and reduces the prevalence of prediabetes in obese patients. Obes. Facts 11, 393–399 (2018).
Article Google Scholar
Lv, N. et al. Behavioral lifestyle interventions for moderate and severe obesity: A systematic review. Prev. Med. 100, 180–193 (2017).
Article Google Scholar
Nguyen, N. T. & Varela, J. E. Bariatric surgery for obesity and metabolic disorders: State of the art. Nat. Rev. Gastroenterol. Hepatol. 14, 160–169 (2017).
Article Google Scholar
Sheetz, K. H., Gerhardinger, L., Dimick, J. B. & Waits, S. A. Bariatric Surgery and long-term survival in patients with obesity and end-stage kidney disease. JAMA Surg. https://doi.org/10.1001/jamasurg.2020.0829 (2020).
Article PubMed PubMed Central Google Scholar
Kerrison, G. et al. The effectiveness of lifestyle adaptation for the prevention of prediabetes in adults: A systematic review. J Diabetes Res. 2017, 8493145 (2017).
Article Google Scholar
Moutzouri, E., Tsimihodimos, V., Rizos, E. & Elisaf, M. Prediabetes: To treat or not to treat? Eur. J. Pharmacol. 672, 9–19 (2011).
Article CAS Google Scholar
Jenkins, D. J. A. et al. Effect of a dietary portfolio of cholesterol-lowering foods given at 2 levels of intensity of dietary advice on serum lipids in hyperlipidemia: A randomized controlled trial. JAMA 306, 831–839 (2011).
Article CAS Google Scholar
Johansen, M. Y. et al. Effect of an intensive lifestyle intervention on glycemic control in patients with type 2 diabetes: A randomized clinical trial. JAMA 318, 637–646 (2017).
Article Google Scholar
Li, W. H. C. et al. Effectiveness of a brief self-determination theory-based smoking cessation intervention for smokers at Emergency Departments in Hong Kong: A randomized clinical trial. JAMA Intern. Med. https://doi.org/10.1001/jamainternmed.2019.5176 (2019).
Article PubMed PubMed Central Google Scholar
Ikramuddin, S. et al. Lifestyle Intervention and medical management with vs without Roux-en-Y Gastric bypass and control of hemoglobin A1c, LDL cholesterol, and systolic blood pressure at 5 years in the diabetes surgery study. JAMA 319, 266–278 (2018).
Article Google Scholar
Byrne, J. L. et al. Effectiveness of the ready to reduce risk (3R) complex intervention for the primary prevention of cardiovascular disease: A pragmatic randomised controlled trial. BMC Med. 18, 198 (2020).
Article CAS Google Scholar
Riedl, A., Gieger, C., Hauner, H., Daniel, H. & Linseisen, J. Metabotyping and its application in targeted nutrition: An overview. Br. J. Nutr. 117, 1631–1644 (2017).
Article CAS Google Scholar
Brennan, L. Use of metabotyping for optimal nutrition. Curr. Opin. Biotechnol. 44, 35–38 (2017).
Article CAS Google Scholar
Wang, D. D. & Hu, F. B. Precision nutrition for prevention and management of type 2 diabetes. Lancet Diabetes Endocrinol. 6, 416–426 (2018).
Article Google Scholar
Kar, P. Partha Kar: Our approach to tackling obesity needs rethinking. BMJ 370, m3034 (2020).
Article Google Scholar

Download references

Acknowledgements

The authors thank the participants in ORISCAV-LUX 2 study and the research nurses from the Luxembourg Institute of Health for the data collection. The authors are thankful to the ORISCAV-LUX Study group, which is composed of Ala’a Alkerwi, Stephanie Noppe, Charles Delagardelle, Jean Beissel, Anna Chioti, Saverio Stranges, Jean-Claude Schmit, Marie-Lise Lair, Marylène D’Incau, Jessica Pastore, Gwenaëlle Le Coroller, Gloria Aguayo, Brice Appenzeller, Laurent Malisoux, Sophie Couffignal, Manon Gantenbein, Yvan Devaux, Michel Vaillant, Laetitia Huiart, Dritan Bejko, Torsten Bohn, Hanen Samouda, Guy Fagherazzi, Magali Perquin, Maria Ruiz-Castell and Isabelle Ernens.

Funding

The ORISCAV-LUX 2 study has been funded by the Luxembourg Institute of Health. No role to be declared of the funding body in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript. GF has received consulting fees from Lilly, MSD, Roche Diabetes Care, AstraZeneca, Diabeloop and Danone Research.

Author information

Authors and Affiliations

Deep Digital Phenotyping Research Unit, Department of Population Health, Luxembourg Institute of Health, 1A-B, rue Thomas Edison, 1445, Strassen, Luxembourg
Guy Fagherazzi, Gloria Aguayo, Jessica Pastore, Catherine Goetzinger, Aurélie Fischer, Laurent Malisoux, Hanen Samouda, Torsten Bohn, Maria Ruiz-Castell & Laetitia Huiart
Center of Epidemiology and Population Health UMR 1018, Inserm, Gustave Roussy Institute, Paris South - Paris Saclay University, Villejuif, France
Guy Fagherazzi
Quantitative Biology Unit, Luxembourg Institute of Health, 1A-B, rue Thomas Edison, 1445, Strassen, Luxembourg
Lu Zhang
University of Luxembourg, 2, avenue de l’Université, 4365, Esch-sur-Alzette, Luxembourg
Catherine Goetzinger & Laetitia Huiart

Authors

Guy Fagherazzi
View author publications
You can also search for this author in PubMed Google Scholar
Lu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Gloria Aguayo
View author publications
You can also search for this author in PubMed Google Scholar
Jessica Pastore
View author publications
You can also search for this author in PubMed Google Scholar
Catherine Goetzinger
View author publications
You can also search for this author in PubMed Google Scholar
Aurélie Fischer
View author publications
You can also search for this author in PubMed Google Scholar
Laurent Malisoux
View author publications
You can also search for this author in PubMed Google Scholar
Hanen Samouda
View author publications
You can also search for this author in PubMed Google Scholar
Torsten Bohn
View author publications
You can also search for this author in PubMed Google Scholar
Maria Ruiz-Castell
View author publications
You can also search for this author in PubMed Google Scholar
Laetitia Huiart
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

G.F. and L.Z. wrote the main manuscript text. L.Z. prepared Fig. 2. G.F. prepared Figs. 1 and 3. All authors reviewed the manuscript.

Corresponding author

Correspondence to Guy Fagherazzi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Fagherazzi, G., Zhang, L., Aguayo, G. et al. Towards precision cardiometabolic prevention: results from a machine learning, semi-supervised clustering approach in the nationwide population-based ORISCAV-LUX 2 study. Sci Rep 11, 16056 (2021). https://doi.org/10.1038/s41598-021-95487-5

Download citation

Received: 14 October 2020
Accepted: 27 July 2021
Published: 06 August 2021
DOI: https://doi.org/10.1038/s41598-021-95487-5

This article is cited by

Phenotyping the obesities: reality or utopia?
- Piero Portincasa
- Gema Frühbeck
Reviews in Endocrine and Metabolic Disorders (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.