Body composition predicts hypertension using machine learning methods: a cohort study

Nematollahi, Mohammad Ali; Jahangiri, Soodeh; Asadollahi, Arefeh; Salimi, Maryam; Dehghan, Azizallah; Mashayekh, Mina; Roshanzamir, Mohamad; Gholamabbas, Ghazal; Alizadehsani, Roohallah; Bazrafshan, Mehdi; Bazrafshan, Hanieh; Bazrafshan drissi, Hamed; Shariful Islam, Sheikh Mohammed

doi:10.1038/s41598-023-34127-6

Download PDF

Article
Open access
Published: 27 April 2023

Body composition predicts hypertension using machine learning methods: a cohort study

Mohammad Ali Nematollahi¹,
Soodeh Jahangiri²,
Arefeh Asadollahi³,
Maryam Salimi^2,4,
Azizallah Dehghan³,
Mina Mashayekh²,
Mohamad Roshanzamir⁵,
Ghazal Gholamabbas²,
Roohallah Alizadehsani⁶,
Mehdi Bazrafshan⁷,
Hanieh Bazrafshan⁸,
Hamed Bazrafshan drissi⁹ &
…
Sheikh Mohammed Shariful Islam^10,11,12

Scientific Reports volume 13, Article number: 6885 (2023) Cite this article

1921 Accesses
3 Citations
1 Altmetric
Metrics details

Subjects

Abstract

We used machine learning methods to investigate if body composition indices predict hypertension. Data from a cohort study was used, and 4663 records were included (2156 were male, 1099 with hypertension, with the age range of 35–70 years old). Body composition analysis was done using bioelectrical impedance analysis (BIA); weight, basal metabolic rate, total and regional fat percentage (FATP), and total and regional fat-free mass (FFM) were measured. We used machine learning methods such as Support Vector Classifier, Decision Tree, Stochastic Gradient Descend Classifier, Logistic Regression, Gaussian Naïve Bayes, K-Nearest Neighbor, Multi-Layer Perceptron, Random Forest, Gradient Boosting, Histogram-based Gradient Boosting, Bagging, Extra Tree, Ada Boost, Voting, and Stacking to classify the investigated cases and find the most relevant features to hypertension. FATP, AFFM, BMR, FFM, TRFFM, AFATP, LFATP, and older age were the top features in hypertension prediction. Arm FFM, basal metabolic rate, total FFM, Trunk FFM, leg FFM, and male gender were inversely associated with hypertension, but total FATP, arm FATP, leg FATP, older age, trunk FATP, and female gender were directly associated with hypertension. AutoMLP, stacking and voting methods had the best performance for hypertension prediction achieving an accuracy rate of 90%, 84% and 83%, respectively. By using machine learning methods, we found that BIA-derived body composition indices predict hypertension with acceptable accuracy.

Development and validation of a prediction equation for body fat percentage from measured BMI: a supervised machine learning approach

Article Open access 17 May 2023

Shiming Xu, Roch A. Nianogo, … Onyebuchi A. Arah

The relationship between anthropometric indices and the presence of hypertension in an Iranian population sample using data mining algorithms

Article 01 December 2023

Amin Mansoori, Najmeh Seifi, … Majid Ghayour-Mobarhan

Assessment of a proposed BMI formula in predicting body fat percentage among Filipino young adults

Article Open access 15 December 2020

Michael Van Haute, Emer Rondilla II, … J. T. Cabrera

Introduction

Hypertension is one of the most important and preventable causes of cardiovascular disease (CVD), stroke, chronic kidney disease, and dementia which caused approximately 8.5 million deaths in 2015, in low & middle-income countries¹. Reports show that hypertension prevalence is 25% in Iran². Hypertension depends on well-known risk factors such as age, gender, family history, smoking, alcohol consumption, central obesity, overweight and physical inactivity^3,4. Obesity has gained significant attention over the past years⁵.

Body mass index (BMI) is widely used for anthropometric measurements, and regardless of inaccuracy, it is still commonly used to determine obesity and assess health risks such as hypertension⁶. Complementary measures such as waist circumference, waist-to-hip ratio (WHR), and body composition analysis improve the prognostic efficiency of BMI⁷. Evidence shows that body fat distribution is a more vital determinant of cardiovascular morbidity and mortality than increased fat mass^8,9,10; further indicating that detailed assessment of body composition is beneficial for health risk estimations.

In the past few years, growing number of researchers have used machine learning and data mining algorithms to diagnose and treat health conditions such as heart¹¹ and brain¹² diseases. Their non-invasive nature and accuracy have enabled health professionals to quickly identify at-risk individuals and use more efficient preventive and managing strategies¹³.

In this study, we used machine learning approaches to investigate whether BIA-derived body composition indices predict hypertension in a cohort of patients.

Methods

Study design and participants

Fasa cohort study¹⁴ recruited at least 10,000 people and assessed predisposing factors for non-communicable diseases in rural regions of Fasa, Iran. In the present study, we used a subset of their data of 4663 records in which 2156 were male, 1099 had HTN, and the age range was 35–70. hypertension diagnosis was based on the blood pressure threshold defined by ACC/AHA guidelines¹⁵. All participants had given informed consent, and the Shiraz University of Medical Sciences ethics committee approved this study.

Body composition analysis

Body composition analysis was performed using eight electrodes (Tanita Segmental Body Composition Analyzer BC-418 MA Tanita Corp, Japan) BIA machines. The following variables were measured:

1.
Fat mass (FATM): Total fat mass (FATM), Left and Right Leg Fat Mass (LLFATM & RLFATM), Left and Right Arm Fat Mass (LAFATM & RAFATM), and Trunk Fat Mass (TRFATM).
2.
Fat percentage (FATP): Total Fat Percentage (TFATP), Left and Right Leg Fat Percentage (LLFATP & RLFATP), Left and Right Arm Fat Percentage (LAFATP & RAFATP), Trunk Fat Percentage (TRFATP).

Fat percentage is calculated as (fat mass)/weight × 100

3.
Fat-free mass (FFM): Total Fat-Free Mass (FFM), Left and Right Leg Fat-Free Mass (LLFFM & RLFFM), Left and Right Arm Fat-Free Mass (LAFFM & RAFFM), Trunk Fat-free Mass (TRFFM).
4.
Basal metabolic rate (BMR).

Dataset

Our dataset included 4663 records of which 1099 were hypertensive. Among 2156 males and 2507 females, 430 and 669 cases were hypertensive, respectively. Input features were: age (Between 35 and 70), gender ID (1: male, 2: female), BMR, FATM, FATP, FFM, LLFATP, RLFATP, LLFFM, RLFFM, LLFATM, RLFATM, LAFATP, RAFATP, LAFATM, RAFATM, LAFFM, RAFFM, TRFATP, TRFATM, and TRFFM. The target feature is the discrete binary variable of hypertension i.e. Yes or No.

It is noted that Institutional approval was granted for the use of the patient datasets in research studies for diagnostic and therapeutic purposes. Approval was granted on the grounds of existing datasets. Informed consent was obtained from all of the patients in this study. All methods were carried out in accordance with relevant guidelines and regulations. Ethical approval for the use of these data was obtained from the Tehran Omid hospital.

Investigated machine learning and data mining algorithms

We utilized some of the most efficient classification algorithms, such as Support Vector Classifier (SVC)¹⁶, Decision Tree (DT)¹⁷, Stochastic Gradient Descend (SGD) Classifier¹⁸, Logistic Regression (LR)¹⁹, Gaussian Naïve Bayes (GNB)²⁰, K-Nearest Neighbor (K-NN)²¹, Multi-Layer Perceptron (MLP)²², Random Forest (RF)²³, Gradient Boosting (GB)²⁴, Histogram-based Gradient Boosting (HGB)²⁵, Bagging²⁶, Extra Tree (ET)²⁷, Ada Boost²⁸, Voting²⁹, and Stacking³⁰.

These algorithms are briefly explained, and references for more detailed descriptions about them are provided. In the following part, we introduce metrics for evaluating the effectiveness of the algorithms.

To classify the data, SVC tries to find the best hyperplane to separate the different classes. The criterion to evaluate the hyper-plane is maximizing its distance to the sample points. SVC has a limitation compensated by the Support Vector Machine (SVM) non-linearly. It is the difference between SVC and SVM. In SVC, the hyper-plane classifies the data linearly. However, in SVM, the algorithm separates the dataset non-linearly³¹.

DT is a supervised learning algorithm used for classification and regression. This method tries to learn a model that can predict the value of a target feature by learning some decision rules inferred from the features of samples³².

SGD classifier is a linear classifier optimized by the SGD³³.

LR is a classification algorithm used in machine learning; it uses a logistic function to model the dependent variable. This variable can only have two values. Therefore, LR is only used in solving problems with binary target features. Moreover, the sigmoid function in LR maps the predicted values to the probabilities³⁴.

GNB is a probabilistic classification algorithm that utilizes the Bayes theorem. It assumes that the variables are independent of each other. This algorithm requires training data to estimate the parameters needed for classification. Since its implementation is simple, it is used to solve many classification problems²⁰.

K-NN algorithm is a non-parametric, supervised classifier that uses proximity to perform classification. In this algorithm, the assumption is that similar points are located near each other. A class label is assigned to a sample based on the majority vote between its K nearer samples³⁵.

MLP is a supervised learning algorithm that tries to learn a function based on a data set. The learned function is used to predict the class for a new sample. This algorithm has a network structure consisting of several layers of nodes. Each layer is connected to the next layer in the network. Nodes in the first layer represent input data. Other nodes map inputs to outputs by linearly combining them using a set of weights and a bias and applying an activation function³⁶.

RF is an ensemble learning method for classification which consists of many decision trees. It is created based on training data. The output of this algorithm is the class that most trees suggest. This algorithm can be used to avoid over-fitting the training set. Random forest performance is usually better than decision tree classifiers. However, the performance improvement usually depends on the data type³⁷.

Another machine learning algorithm is GB which performs prediction based on a set of weak prediction models such as decision trees. GB is one of the most popular methods of structured classification and predictive regression modeling and can cover a wide range of data sets. However, this method suffers slow training, mainly when used on large data sets (number of samples ≥ 10,000). In order to solve this problem, the trees added to the set are trained by discretization (binning) of continuous input variables to hundreds of unique values²⁴. This modification dramatically increases the algorithm execution speed compared to the Gradient Boosting Classifier. GB ensembles that implement this technique are referred to as HGB sets. It can also manage missing values. During training, at each split point, the tree learns whether samples with missing values should be assigned to the left or right child based on the potential gain. If there are no missing values for a given feature during training, samples with missing values are mapped to the child that has the highest number of samples²⁵.

A bagging classifier is an ensemble meta-classifier that consists of a set of base classifiers applied to random subsets of the original dataset. These classifiers’ results are collected, and a final prediction is derived according to them. The base classifiers are trained in parallel on disjoint training sets. Much of the original data may be repeated in the resulting training set, while other data may be omitted³⁸.

ET classifier is an ensemble learning technique, also known as an extremely randomized tree classifier. This algorithm uses the results of several uncorrelated decision trees collected in a forest to perform the classification process. The performance of this algorithm is very similar to an RF classification. However, building decision trees in the forest is different from RF. In this algorithm, each decision tree is built from the original training sample. At each test node, each tree is presented with a random sample containing a subset of the feature set. Each decision tree must select the best feature for splitting the data based on mathematical criteria such as the Gini index. Random selection of samples leads to multiple uncorrelated decision trees²⁷.

An Adaptive Boosting or Adaboost classifier is a meta-classifier algorithm. This ensemble algorithm starts by fitting a classifier on the original data set. It then tries to classify the same data set again using additional copies of the classifier, except that the weights of the misclassified samples are adjusted so that subsequent classifiers focus more on complex cases. The outputs of these classifiers are combined using weighted summation to create the final classification output³⁹.

The voting classifier is a meta-classifier that trains base models the outputs of which are used to guess the final result. Aggregation of the results of base learners is done in two ways: hard voting and soft voting. In the former, voting is done based on the output class declared by each base learner, while in the latter, the output class is based on the probability predicted by the base classes⁴⁰.

Stacking or Stacked Generalization is an ensemble meta-learning algorithm. Using this algorithm makes it possible to learn how to combine the results of two or more basic machine learning algorithms in the best possible way. The advantage is that capabilities of a wide range of well-performing algorithms can be exploited to achieve performance that none of them can achieve individually⁴¹.

We will apply these algorithms to our dataset but before that, some preprocesses must be performed on the training data.

Data preprocessing

To improve the performance of algorithms, some feature selection algorithms were used. These algorithms are used for selecting a subset of features for model construction. They are commonly used for simplification of constructed models to make them easier to interpret. Using these techniques leads to shorten training time, and void the curse of dimensionality. The feature selection algorithms tested in our research are best first⁴², genetic algorithm⁴³, greedy forward selection⁴⁴, greedy backward elimination⁴⁴, decision tree⁴⁵, random forest⁴⁶, and particle swarm optimization (PSO)⁴⁷. Among them, genetic algorithm showed the best performance and the rest of this research was organized according to the results of it. This algorithm declared that FATP, AFFM, BMR, FFM, TRFFM, AFATP, LFATP, and older age were the top features in hypertension prediction.

Evaluation metrics

In this research, we used the confusion matrix to test and compare the algorithms’ effectiveness. This matrix is a popular metric to evaluate the performance of binary and multi-class classification problems. Figure 1 shows a confusion matrix^48,49,50.

The confusion matrix shows how many outputs are correctly classified and how many are misclassified. In this table, “TN”, for true negative, shows how many negative samples are correctly classified. Similarly, “TP” stands for true positive and indicates how many positive samples are correctly classified. The term "FP" stands for false positive and represents the number of samples misclassified as positive. Finally, "FN" stands for false negative and indicates the number of positive samples misclassified as negative. Based on the values of this matrix, one of the most common metrics used for evaluating classification algorithms –accuracy- is calculated based on Eq. (1)^51,52.

$$Accuracy=\frac{TP+TN}{TP+TN+FP+FN}.$$

(1)

Precision, sensitivity (or recall), specificity, and F1-score are some other performance metrics that are very popular. They are calculated according to the following equations:

$$Macro\,\, Average\,\, Precision=\frac{\frac{TP}{TP+FP}+\frac{TN}{TN+FN}}{2},$$

(2)

$$Macro\,\, Average \,\,Sensitivity = \frac{\frac{TP}{TP+FN}+\frac{TN}{TN+FP}}{2},$$

(3)

$$Specificity=\frac{TN}{TN+FP},$$

(4)

$$F1-score=2*\frac{precision*sensitivity }{precision+sensitivity}.$$

(5)

Using these metrics, the above mentioned classification algorithms are compared. The flowchart of proposed method is shown in Fig. 2.

As there is an obvious category imbalance between normal individuals (negative cases) and diseased individuals (positive cases), during model training, the prediction results may be biased to judge as normal individuals, resulting in high specificity and low sensitivity. To solve this issue, three oversampling and two undersampling methods were applied to the dataset. SMOTE⁵³, Random Oversampling⁵⁴, ADASYN⁵⁵ methods are oversampling and Random Undersampling⁵⁶ and NearMiss⁵⁷ methods were used for undersampling. However, the results of applying classification methods on oversampling data generated by SMOTE and undersampling data generated by NearMiss methods were reported because of better performance. Using SMOTE, the number of cases was increased to 7128 with an equal number of positive and negative cases. When NearMiss was used for undersampling, the number of cases was decreased to 2198 with equal number of samples in each class.

In addition, the MetaCost⁵⁸ method was used to increase the penalty of negative cases.

Experimental results

In this section, we report and compare the results of applying classification algorithms mentioned in the methodology section. These algorithms are implemented in Python version 3.10,0 and its ready-made modules were used. These algorithms were run in Windows 11 operating system. The default settings of the algorithms are used in this research, except those listed in Table 1.

Table 1 Settings of the used algorithms.

Full size table

Tables 2, 3, and 4 list the accuracy, precision, recall, f1-score, and AUC of train and test data of these algorithms when oversampling, undersampling, and original data (while the penalty for negative cases in the model was increased) were used, respectively. In our research, genetic algorithm showed the best performance. Therefore, the results reported in Tables 2, 3, and 4 were calculated according to this feature selection algorithm.

Table 2 Performance metrics of different classification algorithms applied on oversampled data.

Full size table

Table 3 Performance metrics of different classification algorithms applied on undersampled data.

Full size table

Table 4 Performance metrics of different classification algorithms applied on original data while the penalty for negative cases in the model was increased.

Full size table

AutoMLP has the best accuracy commonly followed by Stacking and Voting. The performance of different algorithms on the training set was also reported. This helps to check whether the model training is over-fitting or under-fitting, and helps with better adjustment of model parameters to improve the classification results. As it is clear in these tables, the oversampling performance is better than undersampling or original sampling methods.

Discussion

In the present study and a cohort population, we used machine learning methods and found that BIA-derived body composition indices predict hypertension with an acceptable accuracy. FATP, AFFM, BMR, FFM, TRFFM, AFATP, LFATP, and older age were the top features in hypertension prediction. FATP, AFATP, LFATP, TRFATP, higher age, and female gender directly associated with HTN. But, FFM, AFFM, LFFM, TRFFM, BMR, and male gender were inversely linked to HTN. AutoMLP, stacking and voting methods had the best performance for hypertension prediction showed an accuracy rate of 90%, 84% and 83%, respectively.

Total FATP and FFM

Various other studies confirm the direct link of body fat mass (and percentage) with blood pressure^59,60,61. Park et al.⁶², in a prospective cohort study, showed that a high body fat percentage (more than 19.9% in men and 32.5% in women) was associated with an increased risk of incident hypertension regardless of BMI, waist circumference, and WHR. Although body fat mass and percentage are superior to BMI in morbidities risk assessment, a study⁶³ on Iranian population showed that BMI predicts CVD better than body fat percentage. Another study⁶⁴ on American postmenopausal women with normal BMI found no relation between whole-body fat mass and percentage of CVD risk; although regional body fat had significant associations. These discrepancies may be due to different analysis methods of body composition, and ethnicity.

Contrary to our results, some investigations in adult and pediatric populations established that FFM is positively related to systolic, diastolic, or mean blood pressure^{65,66,67,68,69,70,71}. Korhonen et al.⁶⁶ attribute this finding to muscle mass properties; during daytime and contraction, skeletal muscles release myokines that may increase blood pressure. This explanation confirms the findings of Ye et al.⁶⁰ in a Chinese population: total skeletal mass (TSM) indices -primarily arm lean body mass- are positively associated with blood pressure, pre-HTN, and HTN.

Trunk FATP and FFM

Previous studies have established the positive association of TRFATM with hypertension and CVD⁷², and our data further support that BIA-measured abdominal adiposity is positively associated with hypertension⁷³. Chen et al.⁶⁴ assessed CVD incidence in postmenopausal women with normal BMI during a median of 17.9 years. The authors used Dual X-ray Absorptiometry (DXA) and found that higher TRFATP and lower LFATP were associated with higher CVD risk.

In an opinion survey⁷¹, using DXA body measurement and machine learning methods, researchers depicted that TRFAT correlates with both mean systolic and diastolic pressure -the same as our findings. The authors have not provided trunk lean body mass results but declare that total lean body mass positively correlates with mean systolic blood pressure. In general, evidence is lacking about the association between TRFFM and hypertension risk.

Appendicular FATP and FFM

There are conflicting data about arm and leg fat association with HTN. In a study of 3130 Chinese participants by Ye et al.⁶⁰, fat mass percentage and lean body mass, especially in the arm, were positively associated with increased blood pressure. Nevertheless, leg lean mass showed no significant association with systolic and diastolic pressure. In another study⁷⁴ on 399 participants, authors showed that: (1) arm fat was a positive predictor for blood pressure, (2) after full adjustment, loss of lean leg mass directly correlated with reductions in systolic blood pressure, (3) loss of leg fat and lean mass had direct beneficial changes in markers of CVD risk. More conflicting results exist: positive association of mid-upper arm circumference with increased hypertension risk⁷⁵, and significant inverse association between the leg and arm total fat percentage with hypertension⁷⁶.

The exact mechanism by which LFATP and LFFM modulate blood pressure is still unclear. Regional fat deposition in the legs, mainly subcutaneous, reduces fatty acid turnover and downregulates triglyceride production in the blood. Therefore, it acts as a “metabolic sink” and preserves other tissues from lipotoxicity, protects endothelium against damage, and maintains elasticity and compliance of arterioles^74,77. Another possible mechanism is that as subcutaneous fat, it may decrease the activation of renin–angiotensin–aldosterone and the sympathetic system⁷⁷. Also, for FFM, some studies declare that muscle mass has a protective role in blood pressure^78,79. However, Ye et al.⁶⁰ suggest that previous studies on appendicular lean mass or skeletal muscle did not control fat mass and fat distribution in their analysis, leading to inaccurate results.

Gender and age

Sex differences did not predict hypertension in our study population; however, the association was negative in males and positive in females. Previous studies showed that in men, lower body fat (thigh or gynoid) had a more protective effect on cardio-metabolic risks, such as elevated blood pressure. The effects of sex hormones on subcutaneous fat mass in these regions might explain this sex difference⁸⁰.

Based on our results, age had a positive association with hypertension. Likewise, a study on the Chinese population age indicated an independent association in both men and women with hypertension⁸¹. However, results are not always positive; in a study performed on Brazilian children and adolescents, regardless of sex, the authors observed no significant association between age and systolic blood pressure⁸².

BMR

Our study demonstrated a strong inverse relationship between BMR and hypertension, but this is not reported elsewhere. A study in Bangladeshi adults showed a positive relation between BMR and blood pressure, suggesting that upregulated BMR may elevate blood pressure by accelerating thyroid hormone levels and increasing sympathetic tone and oxidative damage⁸³. Further investigation is required.

Strengths and limitations

The implication of machine learning in a cohort of patients is the main strength of our study. Machine learning methods are more precise than traditional ones, so we believe that our findings can resolve the conflicting results regarding our research question. Nevertheless, this study has some limitations including lack of data about the use of anti-hypertensive drugs and other anthropometric indices such as waist circumference. Also, BIA of TRFAT do not differentiate between visceral and subcutaneous abdominal adipose tissues. However, we aimed to use an available method for body composition analysis and BIA is a simple, safe, and readily available method –unlike DEXA, CT scan, and MRI. We suggest that future prospective studies use machine learning methods and body composition analyses to predict hypertension among different ethnic groups. In addition, this study can be extended to more clinical samples. Consequently, classification methods especially the autoMLP are expected to have better performance.

Conclusion

Given that body fat and its distribution are risk factors for hypertension, we used machine learning methods to study these relations. With an acceptable accuracy, we confirmed that BIA-derived body composition predicts hypertension. Also, total and regional FATP, higher age, and female gender had a positive relation with hypertension while it was the exact contrary for total and regional FFM, BMR, and male gender.

Data availability

Data are available from the authors upon reasonable request from the corresponding author, Hamed Bazrafshan Drissi.

References

Zhou, B. et al. Global epidemiology, health burden and effective interventions for elevated blood pressure and hypertension. Nat. Rev. Cardiol. 18(11), 785–802 (2021).
Article PubMed PubMed Central Google Scholar
Oori, M. J. et al. Prevalence of HTN in Iran: Meta-analysis of published studies in 2004–2018. Curr. Hypertens. Rev. 15(2), 113–122 (2019).
Article CAS PubMed PubMed Central Google Scholar
Qiu, L. et al. Prevalence and risk factors of hypertension, diabetes, and dyslipidemia among adults in Northwest China. Int. J. Hypertens. 2021, 1–10 (2021).
Article Google Scholar
Carson, A. P. et al. Ethnic differences in hypertension incidence among middle-aged and older adults: The multi-ethnic study of atherosclerosis. Hypertension 57(6), 1101–1107 (2011).
Article CAS PubMed Google Scholar
Goto, K. et al. An association between subcutaneous fat mass accumulation and hypertension. J. Gen. Fam. Med. 22(4), 209–217 (2021).
Article PubMed PubMed Central Google Scholar
Nuttall, F. Q. Body mass index: Obesity, BMI, and health: A critical review. Nutr. Today 50(3), 117 (2015).
Article PubMed PubMed Central Google Scholar
González-Muniesa, P. et al. Obesity. Nat. Rev. Dis. Primers 3, 17034 (2017).
Article PubMed Google Scholar
Blüher, M. & Laufs, U. New concepts for body shape-related cardiovascular risk: Role of fat distribution and adipose tissue function. Eur. Heart J. 40(34), 2856–2858 (2019).
Article PubMed Google Scholar
Yano, Y. et al. Regional fat distribution and blood pressure level and variability: The Dallas Heart Study. Hypertension 68(3), 576–583 (2016).
Article CAS PubMed Google Scholar
Gowri, S. M. et al. Distinct opposing associations of upper and lower body fat depots with metabolic and cardiovascular disease risk markers. Int. J. Obes. 45(11), 2490–2498 (2021).
Article Google Scholar
Joloudari, J.H. et al. Application of artificial intelligence techniques for automated detection of myocardial infarction: A review. Physiological Measurement (2022).
Shoeibi, A. et al. Detection of epileptic seizures on EEG signals using ANFIS classifier, autoencoders and fuzzy entropies. Biomed. Signal Process. Control 73, 103417 (2022).
Article Google Scholar
Chowdhury, M. Z. I. et al. Prediction of hypertension using traditional regression and machine learning models: A systematic review and meta-analysis. PLoS One 17(4), e0266334 (2022).
Article CAS PubMed PubMed Central Google Scholar
Farjam, M. et al. A cohort study protocol to analyze the predisposing factors to common chronic non-communicable diseases in rural areas: Fasa Cohort Study. BMC Public Health 16(1), 1–8 (2016).
Article Google Scholar
ACO Cardiology. Guideline for the prevention, detection, evaluation, and management of high blood pressure in adults. J. Am. Coll. Cardiol. https://doi.org/10.1016/j.jacc.2017.07.745 (2017).
Article Google Scholar
Lau, K. & Wu, Q. Online training of support vector classifier. Pattern Recogn. 36(8), 1913–1920 (2003).
Article ADS MATH Google Scholar
Chiu, P.K.-F. et al. Enhancement of prostate cancer diagnosis by machine learning techniques: An algorithm development and validation study. Prostate Cancer Prostatic Dis. 25(4), 672–676 (2022).
Article PubMed Google Scholar
Song, S., Chaudhuri, K. and Sarwate, A.D. Stochastic gradient descent with differentially private updates. In 2013 IEEE Global Conference on Signal and Information Processing, IEEE, 2013.
Hosmer, D. W. Jr., Lemeshow, S. & Sturdivant, R. X. Applied Logistic Regression Vol. 398 (Wiley, 2013).
Book MATH Google Scholar
Ontivero-Ortega, M. et al. Fast Gaussian Naïve Bayes for searchlight classification analysis. Neuroimage 163, 471–479 (2017).
Article PubMed Google Scholar
Wu, Y., Ianakiev, K. & Govindaraju, V. Improved k-nearest neighbor classification. Pattern Recogn. 35(10), 2311–2318 (2002).
Article ADS MATH Google Scholar
Camacho Olmedo, M. T. et al. Geomatic Approaches for Modeling Land Change Scenarios. An Introduction (Springer, 2018).
Book Google Scholar
Biau, G. & Scornet, E. A random forest guided tour. Test 25, 197–227 (2016).
Article MathSciNet MATH Google Scholar
Bentéjac, C., Csörgő, A. & Martínez-Muñoz, G. A comparative analysis of gradient boosting algorithms. Artif. Intell. Rev. 54, 1937–1967 (2021).
Article Google Scholar
Friedman, J. H. Stochastic gradient boosting. Comput. Stat. Data Anal. 38(4), 367–378 (2002).
Article MathSciNet MATH Google Scholar
Zareapoor, M. & Shamsolmoali, P. Application of credit card fraud detection: Based on bagging ensemble classifier. Procedia Comput. Sci. 2015(48), 679–685 (2015).
Article Google Scholar
Abhishek, L. Optical character recognition using ensemble of SVM, MLP and extra trees classifier. In 2020 International Conference for Emerging Technology (INCET), IEEE, 2020.
Schapire, R.E. Explaining adaboost. Empirical Inference: Festschrift in Honor of Vladimir N. Vapnik, p. 37–52 (2013).
Parhami, B. Voting algorithms. IEEE Trans. Reliab. 43(4), 617–629 (1994).
Article MathSciNet Google Scholar
Sikora, R. A modified stacking ensemble machine learning algorithm using genetic algorithms. In Handbook of Research on Organizational Transformations Through Big Data Analytics (eds Tavana, M. & Puranam, K.) 43–53 (IGi Global, 2015).
Chapter Google Scholar
Alizadehsani, R. et al. Coronary artery disease detection using computational intelligence methods. Knowl.-Based Syst. 109, 187–197 (2016).
Article Google Scholar
Alizadehsani, R. et al. Machine learning-based coronary artery disease diagnosis: A comprehensive review. Comput. Biol. Med. 111, 103346 (2019).
Article PubMed Google Scholar
Kabir, F. et al. Bangla text document categorization using stochastic gradient descent (sgd) classifier. In 2015 International Conference on Cognitive Computing and Information Processing (CCIP), IEEE, 2015.
Ayoobi, N. et al. Time series forecasting of new cases and new deaths rate for COVID-19 using deep learning methods. Results Phys. 27, 104495 (2021).
Article PubMed PubMed Central Google Scholar
Alizadehsani, R. et al. Coronary artery disease detection using artificial intelligence techniques: A survey of trends, geographical differences and diagnostic features 1991–2020. Comput. Biol. Med. 128, 104095 (2021).
Article CAS PubMed Google Scholar
Shoeibi, A. et al. (2021) Applications of epileptic seizures detection in neuroimaging modalities using deep learning techniques: methods, challenges, and future works. Preprint at https://arxiv.org/arXiv:2105.14278
Khozeimeh, F. et al. Combining a convolutional neural network with autoencoders to predict the survival chance of COVID-19 patients. Sci. Rep. 11(1), 15343 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Nahavandi, D. et al. Application of artificial intelligence in wearable devices: Opportunities and challenges. Comput. Methods Programs Biomed. 213, 106541 (2022).
Article PubMed Google Scholar
Asgharnezhad, H. et al. Objective evaluation of deep uncertainty predictions for covid-19 detection. Sci. Rep. 12(1), 1–11 (2022).
Article Google Scholar
Moridian, P. et al. (2022) Automatic autism spectrum disorder detection using artificial intelligence methods with MRI neuroimaging: A review. Preprint at https://arxiv.org/arXiv:2206.11233
Khozeimeh, F. et al. RF-CNN-F: Random forest with convolutional neural network features for coronary artery disease diagnosis based on cardiac magnetic resonance. Sci. Rep. 12(1), 11178 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Xu, L., Yan, P. & Chang, T. Best first strategy for feature selection. In 9th International Conference on Pattern Recognition (eds Xu, L. et al.) (IEEE Computer Society, 1988).
Google Scholar
Leardi, R., Boggia, R. & Terrile, M. Genetic algorithms as a strategy for feature selection. J. Chemom. 6(5), 267–281 (1992).
Article CAS Google Scholar
Caruana, R. & Freitag, D. Greedy attribute selection. In Machine Learning Proceedings 1994 28–36 (Elsevier, 1994).
Chapter Google Scholar
Zhou, H. et al. A feature selection algorithm of decision tree based on feature weight. Expert Syst. Appl. 164, 113842 (2021).
Article Google Scholar
Kursa, M. B. & Rudnicki, W. R. Feature selection with the Boruta package. J. Stat. Softw. 36, 1–13 (2010).
Article Google Scholar
Xue, B., Zhang, M. & Browne, W. N. Particle swarm optimisation for feature selection in classification: Novel initialisation and updating mechanisms. Appl. Soft Comput. 18, 261–276 (2014).
Article Google Scholar
Nahavandi, S. et al. (2022) A Comprehensive Review on Autonomous Navigation. Preprint at https://arxiv.org/arXiv:2212.12808
Alizadehsani, R. et al. Swarm intelligence in internet of medical things: A review. Sensors 23(3), 1466 (2023).
Article ADS PubMed PubMed Central Google Scholar
Karami, M. et al. (2023) Revolutionizing Genomics with Reinforcement Learning Techniques. Preprint at https://arxiv.org/arXiv:2302.13268
Kakhi, K. et al. The internet of medical things and artificial intelligence: Trends, challenges, and opportunities. Biocybern. Biomed. Eng. https://doi.org/10.1016/j.bbe.2022.05.008 (2022).
Article Google Scholar
Nasab, R.Z. et al. (2022) Deep Learning in Spatially Resolved Transcriptomics: A Comprehensive Technical View. Preprint at https://arxiv.org/arXiv:2210.04453
Torgo, L. et al. Smote for regression. In Progress in Artificial Intelligence: 16th Portuguese Conference on Artificial Intelligence, EPIA 2013, Angra do Heroísmo, Azores, Portugal, September 9–12, 2013. Proceedings 16, Springer, 2013.
Mohammed, R. J. Rawashdeh, and M. Abdullah. Machine learning with oversampling and undersampling techniques: overview study and experimental results. In 2020 11th International Conference on Information and Communication Systems (ICICS), IEEE, 2020.
He, H. et al. ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), IEEE, 2008.
Prusa, J. et al. Using random undersampling to alleviate class imbalance on tweet sentiment data. In 2015 IEEE International Conference on Information Reuse and Integration, IEEE, 2015.
Bao, L. et al. Boosted near-miss under-sampling on SVM ensembles for concept detection in large-scale imbalanced datasets. Neurocomputing 172, 198–206 (2016).
Article Google Scholar
Domingos, P. Metacost: A general method for making classifiers cost-sensitive. In Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining (1999).
Li, R. et al. The association of body fat percentage with hypertension in a Chinese rural population: The Henan rural cohort study. Front. Public Health 8, 70 (2020).
Article CAS PubMed PubMed Central Google Scholar
Ye, S. et al. Associations of body composition with blood pressure and hypertension. Obesity 26(10), 1644–1650 (2018).
Article PubMed Google Scholar
Chen, M. et al. Association between body fat and elevated blood pressure among children and adolescents aged 7–17 years: Using dual-energy X-ray Absorptiometry (DEXA) and bioelectrical impedance analysis (BIA) from a cross-sectional study in China. Int. J. Environ. Res. Public Health 18(17), 9254 (2021).
Article PubMed PubMed Central Google Scholar
Park, S. K. et al. Body fat percentage, obesity, and their relation to the incidental risk of hypertension. J. Clin. Hypertens. 21(10), 1496–1504 (2019).
Article CAS Google Scholar
Sheibani, H. et al. A comparison of body mass index and percent body fat as predictors of cardiovascular risk factors. Diabetes Metab. Syndr. 13(1), 570–575 (2019).
Article MathSciNet PubMed Google Scholar
Chen, G.-C. et al. Association between regional body fat and cardiovascular disease risk among postmenopausal women with normal body mass index. Eur. Heart J. 40(34), 2849–2855 (2019).
Article CAS PubMed PubMed Central Google Scholar
He, H. et al. Effect of fat mass index, fat free mass index and body mass index on childhood blood pressure: A cross-sectional study in south China. Transl. Pediatr. 10(3), 541 (2021).
Article MathSciNet PubMed PubMed Central Google Scholar
Korhonen, P. E. et al. Both lean and fat body mass associate with blood pressure. Eur. J. Intern. Med. 91, 40–44 (2021).
Article CAS PubMed Google Scholar
Rao, K. M. et al. Correlation of Fat Mass Index and Fat-Free Mass Index with percentage body fat and their association with hypertension among urban South Indian adult men and women. Ann. Hum. Biol. 39(1), 54–58 (2012).
Article MathSciNet PubMed Google Scholar
Takase, M. et al. Association between the combined fat mass and fat-free mass index and hypertension: The Tohoku Medical Megabank Community-based Cohort Study. Clin. Exp. Hypertens. 43(7), 610–621 (2021).
Article PubMed Google Scholar
Vaziri, Y. et al. Lean body mass as a predictive value of hypertension in young adults, in Ankara, Turkey. Iran. J. Public Health 44(12), 1643 (2015).
PubMed PubMed Central Google Scholar
Xu, R. et al. Percentage of free fat mass is associated with elevated blood pressure in healthy Chinese children. Hypertens. Res. 42(1), 95–104 (2019).
Article CAS PubMed Google Scholar
Nath, T., Ahima, R. S. & Santhanam, P. DXA measured body composition predicts blood pressure using machine learning methods. J. Clin. Hypertens. 22(6), 1098 (2020).
Article Google Scholar
Goswami, B. et al. Role of body visceral fat in hypertension and dyslipidemia among the diabetic and nondiabetic ethnic population of Tripura—A comparative study. J. Fam. Med. Prim. Care 9(6), 2885 (2020).
Article Google Scholar
Takeoka, A. et al. Intra-abdominal fat accumulation is a hypertension risk factor in young adulthood: A cross-sectional study. Medicine 95(45), e5361 (2016).
Article CAS PubMed PubMed Central Google Scholar
Clifton, P. M. Relationship between changes in fat and lean depots following weight loss and changes in cardiovascular disease risk markers. J. Am. Heart Assoc. 7(8), e008675 (2018).
Article PubMed PubMed Central Google Scholar
Hou, Y. et al. Association between mid-upper arm circumference and cardiometabolic risk in Chinese population: A cross-sectional study. BMJ Open 9(9), e028904 (2019).
Article PubMed PubMed Central Google Scholar
Visaria, A. et al. Leg and arm adiposity is inversely associated with diastolic hypertension in young and middle-aged United States adults. Clin. Hypertens. 28, 1–12 (2022).
Article Google Scholar
Porter, S. A. et al. Abdominal subcutaneous adipose tissue: A protective fat depot?. Diabetes Care 32(6), 1068–1075 (2009).
Article PubMed PubMed Central Google Scholar
AlKaabi, L. A. et al. Predicting hypertension using machine learning: Findings from Qatar Biobank Study. PLoS One 15(10), e0240370 (2020).
Article CAS PubMed PubMed Central Google Scholar
Butcher, J. T. et al. Increased muscle mass protects against hypertension and renal injury in obesity. J. Am. Heart Assoc. 7(16), e009358 (2018).
Article PubMed PubMed Central Google Scholar
Yang, Y. et al. Sex differences in the associations between adiposity distribution and cardiometabolic risk factors in overweight or obese individuals: A cross-sectional study. BMC Public Health 21(1), 1232 (2021).
Article CAS PubMed PubMed Central Google Scholar
Liu, Y. et al. Gender stratified analyses of the association of skinfold thickness with hypertension: A cross-sectional study in general Northeastern Chinese residents. Int. J. Environ. Res. Public Health 15(12), 2748 (2018).
Article CAS PubMed PubMed Central Google Scholar
Zaniqueli, D. et al. Muscle mass is the main somatic growth indicator associated with increasing blood pressure with age in children and adolescents. J. Clin. Hypertens. 22(10), 1908–1914 (2020).
Article Google Scholar
Ali, N. et al. Hypertension prevalence and influence of basal metabolic rate on blood pressure among adult students in Bangladesh. BMC Public Health 18(1), 1–9 (2018).
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Sciences, Fasa University, Fasa, Iran
Mohammad Ali Nematollahi
Student Research Committee, Shiraz University of Medical Sciences, Shiraz, Iran
Soodeh Jahangiri, Maryam Salimi, Mina Mashayekh & Ghazal Gholamabbas
Non Communicable Diseases Research Center, Fasa University of Medical Sciences, Fasa, Iran
Arefeh Asadollahi & Azizallah Dehghan
Bone and Joint Diseases Research Center, Department of Orthopedic Surgery, Shiraz University of Medical Science, Shiraz, Iran
Maryam Salimi
Department of Computer Engineering, Faculty of Engineering, Fasa University, Fasa, 74617-81189, Iran
Mohamad Roshanzamir
Institute for Intelligent Systems Research and Innovation (IISRI), Deakin University, Geelong, Australia
Roohallah Alizadehsani
Shiraz University of Medical Sciences, Shiraz, Iran
Mehdi Bazrafshan
Department of Neurology, Clinical Neurology Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
Hanieh Bazrafshan
Cardiovascular Research Center, Shiraz University of Medical Sciences, PO Box: 71348-14336, Shiraz, Iran
Hamed Bazrafshan drissi
Institute for Physical Activity and Nutrition, School of Exercise and Nutrition Sciences, Deakin University, Geelong, VIC, Australia
Sheikh Mohammed Shariful Islam
Cardiovascular Division, The George Institute for Global Health, Newtown, Australia
Sheikh Mohammed Shariful Islam
Sydney Medical School, University of Sydney, Camperdown, Australia
Sheikh Mohammed Shariful Islam

Authors

Mohammad Ali Nematollahi
View author publications
You can also search for this author in PubMed Google Scholar
Soodeh Jahangiri
View author publications
You can also search for this author in PubMed Google Scholar
Arefeh Asadollahi
View author publications
You can also search for this author in PubMed Google Scholar
Maryam Salimi
View author publications
You can also search for this author in PubMed Google Scholar
Azizallah Dehghan
View author publications
You can also search for this author in PubMed Google Scholar
Mina Mashayekh
View author publications
You can also search for this author in PubMed Google Scholar
Mohamad Roshanzamir
View author publications
You can also search for this author in PubMed Google Scholar
Ghazal Gholamabbas
View author publications
You can also search for this author in PubMed Google Scholar
Roohallah Alizadehsani
View author publications
You can also search for this author in PubMed Google Scholar
Mehdi Bazrafshan
View author publications
You can also search for this author in PubMed Google Scholar
Hanieh Bazrafshan
View author publications
You can also search for this author in PubMed Google Scholar
Hamed Bazrafshan drissi
View author publications
You can also search for this author in PubMed Google Scholar
Sheikh Mohammed Shariful Islam
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.A.N., S.J., A.A., M.S., A.D., M.M., M.R., G.G., R.A., M.B., and H.B. are helping with writing the text. H.B.D. and S.M.S.I. have supervised us during the writing of the paper with their invaluable comments. At the end, this manuscript has resulted by the collaboration of all authors.

Corresponding author

Correspondence to Hamed Bazrafshan drissi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Nematollahi, M.A., Jahangiri, S., Asadollahi, A. et al. Body composition predicts hypertension using machine learning methods: a cohort study. Sci Rep 13, 6885 (2023). https://doi.org/10.1038/s41598-023-34127-6

Download citation

Received: 03 November 2022
Accepted: 25 April 2023
Published: 27 April 2023
DOI: https://doi.org/10.1038/s41598-023-34127-6

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.