Introduction

Hypertension stands as a critical risk factor of cardiovascular and cerebrovascular diseases1,2,3. It has been reported that the global prevalence of hypertension among adults is 31.1%, but its awareness, treatment and control rate remain relatively low1,4,5. In the context of diagnostic criteria of hypertension, systolic blood pressure ≥ 140 mmHg and/or diastolic blood pressure ≥ 90 mmHg, the prevalence rate, awareness rate, treatment rate and control rate of hypertension in Chinese population are 18.0–44.7%, 23.6–56.2%, 14.2–48.5%, 4.2–30.1%, respectively6,7. On November 13, 2022, China’s clinical practice guidelines for management of hypertension was released, lowering the diagnostic criteria to systolic blood pressure ≥ 130 mmHg and/or diastolic blood pressure ≥ 80mmHg8. This adjustment is expected to double the number of individuals with hypertension in China, reaching nearly 500 million. With the acceleration of aging process, the prevalence of hypertension will further increase. Consequently, it is imperative to institute effective public health interventions for hypertension, beginning with the identification and screening of individuals at high risk.

Several hypertension prediction models have emerged. For instance, the Framingham Heart Study developed a short-term hypertension prediction model, encompassing conventional factors such as age, gender, baseline blood pressure, body mass index (BMI), parental history of hypertension, and smoking9. Nonetheless, its applicability might be limited to Caucasian non-diabetic patients10, restricting its extension to non-white races or diabetics11. Asian nations, including China, Japan and South Korea, have also established regional hypertension prediction models that incorporate both conventional and local risk factors12,13,14,15,16,17,18,19,20,21. However, these models only contain part of the influencing factors.

With the advancing comprehension of hypertension's pathophysiology, there is growing emphasis on early detection of hypertension-induced vascular structural and functional changes. Current indicators for early detection including dilatation, stiffness, and compliance, reflecting variations in vascular lumen and wall, offering insights into wall pressure and vascular function22,23,24. Predominantly, manual measurement techniques involve ultrasound, gated chest CT, and MRI, with a primary focus on the diameter of the ascending aorta25,26,27,28. Studies have demonstrated an interrelationship between dilatation of the ascending aorta and the risk of cardiovascular diseases, notably hypertension. Hypertension is an independent risk factor for vascular dilatation, which in turn is a predictor of cardiovascular diseases29,30. Given the prevalence of chest CT scans in lung cancer screenings, its images carry valuable potential information about the thoracic aorta31. Deep learning-powered artificial intelligence has emerged as an innovative approach for quantitative analysis of CT images. In non-contrast chest CT scans, AI enables automated quantitative analysis of the thoracic aorta.

The objective of this study is to develop and validate hypertension prediction models utilizing clinical risk factors and thoracic aorta characteristics, subsequently assessing the models' effectiveness.

Methods

Patients

This study was approved by the Ethics Committee of Wuhan Union Hospital ([2021]0853) in accordance with the Declaration of Helsinki. The research is retrospective and the data are anonymous, thus informed consent was waived by Wuhan Union Hospital Ethics committee.

From November 2018 to November 2019, patients who were hospitalized in the General Medical and Geriatrics Department were included. After excluding 348 patients who did not meet the criteria (see Fig. 1 for details), 804 patients were eventually included in the study, of whom 439 had hypertension and 365 were without hypertension. And blood pressure measurement and diagnosis of hypertension are carried out strictly by professional doctors according to 2018 ESC/ ESH Guidelines for the management of arterial hypertension32. The whole patients were then completely randomly divided into the training cohort and internal validation cohort by a ratio of seven to three. The flowchart presented the detailed procedures of the inclusion and exclusion criteria, blood pressure monitoring and diagnosis as well as grouping.

Figure 1
figure 1

Flowchart illustrating the inclusion criteria and grouping of patients.

Baseline clinical data, including age, sex, height, weight, personal history, serum biochemical markers and other concomitant diseases were acquired from the electronic medical records system.

CT image acquisition

All the included patients underwent non-contrast enhanced chest CT (NCCT) examinations during hospitalization. All chest CT images were acquired from thoracic inlet to the diaphragm with patients in a supine position. Scans were performed using one of the following three CT scanners: SIEMENS- Definition, Germany; GE-Discovery CT750HD, USA; TOSHIBA-Aquilion ONE, Janpan. And applied scanning parameters: tube voltage 120 kV, self-regulating tube current by automatic exposure control system. The reconstruction slice thickness of the chest CT images was 1–2 mm, and the reconstruction interval was 1–2 mm.

Thoracic aorta segmentation and features extraction

In this study, a multi-task learning framework (provided by Shanghai United Imaging Intelligence Co. Ltd., Shanghai, China) was used to automatically measure nine key positions of the thoracic aorta recommended by AHA guidelines33 on chest CT images (Fig. 2, Supplementary Fig. S1). The framework is an automatic post-processing tool for thoracic aorta utilized deep learning methods, which has been applied and validated on challenging NCCT images.

Figure 2
figure 2

The workflow of the research.

The automatic quantification of thoracic aorta mainly consists of three tasks: aortic segmentation, aortic anatomical marker detection and aortic measurement. Using two parallel subnets, the framework can simultaneously accomplish two tasks of aortic segmentation and thoracic aortic anatomical landmark localization. Specifically, the segmentation subnetwork was intended to delineate the thoracic aortic boundary, and the detection subnetwork was intended to detected five key anatomical landmarks, including the aortic sinus, brachiocephalic artery, left common carotid artery, left subclavian artery, and celiac trunk. Based on the segmented aortic mask and the five detected anatomical landmarks, the nine key landmarks recommended by the AHA guidelines can be inferred. Thus, we can calculate the diameters and cross-sectional areas of the nine landmarks as well as the total and segmental volume and length. More details and performance tests for each section can be found in the reference34.

Feature selection and development, validation of signature

A three-step procedure was performed for dimensionality reduction for thoracic aorta imaging factors. Firstly, thoracic aorta imaging factors with variance > 1.0 were selected. Secondly, Analysis of Variance was applied to choose the statistical influence feature for hypertension. Finally, the thoracic aorta imaging factors met the criteria of variance > 1.0 and being significantly different between non-HTN group and HTN group were enrolled into the least absolute shrinkage and selection operation (LASSO) regression method to select the most related features with non-zero coefficients from the training cohort.

After feature selection, the AIScore was computed for each patient through the LASSO regression with a combination of selected features weighted by their respective coefficients. Both feature selection and AIScore development were performed in the training cohort. And it was evaluated in the internal validation set.

Construction of different models

In order to meet the needs of different clinical application scenarios, we established five models with different features: AIMeasure model, BasicClinical model, TotalClinical model, AIBasicClinical model and AITotalClinical model.

AIMeasure model was constructed as the sum of weights of the selected thoracic aorta imaging factors using logistic regression. The clinical factors models were comprised of BasicClinical model and TotalClinical model, which were built using the directly available basic clinical data (including sex, age, height, weight, BMI, smoking history, drinking history, etc.) and the total clinical data (including the above basic clinical data, serum biomarkers and concomitant diseases, etc.), respectively. Concretely, we first applied univariate analysis to compare the differences of clinical factors between the two groups. Then the variables with significant statistical differences were inputted into the multivariate logistic backward stepwise regression to build the clinical factors models. Further, we established the AIBasicClinical model and AITotalClinical model by combining valuable clinical factors and AI-score using the multivariate logistic backward stepwise regression method.

Assessment of the performance of different models

The predictive performance of the five models for identifying hypertension was evaluated using receiver operating characteristic (ROC) curve in both the training set and validation set. At the same time, Delong test was used to compare the differences of AUC values among different models. The agreement between the predicted and actual probabilities of the models was appraised with a calibration curve. To assess the clinical applicability of the five models, a decision curve analysis (DCA) was carried out by calculating the net benefits.

Statistical analysis

The statistical analysis was performed using R software (version 4.0.3, https://www.r-project.org). And p < 0.05 was considered statistically significant.

The Shapiro-Wilks method was used for normal distribution test, and the Levene method was used for homogeneity of variance test. The independent-samples t-test (for normal distribution continuous variables), Mann–Whitney U-test (for non-normal distribution continuous variables) and chi-square test (for categorical variables) were used to compare the differences of clinical factors between two groups. Normally distributed variables were presented as mean ± standard deviation, non-normally distributed variables were expressed as median (25th and 75th percentile), and categorical variables were expressed as count (percentage). Difference of thoracic aorta imaging factors was compared using ANOVA analysis.

The “glmnet” package was used for standard LASSO regression. The “rms” package was used to perform the multivariate binary logistic regression and develop nomogram. The “pROC” package was applied to plot ROC curves and Delong test was used to estimate the differences of AUC values among different models. And calibration curves were plotted using the “rms” package, while the DCA was carried out using the “rmda” package.

Results

Characteristics of patients

A total of 804 patients who met the inclusion criteria were finally enrolled in this study (median age was 52 years, male accounted for 30.6%), including 439 patients with hypertension (54.6%) (Fig. 1, Table 1). Detailed clinical features of the patients are presented in Table 1. None of the clinical features were statistically different between the training set and validation set.

Table 1 Comparison of clinical features between the HTN group and non-HTN group, as well as between the training set and validation set.

In terms of basic clinical features, all the features including age, sex, height, weight, BMI, history of smoking and drinking were significantly different between HTN and non-HTN groups. And high blood pressure was associated with an older age, a greater BMI as well as a history of smoking and drinking. In addition, high blood pressure seems to favor women.

Concerning serum biochemical markers, HTN patients tend to have higher blood glucose and higher blood lipids. Besides, alanine aminotransferase (ALT), blood urea nitrogen (BUN), creatinine (CREA) and serum potassium (K) were also statistically different in HTN and non-HTN groups.

When it came to comorbidities, the number of patients who had hypertension with comorbidities was significantly higher than the non-HTN group. Hyperglycemia, hyperlipidemia, hyperuricemia, peripheral atherosclerosis, coronary atherosclerosis, cerebral atherosclerosis, lacunar cerebral infarction, fatty liver and other diseases tend to occur more in hypertensive patients.

Features extraction, selection and establishment of signature

In total, 43 features including the diameter and area of 9 levels of thoracic aorta, the volume and length at two adjacent levels, the volume and length of ascending aorta, aortic arch, descending aorta, and the total volume and length were obtained by AI on non-contrast enhanced chest CT (see Supplementary Table S1 and Fig. S1 for details). After excluding features with variance > 1, the remaining irrelevant redundant features were continued to be excluded by one-way ANOVA and LASSO regression. In the end, the six most relevant features were selected. The selected features and their corresponding coefficients are shown in Fig. 3. Then the AIScore is established by using the selected features and their coefficients. The AIScore showed a statistically significant difference between the HTN and non-HTN groups (Supplementary Fig. S2).

Figure 3
figure 3

The weights of selected thoracic aorta features measured by AI. The numerical value represents a specific level of the thoracic aorta; D, A, V, and L denote the diameter, area, volume, and length of the thoracic aorta at a specific level or two adjacent levels.

Development of nomogram

In order to adapt to different clinical application scenarios, we established five different models based on different clinical and imaging features.

In clinical model, the method of backward stepwise logistic regression showed that age, height, weight, serum biomarkers (including cholesterol (CHOL), high-density lipoprotein cholesterol (HDL.C), low density lipoprotein cholesterol (LDL.C), creatinine (CREA), serum potassium (K) and accompanied diseases (including hyperlipidemia (hyperlip), peripheral artery atherosclerosis (peri_AS) and coronary atherosclerosis (con_AS) is the risk predictor of hypertension. Since age, height and weight can be obtained directly, we used these three features to build the BasicClinical model (Supplementary Fig. S3A), and combined them and serum markers as well as concomitant diseases to establish the TotalClinical model (Supplementary Fig. S3B). Subsequently, we used multiple logistic regression to construct an AIMeasure model utilizing the six features of thoracic aorta screened previously. Finally, both clinical features and AIScore were included in multiple stepwise backward logistic regression to construct two mixed models called AIBasicClinical Model (Supplementary Fig. S3C) and AITotalClinical model (Supplementary Fig. S3D).

The nomograms of the five models are shown in Supplementary Fig. S3, and the selected valuable features and coefficients included in the clinical and mixed models are listed in Table 2.

Table 2 The coefficients of features included in the two clinical models and two mixed models.

Evaluation and comparison of performance of different models

The ROC curves of the five models in the training and validation sets were shown in Fig. 4, and the diagnostic performance was summarized in Table 3. The results presented that all the five models had good diagnostic performance for hypertension in both the training set (AUC 0.735–0.836, sensitivity 65.5–73.3%, specificity 66.7–72.5%) and the validation set (AUC 0.767–0.818, sensitivity 63.6–68.2%, specificity 70.9–77.3%). Subsequently, we compared the AUC among the five models (Table 4). In the training set, the AUC values of TotalClinical Model and AITotalClinical model were statistically different from other models (P < 0.001), while in the validation set, there was no significant difference in AUC among the five models (P > 0.05).

Figure 4
figure 4

The ROC curves for the five models in the training set (A) and validation set (B).

Table 3 Diagnostic performance of the five models in training set and validation set.
Table 4 Comparison matrix of AUC for the five models in training set and validation set.

Calibration curve and Hosmer–Lemeshow test showed that the five models presented good calibration ability in both the training set (P = 0.079–0.570) and the validation set (P = 0.117–0.977) (Fig. 6). The decision curve analysis of the five models was shown in Fig. 5, which showed that the five models could bring net benefits to patients in most reasonable threshold probability ranges (Fig. 6).

Figure 5
figure 5

Decision curve analysis (DCA) curves for the five models in training set (A) and validation set (B).

Figure 6
figure 6

Calibration curves for the five models in training set (A–E) and validation set (F–J). From left to right, the figures depict the AIMeasure model, BasicClinical model, TotalClinical model, AIBasicClinical model and AITotalClinical model, arranged with the training set at the top and the test set at the bottom. The dashed line represents the ideal prediction line. The red line illustrates the predictive efficacy of the nomogram in hypertension prediction. The green line indicates bias correction in the model.

Discussion

This study develops five distinct hypertension risk prediction models customized for various clinical scenarios. The BasicClinical model incorporates readily available factors like age, height, and weight, making it applicable to a broad audience. Expanding on the BasicClinical model, the TotalClinical model integrates additional parameters, including blood glucose, blood lipids, electrolyte levels, comorbidities, and other clinician-assessed factors—typically gathered during physical examinations and prior medical visits. The AIMeasure model encompasses dimensions such as the diameter, cross-sectional area, volume and length of thoracic aorta. These metrics can be efficiently measured by AI using standard non-enhanced chest CT scans. Consequently, patients can simultaneously estimate hypertension risk during lung cancer screenings or pulmonary nodule follow-ups, amplifying the value of chest CT assessments. The AIBasicClinical model and the AITotalClinical model include the AIScore and the aforementioned clinical risk factors. Our hypertension risk prediction model exhibits robust calibration and substantial clinical utility. Physicians can anticipate hypertension risk based on established factors, allowing for effective preventative measures or treatment strategies.

Notably, in our initial stages of data processing, we also diligently applied the k-fold method, generating thousands of models and meticulously evaluating each one. We observed a considerable degree of stability in these results, prompting us to adopt the 7:3 random division method for the final analysis presented in the paper. The robustness of various sampling outcomes is satisfactory, and we attribute this to our ample sample size, particularly within the specific target population of individuals undergoing lung cancer screening.

Traditional risk factors for high blood pressure comprise age, BMI, smoking and others. Age serves as an independent predictor of hypertension due to diminished vascular elasticity, sluggish blood flow, and heightened blood viscosity35. Studies have yielded inconclusive findings regarding the association between hypertension and gender9,36. Within this study, women exhibited a slightly elevated hypertension incidence compared to men, potentially influenced by a greater female representation in the sample. A European study indicated a notably lower incidence of cardiovascular disease in women compared to men up to age 45, with no substantial variance in prevalence by age 60, potentially attributed to estrogen's protective impact on blood vessels37. Studies have shown that obesity, especially abdominal obesity, independently heightens the risk of hypertension9,38,39. As obesity rates surge, the incidence of high blood pressure escalates. Furthermore, smoking and alcohol consumption also contribute to an elevated hypertension risk9,10,40,41.

Biochemical markers obtained from routine medical check-ups and visits for other conditions provide valuable insights. This study focused on parameters such as blood glucose, lipid profiles, electrolytes and comorbidities. After rigorous variable screening, the model integrated cholesterol, high-density lipoprotein, low-density lipoprotein, blood potassium, creatinine, hyperlipidemia, and arteriosclerosis. Anomalies in glucose metabolism heighten hypertension risk by damaging blood vessels14. Concurrently, hyperlipidemia significantly amplifies hypertension risk, and the two often co-exist to accelerate arteriosclerosis42. The mechanism may involve the rise and fluctuation of blood pressure increasing stress on the vascular wall, lipid deposition thickening the intima, and stimulating the inflammatory response. This leads to injury to intima endothelial cells, increased permeability, fibrosis of media smooth muscle cells, increased arterial hardness, and decreased elasticity. The effect of electrolytes on hypertension is twofold. Lowering sodium intake and increasing potassium intake are known to be beneficial in reducing hypertension43,44, and adequate calcium intake is also advantageous for high blood pressure45. Most studies have demonstrated a protective effect of magnesium against hypertension46. The regulation of magnesium on blood pressure may include mechanisms such as acting as a calcium antagonist to regulate vascular tension and contraction, vascular endothelial function, aging and stiffness, vascular remodeling, oxidative stress, insulin resistance, inflammatory response, etc.

Hypertension is a prevalent clinical syndrome with multiple contributing factors. The occurrence of hypertension is attributed to various risk factors and the decompensation of blood pressure regulation mechanism. Simultaneously, hypertension and various risk factors can mutually influence and collectively contribute to the progression and aggravation of the disease. Hypertension frequently coexists with various conditions, including diabetes, hyperlipidemia, atherosclerosis, cardiovascular diseases, and cerebrovascular diseases. They may interact in a complex causal manner, exacerbating their respective pathological processes47,48.

In a preliminary study49, we demonstrated that the diameter of the thoracic aorta, particularly the middle descending aorta, significantly impacted masked hypertension and poorly controlled outcomes of hypertension. In this study, we additionally measured the cross-sectional area, volume, and centerline length of the thoracic aorta, in addition to the diameter of nine levels. Our study revealed significantly differences in all diameters, areas, volumes, and lengths of the ascending and descending aorta between non-hypertensive and hypertensive groups. Following dimensionality reduction and variable selection, the prediction model ultimately includes D3, D4, D6, D7, A1 and V7_8. The AUC for the training set and validation set is 0.735 and 0.767, respectively. In a previous study, the diameters and volumes of ascending, arching, and descending segments of the thoracic aorta were larger in hypertensive patients than in subjects with normal blood pressure (P < 0.001), and the differences persisted after adjusting for age50. According to Laplace's law, the size of the vascular lumen is inversely proportional to the thickness of the wall and directly proportional to the pressure51. To maintain the stability of circumferential pressure under the condition of constant vascular thickness, an increase in blood pressure inevitably leads to an expansion of diameter. Vasodilation mediated by blood flow occurs when a sudden increase in blood flow in the lumen induces shear force on the vascular wall, resulting in damage to vascular endothelial cells and subsequent vasodilation25,52.

We developed five distinct models based on conventional risk factors, biochemical markers, co-morbidity, and thoracic aorta imaging factors of the thoracic aorta measured on non-enhanced chest CT. All five models exhibited good diagnostic performance (AUC 0.735–0.836), along with robust calibration capabilities and significant clinical net benefits. Framingham Heart Study developed a short-term hypertension prediction model considering the interaction between age, gender, SBP, DBP, current smokers, parental hypertension, BMI, age and DBP9 under the premise of Caucasian patients without diabetes. However, it has been verified that the model’s extensibility is limited10,11. In reality, many chronic diseases tend to co-occur, such as diabetes and hypertension. Therefore, not excluding individuals with diabetes in the study is more in line with the actual clinical scenario. Numerous hypertension prediction models have been developed in Asia, including China, but they often incorporate only partial risk factors and location-specific variables. As an example, a prediction model for Kazakh herdsmen in Xinjiang not only included age, body mass index, blood lipid, and other factors but also considered dietary factors (such as yak butter often consumed in pastoral areas). The model achieved an AUC of 0.803 in modeling set and 0.809 in verification set12. Leveraging genetic and environmental factors, Li et al. built prediction models for systolic and diastolic blood pressure, yielding AUC values of 0.673 and 0.817, respectively17.

In summary, we developed and validated five hypertension prediction models using distinct predictors. Primary care workers can select from different prediction models, tailoring hypertension predictions for individual patients based on the available predictors. For example, when patients undergo chest CT for lung cancer screening, the AIMeasure model can predict the risk of hypertension; Conversely, when patients present additional clinical data like major biochemical laboratory examination and past medical history, the AITotalClinical model emerges as a valuable tool for predicting hypertension risk. This will enhance strategies for preventing and treating hypertension, effectively reducing and delaying the onset of adverse events of related to hypertension.

Nevertheless, certain limitations in this study warrant consideration. First of all, the study focused on patients undergoing routine health examinations in the general medical department of our hospital, with an age range spanning from 18 to 95 years old. Consequently, the predictive capacity of the model in different ethnic groups or specific populations remains uncertain. Secondly, the study exclusively incorporated risk factors accessible through routine examinations, excluding other predictors like economic status, educational level, psychosocial elements, and genetic markers. Lastly, as this study is a single-center cross-sectional study, the sample represents only a portion of the population in this region. Since hypertension risk factors can vary across different regions, the generalizability of this study is somewhat constrained, and the model's stability requires additional external validation.

Conclusion

In this study, five hypertension risk prediction models were established based on clinical risk factors for hypertension and thoracic aorta image features measured on non-enhanced chest CT. These include the BasicClinical model, which relies on easily obtainable basic information; the TotalClinical model, which incorporates comprehensive clinical data such as basic information, biochemical indicators, and comorbidities; the AIMeasure model, focusing on thoracic aorta image features; the AIBasicClinical model, combining basic information and thoracic aorta image features; and the AITotalClinical model, integrating comprehensive clinical data with thoracic aorta image features.

Upon evaluation, all five models demonstrated favorable predictive and calibration capabilities, offering potential clinical utility. Consequently, in diverse clinical scenarios, patients can select the appropriate hypertension risk prediction model based on existing or accessible clinical or imaging data. This aids in the preliminary identification of high-risk hypertension patients and enhances the efficiency of primary prevention efforts. However, the generalizability of these models requires further assessment and external validation in large-scale studies in the future.