Genetically determined hypoalbuminemia as a risk factor for hypertension: instrumental variable analysis

Hypoalbuminemia is associated with vascular endothelial dysfunction and the development of chronic cardiovascular diseases. However, the relationship between serum albumin concentration and blood pressure changes remains controversial. Community-based longitudinal cohort data collected from Korean Genome and Epidemiology Study were used in this study. Hypoalbuminemia was defined as a serum albumin concentration of ≤ 4.0 g/dL. A total of 4325 participants were categorized into control (n = 3157) and hypoalbuminemia (n = 1168) groups. Serum albumin had a non-linear relationship with the risk of hypertension development. A genome-wide association study revealed 71 susceptibility loci associated with hypoalbuminemia. Among susceptibility loci, genetic variations at rs2894536 in LOC107986598 and rs10972486 in ATP8B5P were related to elevated blood pressure. Serum albumin (HR = 0.654, 95% CI 0.521–0.820) and polymorphisms of rs2894536 (HR = 1.176, 95% CI 1.015–1.361) and rs10972486 (HR = 1.152, 95% CI 1.009–1.316) were significant predictors of hypertension development. Increased albumin concentration instrumented by 2 hypoalbuminemia-associated SNPs (rs2894536 and rs10972486) was associated with decreased HRs for hypertension development (HR = 0.762, 95% CI 0.659–0.882 and HR = 0.759, 95% CI 0.656–0.878). Our study demonstrated that genetically determined hypoalbuminemia is a significant predictor of incipient hypertension.

Albumin, one of the major serum proteins, has multiple important physiological functions involving stabilization of plasma colloid osmotic pressure, transportation of diverse substances, and significant antioxidant activity, and its concentration is finely regulated by various systems in the physiologic state 1 . Hypoalbuminemia is strongly related to unfavorable health outcomes in various pathologic conditions, including hospitalized patients, surgical patients, and those with heart failure, chronic liver disease, chronic kidney disease, or end-stage renal disease [2][3][4] . Furthermore, recent epidemiologic studies have shown that low serum albumin concentration is a reliable clinical biomarker of vascular endothelial dysfunction and is an important predictor of future cardiovascular diseases (CVDs) and all-cause mortality in the general population [5][6][7][8] .
Hypertension is a well-known modifiable risk factor in the development and progression of chronic CVDs 9 . The vascular endothelium is the primary site of systemic hemodynamic dysfunction in various metabolic and vascular diseases 10 . Vascular endothelial dysfunction is mainly characterized by the induction of a pro-inflammatory or pro-thrombotic state and impairment in endothelium-dependent relaxation of blood vessels 10, 11 , which may play critical roles in the pathogenesis of hypertension. However, the pathophysiologic connection between hypoalbuminemia and hypertension has not yet been clarified.
There has been inconsistent evidence on the relationship between serum albumin concentration and the risk of hypertension. Previously, Hu et al. and Høstmark et al. showed that a rise in serum albumin levels was associated with elevated systolic blood pressure (SBP) and diastolic blood pressure (DBP) in a Caucasian population 12,13 . Recently, Oda et al. demonstrated that a decreased serum albumin level was also a significant predictor of hypertension in a Japanese health population 14 . Such findings may indicate that possible confounders, such as differences across racial groups and selection bias, could alter the impact of serum albumin levels on the development of hypertension. Instrumental variables can be used to help address the effect of unobserved confounders by operating as a randomization process when evaluating the association between environmental exposures and outcomes of interest. Thus, we performed a genome-wide association study (GWAS) to identify www.nature.com/scientificreports/ deviation from expectation, which may be indicative of the inflation in test statistics caused by polygenic effect and Korean population structure. We performed a logistic regression analysis for slightly decreased serum albumin concentration, using age and sex as covariates, and subsequently calculated the minimum P-value for three genetic models (additive, recessive, and dominant, Supplementary Table S1). Our association analysis revealed that a total of 71 SNPs from 28 distinct genomic regions were significantly associated with hypoalbuminemia (P GWAS ranging from 9.9 × 10 -5 www.nature.com/scientificreports/ to 6.5 × 10 -6 , Supplementary Fig. S2). Among the loci found in this study, only one genetic locus, GCKR, was revealed in other GWAS of populations of eastern Asian ancestry 15 . Because the SNPs of interest were located in a genomic region encoding VEGFA, LOC107986598, LOC100132354, UNC13B, ATP8B5P, and RUSC2, among others, we performed imputation analysis to characterize these loci. The regional association plots using genotyped and imputed data showed that rs2894536 was confined to regions around the LOC107986598 gene and rs10972486, the ATP8B5P gene ( Supplementary Fig. S3).

Hypoalbuminemia, hypoalbuminemia-related SNPs, and the risk of hypertension development.
To avoid bias from the endogeneity in conventional linear regression, we performed two-stage least squares estimation analysis using genetic variant(s) explaining the change in serum albumin concentration as instrument variables, including rs2894536 in LOC107986598 and rs10972486 in ATP8B5P (Table 3 and Supplementary Table S2). In ordinary least square linear regression, SBP and DBP decreased with increasing serum albumin levels (ΔSBP, β = − 0.043 mmHg/month, P = 0.0017; ΔDBP, β = − 0.103 mmHg/month, P < 0.0001). Our two-stage least squares estimation analysis revealed a possible causal role of hypoalbuminemia in determining SBP change (rs2894536: F = 16.56, P DWH = 0.0009; rs10972486: F = 16.69, P DWH = 0.0008), but not DBP.
Our multiple Cox-proportional hazards model demonstrated that serum albumin concentration and hypoalbuminemia-related SNPs were deeply associated with an increased risk of hypertension development (albumin, HR = 0.654, 95% CI 0.521-0.820; rs2894536, HR = 1.176, 95% CI 1.015-1.361; rs10972486, HR = 1.152, 95% CI 1.009-1.316) after adjustment for age, sex, smoking history, SBP, DBP, body mass index, waist circumference, hemoglobin, platelet, hemoglobin A1c, alanine aminotransferase, γ-glutamyl transferase, TG, and HDL-cholesterol level (Table 4 and Supplementary Table S3). Subsequent survival analyses with multiple Cox-proportional hazards regression analysis and log-rank test were performed to compare hypertension-free survival between the groups. We found that participants with serum albumin ≤ 4.0 g/dL and polymorphisms of rs2894536 or rs10972486 had poor hypertension-free survival rate (Fig. 2). Table 3. Relationship of serum albumin (endogenous variable) with the change in (A) systolic blood pressure (ΔSBP) and diastolic blood pressure (ΔDBP) as tested both by ordinary least squares linear regression and the application of two-stage least squares regression analysis using candidate genetic polymorphisms as an instrument variable. Regression results were adjusted for age and sex. SE, standard error. a Estimated using the Durbin-Wu-Hausman test, which examines the difference between the estimates from ordinary least squares linear regression and instrumental variable analysis. www.nature.com/scientificreports/ We performed restricted cubic spline regression analysis to evaluate the linearity of the relationship between serum albumin concentration and the risk of hypertension development and found that there was a non-linear relationship between them (Fig. 3). The inverse relation of serum albumin with the risk of hypertension development was attenuated in the condition of low serum albumin level (below 4 g/dL), suggesting that other confounding factor(s) may change the effect of serum albumin level on arterial blood pressure. To draw causal inferences about the effects of candidate genetic variation on hypoalbuminemia-related hypertension while controlling for unobserved confounding effects, we performed a two-stage residual inclusion analysis and found that genetically determined hypoalbuminemia was deeply related to an increased risk of incident hypertension (rs2894536, HR = 0.762, 95% CI 0.659-0.882; rs10972486, HR = 0.759, 95% CI 0.656-0.878; Table 5).

Discussion
This study provides comprehensive information on the association between genetic variations, hypoalbuminemia, and the risk of hypertension, showing that genetically determined hypoalbuminemia played a causal role in the development of incipient hypertension. To the best of our knowledge, this is the first instrumental variable study to demonstrate that the vascular protective effect of serum albumin may attenuate the risk of hypertension development.
In this study, we found that there was a reverse relationship between serum albumin and the risk of incipient hypertension. Previous epidemiologic and experimental studies demonstrated that serum albumin with antiinflammatory and antioxidant properties plays homeostatic regulatory roles in various physiological processes, and hypoalbuminemia is strongly associated with systemic inflammatory responses, vascular endothelial injury, and development of chronic vascular diseases [16][17][18][19] . However, other clinical studies showed very inconsistent association between serum albumin level and arterial blood pressure [12][13][14] . Such findings seem to be the result of not only the limitations of study design, problems of small sample size, or differences in race and ethnicity, but also the hidden effects of confounding variable(s). To overcome the limitations of previous studies and to assess more robust associations between exposure variables and outcome variables, we performed an instrumental variable analysis based on a community-based longitudinal study with a large sample size and demonstrated that a low serum albumin concentration could be a factor for predicting the development of hypertension in the general population.
Our study results revealed that potential genetic variations could be related to changes in the function of vascular endothelial cells and the development of hypertension, showing that two candidate loci, rs2894536 in the LOC107986598 region and rs10972486 in the ATP8B5P region, were deeply associated with hypoalbuminemia. www.nature.com/scientificreports/ LOC107986598 is an uncharacterized non-coding ribonucleic acid (ncRNA) gene located near vascular endothelial growth factor A (VEGF-A). Previous basic research demonstrated that ncRNA can exert more widespread effects on gene regulation; VEGF-A is one of the critical molecules for endothelial maintenance and exerts pleiotropic actions to facilitate migration, proliferation, and survival of vascular endothelial cells; and anti-VEGF-A therapies are strongly related to the development of proteinuria and increased blood pressure [20][21][22] . Such findings suggested that the regulatory function of LOC107986598 may be important for maintaining physiological vascular endothelial function. Meanwhile, there is limited evidence supporting the relationship between ATP8B5P and hypoalbuminemia. ATP8B5P, a human gene related to ATP8B1, is a typical pseudogene predominantly expressed in the testes. Although the transcript of ATP8B5P does not contain fully functional reading frames, several recent experimental studies demonstrated that the ATP8B1 protein belongs to the family IV of P-type adenosine triphosphatase, which is one of the membrane proteins that are responsible for the generation and maintenance of phospholipid asymmetry in the lipid bilayer; ATP8B1 protein was enriched in cerebral micro-vessels; and some pseudogenes appear to harbor the potential to regulate their protein-coding cousins [23][24][25][26] . These findings suggest that the VEGFA-or ATP8B1-related pathways may play a role in the development of hypoalbuminemia and initiation of blood pressure elevation. There is a need to further investigate the specific gene regulatory pathway, which will provide target genes for future drug development endeavors. There was a difference between conventional linear regression analysis and two-stage least squares estimation analysis results: serum albumin level was strongly related with only DBP in conventional linear regression analysis, but it was related to both SBP and DBP in the two-stage least squares estimation analysis. A possible explanation for this discrepancy may be the limitations of observational epidemiology (confounding, reverse *Calculated using restricted cubic spline regression using age, sex, smoking history, systolic BP, diastolic BP, body mass index, waist circumference, hemoglobin, platelet, hemoglobin A1c, alanine aminotransferase, γ-glutamyl transferase, triglyceride, and HDL-cholesterol level as covariates. www.nature.com/scientificreports/ causality, or regression dilution bias) for making causal inferences 27 . These findings indicated that caution is needed when drawing conclusions from conventional observational epidemiologic studies, and instrumental variable methods can make more accurate causal relationships between the exposure to the candidate risk factors and clinical outcomes. There are some limitations to this study. First, statistical methods of causal inference, namely instrumental variable analyses, could not completely remove confounding and selection bias. To avoid a possible current limitation of the instrumental variable analysis, we had strict inclusion criteria and extensive exclusion criteria. Second, either the determination of serum albumin concentration or the development of hypertension might be influenced by a wide variety of factors promoting vascular endothelial dysfunction. Because this communitybased cohort study did not contain data on 24-h urine collection, ambulatory BP monitoring, vascular flowmetry, thyroid function test, abdominal ultrasonography, or echocardiography, we could not assess the exact association of genetic variation, hypoalbuminemia, vascular endothelial dysfunction, and hypertension. In addition, sex differences in albumin levels may have been attributed to hormonal basis for the differences between males and females or other factors such as oral contraceptives and menopause, but it is believed that the adjustment for sex would have reduced the error resulting from them in this study 28 . The Third, a social desirability bias could not be ruled out because medical history, use of medication, and consumption of tobacco or alcohol were all self-reported by subjects. This could have contributed to conflicting results with other studies. Finally, this GWAS could not reached a stringent genome-wide significance threshold (e.g. 10 -8 ) and we could not perform a replication study owing to limited sample sizes of the cohort data and the lack of another independent cohort with similar phenotype. Moreover, the limited sample size of the hypoalbuminemia group could lead to bias in the nonlinear relationship between serum albumin and hypertension. This lack of sample size in this study is one of the major limitations and it raises the need for large-scale follow-up research. Because GWAS generally diminishes the ability to detect variants of small effects, a replication study is required to avoid false positives and to correctly assess the effect size of the SNP. As a result, large genotyping studies and robust replication studies are required in the future to confirm the associations and to find detailed relationships among the genotypic variants.
In conclusion, this study demonstrated that there was a causal relationship between hypoalbuminemia and the risk of future hypertension in the general population. The current findings may provide an opportunity to determine the underlying mechanisms of the effects of hypoalbuminemia on the increased risk of hypertension. Further replication studies and experimental studies are needed to clarify the causal mechanisms between hypoalbuminemia and incipient hypertension.

Methods
Study design and population. The present study used Ansan-Anseong cohort data from KoGES, in which biannual repeated surveys were collected between 2001 and 2014. The comprehensive profile and methods regarding the development of the KoGES have been presented previously 29 . Ansan-Anseong cohort data were developed to find the potential effect of candidate genetic variation on various chronic illnesses, which are from a medium-sized city (Ansan) and a rural area (Anseong) near Seoul, Korea. All participants underwent serial health examinations biannually including laboratory tests, electrocardiograms, chest X-rays, and health questionnaires. The use of antihypertensive therapy was assessed using an interviewer-administered questionnaire at baseline and every visit. All participants were enrolled voluntarily and provided written informed consent. All participants' records, apart from the survey date and home region, were anonymized before analysis. This study was carried out in accordance with the Declaration of Helsinki and was approved by the Institutional Review Board (IRB) of Konkuk University Medical Center (IRB protocol: KUCH 2019-02-007). After the exclusion of subjects with missing data and those with hypertension, diabetes mellitus, chronic kidney disease, cardiovascular disease, and malignancy in KoGES, a total of 4325 eligible participants were categorized into two groups based on their serum albumin results ( Supplementary Fig. S4).
Anthropometric and clinical measurements. Anthropometric measurements were performed by well-trained examiners. Participants wore a lightweight gown or underwear. Height (Ht) was measured to the nearest 0.1 cm using a portable stadiometer (SECA 225, SECA, Hamburg Germany). Weight was measured to the nearest 0.1 kg on a calibrated balance-beam scale (GL-60000-20, CAS Korea, Seoul, Korea). Waist circumference was measured using a flexible tape at the narrowest point between the lowest border of the rib cage and the uppermost lateral border of the iliac crest at the end of normal expiration.
Trained examiners measured blood pressure (BP) according to a standardized protocol using an appropriatesized cuff and a mercury sphygmomanometer at baseline and biannual follow-up visits. The first and fifth phases of Korotkoff sounds were used for SBP and DBP. BP measurements were repeated after 30-s intervals and were recorded to the nearest 2 mmHg. SBP and DBP were determined as the average of the right and left arm readings obtained three times between 7:00 a.m. and 9:00 a.m. after a minimum of 5 min of rest in a seated position; there were 5 min of rest between each measurement. Average rate of blood pressure change was calculated as the difference between blood pressure at baseline and last measurement just before the diagnosis of hypertension divided by the length of the period.
Laboratory tests. Venous blood samples were collected after 8 h of overnight fasting. Fasting plasma concentrations of glucose, triglyceride (TG), high-density lipoprotein (HDL)-cholesterol, and low-density lipoprotein (LDL)-cholesterol were determined using a Hitachi Automatic Analyzer 7600 (Hitachi, Tokyo, Japan). Glycated hemoglobin (HbA1c) levels were determined by high-performance liquid chromatography (Variant II; BioRad Laboratories, Hercules, CA, USA). Serum creatinine levels were measured colorimetrically (Hitachi Automatic Analyzer 7600), and estimated glomerular filtration rate (eGFR) was calculated using the Chronic www.nature.com/scientificreports/ Kidney Disease Epidemiology Collaboration equation 30 . To obtain the urine albumin/creatinine ratio (UACR), urinary albumin was measured in spot urine using the immunoturbidimetric method, and urinary creatinine was measured using the colorimetric method.
Genotyping. We analyzed the data on single nucleotide polymorphisms (SNPs) in the whole genome available to the research community through the Korean Association Resource (KARE) project from KoGES and used the Affymetrix Genome-Wide Human SNP Array 5.0 (Affymetrix Inc., Santa Clara, CA, USA) to genotype the samples from the Ansan-Anseong cohorts. The Bayesian Robust Linear Model with the Mahalanobis distance algorithm was used to determine the genotypes at SNP of Affymetrix 5.0. SNPs were excluded if any of the following criteria were met: (1) a call rate lower than 95%, (2) a minor allele frequency below 0.05, or (3) a significant deviation from the Hardy-Weinberg equilibrium below 0.001. Among the SNPs filtered by these criteria, only tagging SNPs were used for analysis in this study.
Definition. According to previous epidemiologic studies 31-33 , we defined hypoalbuminemia as a serum albumin concentration of ≤ 4.0 g/dL. According to the Joint National Committee 7/8 guidelines 34,35 , hypertension was defined as either the use of antihypertensive therapy, SBP above 140 mmHg, or DBP above 90 mmHg.
Statistical analysis. All data, including socio-demographic information, medical conditions, anthropometric and clinical measurements, and laboratory results, were presented as mean ± standard deviation or frequencies (and proportions). The normality of the distribution of parameters was analyzed using the Kolmogorov-Smirnov test. If the original data did not follow a Gaussian distribution, logarithmic transformation was applied to make the distribution more normal. Quantitative variables were compared using the Mann-Whitney U test, and categorical variables by the chi-square and Fisher's exact tests. The relationship between serum albumin and potential risk factors for chronic CVD was assessed by linear regression analysis. Hazard ratios (HRs) with 95% confidence intervals (CIs) were calculated by multiple Cox-proportional hazards models according to the development of hypertension (case vs. control). In order to compare hypertension-free survival between the groups, Kaplan-Meier analysis with the log-rank test was used. We used Supremum tests to evaluate proportional hazards assumptions. Restricted cubic spline regression analysis was used to determine the possible non-linear dependency of the relation between candidate risk factor(s) and the risk of the dependent variable 36 .
Quantile-quantile plots and Manhattan plots were performed to visualize the results of the genome-wide association study, confirming the existence of relevant SNPs. We generated a regional association analysis using a web-based tool for the identification and annotation of proxy SNPs using HapMap (locuszoom.shp.umich. edu/locuszoom).
Because of the potential limitations of conventional statistical methods, an instrumental variable analysis (also known as Mendelian randomization) was applied to obtain causal inferences on the effect of an exposure on a clinically relevant outcome from observational data controlling for threats to its internal validity, including confounding variables, measurement error, spuriousness, simultaneity, and reverse causality [37][38][39][40] . In the twostage least squares regression, the effect estimates from the second stage of the instrumental variable analysis and ordinary least square analysis were compared using the Durbin-Wu-Hausman test. Because conventional two-stage least squares regression analysis may have insufficient statistical power to assess the pathogenic relevance when there is a non-linear relationship between the exposure and outcome, further two-stage residual inclusion analysis was performed to draw inferences on the causal effects of candidate genetic variant(s) on the association between hypoalbuminemia and interesting phenotype (hypertension) while controlling for potential confounding factors [41][42][43][44] .