Genetic association and functional analysis of rs7903456 in FAM35A gene and hyperuricemia: a population based study

Recent studies have identified SNP rs7903456 of FAM35A to be associated with gout. Because of the close connections between hyperuricemia and gout, we hypothesized that the effect of rs7903456 on gout might be mediated by hyperuricemia or its related quantitative trait, uric acid level. We investigated the association between 31 SNPs of FAM35A, including rs7903456, and hyperuricemia based on 2,773 hyperuricemia patients and controls. We fitted a simple model for each of these 31 SNPs to screen the candidate SNP for further analyses. Moreover, we selected potential confounders and fitted a multivariate model to investigate the adjusted effects of the targeted SNPs. Both disease status of hyperuricemia and blood uric acid level were considered as the main phenotype. We have identified rs7903456 to be associated with hyperuricemia and uric acid level. The significant signal was identified between rs7903456 and uric acid level after adjusted by several potential confounders. Our findings showed that the T allele of rs7903456 could increase the uric acid level by ~10 mmol/L on average after adjusting several biochemical and clinical variables. Our findings indicated that the previously identified effects of rs7903456 on gout might partly be mediated by its effect on uric acid levels.


Materials and Methods
Subjects. In our study, a total of 981 hyperuricemic individuals without gout and 1,792 normouricemic controls were recruited from Xijing Hospital, the Fourth Military Medical University between 2013 and 2016. The study protocol was approved by the Ethics Committee of Xijing Hospital of the Fourth Military Medical University in accordance with the ethical guidelines of the Helsinki Declaration of 1975 (revised in 2008). Written informed consent was obtained from participants before the start of the study. Hyperuricemia was defined as serum uric acid concentrations ≥416 μmol/L (male) or ≥357 μmol/L (female). Normouricemia was defined as serum uric acid <416 μmol/L (male) or <357 μmol/L (female). The controls were enrolled based on the selection criteria of frequency-matched age (±5 years) and gender of the patients. The age of enrolled subjects was between 25 and 60 years, and the body mass index (BMI) ranged from 21 to 28 kg/m 2 . All enrolled subjects had serum blood urea nitrogen and serum creatinine within the normal range. Subjects with diabetes, hypertension, congestive heart failure, kidney or liver diseases, as well as alcohol or tobacco addiction, were excluded, and individuals with a history of serum uric acid lowering agents were also excluded. Serum biochemical tests of all subjects were conducted, and biochemical parameters including total cholesterol, triglycerides, low density lipoprotein cholesterol, high density lipoprotein cholesterol, fasting glucose, creatinine, urea nitrogen, and uric acid in the plasma were measured or recorded.
SNPs selection and Genotyping. We searched for all SNPs with minor allele frequencies (MAF) ≥0.01 within the region of the FAM35A gene in the 1000 Genomes Chinese Han Beijing population (CHB). Then, MAF ≥ 0.01 with pair-wise tagging and r 2 ≥ 0.8 were used as the cut-off criteria during tag SNP selection, which generated 31 tag SNPs covering the region of the FAM35A gene for our study. Basic information on the 31 selected SNPs is summarized in the Supplemental Table S1. As we can see from this table, most of our selected SNPs are located at the intronic region. All of our selected SNPs had P values greater than 0.05 on the Hardy-Weinberg equilibrium test. Genomic DNA was extracted from peripheral blood leukocytes according to the manufacturer's protocol (Genomic DNA kit, Axygen Scientific Inc., California, USA). Genotyping was performed for all SNPs using the Sequenom Mass ARRAY RS1000 system (Sequenom, San Diego, California, USA). The results were processed using Typer Analyzer software, and genotype data were generated from the samples. Case and control statuses were blinded during all genotyping processes for quality control. Five percent of the random samples were repeated, and the results were 100% concordant.
Statistical and Bioinformatics analyses. Clinical variables and age were compared between hyperuricemia patients and the controls using a two-sample t-test. The distribution of gender between the two groups was examined by χ 2 test. Genetic association analyses were conducted by Plink 13 . Since multiple clinical variables have shown significant differences between our hyperuricemic patients and controls, it was very important to adjust for some of these variables to remove the potential confounding effects. To achieve this goal, we implemented a three-step strategy as follows: 1) We fitted a simple model for each of these 31 selected SNPs with only age and gender adjustments, and significant SNPs will be obtained and kept for further analysis. 2) We took those clinical variables one-by-one as phenotypes and fitted linear models with the significant SNPs obtained from the previous step (including age and gender as covariates) to identify the potential confounders. Significant clinical variables will be kept for further analysis. 3) A multivariate model will be fitted with age, gender and the confounders identified from step 2 to investigate the adjusted effects of the targeted SNPs. Both disease status of hyperuricemia and blood uric acid level were considered as the main phenotype. General statistical tests were performed by R software 14 . P = 0.05 was considered as the significance threshold for single statistical tests. Bonferroni corrections were applied as necessary to address multiple comparisons. To examine the potential functional significance of the identified SNPs, we utilized GTEx to investigate the eQTL pattern of the significant SNPs 15 . Furthermore, we performed power calculations by using PGA v2.0 16 . Our sample size can detect SNP association with 88% power, at a false positive rate of 5% and a presumed odds ratio (OR) of 1.5.

Simple model fitting. Demographic and clinical characteristics of hyperuricemic patients and controls
were shown in Table 1. All of the clinical variables tested in our study subjects have shown significant differences between patients and controls. The simple models for the disease status of hyperuricemia and uric acid level were fitted. The full results are summarized in Table 2. A total of 31 SNPs were tested, and the P value threshold used here was 0.05/31 ≈ 0.0016. With this threshold, we have identified SNP rs7903456 to be significantly associated with the disease status of hyperuricemia (OR = 1.33, P = 3.54 × 10 −6 ) after adjustment by age and gender, and this significant signal can be replicated by analysis of the blood level of uric acid (β = 10.85, P = 6.82 × 10 −5 ). The Q-Q plot of the full results of the simple model was shown in Supplemental Fig. S1.

Confounding factors and multivariate model fitting.
To identify the potential confounding factors that need to be adjusted in the multivariate model, we have fitted several simple models to investigate the potential relationships between SNP rs7903456 and the clinical variables tested in our study subjects. The results are summarized in Supplemental Fig. S2. A total of 6 out of the 8 clinical variables, including BMI (body mass index), fasting glucose level, total cholesterol level, triglyceride level, urea nitrogen level and creatinine level, were identified to be significant and therefore need to be adjusted in the multivariate model. HDL (high density lipoprotein cholesterol) and LDL (low density lipoprotein cholesterol) levels were not related to SNP rs7903456 and therefore might not confound the association signal. To examine the potential multi-collinearity effects in our multivariate model, we have calculated the VIF (variance inflation factor) and R 2 for these 6 clinical variables. The results are summarized in Supplemental age, gender, and these 6 clinical variables as covariates. A significant association signal was obtained for uric acid level only when rs7903456 was coded in the recessive mode (SNP was coded as 0 and 1 when there is either 0 or 1 minor allele and 2 minor alleles). After being adjusted by several clinical factors, the P value was 0.019 (β = 10.61). No multiple comparisons need to be addressed for the multivariate model because we only tested and presented one SNP. This significant hit was only identified for the uric acid level but not for disease status. The full results of the multivariate model for rs7903456 are summarized in Table 3.

Bioinformatics analyses.
We examined the potential effects of SNP rs7903456 on gene expression in multiple tissues by the GTEx (Genotype-Tissue Expression) database (Fig. 1). Rs7903456 had widespread significant effects on expression of multiple loci including FAM35A, GLUD1, LINC00864 and NUTM2A, and these eQTL (Expression Quantitative Trait Loci) effects were identified in 23 human tissues. This finding indicated that SNP rs7903456 had significant biological function through affecting gene expression of FAM35A.

Discussion
To the best of our knowledge, this study is the first to focus on the relationship between FAM35A genetic polymorphisms and hyperuricemia-related traits in the Chinese Han population. In this study, we have identified a significant SNP, rs7903456, to be associated with hyperuricemia and the related uric acid level. Although the significance of rs7903456 on the disease status disappeared after being adjusted by several potential confounders, its significance on the hyperuricemia-related quantitative trait was successfully attained. Our findings showed that the T allele of rs7903456 could increase the uric acid level by ~10 mmol/L on average after adjusting several biochemical and clinical variables. Our results were similar to the GWAS study of gout in the Japanese population published by Nakayama et al., although different phenotypes were investigated in the two studies. In the GWAS study, the T allele (known as the A allele in the original paper due to a different reference sequence) of rs7903456 would significantly increase the risk of gout by approximately 30% 12 . Our findings indicated that the effects of  Table 3. Full results of multivariate models fitted for rs7903456. CI: confidence interval. rs7903456 on gout might partly be mediated by its effect on the uric acid level because we now know that one of the most important risk factors for gout is hyperuricemia and its related biochemical variables. FAM35A encodes a protein with 835 amino acids, and it is ubiquitously expressed in many organs and human tissues 17 . However, the biological function of FAM35A is still unclear. Nevertheless, combined with the results of our bioinformatics analyses using GTEx, one interesting point is worth noting. Our GTEx data showed that rs7903456 might have significant effects on four genes in multiple tissues. Therefore, it is valuable to move our focus from FAM35A to the other 3 genes. As shown in Fig. 1, the other three genes were GLUD1, LINC00864 and NUTM2A. Similar to FAM35A, the biological function for LINC00864 and NUTM2A are still unclear, but GLUD1 is a gene with clearer biological functions. GLUD1 encodes glutamate dehydrogenase, which plays an important role in human nitrogen metabolism. It is located at the mitochondrial matrix and catalyzes the oxidative deamination of glutamate into alpha-ketoglutarate and ammonia. Early family-based studies have linked this gene to hyperinsulinemic hypoglycemia and hyperammonemia [18][19][20] . Nevertheless, it is still too early to conclude anything with this vague bioinformatics evidence. More studies are needed in the future to 1) confirm the biological function of FAM35A and 2) investigate the potential association between GLUD1 and hyperuricemia (and its related traits).
In our study, some limitations should be kept in the mind. One challenge of this study is to control confounding factors. As shown in Table 1 and Supplemental Fig. S2, most of the clinical variables we collected were associated with both hyperuricemia and our target SNP. Although we have adjusted these factors in our multivariate model, it is still possible that some underlying confounders have been missed. Some previous similar studies have included blood pressure measures as covariates in the multivariate model 10,11 , however, we did not collect these measures, so they were not adjusted in our model. This might introduce some confounding effects on our results. In addition, we did not apply any statistical techniques in addressing the potential population stratification in this genetic association study. As a candidate gene-based study, we simply did not have enough genetic markers to estimate the principle components for our sample. However, we have applied some criteria during the sample recruitment process to control the genetic heterogeneity of our sample by restricting their immigration history, and the Q-Q plot of our simple model did not show any signs of inflation for our tested markers. Finally, Population stratification is one of the major limitations to weaken the reliability of our significant results. However, we have tried our best to rule out the potential effects of population stratification by restricting the genetic background (through confining the immigration history of study subjects) during the sample collection 21,22 .
In sum, this study has identified a significant SNP located within FAM35A to be associated with hyperuricemia and the blood uric acid level, and after adjusting for multiple potential confounders, the association signal is still attained for the uric acid level. Our findings indicated that the previously identified effects of rs7903456 on gout in Japanese population might be partly mediated by its effect on the uric acid level. More studies in the future are needed to clarify the biological functions of FAM35A and NUTM2A. In addition, it might be worth diverting the focus from FAM35A to GLUD1, which is related to nitrogen metabolism since our bioinformatics analyses showed that rs7903456 was also associated with the expression of GLUD1.