Introduction

Fibroblast growth factor (FGF) superfamily includes 22 family members. Except for the endocrine FGF19, FGF21, and FGF23, other family members act as autocrine or paracrine factors1. Recent evidence showed that FGF21 exerts metabolic effects by acting both centrally and peripherally, mostly in the liver, adipose tissue and pancreas. In prolonged starvation, FGF21 is secreted by the liver to promote gluconeogenesis, increases hepatic fatty acid oxidation and ketogenesis2, enhances secretion of glucocorticoid, and reduces behaviors that waste energy such as ovulation3. In contrast, in fed status, FGF21 promotes glucose uptake by fat cells and increase heat production of brown fat4. In the pancreatic islet, FGF21 preserves the survival of beta-cell and restores insulin synthesis under the stress of nutrient excess5. Of note, FGF21 acts on the brain to induce sympathetic nerve activity, thereby increasing energy expenditure and promoting weight loss6. FGF21 is regulated by both peroxisome proliferator-activated receptor (PPAR) γ and PPARα and is an important mediator of the downstream effects of PPARγ and PPARα6,7. FGF21 is currently a drug target for treating lipid disorders and fatty liver8.

FGF23 is the physiological regulator of phosphate and vitamin D serum levels. FGF23 lowers apical membrane expression of sodium-dependent phosphate co-transporter 2A and sodium-dependent phosphate co-transporter 2C in the kidney, both of which primarily mediate renal tubular phosphate reabsorption. Furthermore, FGF23 lowers serum 1,25-dihydroxyvitamin D3 (1,25(OH)2D3) levels, thereby preventing hyperphosphatemia and hypervitaminosis1,9. FGF23 has structure and biological features similar to those of FGF19 and FGF21, which were known to regulate glucose and lipid metabolism10. FGF23 level was also found to be associated with obesity and dyslipidemia in studies of elderly cohorts11. FGF23 is also an important drug target for treating phosphate disorder and bone disease12. Robinson-Cohen et al. performed a genome-wide association study (GWAS) of circulating FGF23 concentrations among people of European ancestry, and found five genome-wide significant loci13.

To understand better the regulation of circulating FGF21 and FGF23 levels, we performed a GWAS to identify genetic determinants of circulating FGF21 and FGF23 levels in Taiwanese population.

Results

Of the 5,000 subjects enrolled from Taiwan Biobank, one withdrew from the study, 559 were excluded due to their having diabetes mellitus, and 239 were excluded by quality control procedures. A total of 617,073 genotyped autosomal SNPs remained. After imputation and post-imputation quality control, 7,897,704 SNPs remained. We performed GWAS analysis in the remaining 4,201 subjects, where log-transformed FGF21 or FGF23 level was the quantitative phenotype. Study characteristics are shown in Table 1.

Table 1 Study characteristics of participants.

The results of association testing for FGF21 and FGF23 are listed, showing SNP with strongest association in each region (Tables 2 and 3, respectively). Manhattan plots and quantile–quantile plots of log-transformed FGF21 and FGF23 are shown in Figs. 1 and 2, respectively. Regional GWAS plots of each SNP are shown in Figs 3 and 4. For FGF21, no significantly associated SNP is found. One suggestive locus is located at the intergenic region between PHC2 and ZSCAN20 genes on chromosome 1 (rs12565114; P = 6.00 × 10–7, Fig. 3a), another is between ARGLU1 and FAM155A genes on chromosome 13 (rs9520257; P = 6.11 × 10–7, Fig. 3b), and the other is within the RGS6 gene on chromosome 14 (rs67327215; P = 6.66 × 10–7, Fig. 3c). For FGF23, two significant loci are found. One locus is near gene PCSK9 on chromosome 1 (rs17111495; P = 1.04 × 10–10, Fig. 4a), and the other is near the HLA-DQA1 gene on chromosome 6 (rs17843626; P = 1.80 × 10–8, Fig. 4b). There is also a suggestive locus within the TGFB2 gene on chromosome 1 (rs2798631; P = 4.97 × 10–7, Fig. 4c). The array-based heritability estimates for FGF21 and FGF23 were 0.1171 and 0.1104 respectively. Proportion of variance in serum FGF21 level explained by rs12565114, rs9520257 and rs67327215 were 0.5%, 0.6% and 0.5% respectively; for serum FGF23 level, proportion of variance explained by rs17111495, rs2798631 and rs17843626 were 1.0%, 0.6% and 0.7% respectively.

Table 2 Top genetic polymorphisms associated with log-transformed FGF21 level.
Table 3 Top genetic polymorphisms associated with log-transformed FGF23 level.
Figure 1
figure 1

Manhattan plot of the GWAS results for FGF21. SNPs are plotted on the x axis according to their chromosome position against association with (a) log FGF21 or (b) log FGF23 on the y axis. The red horizontal line represents the suggestive threshold of P = 1.0 × 10–6.

Figure 2
figure 2

Quantile–quantile plots of (a) log FGF21 and (b) log FGF23.

Figure 3
figure 3figure 3

Regional association plots of log FGF21.

Figure 4
figure 4figure 4

Regional association plots of log FGF23.

Further excluding two more subjects with severe renal impairment (eGFR < 30 ml/min/1.73m2) did not alter the results of top SNPs associated with FGF23 level (Supplementary Table S1).

We looked for the association of the five SNPs identified to be associated with serum FGF23 levels by Robinson-Cohen et al.13 in the European population but failed to found significant association with the FGF23 level in our cohort (Supplementary Table S2). In view of low minor allele frequency of rs17216707, we performed gene-based association analysis using MAGMA v1.07b14 and still found no significant association (CYP24A1, P = 0.18). The regional plots regarding association with serum FGF23 levels in Taiwan Biobank at the five loci identified previously by Robinson-Cohen and colleagues are shown in Supplementary Fig. S1a–e. We also performed meta-analysis pooling our results and those reported by Robinson-Cohen et al.13 using METAL15 but the results were not significant as shown in Supplementary Table S3 (P values were 0.00070, 0.0010 and 0.00029 for rs17111495, rs2798631 and rs17843626, respectively).

Discussion

In this GWAS for evaluating both circulating FGF21 and FGF23 levels, we identified several loci for each trait that showed associations. SNP rs67327215, associated with FGF21 level, is located within RGS6 (Regulator of G protein signaling 6). RGS6 belongs to the R7 subfamily of RGS proteins16 and is expressed in a variety of organs and tissues, including the central nervous system and liver17. RGS proteins regulate G protein-coupled receptor-initiated signaling by acting as GTPase-activating proteins for Gα subunits, which hydrolyze GTP and restore the inactive Gα GDP βγ heterotrimer18. A previous study has shown Rgs16 knockout mice expressing increased Fgf21 expression in the liver while overexpression of Rgs16 decreased Fgf21 expression. It was postulated that RGS16 inhibits Gαi/Gαq-mediated fatty acid oxidation, thereby decreasing the induced Fgf21 expression by peroxisome proliferator-activated receptor alpha (PPARα)19. RGS6 knockout mice revealed lower mRNA levels of PPARα and PPARγ20. These data support that RGS6 may also act as a regulator of circulating FGF21 levels through fatty acid synthesis and oxidation.

The SNPs rs12565114 and rs9520257, also suggestively associated with FGF21 level, are intergenic SNPs on chromosome 1 and 13 respectively. rs12565114 is located between PHC2 (Polyhomeotic Homolog 2) and ZSCAN20 (Zinc Finger And SCAN Domain Containing 20) genes. PHC2 is associated with conventional angiosarcoma21. ZSCAN20 is associated with diabetic neuropathic pain22. rs9520257 is located between gene ARGLU1 (Arginine And Glutamate Rich 1) and FAM155A (Family With Sequence Similarity 155 Member A) genes. ARGLU1 acts in cooperation with MED1 (Mediator Complex Subunit 1) and is required for estrogen-dependent gene transcription and breast cancer cell growth23. FAM155A is associated with diverticulitis24. None of the genes above were related to FGF21 according to our current understanding. There might be microRNAs or long non-coding RNAs encoded in these intergenic regions affecting FGF21 levels.

SNP rs17111495, strongly associated with FGF23 levels, is located upstream of PCSK9 (Proprotein convertase subtilisin/kexin type 9). PCSK9 is a member of proprotein convertase family PCSK (subtilisin-like proprotein convertases previously) and is synthesized as a soluble zymogen that undergoes autocatalytic intramolecular processing in the endoplasmic reticulum25. FGF23 is inactivated when being cleaved intracellularly by PCSK at the minimum consensus sequence RHTR179 between Arg179 and Ser18026,27. Autosomal dominant hypophosphatemic rickets is caused by gain of function mutations in FGF23 that renders it resistant to PCSK cleaving at site RHTR179, thus resulting in elevated circulating FGF23 level and the consequent renal phosphate wasting, rickets, and osteomalacia26. Hyperphosphatemic familial tumoral calcinosis due to GALNT3 mutation involves deficient O-glycosylation of the threonine residue in the RTHR179 proprotein convertase-processing site, thus favoring intracellular degradation of intact FGF23 by PCSK and the resulting reduced phosphate renal excretion28. Collectively, previous studies and our findings suggest that PCSK is a critical regulator of FGF23 secretion.

SNP rs17843626, significantly associated with circulating FGF23 levels, is located between HLA-DQA1 and HLA-DQB1 (Human Leukocyte Antigen, Class II, DQ Alpha 1 and Beta 1). In general, these HLA class II loci are known to be associated with autoimmune diseases such as type 1 diabetes, multiple sclerosis and rheumatoid arthritis29. No literature has reported association between these loci and FGF23 level or its regulators, parathyroid hormone and vitamin D. This is also a new finding.

SNP rs2798631, suggestively associated with circulating FGF23 levels, is located within TGFB2 (Transforming growth factor beta-2), which is involved in cell proliferation, differentiation, inflammation and apoptosis. A recent study has demonstrated that TGF-β2 directly enhanced FGF23 production in rat osteoblast-like cells through increasing calcium entry30. 1,25(OH)2D3, the active vitamin D metabolite which induces FGF23 production, also inhibits TGF-β downstream signaling31. Downstream TGF-β and 1,25(OH)2D3 signaling are characterized by intense crosstalk32. The expression of FGF23 by both modulators also seemed to be subjective to this crosstalk33.

For more comprehensive bioinformatic annotations of these top SNPs, we used GRASP v2.0 to search for documented associations between these SNPs and phenotypes34. One study showed association between rs12565114 and schizophrenia35. Several studies demonstrated association between rs17111495 and serum lipid levels36,37. rs2798631 was associated with height38, refractive error39 and Parkinson’s disease40. There were no results for rs9520257, rs67327215 and rs17843626.

The absolute circulating FGF23 levels measured in this study were different from that obtained by Robinson-Cohen et al. This study used Duoset ELISA Development kit (R&D system, Inc.) to measure intact FGF23 levels; and the median intact FGF23 level in the studied population was 360 (110–1,740) pg/ml (expressed as median and interquartile range). For published researches using this ELISA kit, the median of intact FGF23 level was 269.9 (109.2–1,014.0) pg/ml in Chinese renal transplant donors41, and 150 (46–583) pg/ml in Mexicans with normal kidney function42. In contrary, Robinson-Cohen et al. used a different ELISA kit (Kainos Laboratories, Inc., Tokyo, Japan), and the population for their study was of European ancestry. Their mean intact FGF23 level was around 40–50 pg/ml13. The variations in FGF23 level obtained may be attributed to ethnicity of the studied populations and different ELISA kits used.

Our study has some unique strength and some limitations. It is the first known study identifying suggestive genetic determinants of circulating FGF21 level. We also evaluated SNPs associated with circulating FGF23 level from a population of ancestry different from those studied in previous GWAS, and novel candidate loci were found.

As mentioned above, our findings differed from those of Robinson-Cohen et al.13, possibly due to differences in ethnicity, differences in gene-environment and gene–gene interactions, and our relatively small sample size. As rs17216707 minor allele frequency was low in our cohort, gene-based association analysis was conducted as an alternative way in effort to replicate the findings but the result was not significant. A meta-analysis conducted on both this study and that of Robinson-Cohen et al. found no statistically significant results either.

Regarding the lack of replication in an external cohort of the study findings, we searched for GWAS summary statistics of circulating FGF21 or FGF23 phenotypes in UK Biobank and Biobank Japan but found them unavailable. Further replication may be needed in the future.

In conclusion, this study is the first GWAS on circulating FGF21 level to date. Novel candidate genetic loci possibly related to circulating FGF21 and FGF23 levels were found. Further replication and functional studies are needed to support the present findings.

Materials and methods

Study population

Five thousand subjects were enrolled from Taiwan Biobank by random sampling as study population. Subjects are aged between 30 and 70 years, without cancer history and volunteered to join the Taiwan Biobank study population. Eating and drinking were refrained before blood samples were drawn. Delinked databases including biological specimens, personal data and clinical information were used in this investigation. The estimated glomerular filtration rate (eGFR) was calculated using the 4-variable equation from the Modification of Diet in Renal Disease (MDRD) Study43. There were only two individuals with severe renal impairment (eGFR < 30 ml/min/1.73 m2), and their serum FGF23 levels were 0.259 and 0.709 ng/ml.

All subjects provided written informed consent. Individuals diagnosed with diabetes mellitus (either as stated in the questionnaire or HbA1c ≥ 6.5%) were excluded.

FGF21 and FGF23 measurements

Intact circulating FGF21 and FGF23 levels in serum of Taiwan Biobank subjects were measured using Enzyme-Linked Immunosorbent Assay (Duoset ELISA Development kit, R&D system, Inc.). The assays have high sensitivity and exhibit no interference or cross-reactivity with recombinant human FGF R1a/Fc Chimera, FGF R2a/Fc Chimera, FGF R3/Fc Chimera, FGF R4/Fc Chimera and Klotho. All standards by serial dilution were assayed in duplicates.

Genotyping, quality control and imputation

Genome-wide genotyping of all subjects was carried out at the National Center for Genome Medicine of Academia Sinica using the Axiom-Taiwan Biobank Array Plate (TWB chip; Affymetrix Inc, Santa Clara, California)44. TWB chip consists of 653,291 SNPs, and was specifically customized for the Taiwanese population, who are mainly of Han-Chinese lineage, by including SNPs with detected polymorphisms in Taiwanese-based genotyping results from the Axiom Genome-Wide CHB 1 Array plate (Affymetrix Inc). SNPs from ancestry information panels, GWAS and cancer studies, as well as pharmacogenetics arrays were also incorporated into the TWB chip. PLINK (version 1.07), an open-source whole-genome association analysis toolset was used for quality control45. Genotypes for SNPs with batch effect were set as missing. Individuals with high missing genotype rate (> 5%), with extreme heterozygosity rate (above or below 5 standard deviations of mean heterozygosity rate) or being closely related as assessed by Identity-By-Descent (IBD) estimation (IBD ≥ 0.1875) were excluded from analyses. SNPs with high missing genotype rate (> 5%), low frequency (minor allele frequency < 1%), or deviation from Hardy–Weinberg Equilibrium (HWE) (P value < 10–5) were excluded, with 617,073 SNPs remaining. Population structure was assessed using principal component analysis. Using PLINK, we computed the principal components on an LD-pruned (r2 < 0.2) set of autosomal variants obtained by removing high-LD regions. Genome-wide genotype imputation was performed by SHAPEIT46 and IMPUTE247 with 1,000 Genomes Project (1000GP) Phase 3 East Asian (EAS) population as reference panel. Quality control after imputation was as follows: We filtered the SNPs with imputation quality score of IMPUTE2 (info score48) greater than 0.3 for further analysis. Indels were removed with VCFtools49. SNPs with low frequency (minor allele frequency < 1%) were excluded.

Statistical analyses

Age, body mass index (BMI), glucose level and eGFR were expressed as mean and standard deviation. FGF21 and FGF23 levels were expressed as median and interquartile range. GWAS analyses were performed using an additive genetic model in PLINK v1.07. Skewed variables including serum FGF21 and FGF23 were logarithmically transformed to approximate normal distribution. We performed regression diagnostics plots to check the assumptions of linearity, normality, and homoscedasticity, as well as to identify the influential observations for each outcome of interest (logFGF21 and logFGF23) and SNPs, adjusting for covariates. We did not find any severely violations of assumptions. Further, variance inflation factor (VIF) was used to check for multicollinearity. No correlations between independent variables were found. Linear regression was employed to analyze association between SNPs and log-transformed FGF21 and FGF23 levels. Population stratification was not obvious (λ = 1.002 for FGF21, λ = 0.999 for FGF23). Age, sex and the first ten principal components of ancestry were adjusted in Model 1, and BMI was additionally adjusted in Model 2. For log FGF23, BMI and eGFR were additionally adjusted in Model 3. The threshold for genome-wide significance was set at P = 5 × 10–850. In view of the relatively small sample size and to avoid false negative results caused by too stringent threshold51, the P value threshold for suggestive results were set at 1.0 × 10–652. Manhattan plots and quantile–quantile plots were drawn using the qqman R package53,54. Regional association plots were generated by LocusZoom55. 1,000 Genomes Project Phase 3 East Asian Ancestry was used for the reference population and Genome Reference Consortium Human Build 37 was used for gene annotation. The array-based heritability was calculated by linkage disequilibrium (LD) score regression56. The proportion of phenotypic variance explained by single SNP was estimated by the following formula:

$$ \frac{{{2}\beta^{{2}} {\text{*MAF*}}\left( {{1} - {\text{MAF}}} \right)}}{{{2}\beta^{{2}} *{\text{MAF}}*\left( {{1} - {\text{MAF}}} \right) + {\text{SE}}^{{2}} *{\text{2N}}*{\text{MAF}}*\left( {{1} - {\text{MAF}}} \right)}} $$

where β, SE, N, and MAF are the effect size estimate of each minor allele on the relative concentration of FGF21/ FGF23, standard error of the effect size, sample size, and minor allele frequency for the SNP, respectively57. Gene-based association analysis was performed using MAGMA v1.07b14 with LD information retrieved from 1,000 Genomes Project Phase 3 East Asian Panel. Meta-analysis pooling our study and those reported by Robinson-Cohen et al. was performed using by METAL using a random effects model 15.

Bioethics statement

All methods were carried out in accordance with relevant guidelines and regulations. The experimental protocols were approved by the Institutional Review Board of National Taiwan University Hospital.