Introduction

Calcium is a universal intracellular messenger that has an important role in controlling various cellular processes, such as muscle contraction, cardiac rhythmicity, blood clotting and secretion of diverse hormones.1, 2 It is estimated that 99% of body calcium is stored in bones and teeth, while 1% circulates in the bloodstream and gets involved in intracellular signaling.3 Serum calcium levels are under strong genetic control. It is estimated that ~70% of variation in the total calcium level can be attributable to genetic effects.4 The parathyroid glands have an important role in maintaining the ionized calcium levels in a narrow range for many cellular functions, which is accomplished by regulating the minute-to-minute release of parathyroid hormone into the blood.5 Disturbances of this tightly regulated homeostatic system lead to disorders of calcium metabolism.6

The important regulator of the parathyroid hormone release is the calcium-sensing receptor (CASR), a G-protein coupled cell-surface receptor, located mainly in the plasma membrane and in renal tubule cells.7 Loss-of-function mutations of the CASR gene on chromosome 3q are associated with familial hypocalciuric hypercalcemia and hyperparathyroidism.8 Polymorphisms in the CASR gene have been reported to be associated with hypertension and calcium kidney stones.9, 10 Some polymorphisms in the CASR gene were reported to be more frequent in Asian and African populations than in Europeans.11, 12 To date, two genome-wide association studies (GWAS) and one meta-analysis of serum calcium levels were reported, which identified several polymorphisms in the CASR gene affecting serum calcium levels.3, 13, 14 Nevertheless, most of these studies were conducted in Caucasians. The present study attempts to evaluate the genetic polymorphisms that affect serum calcium levels in Korean population through a large-scale GWAS with the combined sample size of 8642 individuals.

Materials and methods

Study subjects

Study subjects were selected from an ongoing population-based study known as the Korean Genome and Epidemiology Study (KoGES). Korea Centers for Disease Control and Prevention launched the KARE (Korean Association REsource) project through which recruited two cohorts from Ansung and Ansan, two cities in Gyeonggi Province, Republic of Korea.15 The Ansan cohort consisted of 4558 unrelated Korean individuals who live in urban areas and was used as a stage 1 discovery set. The Ansung cohort consisted of 4093 unrelated Korean individuals who live in rural areas and was used as a stage 2 replication set. This study was approved by the Institutional Review Board of the Catholic University of Korea, College of Medicine (CUMC07U047).

Genotyping

All the subjects were genotyped using the Affymetrix Genome-Wide Human Single-nucleotide Polymorphism (SNP) Array 5.0 (Affymetrix, Santa Clara, CA, USA). Quality control was performed based on the earlier studies.15, 16 In brief, we filtered out individual data with discordant sex information and those with the genotype failure rate higher than 3%. Individuals with the heterozygosity rate >3 s.d away from the mean were also excluded from the analysis. The genetic homogeneity of the KARE study population was already reported by a previous study.15 We applied SNP imputation to increase the coverage of variants by capturing additional association signals. The detailed imputation procedure has previously been described by Cho et al.15 The genotypes of the individuals were imputed using IMPUTE software17 based on JPT/CHB data from the International HapMap Project and also from 1000 Genomes Project as a reference panel. Based on these imputed SNP genotypes, we applied standard quality control parameters such as SNP call rate >95%, minor allele frequency >5% and Hardy-Weinberg equilibrium P>0.001. After this quality control process, genotypes of 4558 individuals for 1 219 546 autosomal SNPs were used for stage 1 association analysis. The expression quantitative trait loci association of the top significant SNPs (P<10−7) were checked using SCAN database (http://www.scandb.org).18

Statistical analysis

We have performed a linear regression analysis assuming an additive model to determine the association of variants with log10-transformed ionized serum calcium levels. Information on calcium levels and covariates were obtained from the KARE project.15 We followed the formula used by one earlier study to infer the amount of ionized calcium.3 Age and sex were used as covariates in the regression analysis. Statistical analyses were performed using PLINK.19 We have used Haploview (version 4.2; http://www.broad.mit.edu/mpg/haploview/) to create Manhattan plots.20 SNAP software was used to annotate the proxy of the top SNP.21

SNP prioritization

SNP prioritization was performed using GWASrap (http://jjwanglab.org/gwasrap).22 This software generates a re-prioritized genetic variant list by combining the original statistical value and variant prioritization score. A total of 13 345 SNPs with P<0.01 were applied as input values.

Results

Genome-wide associations with serum calcium levels and their replication

General characteristics of the stage 1 and 2 subjects are summarized in Table 1. The stage 1 set consists of 4558 individuals (2222 male and 2336 female) and the stage 2 set consists of 4093 subjects (1762 men and 2331 women). The mean ages of stage 1 and 2 subjects were 49±7.8 and 55.7±8.7, respectively. The mean serum calcium levels were 9.6±0.46 and 9.6±0.48 in stages 1 and 2, respectively (Table 1). GWAS for serum calcium levels in Korean individuals were performed with imputed SNPs using HapMap II data. The overall results of the GWAS assuming an additive model are shown as a Manhattan plot (Figure 1). A quantile–quantile plot of the GWAS is illustrated in Supplementary Figure S1. The genomic control inflation factor (λGC) was 1.04, indicating no evidence of type 1 error inflation.

Table 1 Characteristics of the subjects in stage 1 and 2 data
Figure 1
figure 1

Manhattan plot showing GWAS results for serum calcium levels in 8642 Korean subjects. The red horizontal line (P<10−8) denotes the general threshold for genome-wide significance. The arrow head indicates the significant locus (CASR) that passed the threshold. CASR, calcium-sensing receptor; GWAS, genome-wide association study. A full color version of this figure is available at the Journal of Human Genetics journal online.

In the stage 1 discovery analysis, 963 SNPs in 22 loci including the CASR locus on 3q21.1 showed associations with the significance level of P<10−4 (Supplementary Table S1). We examined these 963 SNPs in an independent, stage 2 replication set (Supplementary Table S2). Of the 963 SNPs, 105 SNPs in 10 loci were consistently significant in both discovery and replication stages (P<0.05). We tried a combined analysis of the two sets and found that 65 SNPs in seven loci among the 105 SNPs were showed associations with the significance level of P<10−5 (Table 2). Of the 65 SNPs, a cluster of 34 SNPs in the CASR gene locus showed relatively stronger associations compared with other loci. Among them, rs13068893 in the CASR gene showed the strongest association (P=3.85 × 10−8; calculated using the SNP imputation data based on HapMap data: Table 2 and Figure 2). The rs13068893 showed the significant association in both stage 1 and 2 sets with consistent effect sizes and directions (Stage 1, P=1.20 × 10−4 and β=−0.00211; Stage 2, P=7.58 × 10−5 and β=−0.00204; Supplementary Table S1 and S2). When using the SNP imputation data based on 1000 Genomes data, the association pattern was almost consistent with those found by using imputation data based on the HapMap data and the rs13068893 was also the one that showed the strongest signal (Supplementary Table S3).

Table 2 SNP loci associated with serum calcium levels in the combined analysis of stages 1 and 2
Figure 2
figure 2

Regional plot of the SNPs (up) and the linkage disequilibrium relationships among these SNPs (down) in the CASR locus. Data are shown for the CASR locus around rs13068893. Diamond-shaped dots represent −log10 (P-values) of SNPs, and green diamond in the linkage disequilibrium plot indicates the most significant SNP. The strength of linkage disequilibrium relationship (r2) between the most strongly associated SNP and the other SNPs is presented with red color intensities based on JPT+CHB HapMap data. The light blue curve shows recombination rates drawn based on JPT+CHB HapMap data. Green bars represent the coding genes in this region. CASR, calcium-sensing receptor; SNP, single-nucleotide polymorphism. A full color version of this figure is available at the Journal of Human Genetics journal online.

SNP prioritization

We conducted a SNP prioritization analysis to identify SNPs that may have impacts on serum calcium levels, even though P-values for their associations are mediocre, using GWASrap.22 Most of the top ranked SNPs remained significant and the ranking was consistent with the GWAS data after SNP prioritization. Interestingly, the ranking of the significance levels of some SNPs in the CASR gene became higher after prioritization. For example, three SNPs in the CASR gene (rs1042636, rs4678176 and rs3749208) got higher rankings in significance levels after prioritization; from 16th, 17th and 20th position to 1st, 5th and 8th, respectively (Supplementary Table S4).

Replication of previously reported loci in the combined data set

We further examined the four loci (CASR, CSTA, DGKD and GCKR) in our data set which have been previously reported to be significantly associated with the calcium level in Europeans and Indians.3, 13 All the associations of the SNPs in the four loci were replicated in our data set (P<0.05), especially the SNPs in the CASR and CSTA were most significant in our study (P=3.85 × 10−8~9.70 × 10−8). Details of all the SNPs in the four loci are listed in Table 3.

Table 3 Associations of the previously reported serum calcium-related loci in the present GWAS

Associations of the lead SNP with calcium-related traits

To explore the possible associations between the top significant SNP in the CASR locus (rs13068893) and calcium-related traits, we examined the associations in the combined data set between rs13068893 and medical histories of myocardial infarction (n=79), osteoarthritis (n=267), hypertension (n=1387), renal disease (n=238), bone mineral density (n=8234) and osteoporosis (n=11). However, none of these traits showed significant associations with the SNP (data not shown).

Discussion

We conducted a two-stage GWAS to identify genetic variations that affect serum calcium levels, which is the largest GWAS regarding this in East Asian populations. In both stage 1 discovery and stage 2 replication sets, SNPs in or near the CASR gene on chromosome 3 were consistently significant. The signals became stronger in the combined analysis of stage 1 and 2 data sets. The top significant SNP identified in this study (rs13068893, from the combined analysis) was located in the CASR locus regardless of which data, HapMap or 1000 Genomes, we used for SNP imputation. CASR locus was reported to be strongly associated with serum calcium levels in previous GWAS in European and Indian populations.3, 14 A large meta-analysis also reported the same finding in European population.13 However, top significant SNPs are different; rs13068893 in our study, but rs17251221 in Europeans and rs1801725 in Indians.3, 14 Indeed, the minor allele frequency of the rs13068893 is known to vary widely between ethnicities, according to the 1000 Genomes project (http://www.1000genomes.org), 8% in Europeans, 3% in Africans, 49% in Japanese and 44% in Han Chinese. In addition to the rs13068893, the allele frequencies of other highly significant SNPs in the CASR locus were also found to vary between ethnicities: low in Europeans and Africans but high in East Asians (Supplementary Table S5). These inter-ethnic differences in allele frequencies of rs13068893 and other significant SNPs in the CASR locus may explain why these SNPs showed the higher significance levels in Koreans but were not even significant in Caucasians. Vice versa, the allele frequencies of rs17251221 and rs1801725, previously reported to be significant in Indian and European populations,3, 14 were known to be very low in East Asian population (both 1% in 1000 genome JPT). In addition, linkage disequilibrium profiles around the top significant SNP (rs13068893) showed a marked inter-ethnic difference between East Asian and others (Europeans and Africans) (Supplementary Figure S2). Unfortunately, as the probes for rs17251221 and rs1801725 were not present in Affymetrix Genome-Wide Human SNP Array 5.0 that was used for the present study, we could not directly investigate the associations of those two SNPs. Similar to this study, several previous studies also reported ethnic-specific results. For example, a meta-analysis of GWAS for serum calcium levels by O'Seaghdha et al.13 showed that some SNPs identified in European population did not reach genome-wide significance levels in Japanese population. In addition to calcium levels, there were also reports of East Asian-specific SNPs associated with other traits such as C-reactive protein level23 and bilirubin level.24

Considering the high allele frequency and significance level of the rs13068893C>G in the CASR gene, this SNP may have a key role in regulating the serum calcium level. A mutation in the CASR gene was reported to cause hyperparathyroidism and hypocalciuric hypercalcemia.25, 26 Indeed, in the prioritization analysis, we observed that a non-synonymous SNP (rs1042636) in CASR ranked top. The rs1042636 was also reported to be associated with diverse diseases such as aortic stenosis.27 Expression quantitative trait loci analysis may provide further insight into the mechanisms underlying the associations that were discovered through GWAS. We checked this possibility in this study by searching the SCAN database (http://www.scandb.org) for expression quantitative trait loci associations of the top significant SNPs (P<10−7). However, none of the SNPs showed significant expression quantitative trait loci associations.

Four major serum calcium level-associated loci identified in previous studies from Europeans and Indians, such as CASR, CSTA, DGKD and GCKR loci, were consistently replicated in this study.3, 13 These results suggest that these SNPs may be universally associated with calcium levels and also support the reliability of our study with Korean population. Although they were significantly replicated in this study, SNPs clustered in the CASR locus showed higher significance levels than the other three loci. It suggests that the CASR is likely to be more significant for the serum calcium level in East Asians.

In addition to CASR, we have observed potential associations in several genes such as PRDM9, GPR39, SLC16A7, BCAT1, RARB, PTPRN2 and ATG4C. Although none of these potential genes identified in this study have been previously reported to be associated with serum calcium levels, some of them would be worth larger-scale validation. For example, GPR39 is a member of the G-protein coupled receptor. Asraf et al.28 found that GPR39 was mediating Zn2+-dependent Ca2+ responses and that endogenous GPR39 was regulated by the expression and the activity of CASR, which is another G-protein coupled receptor. In this study, four SNPs in GPR39 showed significant associations (P<10−6). SNPs in BCAT1 have been reported to be associated with salt sensitivity in Korean population.29

CASR has been previously suggested to be involved in osteoporosis, coronary heart disease and cardiovascular mortality.30 Therefore, we explored the possibility of the associations between the top significant SNP in CASR with calcium-related phenotypes such as myocardial infarction (n=79), osteoarthritis (n=267), hypertension (n=1387), renal disease (n=238), osteoporosis (n=11) and bone mineral density (n=8234). None of the traits showed any significant associations (data not shown) with the SNP, but this result is in agreement with previous GWAS that found no association of variants in the CASR locus with calcium-related diseases in European population.3 This result might also be due to a small number of related outcomes in our GWAS or due to cross-sectional nature of the serum calcium measurement. A large meta-analysis or longitudinal study measuring calcium levels repeatedly may be helpful to figure out associations between calcium-related SNPs and diseases.

In conclusion, through a large-scale GWAS of 8642 Koreans, we identified novel CASR variants and a number of interesting candidate genes that have a potential to be related to serum calcium levels in Korean population. Inter-ethnic differences were suggested in some associated SNPs. We have also replicated a number of SNPs reported by previous studies in genes such as DGKD, GCKR and CASR in our Korean data set. Given the significant role played by calcium in many diseases and cell signalings, further studies with more East Asian subjects or meta-analyses on them may enable validation of our results and identification of novel genetic loci associated with serum calcium levels.