Genetic polymorphisms of PCSK2 are associated with glucose homeostasis and progression to type 2 diabetes in a Chinese population

Proprotein convertase subtilisin/kexin type 2 (PCSK2) is a prohormone processing enzyme involved in insulin and glucagon biosynthesis. We previously found the genetic polymorphism of PCSK2 on chromosome 20 was responsible for the linkage peak of several glucose homeostasis parameters. The aim of this study is to investigate the association between genetic variants of PCSK2 and glucose homeostasis parameters and incident diabetes. Total 1142 Chinese participants were recruited from the Stanford Asia-Pacific Program for Hypertension and Insulin Resistance (SAPPHIRe) family study, and 759 participants were followed up for 5 years. Ten SNPs of the PCSK2 gene were genotyped. Variants of rs6044695 and rs2284912 were associated with fasting plasma glucose, and variants of rs2269023 were associated with fasting plasma glucose and 1-hour plasma glucose during OGTT. Haplotypes of rs4814605/rs1078199 were associated with fasting plasma insulin levels and HOMA-IR. Haplotypes of rs890609/rs2269023 were also associated with fasting plasma glucose, fasting insulin and HOMA-IR. In the longitudinal study, we found individuals carrying TA/AA genotypes of rs6044695 or TC/CC genotypes of rs2284912 had lower incidence of diabetes during the 5-year follow-up. Our results indicated that PCSK2 gene polymorphisms are associated with pleiotropic effects on various traits of glucose homeostasis and incident diabetes.


Baseline
Follow-up Association of PCSK2 genetic variants with various traits of glucose homeostasis. The SNPs and their location in the PCSK2 gene, their position in the physical map and minor allele frequency are shown in Supplementary Table S1. All the following traits of glucose homeostasis were adjusted for age, gender and body mass index (BMI) and analyzed by family-based association test (FBAT). Genetic variants of rs6044695 and rs2284912 were negatively associated with fasting plasma glucose (FPG) concentration. Genetic variants of rs2269023 were positively associated with FPG and 1-hour plasma glucose concentration (1 h-PG) during OGTT ( Table 2). All associations with a q value < 0.05 were considered statistically significant.
Association of SNP haplotypes of PCSK2 with various traits of glucose homeostasis. After analysis with Haploview 4.1, the 10 tag SNPs of PCSK2 were divided into 2 haplotype blocks (block 1: rs4814605/rs1078199, block 2: rs890609/rs2269023, Fig. 1). Haplotypes of rs4814605/rs1078199 (block 1) were associated with fasting plasma insulin concentration (FINS) and HOMA-IR. Haplotypes of rs890609/rs2269023 (block 2) were associated with FPG, FINS and HOMA-IR (Table 3). Both the haplotype-specific P value and global P value were derived from permutation testing 10,000 times. A null hypothesis was rejected if the permuted global P value was <0.05.
Specific SNPs of PCSK2 were associated with progression from normoglycemia to diabetes during a 5-year follow-up. We  of diabetes (Table 4). As shown in Table 5, rs4814597, rs1609659, rs2208203, and rs2021785 were also associated with type 2 diabetes or glucose homeostatic traits according to GWAS database 28 . Since we did not have genotype data of the four SNPs in this study, we reexamined these additional established loci from GWAS in Table 5 by imputing their genotypes using MACH imputation package 29,30 based on 1000 Genomes data. The results are presented in Supplementary Table S2 with a serial number starting with A. None of the imputed SNPs showed evidence of association with incident diabetes. The Haploview linkage disequilibrium (LD) graph of the PCSK2 gene (10 genotyped SNPs in this study and 4 imputed SNPs: rs4814597, rs1609659, rs2208203, and rs2021785) was shown in Supplementary Fig S1.

Discussion
In our study, significant associations between some SNPs as well as haplotypes of PCSK2 and various traits of glucose homeostasis, including FPG, 1 h-PG, FINS and HOMA-IR, were found. Furthermore, individuals with some specific SNPs of PCSK2 were also associated with progression to diabetes during a 5-year follow-up. In our previous study 31 , we reported the potential pleiotropy of the locus at 37 cM on chromosome 20 on each pair of traits, such as fasting insulin/HOMA-beta and HOMA-IR/HOMA-beta, which supports our present findings that PCSK2 gene polymorphisms are associated with pleiotropic effects on these metabolic variables.
PCSK2 is a type II proinsulin-processing enzyme, and it cleaves the proinsulin molecule on the COOH-terminal side of dibasic peptide, Lys64-Arg65, which joins the C-peptide and A-chain domains 32 . Defects affecting the catalytic activity of the prohormone-processing enzymes have been found to be associated with obesity and other metabolic disorders 33,34 . The etiology of hyperproinsulinemia is thought to be pancreatic β cell dysfunction, which is manifested in part by inadequate cleavage of proinsulin. Previous studies have shown that increased concentrations of proinsulin are a significant predictor of the development of T2DM in several ethnic groups [35][36][37][38] . Furuta et al. 39 reported that increased levels of proinsulin and split proinsulin were detected in pancreatic islet cells isolated from homozygous pcsk2 null mice.
There have been several studies reporting that genetic polymorphisms of PCSK2 were associated with either T2DM or various glucose homeostasis parameters (Table 5). A significant difference in the allele frequency distribution of a simple CA tandem-repeat DNA polymorphism (STRP) in intron 2 of PCSK2 has been reported in a case-control study of T2DM patients and normal controls in a Japanese population 26 (Table 5). Jonssan et al. recently reported that the C allele of PCSK2 rs2208203 in intron 2 was associated with reduced insulin secretion measured as the corrected insulin response as well as disposition index 40 . The variant was also associated with lower fasting glucagon levels in non-diabetic individuals with FPG over 5.5 mmol/l 40 ( Table 5). The above microsatellite and rs2208203 in intron 2 were not examined in this study. According to imputation analysis based on 1000 Genomes data, rs2208203 was not associated with incident T2DM.
A more recent genome-wide association study (GWAS) on T2DM in African American families also showed linkage to chromosome 20p in a subset with a later age at diagnosis. The PCSK2 gene is within the 1-logarithm of odds (LOD) interval of this linkage peak. Association with T2DM was observed among 4 SNPs: rs2021785, rs1609659, rs4814597 and rs2269023 25 (Table 5). A recent report showed that an association of the risk allele of rs2021785 at PCSK2 with T2DM also existed in a Han Chinese population 27 (Table 5). Rs2021785, rs1609659, and rs4814597 were not genotyped in this study. According to imputation analysis based on 1000 Genomes data, the above three imputed SNPs were not associated   with incident T2DM. Consistently, in this study, rs2269023 was associated with FPG and 1-hour PG during OGTT in a non-diabetic Han Chinese population (Table 2). Therefore, rs2269023 may play an important role in the regulation of glucose homeostasis in different ethnic groups. We further searched the open GWAS Central database 28 for associations between the 10 genetic variations of PCSK2 investigated in this study and related metabolic phenotypes in Caucasian populations. Significant associations were found between rs2206447 and T2DM (P = 0.008, FUSION Study), and between rs6080705 and HOMA-beta (P = 0.008588), HOMA-IR (P = 0.02582) and fasting insulin (P = 0.01508) (https://www.gwascentral.org/, searched on 6.10.2014) ( Table 5). However, these associations could not be replicated in a Han Chinese population in this study. Furthermore, genetic variants of rs6044695 and rs2284912 were associated with both baseline FPG and progression of T2DM during the 5-year follow-up in this study. Therefore, the association at baseline was also replicated in the longitudinal follow-up study. To the best of our knowledge, this is the first study to report that the genetic variants of PCSK2 were associated with incident T2DM.
This study has several strengths. First, our study used a family-based design, which is a systemic approach to capture all common genetic variations, to control for population stratification. Second, we adopted q-values as our measure of significance in order to reduce false-positive results derived from multiple tests. The q-value is an false-discovery rate (FDR)-based measure of significance used in genome-wide studies. Most importantly, a systematic use of q-values in genome-wide tests of significance will yield a clear balance of false-positive results to true-positive results and provide a standard measure of significance that can be universally interpreted 41 . Third, this study examined SNPs associated with incidence of diabetes rather than prevalence. The limited number of diabetes incidences would be the limitation of this study though.
In conclusion, several genetic variants and haplotypes of PCSK2 were associated with various traits of glucose homeostasis and progression to diabetes. These findings, together with several earlier observations in different ethnic groups, support an involvement of the PCSK2 gene in the pathogenesis of T2DM.

Study population of the SAPPHIRe study cohort. The Stanford Asia-Pacific Program for
Hypertension and Insulin Resistance (SAPPHIRe) was a collaborative study that was part of the Family Blood Pressure Program of the National Heart, Lung and Blood Institute of the National Institutes of Health meant to investigate the genetic determinants of hypertension and insulin resistance in Chinese and Japanese. The study collected over 1,300 sib pairs that were either concordant or discordant for high blood pressure. Detailed descriptions of the study cohort were published in our previous work 42, 43 . In brief, subjects were aged between 35 and 60 years and of Chinese or Japanese ancestry. Hypertension was defined as systolic blood pressure >160 mm Hg, diastolic blood pressure >95 mm Hg, or use of 2 medications for high blood pressure (stage II hypertension). Also, the subjects could be taking one medication for high blood pressure with a systolic blood pressure >140 mm Hg or a diastolic blood pressure > 90 mm Hg. Low-normal blood pressure was defined as blood pressure in the bottom 30% of the age-and sex-adjusted blood pressure distribution. Individuals with chronic illnesses like diabetes, cancer, or diseases of the heart, liver, or kidney were excluded. In this study, 1142 Chinese participants  were recruited from the SAPPHIRe study, and 759 participants received a 5-year follow-up. The institutional review board of each participating site (National Taiwan University Hospital, Taipei Veterans General Hospital, Taichung Veterans General Hospital, and Tri-Serve General Hospital) approved all the experiments in this study. Informed consent was obtained from all subjects.
Phenotyping. The participants underwent anthropometric measurements at 8 A.M. after an 8-10 h overnight fast. Each subject was subjected to a 75-g OGTT after the anthropometric measurements. Fasting blood samples were collected for the measurement of plasma glucose and insulin. Then, 75 g glucose monohydrate (in 300 ml water) was administered to the subject to drink within 5 minutes. Blood samples were taken for plasma glucose and insulin 1 and 2 hours after glucose loading. The patients were not allowed to eat or drink until the end of the test 7 . Plasma glucose and insulin levels were measured as described previously 7 . HOMA-IR and HOMA-beta derived from the homeostasis model were identical to the previous study 7 .  45 . Ten SNPs were selected with minor allele frequencies of more than 10% at r 2 = 0.7, and that captured 80% of alleles of PCSK2. SNP genotyping was performed using the GenomeLab SNPstream genotyping platform (Beckman Coulter, Fullerton, CA) and its accompanying SNPstream software suite. ASPEX software was applied to examine Mendelian inconsistencies. When an error was found, the marker data were converted to missing; less than 1% of the marker data were converted to missing in this study. All the methods were carried out in accordance with the approved guidelines. All experimental protocols were approved by committee of National Taiwan University Hospital, Taipei Veterans General Hospital, Taichung Veterans General Hospital, and Tri-Serve General Hospital.

Selection of tagSNPs and genotyping.
Statistical analysis. All data were summarized as mean values ± S.D. unless otherwise specified.
Pairwise linkage disequilibrium (LD) measures D′ and r 2 were estimated to assess LD between SNPs in the PCSK2 gene. The structure of the haplotype block was evaluated using the confidence interval method developed by Gabriel et al. and implemented in the Haploview program 45 . The association of PCSK2 SNP and haplotypes with metabolic phenotypes was analyzed using the family-based association test (FBAT) 46 . The trait residuals were obtained based on the generalized linear models adjusted for age, gender, center, drug, environmental factors (i.e., smoking, drinking and sedentary lifestyle), and BMI, then imported into FBAT for association analysis. For each association, we derived a q-value 41 that was calculated using the statistical package SAS version 9.1. The q-value has been proposed as a FDR-based measure of significance for multiple testing 41 . FDR is the expected proportion of Type I errors among the rejected hypotheses. Q-value is defined as an analog of the p-value that incorporates FDR-based multiple testing correction 41 . Namely, q-value is the minimum FDR that can be attained to reach significance (i.e., expected proportion of false positives incurred for significance). A p-value of 0.05 implies that 5% of all tests will result in false positives, while an FDR adjusted p-value (or q-value) of 0.05 implies that 5% of significant tests will result in false positives. We also used the proportional hazard model to analyze whether the presence or absence of specific SNPs was associated with the progression from normoglycemia to diabetes during a 5-year follow-up. A null hypothesis was rejected if the q-value was <0.05. We presented the hazard ratio of the allelic effect from the major allele (A) for each SNP based on Cox regression models. A Cox regression model is a regression-based method for exploring the associations between survival data and explanatory variables. It provides an estimate of the hazard ratio and its confidence interval between two groups. In the present study, the survival data is the person-years for diabetes incidence during the 5-year follow-up period and the explanatory variable of interest is individual SNPs. Proportional hazards regression assumes the hazard ratio is constant over time. Therefore, we conducted Schoenfeld's residuals test 47 to check the proportional hazard assumption for each SNP. None of the proportional hazard assumption was rejected suggesting the assumption is legitimate for all the SNPs in the Cox regression analysis (Supplementary  Table S3).
We obtained haplotype-specific and whole marker P-value by a permutation test. Ten-thousand times were permuted when analyzing family-based association test of PCSK2 haplotypes with various traits of glucose homeostasis. To calculate permutation-based P values, the phenotype labels are randomly shuffled, and all the multiple tests are recalculated emperically on the reshuffled data set, with the smallest P value of these multiple tests. The procedure is repeated for 10,000 times to construct an empirical frequency distribution of the smallest P values. If the P value calculated for the actual data set is smaller than r of the 10,000 smallest P value from the permuted data sets, then an empirical adjusted P value (P*) is given by P* = (r + 1)/(n + 1), where n is the number of replicate samples that have been simulated and r is the number of these replicates that produce a test statistic greater than or equal to that calculated for the actual data. A null hypothesis was rejected if the permuted P value was <0.05 48 .