Genetic variation of SORBS1 gene is associated with glucose homeostasis and age at onset of diabetes: A SAPPHIRe Cohort Study

The SORBS1 gene plays an important role in insulin signaling. We aimed to examine whether common single-nucleotide polymorphisms (SNPs) of SORBS1 are associated with prevalence and incidence of diabetes, age at onset of diabetes, and the related traits of glucose homeostasis. A total of 1135 siblings from 492 ethnic Chinese families were recruited at baseline, and 630 were followed up for 5.19 ± 0.96 years. Nine SNPs including rs7081076, rs2281939, rs3818540, rs2274490, rs61739184, rs726176, rs2296966, rs17849148, and rs3193970 were genotyped and examined. To deal with correlated data of subjects within the same families, the generalized estimating equations approach was applied throughout all association analyses. The GG genotype of rs2281939 was associated with a higher risk of diabetes at baseline, an earlier onset of diabetes, and higher steady-state plasma glucose levels in the modified insulin suppression test. The minor allele T of rs2296966 was associated with higher prevalence and incidence of diabetes, an earlier onset of diabetes, and higher 2-h glucose during oral glucose tolerance test. These two SNPs revealed independent associations with age of diabetes onset as well as risk of diabetes at baseline. These findings supported that SORBS1 gene participates in the pathogenesis of diabetes.

Successful rate of genotyping. The information of 9 tag SNPs of SORBS1 gene was shown at Table 2. As described in Methods, genotyping of rs2281939 was applied to 938 subjects; among which, 886 obtained available genotypes (successful rate: 94.5%). For other eight SNPs, the successful rate of genotyping ranges from 92.9 to 99.5%, with a median of 98.2%. Un-typed or failed genotypes of rs2281939 and failed genotypes of other SNPs were inferred by implementing HAPLORE 11 and MERLIN 12 . Our association analyses were based on lab-typed and inferred genotype data of 1135 subjects.
Association with risk of DM at baseline and incidence of DM during follow-up. The association of minor allele of each SNP with DM risk at baseline and the incidence of DM during follow-up were examined under additive, dominant, and recessive models. As shown in Table 3, significant associations were revealed by rs2281939 and rs2296966. Specifically, subjects carrying the GG genotype of rs2281939 had a significantly higher risk of having DM at baseline (  Association with overall risk of DM and age of DM onset. Since rs2281939 and rs2296966 showed significant associations with risk of DM at baseline/incidence of DM during follow-up, we further investigated whether they were associated with overall risk of DM and age of DM onset. As shown in Table 4 Rs2281939 and rs2296966 are independently associated with age at onset of DM. To investigate whether rs2281939 and rs2296966 are independently associated with age at onset of DM, we re-examined the associations by incorporating these 2 SNPs into the model simultaneously. As shown in Table 6, compared with the associations observed in 1-SNP models, the respective associations of these two SNPs were strengthened in the 2-SNP model (H.R. for rs2281939: 5.42, 95% C.I.: 2.73-10.78, p = 0.0000014; H.R. for rs2296966: 1.32, 95% C.I.: 1.09-1.59, p = 0.0043), which implied that rs2281939 and rs2296966 were independently associated with age at onset of DM. Similarly, as shown in Supplementary Table S1, these two SNPs were also independently associated with risk of DM at baseline. The estimated Dʹ of linkage disequilibrium (LD) between rs2281939 and rs2296966 was 0.58.
Re-examining associations of rs2281939 using lab-typed genotypes. Because only 886 subjects had lab-typed genotypes of rs2281939, we re-examined above significant findings of rs2281939 by using these subjects. Results of the re-examination were consistent with original findings (Supplementary Table S2).

Discussion
The main finding of this study is that two common SNPs within SORBS1 gene, rs2281939 and rs2296966, were independently associated with age at onset of DM in a Han Chinese population. The incidence of T2DM in young adults has been increasing worldwide in recent decades [13][14][15] . Fasting glucose, 2-h glucose during OGTT, HDL cholesterol and BMI have been reported to be risk factors for early onset of T2DM in children, and in American Indian adolescents 16 . More and more studies have examined the influence of genetic variants on the age at diagnosis of T2DM [17][18][19][20][21][22][23][24][25] . To the best of our knowledge, this is the first study on the association of genetic variants of SORBS1 with age at onset of DM. In addition, we found some associations of SORBS1 SNPs with prevalence and incidence of DM as well as some quantitative traits of glucose homeostasis. It has been reported that SORBS1 is an important adaptor protein in the signaling pathway of insulin-stimulated glucose uptake in the mouse 9 . Deletion of the bone marrow-specific Cap gene has also been reported to protect against high fat diet-induced insulin resistance 26 . It has also been reported that treatment with an insulin sensitizer, thiazolidinedione, may regulate expression of the CAP gene in insulin-sensitive tissues 27 .  In this study, though subjects carrying the GG genotype of rs2281939 (T228A polymorphism) of SORBS1 had lower BMI, they were more insulin-resistant with higher SSPG (Table 5), which may explain the higher risk of DM and earlier age at onset of DM in subjects with the GG genotype of rs2281939 (Tables 3 & 4). Although the finding related to BMI was similar to a previous study 10 , the association between rs2281939 and DM risk observed in this study was not consistent with previous studies 10 . Lin et al. reported that subjects with AG or GG genotypes of rs2281939 had lower risk of obesity and T2DM and lower BMI 10  The discrepancy between the current and previous studies could be explained, at least partially, by the study populations and the genetic models used. First, due to the study design of SAPPHIRe, about 70% of the study population had hypertension at baseline, which is quite different from the study populations used in Lin et al. 10 and in the DIAGRAM Consortium 28 . Second, all the additive, dominant, and recessive genetic models were considered in the current study and significant associations of rs2281939 with baseline DM risk and age at onset of DM were observed uniformly under a recessive model. However, the association under a recessive model was not examined by the previous studies. Specifically, to investigate the association with DM risk, Lin et al. adopted a dominant genetic model 10 , while an additive genetic model was used in the GWAS of DIAGRAM 28 . Nevertheless, it should be noted that, due to the small number of subjects carrying GG genotype of rs2281939 in the current study (n = 10, Table 4), the observed association of rs2281939 under a recessive model need to be confirmed by further studies with larger samples in the future.
In this study, we found that minor allele of rs2296966 was associated with a higher risk of DM (Tables 3 and 4), a higher incidence of DM during follow-up (Table 3), a younger age of DM onset (Table 4), and a higher level of 2-h glucose during OGTT (Table 5). These associations were observed mainly under an additive genetic model. From the Type 2 Diabetes Knowledge Portal (http://www.type2diabetesgenetics.org/, searched on 3.26.2018), we found that nominally significant associations between rs2296966 and risk of DM were observed in three previous studies, including BioMe AMP T2D GWAS (OR = 0.838, p = 0.036), GWAS SIGMA (OR = 1.13, p = 0.0232), and GoT2D WGS (OR = 1.25, p = 0.0483), from different populations. Among which, two studies also showed that minor allele T was associated with higher risk of T2DM. Therefore, this study not only replicated an association between rs2296966 and DM risk but also demonstrated that this association may exist in multiple ethnic populations.   29 . In addition to lifestyle and environmental factors, this phenomenon could partially result from genetic factors. In this study, we found that minor allele of rs2296966 was associated with a younger age of DM onset. According to data from 1000 Genomes Project phase 3 (https://www.1000genomes.org/) 30 , the minor allele frequencies (MAF) of rs2296966 in the Asian and European populations are 0.392 and 0.067, respectively. Since MAF of rs2296966 in Asians is much higher than that in Europeans, the earlier onset of DM in Asians 31 might be explained partially by the observed association of rs2296966 with younger age of DM onset found in this study and the large difference in MAFs between Asian and European populations.
Although the genetic variants of rs2281939 and rs2296966 were both associated with a younger age of DM onset, the result of 2-SNP model analysis showed that the associations revealed by these two SNPs were independent (Table 6). Similar result was obtained from 2-SNP model analysis for risk of DM at baseline (Supplementary  Table S1). In addition, only rs2281939 was associated with SSPG and BMI and only rs2296966 was associated with incidence of DM during follow-up and 2-h glucose during OGTT. These observations indicated that these two SNPs affect DM-associated phenotypes in different ways. We note that the estimated D′ of LD between rs2281939 and rs2296966 was 0.58, which showed a low extent of LD between these two SNPs. The genetic polymorphism of rs2281939 confers a substitution of alanine for threonine in codon 228, which may change the function of SORBS1 and influence the insulin signaling pathway. Rs2296966 is located in 3′UTR of SORBS1. According to the miR-NASNP database 32 (http://bioinfo.life.hust.edu.cn/miRNASNP2/, searched on 4.10.2018), we found that the SNP of rs2296966 is related to the potential binding sites of two microRNAs. Specifically, T allele of rs2296966 will cause gain of the target site of hsa-miR-526b-5p and loss of target site of hsa-miR-635. Therefore, the substitution of T allele for C allele may change the binding site to specific microRNA, which leads to alter protein expression or mRNA stability 33 of SORBS1 gene. This may explain why the two SNPs affect the phenotypes in slightly different ways.
There are several strengths to this study. First, to the best of our knowledge, this is the first SORBS1 genetic study investigating comprehensive DM-related traits. We examined the association between SORBS1 SNPs and age at onset of diabetes, in addition to prevalence of diabetes in this study. We also analyzed the association of SORBS1 SNPs with multiple metabolic traits and incidence of DM in this study. Second, we didn't assume specific genetic inheritance in the association analysis. Instead, we analyzed the association of SNPs with risk of DM at baseline, incidence of DM during follow-up, overall risk of DM and age of DM onset in this study under additive, dominant, and recessive model, respectively. Third, instead of using a stringent threshold for p-values required by Bonferroni correction, we implemented a flexible false discovery rate (FDR)-based approach to deal with the problem of multiple testing. For each association test, in addition to p-value, we also reported a q-value, which is a measure of significance in terms of FDR 34 . Then, by determining a threshold for q-values to declare significant findings in multiple comparisons, we can control the expected proportion that declared associations are truly null.  In this study, we used a threshold of 0.2, which was also adopted by recent studies 35,36 . Using such a less stringent threshold should enable us to find moderate associations, while controlling a reasonable FDR. However, there is a limitation in this study. According to analysis of the 2000-2009 nationwide health insurance database in Taiwan, the incidence of diabetes at the age group 20-79 y/o in 2008 is 1.16% 37 . In this study, the incidence of diabetes during follow up for 5.19 ± 0.96 years is 21.38% (4.12%/yr), which is much higher than the incidence of Taiwanese national database. The probands recruited in this study had hypertension, and the prevalence rate of hypertension at baseline is high (68.99%). Several studies supported that impaired insulin signaling involved in the pathogenesis of essential hypertension 38,39 . Hence, the higher incidence of diabetes in this study than that in the normal population may be partially due to the higher prevalence of hypertension at baseline. Therefore, the association obtained from this study may not be extrapolated to non-hypertensive populations.

SNP Genotype (n) Model
In conclusion, we reported the associations of 2 common SNPs of SORBS1 with prevalence and incidence of DM, age at onset of DM, and some quantitative traits of glucose homeostasis. These findings, together with earlier observations in different ethnic groups, support an involvement of the SORBS1 gene in the pathogenesis and clinical phenotypes of DM.

Subjects of the SAPPHIRe study cohort. The Stanford Asia-Pacific Program for Hypertension and
Insulin Resistance (SAPPHIRe) was a collaborative family study designed to investigate the genetic determinants of hypertension and insulin resistance in ethnic Chinese and Japanese. The design of SAPPHIRe study recruited both concordant siblings (all siblings with hypertension) and discordant siblings (at least one hypertensive sibling). Index cases were determined as those with age at onset of 35-60 years or those 60 years of age with corroborating record of their hypertension status before age 60 years 40 . In the present study, we just selected siblings from ethnic Chinese nuclear families for genotyping. In the present study, 1135 siblings from 492 ethnic Chinese nuclear families were genotyped. Among which, 630 received follow-up exam approximately 5 years later. The study was approved by Institutional Review Boards at all participating sites, including National Taiwan University Hospital, Taipei Veterans General Hospital, Tri-Service General Hospital, and Taichung Veterans General Hospital. The methods were carried out in accordance with the relevant guidelines and regulations. Written informed consent was obtained from all subjects prior to their enrollment in the study.
Phenotyping. Body height (BH), body weight (BW), and BMI [BW (kg)/BH 2 (m 2 )] were measured at 8 AM, after an overnight fast for 8-10 hrs. Each subject received a 75-g oral glucose tolerance test (OGTT) after the anthropometric measurements. Fasting blood samples were obtained for measuring plasma glucose and insulin concentration. Then, the subjects drank 75-g glucose in 300 ml water within 5 min. Blood samples were taken for plasma glucose and insulin 1-and 2-h after glucose loading 40 . Plasma glucose and insulin levels were measured as described previously 40 . Insulin resistance index (HOMA-IR), beta cell function index (HOMA-beta), and sensitivity index (HOMA-S) derived from the homeostasis model were calculated as in the previous study 40 . DM was defined as fasting plasma glucose ≥7.0 mmol/l (126 mg/dl), or 2-h glucose in OGTT ≥11.1 mmol/l (200 mg/dl), or use of at least 1 antidiabetic agent, or self-reported DM history at baseline or during follow-up. All subjects diagnosed as having DM during follow-up were classified as having T2DM by endocrine specialists based on their BMI, family history of DM, age, and the fact that no one had diabetic ketoacidosis. One subject developed type 1 DM during follow-up was excluded. Hypertension was defined as systolic blood pressure >140 mmHg, diastolic blood pressure >90 mmHg, or use of at least one medication to control high blood pressure.
We quantified insulin sensitivity in a subset of 344 participants in the SAPPHIRe cohort at baseline using a modified insulin suppression test 41 . In brief, after overnight fasting, a venous catheter was placed in each of the subjects' arms. One arm was used for the infusion of octreotide (0.5 μg/min preceded by a 25 μg bolus), insulin (25 mU⋅m 2 /min), and glucose (40 mg⋅m 2 /min). Plasma glucose levels were measured between 150 and 180 min after the infusion. The mean of the glucose levels was termed steady-state plasma glucose (SSPG); this provided a direct measure of the ability of insulin to mediate the disposal of infused glucose 41 .
Selection of tag SNPs and genotyping. Initially, eight tag SNPs were selected from the HapMap Chinese in Beijing (CHB) data bank (http://www.hapmap.org) using the Tagger program in Haploview 4.1 (http://www. broad.mit.edu/mpg/haploview/) 42 , with a minor allele frequency threshold of 5% and a r 2 of 0.8 (Table 2). Total genomic DNA was purified from peripheral blood leukocytes using a Puregene DNA extraction kit (Minneapolis, MN, USA), according to the manufacturer's protocol. Genotyping was performed using Applied Biosystems SNPlex assays. Later, rs2281939, which has been shown to be associated with insulin resistance, obesity and T2DM 10 was also selected in this study. Among the 1135 study subjects, the genotyping of rs2281939 was applied only to a subset of 938 subjects. Un-typed or failed genotypes were inferred by implementing software HAPLORE 11 and MERLIN 12 . All SNPs were in Hardy-Weinberg equilibrium in the controls (all p > 0.01), as determined by the Haploview program 42 .
All methods were carried out in accordance with the approved guidelines, and all experimental protocols were approved by committees at National Taiwan University Hospital, Taipei Veterans General Hospital, Taichung Veterans General Hospital, and Tri-Service General Hospital.
Statistical analysis. Continuous variables were presented as mean values ± standard deviation (SD), and binary variables were presented as count (percentage) of a specific category. The associations of individual SORBS1 SNPs with the risk of DM at baseline and the incidence of DM during follow-up under different genetic models were assessed by implementing multiple logistic regression analyses and were adjusted for age, gender, BMI, and hypertension status. Furthermore, we also investigated the associations with overall risk of DM, which were adjusted for birth year, gender, BMI, and hypertension status. The multiple linear regression analyses were used to examine the associations of SNPs with quantitative traits including BMI, fasting glucose/insulin, 2-h glucose/insulin during OGTT, HOMA-IR, HOMA-B, HOMA-S, and SSPG. For BMI, the covariates used for adjustment included age, gender, and hypertension status. For other quantitative traits, the same covariates as well as BMI were used for adjustment. Since the subjects analyzed in this study were siblings from nuclear families, a generalized estimating equations (GEE) approach 43 was applied throughout the logistic and linear regression analyses to deal with the correlated data within the same families. The GEE approach is an extension of the quasi-likelihood method which takes within-cluster correlations into account by using a set of equations and sums these equations over all clusters to obtain the population-averaged estimates of the parameters 44 . The GEE analyses were implemented using IBM SPSS version 19.0.
We used a Cox proportional hazard model 45 to test for associations between specific SORBS1 SNPs and age at onset of DM, which were adjusted for birth year, gender, BMI, and hypertension status. Cox regression analysis was performed using the R package "survival" [version 2.38-3, downloaded from The Comprehensive R Archive Network (CRAN)], and the robust sandwich variance estimator 46 was used to deal with the within-family correlation of age at onset of DM. Furthermore, the Schoenfeld residuals test was performed to examine whether our data were fit for the proportional hazards assumption essential for Cox regression analysis 47 .
To deal with the problem of multiple testing, a measure of significance in terms of FDR, called q-value, was calculated by QVALUE software 34 for each association test. In this study, an association with a q-value less than 0.2 was considered significant under multiple comparisons. With this threshold of q-value, the expected proportion of false positive findings among the declared associations is 20%. Data availability. The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.