Introduction

Type 2 diabetes (T2D) has become one of the fastest growing public health problems worldwide. T2D affects at least 6% of the worlds’ population and the prevalence is expected to double by 2025, with the greatest increases in the developing countries.1, 2 In China, it was estimated that approximately 92.4 million adults were suffering from T2D per year during 2007–2008 and there was an increased prevalence in young and middle-aged population.3 T2D is a complex polygenic disorder characterized by insulin resistance and deficient pancreatic beta-cell function.4 Although environmental factors have clearly contributed to the rise in prevalence of T2D, genetic factors have an important impact on T2D susceptibility.5

Over the past several years, genome-wide association studies (GWAS) have provided tens of new susceptibility loci for T2D,6 including a cluster of loci on chromosome 6.7, 8, 9, 10, 11, 12, 13, 14 Four Caucasian GWAS had consistently identified CDKAL1 (6p22.3) as a novel T2D susceptibility gene in 2007. Intronic variants (rs10946398, rs7756992, rs7754840 and rs9465871) in CDKAL1 (cyclin-dependent kinase 5 (CDK5) regulatory subunit-associated protein 1-like 1) had significant associations with the risk of T2D.7, 8, 9, 10 Subsequently, four new single nucleotide polymorphisms (SNPs) (rs4712524, rs4712523, rs6931514 and rs10440833) in CDKAL1 were also reported to increase the risk of T2D in GWAS of the Japanese and Caucasian population.11, 12, 13, 14 Simultaneously, Zeggini et al.7, 11 detected another two new T2D susceptible signals at 6p21.1 (the maker SNP: rs9472138 and rs9369425) neighboring to vascular endothelial growth factor A (VEGFA) in a Caucasian T2D GWAS and a meta-analysis of T2D GWAS. However, the two SNPs were failed to replicate in the final validations: further studies will be required to establish the associations with the increased T2D risk.

CDKAL1 resides at chromosome 6p22.3 and encodes a 579-residue, 65 kD protein with unclear molecular function. However, CDKAL1 shares considerable protein domain and amino-acid homology with CDK5 regulatory subunit-associated protein 1 (CDK5RAP1), a known inhibitor of CDK5 activation.15 CDK5 has been implicated in the regulation of pancreatic beta cell function through the formation of p35/CDK5 complexes that downregulate insulin expression.16 VEGFA, locates at the downstream of CDKAL1 at the same chromosome (6p21.1). As a major member of VEGF family, VEGFA is a mitogen that specifically acts on endothelial cells and has various effects, including mediating increased vascular permeability, inducing angiogenesis, cell growth, migration and inhibition of apoptosis.17 It has been recognized that T2D mainly accompanies with the increased expression of VEGF in numerous tissues (retina and glomeruli) as a response to hyperglycemia and tissue ischemia, finally leading to complications such as retinopathy and diabetic nephropathy.18 All these findings further support the biological plausibility that genetic variants of CDKAL1 and VEGFA may affect the risk of T2D.

Up to now, most of the GWAS were conducted among populations of non-Chinese ancestry and the contributions of these variants to the T2D risk in Han Chinese are poorly characterized. Nevertheless, some most significant SNPs on chromosome 6 have been successfully replicated in many case–control studies of Chinese population but no study has conducted to include all the SNPs identified by GWAS on chromosome 6 associated with T2D risk. For instance, the association of SNP rs9472138 neighboring to VEGFA with the risk of T2D has not been confirmed in Chinese after GWAS exploration and the minor allele frequency of this variant is lower in Chinese (0.116), Japanese (0.081) and Africans (0.137) than in Caucasians (0.239), suggesting that there may be a heterogeneity of the associations with T2D among different populations. Here, we performed a case–control study including 2925 T2D cases and 3281 controls in a Han Chinese population to further determine the associations between these SNPs on chromosome 6p21.1 and 6p22.3 and T2D risk.

Materials and methods

Study subjects

We performed a case–control study of T2D in a Chinese population and our samples were composed of two parts. One part of samples (1200 T2D cases and 1200 orthoglycemic controls) were recruited from a cross-sectional survey of more than 21 000 participants aged over 30 years in a community-based non-communicable diseases screening program conducted in Wuxi City, Jiangsu Province of southeastern China, from April to July 2007, and the other part of 1725 cases and 2081 controls were selected from an ongoing, population-based cohort study of about 40 000 subjects in Changzhou and Nantong cities in Jiangsu Province during 2004 and 2008. All these cases and controls were unrelated ethnic Han Chinese. The cases were defined by 2003 ADA (the American Diabetes Association) criteria (fasting blood glucose, FBG7.0 mmol l−1) or previously diagnosed T2D. All the controls were randomly selected from the subjects with normal fasting glucose defined as FBG<5.6 mmol l−1 and were frequency-matched to cases on the age of interview, sex and residential area (urban and rural).19, 20 The subjects who self-reported having diabetes, hypertension, coronary heart disease and cancer in the baseline or follow-up were all excluded from the controls. A questionnaire was completed by trained interviewers to collect information on demographic characteristics, risk factors, physical activity and disease history after the written informed consent was obtained from every subject. At the same time, physical examinations including height, weight and blood pressure were conducted and the average level of twice of blood pressure test was used. BMI was calculated as weight (in kg)/height (in m2) and overweight or obesity was defined as BMI24 kg m−2.21 Approximately 5 ml blood was collected from each participant early in the morning after an overnight fast. FBG, triglycerides, cholesterol and high-density lipoprotein cholesterol were measured enzymatically by Hitachi 7180 Biochemistry Auto-analyzer (Hitachi High-Tech Science Systems Corporation, Hitachinaka, Japan) following the manufacture's instructions.

As a result, a total of 2925 cases and 3281 controls who completed the interviews, physical examinations and donated the blood samples with biochemical measurements were included in the final analysis. This study was approved by the Institutional Review Board of Nanjing Medical University (Nanjing, China).

SNPs selection and genotyping

Based on the recent GWA studies of T2D, we identified eight SNPs in CDKAL1 (6p22.3), showing associations with the risk of T2D, including rs4712523, rs10946398, rs7754840, rs7756992, rs9465871, rs10440833, rs4712524 and rs6931514. The minor allele frequencies of all these eight SNPs were higher than 0.05 in Chinese Han population (CHB). Based on the HapMap database (phase II, 8 November, on NCBI B36 assembly, dbSNP b126), we selected two representative SNPs (rs4712523 and rs7756992) with r2<0.8 among the first six SNPs in CHB (Figure 1). The last two SNPs (rs4712524 and rs6931514) that had no linkage disequilibrium (LD) information in HapMap were also selected for genotyping. In addition, SNPs rs9472138 and rs9369425 close to VEGFA (6p21.1) significantly associated with T2D risk were also in complete LD (r2=1.0), and we selected rs9472138 in the validation for its stronger reported association than rs9369425. Finally, five SNPs (rs4712523, rs7756992, rs4712524, rs6931514 and rs9472138) were included in the present study.

Figure 1
figure 1

Overview of the GWAS reported loci on 6p22.3 and 6p21.1 associated with T2D risk. The upper panel shows a view of the genomic region in part of 6p22.3 (20 400 000–21 000 000) and 6p21.1 (43 840 000–43 950 000) from the UCSC browser Build 36 assembly (hg18), marking with nine SNPs, which were reported to associate with the risk of T2D in GWAS. The lower panel shows the LD plot (20 751 201–20 836 492) of intron 5 in CDKAL1 gene of Chinese descent (the block was defined by the confidence intervals). The lower left thumbnail shows the r2 value between six reported SNPs in CDKAL1 (rs6931514 and rs4712524 that have no LD information in Hapmap database are directly included in our study).

Genomic DNA was extracted from peripheral blood leukocytes by proteinase K digestion and followed by phenol/chloroform-based method. Two genotyping platforms were used to genotype the samples. For the first part of the 1200 T2D cases and 1200 controls, we used TaqMan OpenArray Genotyping System (Life Technologies, Carlsbad, CA, USA), a medium-throughput genotyping platform. Normalized DNA samples were loaded and amplified on 48-sample arrays following the manufacturer's protocol. Equal numbers of cases and controls were assayed and two no template controls in each chip were used for quality control. The overall call rates were ranged from 98.5 to 99.3% for this assay. For the other part of the 1725 cases and 2081 controls, all five SNPs were genotyped using the iPLEX Sequenom MassARRAY platform (Sequenom, Inc., San Diego, CA, USA). Genotyping was performed without knowing the subjects’ case or control status and two no template controls in each 384-sample array chip were used for quality control. The genotype call rates for the Sequenom assay were all higher than 99.3%. Randomly selected 96 duplicated samples were assessed by the two platforms and the accordance achieved was 100%.

Statistical analyses

Student's t-test (for continuous variables) or χ2-test (for categorical variables) were used to analyze the differences in selected demographic variables, risk factors and frequencies of variant genotypes between the cases and controls. Hardy–Weinberg equilibrium was tested by a goodness-of-fit χ2-test to compare the observed genotype frequencies with the expected ones among the controls. LD between SNPs was evaluated using Haploview version 4.1 (Broad Institute, Cambridge, MA, USA). Odds ratios (ORs) and 95% confidence intervals (CIs) were computed by using multivariate logistic regression with adjustment for age, sex and BMI to estimate the associations between each SNP and the risk of T2D. Conditional regression analyses on each SNP were conducted by using logistic regression with adjustment for age, sex, BMI and any of the other four SNPs in additive genetic model.11, 22 The haplotypes of genotyped SNPs were inferred by PHASE 2.1. Between-group heterogeneity test in a stratified analysis was assessed by the χ2-based Q test. We used PS 3.0.2 software (Dupont WD and Plummer WD, Nashville, TN, USA) to perform the power estimation. All the statistical analyses were performed by the PLINK 1.07 (Purcell S, Cambridge, MA, USA) and STATA 11.1 software (StataCorp, College Station, TX, USA). P<0.05 was the criterion of statistical significance, and all statistical tests were two sided.

Results

The distributions of selected characteristics of the 2925 T2D cases and 3281 controls are shown in Table 1. There were no significant differences in the variables of sex, smoking and drinking status between the cases and the controls (P=0.958, 0.556 and 0.336, respectively). However, the mean age of the cases (58.21±10.11) was higher than that of the controls (56.57±9.88). As expected, compared with the control subjects, the T2D cases had a significantly higher BMI, FBG, triglycerides, cholesterol, SBP, DBP and lower high-density lipoprotein cholesterol (all P-values <0.0001).

Table 1 Distributions of selected variables between type 2 diabetes cases and controls

The genotype distributions of the five selected variants between the T2D cases and controls are shown in Table 2. The observed genotype frequencies for five SNPs were all consistent with the Hardy–Weinberg equilibrium among the controls (P=0.53, 0.58, 0.83, 0.54 and 0.06 for rs6931514, rs7756992, rs4712523, rs4712524, and rs9472138, respectively). Considering multiple comparisons, we used Bonferroni correction on the results of additive genetic model and P=0.01(0.05/5) was considered as significance threshold. Multivariate logistic regression analyses showed that all the five SNPs were significantly associated with increased T2D risk with overall effects (OR) from 1.19 to 1.29 (rs6931514: adjusted OR=1.29, 95% CI=1.19–1.39, P=5.6 × 10−10; rs7756992: adjusted OR=1.23, 95% CI=1.15–1.32, P=1.2 × 10−8; rs4712523: adjusted OR=1.25, 95% CI=1.15–1.35, P=3.8 × 10−8; rs4712524: adjusted OR=1.24, 95% CI=1.15–1.35, P=6.8 × 10−8; rs9472138: adjusted OR=1.19, 95% CI=1.05–1.34, P=0.006).

Table 2 Genotype distributions of the five SNPs at chromosome 6 between T2D cases and controls and their associations with T2D risk

We then used conditional analysis to test the independence of these SNPs. As shown in Table 3, after controlling any of the other four SNPs, rs9472138 showed substantially significant association with T2D risk, while rs6931514 and rs7756992 were not significant after adjustment with each other, as the two SNPs were in highly LD (D′/r2=0.99/0.97). Moreover, the effect of rs4712523 and rs4712524 on T2D risk was abolished after condition on the other three SNPs except rs9472138. Therefore, only two SNPs were considered independent signals of T2D susceptibility on 6p22.3 (rs6931514) and 6p21.1 (rs9172138).

Table 3 Conditional regression analysis and LD values between the 5 SNPs

Based on the results of the conditional analysis, we included SNPs rs9472138 and rs6931514 with independent effect in further combined analysis. The combined effect was calculated by summing up the number of risk alleles for each subject who had the genotyping information of these two SNPs. As shown in Table 4, compared with the group of ‘0’ risk allele, we found significantly increased risk for T2D associated with the combined genotypes containing ‘two’ risk alleles (adjusted OR=1.59, 95% CI=1.35–1.88, P=5.6 × 10−8) and ‘three to four’ risk alleles (adjusted OR=1.86, 95% CI=1.44–2.38, P=1.4 × 10−6) in a dose-response effect (P for trend=7.4 × 10−12).

Table 4 Cumulative effects of rs6931514 and rs9472138 on T2D risk

As four of the five SNPs were located in one LD block within CDKAL1 gene, the pattern of risk association was further evaluated by examination of the common haplotypes derived from observed genotypes (Supplementary Table 1). Compared with the most common haplotype AAAA, the GGGG haplotype that was consisted of risk alleles of four SNPs was associated with a 33% elevated risk of T2D (adjusted OR=1.33, 95% CI=1.22–1.45, P=1.5 × 10−10). Meanwhile, haplotype GGAA was also significantly associated with a 1.17-fold increased T2D risk (adjusted OR=1.17, 95% CI=1.03–1.33, P=0.015).

We then conducted the stratification analysis for two independent SNPs rs6931514 and rs9472138 and the combined genotypes of the two SNPs by age, sex, BMI, smoking and drinking status. As shown in Supplementary Table 2, heterogeneity test showed no significant heterogeneity in between any two strata.

Discussion

In this case–control study of T2D, we investigated the associations between four SNPs in CDKAL1 (rs6931514, rs7756992, rs4712523 and rs4712524) and one SNP neighboring VEGFA (rs9472138) on chromosome 6 and risk of T2D in a Chinese population. We found that all these five SNPs could significantly increase T2D risk on the basis of community sources of 2925 T2D cases and 3281 controls.

Among the four replicated SNPs in the CDKAL1 gene, rs7756992 had shown reproducible association with the risk of T2D in both European population and Chinese populations.8, 23, 24 Hu et al.23 reported a significant association between rs7756992 and risk of T2D in 1849 cases and 1785 controls from Shanghai, which was consistent to our study. Furthermore, some replication studies found that rs7756992 was significantly associated with T2D-related traits (for example, FBG, A1C, HOMA-B and corrected insulin response), indicating that this variant may confer risk of T2D through impaired function of beta cells or reduced insulin secretion.8, 24 However, this hypothesis needs to be explored by biological assays in future studies.

Compared with the above well-replicated SNPs, the other three SNPs at 6p22.3 have rarely replicated after GWAS discovery, and have not reported in Chinese populations. rs4712523 and rs4712524 were firstly identified to be associated with T2D risk in Japanese GWAS.12, 13 Similar to the results by Japanese, rs4712523 and rs4712524 GG genotypes were significantly associated with increased risk of T2D in current study. Based on HapMap CHB database, rs4712523 was in very high LD with rs10946398, rs7754840 and rs10440833 (r2=1.0, 1.0 and 0.82, respectively), which all showed associations with T2D risk both in European GWAS and in Chinese replication studies.19, 25, 26, 27, 28 rs6931514 was in high LD with rs7756992 (D′/r2=0.99/0.97), although it was not reported in HapMap CHB database.

Overall, the four associated SNPs all mapped to the same intron (intron 5: 90 kb) of the CDKAL1 gene on 6p22.3. These variants reside in a large LD block of 200 kb that includes exons 1–5 of CDKAL1 and the minimal promoter region, likely to contain causative variants (Figure 1). The function of the gene product of CDKAL1 is little known but it shares large protein domain with CDK5RAP1, which inhibits CDK5 activity and expresses in neuronal tissues. CDK5 has been identified to have an important role in beta cell dysfunction and the decrease of insulin gene expression under glucotoxic conditions.29 Thus, CDKAL1 could have a similar role in the inhibition of the CDK5 in pancreatic beta cells as CDK5RAP1 in neuronal tissue. Beta cell-specific CDKAL1 knockout mice models suggested that functional loss of CDKAL1 affects the accuracy of protein translation, causing the synthesis of abnormal insulin, which triggers estrogen receptor stress in beta cells.30, 31

The significant association between the SNP rs9472138 (about 73 kb to the transcription start site of VEGFA) and T2D was firstly replicated among Chinese population. Zeggini et al. first reported in Caucasian that rs9472138 was associated with T2D in GWAS scan and the first stage validation, but achieved a borderline significant association (P=0.095) in the second stage validation. The OR (adjusted OR=1.19, 95% CI=1.05–1.34, P=0.006 in additive model) in our study was slightly higher than that identified in Caucasian GWAS (adjusted OR=1.06, 95% CI=1.06–1.09), which implied some ethnic differences between these studies for distinct risk allele frequencies (0.110 in our controls vs 0.282 in European population).11 Staiger et al.32 reported that rs9472138 tended to associate with more than one measure of insulin secretion and insulin sensitivity in 1578 metabolically characterized non-diabetic German individuals. Furthermore, Sharma et al. reported that rs9472138 was significantly associated with the expression of VEGFA in transformed lymphocytes from 82 Caucasians, and human adipose and muscle expression of VEGFA was also correlated with BMI and insulin sensitivity.33 By similar screening cis-eQTLs in transformed lymphocytes of African-American subjects, Langberg et al.34 found that rs9472138 was associated with nearby transcript expressions (MEA1, SLC35B2, NFKBIE), but the association between the variant and T2D was not successfully validated (P=0.128). A recent replication study with 3388 T2D cases and 2856 controls in a Japanese population failed to repeat the association between rs9472138 and T2D (OR=1.06, P=0.41).35 This finding was inconsistent with our results in Chinese population and might also reflect the difference among different populations. On the other hand, because of the lower allele frequency of rs9472138 that leads to the lower statistical power (about 64%), we could not rule out the possibility of false-positive associations, which needs validation from large studies in the future.

Some limitations of this study need to be addressed. First, there may be a spurious finding by population stratification derived from samples comprising two parts. However, we used the χ2-based Q statistic to test for the heterogeneity between groups in a stratified analysis and we found that the P-values of heterogeneity tests were 0.50, 0.64, 0.06, 0.11 and 0.15 for rs6931514, rs7756992, rs4712523, rs4712524 and rs9472138, respectively, indicating that it was unlikely that the associations were resulted from population stratification (data not shown). Second, when this article was in preparation, a T2D GWAS study for the East Asian population consisting 25 079 cases and 29 611 controls was published, reporting that six new T2D loci reaching genome-wide significance, including ZFAND3 (the marker SNP: rs9470794) at 6p22.3 and KCNK16 (the marker SNP: rs1535500) at 6p21.1, which further confirmed that these two susceptible regions might have an important role in T2D.36

In conclusion, we found that five SNPs on chromosome 6 were significantly associated with risk of T2D in Chinese population, especially for SNP rs9472138 at 6p21.1 identified for the first time to significantly increase the T2D risk in Chinese individuals. Further functional studies are warranted to explore the potential mechanisms of these SNPs associated with T2D risk.