Introduction

Glutamine: fructose-6-phosphate amidotransferase 1 (GFPT1) is a rate-limiting enzyme of the hexosamine biosynthetic pathway (HBP) that has been shown to play an important role in insulin action (Balkan et al. 1994; Baron et al. 1995; Buse 2006; Crook et al. 1996; Marshall et al. 1991; Patti et al. 1999). Although it is a relatively minor branch of glycolysis, HBP is an alternative pathway of glucose metabolism acting as a glucose sensor and mediating the decrease in insulin sensitivity. In addition, the flux of glucose through HBP mediates the up-regulation of transforming growth factor beta 1 (TGF-β1) function in renal mesangial cells (Kolm-Litty et al. 1998), leading to impairment in the diabetic kidney (Border 1994). HBP is closely related with insulin action and the pathogenesis of late complications in diabetic nephropathy. Because the amount of GFPT1 protein appears to control the flux of glucose through HBP (Weigert et al. 2003), it is important to examine the regulation of the GFPT1 gene. Indeed, transgenic mice over-expressing GFPT1 in skeletal muscle and adipocytes developed peripheral insulin resistance as determined by insulin clump studies (Hebert et al. 1996). Therefore, the GFPT1 gene was proposed as an attractive target of susceptibility to type 2 diabetes, and we examined the polymorphism related with gene regulation.

In previous reports, polymorphisms in intron 1, 9 and 3′-untranslated region (UTR) of GFPT1 were associated with diabetes in Caucasians and with diabetic nephropathy in African-Americans (Elbein et al. 2004). In addition, an association with −913G > A polymorphism (A allele is minor in Japanese) in the 5′-flanking region was observed with obesity in non-diabetic Caucasians (Weigert et al. 2005). However, the other contradictory report indicated no difference of promoter activity in −913G > A polymorphism, but −1412C > G in the 5′-flanking region had an effect on promoter activity (Burt et al. 2005). Further, it was shown that polymorphisms in GFPT1 play no major role in susceptibility to diabetic nephropathy in Caucasians (Ng et al. 2004). To date, it remains unclear whether a precise polymorphism contributes to the regulation of the GFPT1 gene and disease status of type 2 diabetes. In addition, there has been no study on the association of the polymorphisms of GFPT1 with type 2 diabetes in Japanese.

In this study, we searched for polymorphisms across 5′-flanking and coding regions of the GFPT1 gene and systematically examined their effects on promoter activity with in silico analysis and luciferase assay. To extend the role of polymorphism related with promoter activity, we analyzed the association study with type 2 diabetes in 2,763 Japanese and 330 Caucasian samples.

Materials and methods

Subjects

The clinical characteristics of all Japanese and Caucasian subjects are summarized in Table 1. A total of 2,763 Japanese subjects, consisting of 1,461 controls and 1,302 type 2 diabetic patients, were enrolled in this study. Controls were recruited from adult healthy volunteers, whose HbA1c levels were lower than 5.8% and who had no other diseases on close clinical examination. Patients were mainly recruited at Tokushima University Hospital, Kyoto Prefecture University Hospital and its affiliated hospitals. Type 2 diabetes was diagnosed with the criteria of the World Health Organization (1985).

Table 1 Clinical characteristics of 2,763 Japanese and 330 Caucasian samples in the association study

Japanese DNA samples were obtained from peripheral blood leukocytes or Epstein-Barr virus-immortalized lymphoblast cell lines with the standard protocol. Written informed consent was obtained from all subjects. The study protocol was approved by the Institutional Review Board of the University of Tokushima.

Caucasian DNA samples from 190 control subjects and 140 type 2 diabetes patients were purchased from BioClinical Partners Inc. (Franklin, MA). These materials are now available from ZeptoMetrix Inc. (Buffalo, NY). Informed consent was obtained from all subjects through the company.

Polymorphism screening of the GFPT1 gene

Using 24 randomly selected control samples, we searched for polymorphisms in the 5′-flanking region as a putative promoter sequence (1.7 kb), 5′ and 3′-UTR (untranslated region), all 19 exons plus an alternative exon in intron 8, and exon/intron boundaries (7.3 kb) (DeHaven et al. 2001; Niimi et al. 2001). For PCR-direct sequencing, the sequences of amplification primers are available upon request. Amplified PCR products in 3 μl were treated with 2 μl of ExoSAP-IT (Amersham Biosciences, Piscataway, NJ) at 37°C for 15 min and 80°C for 15 min with the standard protocol. The sequencing mixture contained 1 μl of BigDye terminator cycle sequencing kit version 1.1 (Applied Biosystems; ABI, Foster City, CA), 1 μl of amplification primer or nest primer, 2 μl of BigDye Terminator Sequencing Buffer and 1 μl of distilled water. We performed the cycle sequencing reaction with 25 cycles at 96°C for 10 s, 50°C for 5 s and 60°C for 4 min. The mixture was then filtered using Sephadex G-50 Fine (Amersham Biosciences) with Multi-Screen 96-well plates (Millipore, Molsheim, France). After mixing with Hi-Di Formamide (ABI), the product was sequenced using an ABI 3730 or 3100 automated sequencer (ABI).

In silico prediction of promoter region in the GFPT1 gene

As a target region of analysis, we selected a 65-kb sequence including 3 kb of the 5′-flanking sequence as a putative promoter sequence and 62 kb of the GFPT1 gene. We predicted the promoter region by the promoter score calculated with PROSCAN version 1.7 (Advanced Biosciences Computing Center, University of Minnesota, http://www.bimas.dcrt.nih.gov/molbio/proscan/) (Chan et al. 2001; Gabellini et al. 2003; Meyer et al. 2002; Steinke et al. 2003). The promoter score was calculated based on the sequence elements that potentially bind transcription factors. When the cutoff index was set at 53.0, promoter regions were predicted with a probability of 70% (Prestridge 1995).

The genome and mRNA sequence of the GFPT1 gene (Gene ID 2673, NT_022184, and NM_002056) were obtained from NCBI Human Build 35. The target region of analysis was divided into 1-kb fragments. As a result, 65 fragments of 1 kb were assessed with PROSCAN. To narrow down the predicted region, 1-kb fragments in the immediate upstream and downstream of the translation initiation codon ATG with the position of A designated as +1, we calculated the promoter score in shorter fragments with a deletion every 100 bp.

Luciferase reporter assay

Based on in silico prediction of the promoter region, we examined promoter activity with the luciferase reporter assay of five reporter constructs. To make these constructs, we amplified the target sequence by PCR (KOD plus, TOYOBO, Osaka, Japan) from human lymphoblast DNA using a forward primer with SacI and a reverse primer with HindIII. The amplified products were cloned using the Zero Blunt TOPO PCR Cloning kit (Invitrogen, Carlsbad, USA), and their identities were confirmed with sequencing. The constructs were sub-cloned into a pGL3-basic vector (Promega, Madison, WI) using the restriction enzyme site. NIH3T3 cell lines were cultured at 5×104 cells in a 35-mm culture dish in Dulbecco’s modified Eagle’s medium (Invitrogen) containing 10% fetal bovine serum and 10,000 units/ml of 1/100 volume of penicillin–streptomycin (Invitrogen). After incubation at 37°C for 24 h under 5% CO2, 2 μg of the reporter plasmid in pGL3-basic vector and 120 ng of control pRL-TK vector (Promega) were co-transfected into NIH3T3 cells using 6 μl of FuGENE 6 (Roche, Basel, Switzerland). After culturing for 48 h, NIH3T3 cells were washed with Dulbecco’s PBS (Invitrogen) and collected in 1× lysis buffer. Luciferase activity in each construct was assayed using the Dual-Luciferase Reporter Assay System (Promega) with a luminometer (Lumat LB9507, BERTHOLD, Bad Wildbad, Germany) according to the standard protocol. To normalize the transfection efficiency, the luminescence value of the pGL3-basic vector was standardized with the value of the pRL-TK vector. The results were statistically compared with Student’s t test.

Genotyping, a case-control association study, and haplotype analyses

Genotyping was assayed by the TaqMan method (ABI). In brief, 1.25 μl of TaqMan Universal Master mix and 1.25 μl of TaqMan Probe were mixed with 5 ng of DNA by the protocol recommended by the manufacturer. The list of TaqMan Probes is shown in Supplementary Table 1. The PCR condition was 40 cycles of reaction at 94°C for 15 s and 60°C for 1 min after pre-incubation at 95°C for 10 min with ABI GeneAmp PCR System 9700. Each 384 plate containing four non-template controls was measured for VIC and FAM fluorescence with an ABI 7900HT analyzer. Genotyping results were automatically obtained, and the results were independently judged by two researchers. In our laboratory, genotyping by the TaqMan method showed 100% concordance with sequencing as previously reported (Hamada et al. 2005; Kato et al. 2006).

Case-control and haplotype-based association tests were analyzed in 2,763 Japanese subjects including 1,461 controls and 1,302 cases. P values were calculated with four types of chi-square tests, such as allele, genotype, dominant and recessive models (SNPAlyze software version 3.2 pro, DYNACOM, Yokohama, Japan). The crude odds ratio (OR) for allele model and 95% confidence interval (CI) were also calculated with the same software. Hardy–Weinberg equilibrium was calculated with genotype frequency obtained by simple gene counting, and evaluated by the chi-square test for comparing observed and expected values. As the cases were older than the controls in this study, a logistic regression model was used to adjust for age and sex. This analysis was carried out using the SPSS system, release 12.0 J for Windows (SPSS Japan Inc., Tokyo, Japan). Power calculation was simulated with the Genetic Power Calculator (http://www.statgen.iop.kcl.ac.uk/gpc/cc2.html) (Purcell et al. 2003).

To evaluate linkage disequilibrium (LD), the simple pairwise LD value was calculated in 1,461 controls and 1,302 cases using |D′| and r 2. Haplotypes with all verified polymorphisms were then examined using SNPAlyze ver. 3.2. A region with |D′| >0.9 or r 2 >0.9 was considered an LD block.

Results

Identification of polymorphisms in the GFPT1 gene

The GFPT1 gene is composed of 19 exons and one alternative exon in a total length of 62 kb on chromosome 2p14. This locus showed a possible linkage to a body-mass index (BMI) with Framingham Heart Study families (Moslehi et al 2003). However, there is no evidence of linkage in Japanese (Mori et al. 2002; Iwasaki et al. 2003; Nawata et al. 2004). The GFPT1 gene is coded on the opposite site of the reference genome sequence. The genomic structure of GFPT1 and the positions of polymorphisms are summarized in Fig. 1a. With the PCR-direct sequencing method, we detected eight polymorphisms across the GFPT1 gene. Of these eight, six were identified from the database and two were novel polymorphisms, including +70C > T in exon 8 (L255L) and T-repeat in intron 18 (Elbein et al. 2004; Ng et al. 2004). In Caucasians, −1412C > G in the 5′-flanking region was observed with the minor allele frequency (MAF) of 0.145 (Burt et al. 2005), but it was not detected in 24 Japanese samples.

Fig. 1
figure 1

The GFPT1 gene and in silico prediction of promoter region. a Genomic structure of GFPT1 with 19 exons and an alternative exon in intron 8. Coding exons are marked by black boxes and non-coding regions by white boxes. The eight detected polymorphisms are indicated by arrows, and each polymorphism is named by its position and allele type (major allele > minor allele). b Promoter scores across the GFPT1 gene. A total of the 65-kb region including 3 kb of the 5′-flanking region and 62 kb of the GFPT1 gene was divided into 1-kb fragments, and promoter scores calculated with PROSCAN version 1.7 are shown. Relative position to a start codon of ATG is shown on the horizontal axis. The promoter score is depicted on the vertical axis. Promoter regions were predicted with a probability of 70% when the score was over 53.0. a and b are illustrated as the same scale. c Promoter scores in the sequence between −1,000 and +1,000 bp. Promoter scores corresponding to the shorter fragments are shown in bilateral graphs. A total of 2-kb region, including 1 kb upstream (left) and 1 kb downstream (right) of a start codon, was calculated. Black lines on both sides show the start and end positions of fragments. The arrow indicates the position of a start codon. d Partial sequence of the predicted promoter region. Intron 1 +36T > C polymorphism is located on Sp1 binding GC box sequence. Its position is squared with a dotted line, and GC box sequence is shown in bold. Large and small characters represent the coding and the non-coding sequences, respectively

In silico prediction of the promoter region

With PROSCAN software, the promoter score was calculated in a total of 65-kb length (Fig. 1b). The promoter region was apparently predicted in two regions, 1 kb upstream and 1 kb downstream of the translation initiation ATG (76.29 and 101.15, respectively). Marginal scores (55.66–53.24) were obtained in another five 1-kb fragments located in intron 3, intron 6, intron 8 and intron 13–14. Thus, we considered two regions of −1,000 to −1 bp and +1 to +1,000 bp as putative promoter regions of the GFPT1 gene.

To narrow down the putative promoter region, we calculated the promoter score of shorter fragments (Fig. 1c). In the upstream region of a start codon, a high promoter score (70.88) was observed in the fragment of −1,000 to −300 bp, but it disappeared in the next fragment of −1,000 to −400 bp, suggesting the presence of promoter activity between −400 and −1 bp. In the downstream region, the highest score (101.15) was observed in the fragment of +1 to +1,000 bp, and it decreased to 68.47 in the following fragment of +100 to +1,000 bp. The promoter score was completely lost in the fragment of +200 to +1,000 bp, suggesting the presence of promoter activity between +1 to +200 bp. Taken together, these results suggested that the region between −400 and +200 bp could contain promoter activity of the GFPT1 gene.

To evaluate the effect of polymorphisms in the predicted region, the promoter score was further calculated. One verified polymorphism, +36T > C in intron 1, was located in the predicted promoter region (−400 to +200 bp). The +36T > C polymorphism was located on Sp1 binding GC box sequence, in which the C allele had a slightly increased promoter score compared to the T allele (Fig. 1d). It was likely that +36T > C in intron 1 was one of the functional polymorphisms affecting the promoter activity of GFPT1.

Effect of two polymorphisms on actual promoter activity

To evaluate the putative functional polymorphisms, we assessed actual promoter activity with the luciferase reporter assay (Fig. 2). We prepared five reporter constructs: two from −1,008 to +2 bp containing −913G > A in the 5′-flanking region and two from −657 to +112 bp containing +36T > C in intron 1, and one from −657 to +2 bp without a polymorphism as a standard.

Fig. 2
figure 2

GFPT1 promoter activity by luciferase reporter assay. a Structure of reporter plasmids. Both sides of the black line show the start and end positions in the predicted promoter region. Black arrows indicate the position of a translation start codon (ATG). White triangles indicate the positions of polymorphisms, and their allele types are shown under the line. Five constructs are shown as follows: two from −1,008 to +2 bp with −913G (major allele), −913A (minor allele), two from −675 to +112 bp with +36T (major allele), and +36C (minor allele), and one from −675 to +2 bp without polymorphism as a standard. LUC denotes luciferase. b Comparison of luciferase reporter activities among polymorphisms. White bars represent luciferase activities in two constructs with a major allele. Black bars represent those in two constructs with a minor allele. Gray bar represents luciferase activity without polymorphism as a standard. Data were obtained from eight independent samples and are represented as the mean ± SE. Data are standardized as a mean value in the construct of −675 to +2 bp from 16 independent samples. Statistical significance was calculated by Student’s t test. NS denotes not significant

In NIH3T3 cells, there were no significant differences in luciferase activities between the two constructs with −913G > A (0.81±0.05 vs. 0.79±0.04, n=8). Luciferase activity in the C allele with +36T > C polymorphism was significantly higher than that in the T allele (P=0.00006, n=8), in agreement with in silico prediction. We further detected a 53 and 49% increase in luciferase activities in the C allele in HEK 293 and Hela cells (independent experiments were repeated three times). Increased luciferase activity was observed between constructs −657 to +2 bp and −657 to +112 bp, suggesting that a positive regulatory element may be present between +3 to +112 bp in intron 1. These results indicated that +36T > C in intron 1 could affect promoter activity as a functional polymorphism, but not −913G > A in the 5′-flanking region.

Case-control association analysis

To evaluate the effect of +36T > C on the disease status of type 2 diabetes, case-control association was analyzed in 2,763 Japanese subjects (Table 2). With the power estimation, the sample size of 1,300 controls and 1,300 cases would reach 72% power based on the following conditions: a disease prevalence of 6%, genotypic relative risk (penetrance for the disease susceptibility allele) of 1.2 and 1.3, a marker allele frequency of 0.45 and an alpha level of 0.05. Seven polymorphisms were analyzed, except the T-repeat in intron 18, which we failed to detect with the TaqMan method. Two polymorphisms, +30T > C in intron 5 and +70C > T in exon 8, had relatively low MAF, and the other five polymorphisms had high allele frequencies. None of the polymorphisms including +36T > C in intron 1 showed significant results in 1,461 controls and 1,302 cases. The +30T > C in intron 5 did not satisfy the Hardy–Weinberg equilibrium, which was excluded from LD and haplotype-based association analyses.

Table 2 Association analysis of seven polymorphisms in 2,763 Japanese subjects (1,461 controls and 1,302 cases)

In Caucasians, +36T > C showed an association with type 2 diabetes by allele, genotype, dominant and recessive models of the chi-square test (crude OR for C allele =1.569, 95% CI =1.136–2.167) (Table 3). By logistic regression analysis, the association with +36T > C remained statistically significant after adjustment for age and sex (adjusted OR =1.532, 95%CI =1.078–2.177 and P=0.017). However, this result was nominal because we examined only a limited number of Caucasian samples. In addition, power calculation with 150 controls and 150 cases showed a relatively low value (0.139).

Table 3 Association analysis of Intron 1 + 36T > C (rs6720415) in 2,763 Japanese and 330 Caucasians

LD pattern and haplotype-based association analysis

As shown in Fig. 3, strong LD values (|D′|>0.98) were presented within six polymorphisms of the GFPT1 gene in both controls and cases. With the definition of r 2, the LD block was separated by +70C > T in exon 8 due to a low MAF.

Fig. 3
figure 3

Linkage disequilibrium (LD) coefficients (|D′| and r 2) across the GFPT1 gene. The values of 1,461 controls are shown in the upper right, and the values of 1,302 cases are shown in the lower left. Columns of |D′|>0.9 or r 2>0.9 are shaded in gray. The +30T > C in intron 5 did not satisfy the Hardy–Weinberg equilibrium and was excluded from this analysis

With the definition of |D′|, three types of haplotype were observed, such as GTTCAG (haplotype 1), ACGCGT (haplotype 2) and ACGTGT (haplotype 3) (Table 4). Total frequencies of three haplotypes were over 0.99 in controls and cases, suggesting that the haplotype block was highly conserved in the GFPT1 gene. For +70C > T in exon 8, haplotypes 2 and 3 were separated with its low MAF. There were no significant associations with the disease status of cases and controls. In addition, there were no significant results in other haplotypes with the definition of r 2 (data not shown).

Table 4 Haplotype-based association analysis in 2,763 Japanese subjects (1,461 controls and 1,302 cases)

Discussion

It is well known that two types of functional polymorphisms are closely related with gene function: the polymorphism changing the amino acid and that affecting gene regulation (Altshuler et al. 2000; Hudson 2003; Rebbeck et al. 2004). To evaluate these factors, we first searched for polymorphisms in the 5′-flanking region and the coding region of the GFPT1 gene. By re-sequencing Japanese samples, we found eight polymorphisms, including one synonymous polymorphism (+70C > T in exon 8), but no mis-sense or non-sense polymorphisms. Thus, we focused on polymorphisms related with gene regulation.

With in silico prediction of the promoter region, 1 kb upstream (5′-flanking) and 1 kb downstream (intron 1) of a start codon were proposed to affect promoter activity. As a result of polymorphism screening, this region contained two verified polymorphisms, such as the −913G > A in the 5′-flanking region and +36T > C in intron 1. The reporter construct with −913G > A in the 5′-flanking failed to change luciferase activity significantly, consistent with the previous report (Burt et al. 2005). In addition, the promoter region of the human GFPT1 gene was shown to be between −283 to +80 bp of the transcription start site (Burt et al. 2005). These results suggested that the functional polymorphism might be absent within 1 kb upstream of a start codon in GFPT1.

Although no functional polymorphism has been reported in intron 1 of GFPT1, the promoter region was strongly predicted downstream of a start codon, namely the 5′ side of intron 1. Furthermore, +36T > C in intron 1 was shown to affect promoter activity with its position in a GC box sequence. Indeed, reporter constructs with +36T > C in intron 1 had a positive effect on luciferase activity. Taken together, +36T > C in intron 1 was implicated as a new functional polymorphism affecting the regulation of GFPT1. However, our results may be limited to evaluating only some functional polymorphisms in GFPT1, because the functional polymorphisms related with gene regulation are occasionally located outside the gene region. Further study is therefore needed.

The contribution of this new functional polymorphism to type 2 diabetes was assessed with a case-control association study, but there was no significant association including other polymorphisms across the GFPT1 gene. There are several possible explanations for the lack of association with type 2 diabetes.

First, it is likely that the contribution of polymorphism in GFPT1 was restricted in the specific sub-groups of type 2 diabetes. For example, increased gene expression of GFPT1 was observed in the diabetic nephropathy patients (Elbein et al. 2004). This finding suggested further analysis in the sub-group with diabetic nephropathy. Unfortunately, it was not feasible to evaluate the clear effect of polymorphisms on diabetic nephropathy, because the clinical data related with nephropathy were insufficient in this study.

Next, obesity may contribute to the association with type 2 diabetes as a co-factor, because one polymorphism (−913G > A in the 5′-flanking region) is associated with the risk for obesity in Caucasians (Weigert et al. 2005). This hypothesis could support a further association study with a sub-group of high BMI patients. However, it is quite difficult to collect these patients because they were very rare in the Japanese population. It was revealed that the distribution of BMI in type 2 diabetic patients is strikingly different between Caucasians and Japanese (Sone et al. 2003). In Japanese, the BMI of diabetic patients was almost equal to that of the non-diabetic population. Indeed, 1,302 type 2 diabetic patients were almost as lean as the control subjects in this study (mean BMI <25), whereas, the +36T > C polymorphism showed nominal association with type 2 diabetes in Caucasian samples (mean BMI >30) despite the small number of samples tested. Although the cause of this difference is still unknown, it may reflect differences in insulin secretion and sensitivity between the two populations. To prove this assumption, further analysis is necessary.

Finally, the average age in controls was unmatched with that in cases, which could have biased our results. If we could use hyper-normal controls, such as older people known to be free of the disease of interest, it would be expected to improve the power of detecting the susceptibility variant between controls and cases. As a better way, our findings could be confirmed using other populations, such as controls in a large cohort study (Hattersley et al. 2005).

In conclusion, we detected +36T > C in intron 1 with polymorphism screening, and its position was located in a predicted promoter region with in silico simulation. We further confirmed the effect of +36T > C in intron 1 on the promoter activity by luciferase assay, implicating its functional role in the GFPT1 gene. Although one possible association was observed in Caucasians, no polymorphisms including +36T > C in intron 1 were associated with type 2 diabetes in 2,763 Japanese samples. To further understand its genetic risk factor for type 2 diabetes, it is recommended to analyze the association in sub-groups according to phenotypes of type 2 diabetes or in other populations.