According to the International Diabetes Federation’s Diabetes Atlas sixth edition, it was estimated that 382 million people had diabetes in the year 2013 worldwide, and this figure is expected to increase to 592 million in 2035.1 In 2014, 4.9 million deaths were attributable to diabetes. It is also the leading cause of blindness, chronic kidney disease and amputation. Substantial effort to understand the pathophysiology of type 2 diabetes (T2DM) and to develop novel preventive and therapeutic measures for T2DM has been made in recent years.

One of these approaches relies on genetic research. This strategy is based on the findings that T2DM has a genetic predisposition. It has been reported that subjects having T2DM-affected siblings are at a two- to threefold increased risk of developing T2DM compared with the general population.2 Having one parent with diabetes increases the risk of T2DM by 30–40%, and having both parents with diabetes increases the risk by 70%.3 Furthermore, there are specific forms of monogenic diabetes that suggest a genetic etiology in diabetes. It should be noted that understanding the human genetic variation and its implications in disease is one of the core components of the precision medicine initiative.4 However, the explosive increase in the prevalence of T2DM during the past several decades cannot be explained by genetic factors alone, as it is unlikely that our genomes have changed during this relatively short time period.

Since the first genome-wide association study (GWAS) on T2DM in 2007, there has been considerable progress in genetic research on T2DM.5 There are at least 75 independent genetic loci associated with T2DM (Table 1). Technical advances in genotyping method, massive parallel sequencing and the development of global references for human genetic variations such as the 1000 Genomes Project have laid the foundation for these achievements.6, 7 In addition, there has been open collaboration between researchers in this field. Many of the major advances would not have been possible without these collaborative efforts.

Table 1 Common genetic variant association loci for T2DM

Major critics of genetic research on T2DM highlight the fact that the common variants with a relatively low effect size (odds ratio between 1.10 and 1.40) explain only 10–15% of the T2DM heritability.8 In addition, most of the variants are located at intergenic or intronic region, where it is difficult to explain their functional consequences. In the past several years, efforts were made to reveal the functional role of these genetic variants or those of newly implicated genes in T2DM. On the other hand, there are ongoing studies that investigate additional layers of genomic regulation, such as epigenetics. Environmental factors may exert their effect on epigenetic changes and interact with DNA sequence variations. In this review, we will briefly summarize the recent progress in genetic and epigenetic research on T2DM and discuss how environmental factors, genetics and epigenetics can interact in the pathogenesis of T2DM.

Genetic research on T2DM

Initial GWAS on T2DM

The first report of GWAS on T2DM appeared in early 2007 and identified the novel T2DM-associated loci HHEX/IDE and SLC30A8.5 Soon after, three international consortia reported their own GWAS and simultaneously replicated these results.9, 10, 11 Through collaboration, they confirmed the previous T2DM loci of TCF7L2, PPARG, KCNJ11 and FTO, and identified novel loci for an association in the intergenic region of CDKN2A/2B, as well as the intronic region of CDKAL1 and IGF2BP2. These studies showed that most of the common genetic variants had a modest effect size, with an odds ratio between 1.10 and 1.40, and that multiple variants with a modest effect size would contribute to the genetic etiology of T2DM.

Meta-analysis of GWAS

After the initial success of GWAS, it was recognized that a large sample size is required to identify novel genetic loci. Accordingly, in 2008 the DIAbetes Genetics Replication And Meta-analysis (DIAGRAM) study meta-analyzed previous GWAS results encompassing more than 10 000 subjects for an initial genome-wide scan and performed independent replication in more than 53 000 subjects.12 They found six novel loci associated with T2DM in/near JAZF1, CDC123-CAMK1D, TSPAN8-LGR5, THADA, ADAMTS9 and NOTCH2. A further increase in sample size was available in the DIAGRAM+ study in 2012, with a sample size over 101 000. This study identified 12 novel loci associated with T2DM.13 These studies were performed in European descendants.

GWAS on non-Europeans

The first GWASs in East Asians were reported in 2008.14, 15 They found that a variant in KCNQ1 was significantly associated with T2DM in Japanese, Koreans and Chinese. This variant also showed nominal significance in Europeans. Interestingly, meta-analysis of GWAS in Europeans identified a second independent variant in KCNQ1, which was not in linkage disequilibrium with the original variant identified in East Asians.13 The meta-analysis of GWAS in East Asians was led by the Asian Genetic Epidemiology Network (AGEN) in 2011.16, 17 They found eight novel loci associated with T2DM in more than 54 000 cases and controls. These include variants in/near GLIS3, PEPD, FITM2-R3HDML-HNF4A, KCNK16, MAEA, GCC1-PAX4, PSMD6 and ZFAND. GWAS was also performed in South Asians and African Americans.18, 19 In Mexicans and other Latin Americans, a novel locus spanning SLC16A11 and SLC16A13 was significantly associated with T2DM.20 This variant was relatively common in native Americans (minor allele frequency (MAF) of ~50%) and East Asians (MAF of ~10%), but was rare in Europeans and Africans. A functional study suggested that the SLC16A11 gene is involved in triglyceride metabolism. Recently, a meta-analysis of GWAS in African Americans identified two novel loci at HLA-B and INS-IGF2.21

Large-scale genotyping using the Metabochip

To further increase the sample size, a cost-effective custom genotyping array was developed. The ‘Metabochip’ included nearly 200 000 variants to investigate the genetic association for metabolic, cardiovascular and anthropometric traits.22 For T2DM, the chip contained ~5000 variants to validate previously suggested loci and ~17 000 additional variants for fine mapping of previously confirmed loci. The Metabochip was used for the genotyping of nearly 150 000 Europeans and successfully identified eight novel loci for T2DM.23 They confirmed 63 T2DM loci, and the sibling relative risk attributable to these loci was estimated to be 1.104. These variants explained 5.7% of the variance in T2DM risk on a liability scale. When a polygenic linear mixed model was applied, the overall genome-wide common variants explained 49–64% of the liability variance in Europeans. In addition, pathway analyses and protein–protein interaction analyses revealed that CREB-binding protein-related transcription, adipokine signaling and cell cycle signaling processes are important in the pathogenesis of T2DM. Another study using Metabochip in an isolated founder population in Greenland identified a common (MAF 17%) nonsense p.Arg684Ter variant in TBC1D4 that was associated with an elevated 2-h glucose and insulin concentration after an oral glucose challenge and was also a risk factor for T2DM.24 A functional study using muscle biopsies of participants suggested that this variant is associated with impaired insulin-dependent glucose disposal.

Trans-ancestry GWAS meta-analysis

To further identify novel T2DM-associated loci and to fine map the previously reported loci, trans-ancestry GWAS meta-analysis was performed.25 This effort included the meta-analysis of multiple ethnic specific GWAS results obtained from Europeans, East Asians, South Asians and Mexican Americans. The sample size increased to more than 187 000 and successfully identified seven novel T2DM loci. Among the 52 variants that were available in all 4 ancestry groups, 34 showed directional consistency across the 4 groups. However, 3 variants showed a significant difference in the effect size among the 4 ancestry groups (Figure 1). The risk allele frequency of rs7903146 (TCF7L2) was lower in East Asians (5%) compared with Europeans (30%), and the association of this variant with T2DM did not reach genome-wide significance in East Asians. A variant in PEPD, rs3786897, was only significant in East Asians, and a variant in KLF14, rs13233731, was only significant in Europeans. In this study, the authors used the difference in linkage disequilibrium among ethnic groups to refine the association signal in previously confirmed loci. Using this trans-ethnic approach, the 99% credible set of variants and their genomic interval was significantly decreased for loci including JAZF1 and SLC30A8.

Figure 1
figure 1

Three variants that showed significant heterogeneity in effect size among the four ancestry groups (modified from Mahajan et al.25).

Sequencing studies

As the common variants only had a modest effect on T2DM, it has been hypothesized that rare- or low-frequency variants with a large functional effect might explain a significant proportion of the T2DM heritability. Technical advances in massive parallel sequencing, variant calling and association testing made it possible to investigate the role of these rare- or low-frequency variants in a large number of samples. As MTNR1B is a relatively small gene with only two exons, it was one of the first genes to be sequenced at a large scale. Experimentally validated loss-of-function variants of MTNR1B were significantly associated with T2DM.26 In a similar manner, rare variants of PPARG, which resulted in decreased adipocyte differentiation, were collectively associated with T2DM.27 Regarding SLC30A8, which encodes an important protein (ZnT8) involved in insulin secretion, rare protein-truncating variants were associated with a decreased risk of T2DM.28 It is currently suggested that SLC30A8 could be a novel therapeutic target for T2DM. In one study, whole-genome sequencing was performed in 2630 Icelanders and was then used as a reference in the imputation of 278 000 additional subjects.29 They found an intronic low-frequency (1.5%) variant in CCND2, which was associated with decreased risk of T2DM. In addition, nonsynonymous variants in PAM (p.Asp563Gly MAF 5% and p.Ser539Trp MAF 0.7%) and a rare (0.2%) frameshift variant in PDX1 (Gly218Alafs*12) were associated with increased risk of T2DM. In the Latino population, whole-exome sequencing was performed in 3756 participants. A rare nonsynonymous variant in HNF1A p.E508K was associated with a fivefold increased risk of T2DM (MAF in T2DM 2.1% vs controls 0.35%).30 Currently, there are ongoing large-scale international collaborative studies to investigate the overall effect of rare, functional variants in the development of T2DM.

Glycemic trait analysis

In parallel to efforts aimed at identifying variants of T2DM, several studies investigated the genetic variants associated with quantitative glycemic traits such as fasting glucose, fasting insulin, 2-h glucose after oral glucose challenge and HbA1c. One of the first GWAS of glycemic traits identified that an intronic variant in G6PC2 was significantly associated with the fasting glucose concentration.31 The gene is specifically expressed in pancreatic islets and has been suggested to have a role in glucose-dependent insulin secretion. However, this variant was not associated with the risk of T2DM. By the collaborative efforts of the Meta-Analysis of Glucose and Insulin-related traits Consortium (MAGIC), a variant in MTNR1B was identified to be associated with fasting glucose, but this time the variant was associated with an increased risk of T2DM.32, 33, 34 The MAGIC study also identified an intronic variant of GIPR that was associated with the 2-h glucose concentration.35 In 2010, a large-scale meta-analyses of GWAS in MAGIC identified nine novel loci for fasting glucose (in/near ADCY5, MADD, ADRA2A, CRY2, FADS1, GLIS3, SLC2A2, PROX1 and C2CD4B) and one for fasting insulin (near IGF1).36 It was found that some of the fasting glucose-elevating variants were associated with increased risk of T2DM. However, this overlap was incomplete, suggesting that some variants are affecting the fasting glucose concentration only at the physiologic level. Using Metabochip, the MAGIC study increased their sample size to more than 133 000 and confirmed 53 glycemic trait-associated loci.37 The AGEN consortium identified three variants (in/near PDK1-RAPGEF4, KANK1 and IGF1R) that were associated with fasting glucose in East Asians.38 Finally, there were efforts to identify low- or rare-frequency variants of glycemic traits. An exome array genotyping study in Europeans identified a low-frequency (MAF 1.5%) nonsynonymous variant in GLP1R (p.Ala316Thr), which was associated with fasting glucose and a rare (MAF 0.1%) nonsynonymous variant in URB2 (p.Glu594Val), which was associated with fasting insulin.39 In addition, multiple nonsynonymous variants in G6PC2, which resulted in loss of function, were associated with decreased fasting glucose concentrations. A gene-based variant aggregation test also confirmed that rare nonsynonymous variants in G6PC2 were associated with the fasting glucose concentrations.40

Epigenetic research on T2DM

Introduction to epigenetics

Epigenetics is defined as ‘the study of changes in gene function mitotically and/or meiotically heritable and that do not entail a change in DNA sequence’.41 These changes are influenced by environmental factors and can modulate gene expression independent of DNA sequence. An important characteristic of epigenetic change is that it is reversible and modifiable, making it a possible therapeutic target. Epigenetic regulation is mainly controlled by DNA methylation, histone modification and microRNA. Although epigenetics adds another layer of complexity in understanding the genomic pathogenesis of T2DM, it could provide precious information on how environmental factors contribute to T2DM at the genomic level. In addition, it might explain another significant proportion of the variance in T2DM susceptibility. The field of epigenetic research on T2DM is relatively new due to difficulties in acquiring a large number of homogenous target tissues and the technical limitations in analyzing epigenetic changes at a genome-wide scale. In this review, we will focus on DNA methylation, as it has been the most widely investigated strategy up until now.

Evidence that epigenetic changes contribute to T2DM

There are several lines of evidence suggesting that epigenetic changes have an important role in the pathogenesis of T2DM. First, intrauterine malnutrition has been linked to T2DM. Epidemiologically, offspring born from mothers who experienced famine during pregnancy in the Dutch Hunger Winter or World War II had significantly decreased birth weight and were at high risk of future T2DM.42, 43 Several mechanisms to explain this ‘thrifty phenotype’ include altered epigenetic modifications in pancreatic β-cells44 and decreased mitochondrial DNA content.45 These changes could be mediated by low levels of methyl donors and depletion of nucleotide pools. Second, intrauterine exposure to hyperglycemia can lead to diabetes and obesity in offspring. In a study on Pima Indian families, offspring conceived after the mother was diagnosed with T2DM were more likely to have a higher body mass index and a greater risk of T2DM compared with siblings born before their mother developed T2DM.46 Third, environmental factors can affect β-cell function by mediating epigenetic changes. In a recent study, palmitate exposure to human pancreatic islets induced global and specific DNA methylation alterations that resulted in coordinate changes in mRNA expression and decreased insulin secretion.47

Technical advances in epigenetic research

Recent technical advances have accelerated the field of epigenetic research on T2DM. Several DNA methylation arrays are commercially available to investigate epigenome-wide associations. Illumina’s Infinium HumanMethylation450 BeadChip interrogates more than 485 000 methylation sites and covers 96% of CpG islands and additional island shores and their flanking regions.48 Another method to investigate genome-wide methylation changes is to perform bisulfite conversion and apply next generation sequencing. An unprecedented resource of information is provided by the Encyclopedia of DNA Elements (ENCODE) project and Roadmap Epigenomics project.49, 50 In the ENCODE project, functional DNA elements were delineated using sequencing studies in a number of cell lines. In the Roadmap Epigenomics project, human tissues and cells were investigated and reference epigenomes for 127 tissues and cell types were generated and are publically available.

Recent progress in epigenetic research on T2DM

One of the first epigenome-wide association studies to identify T2DM-related DNA methylation variations in peripheral blood was reported in 2011.51 DNA from a total of 1169 T2DM cases and controls was assembled in four groups of DNA pools and was investigated with microarray-based methylation assay. They found that known GWAS loci were enriched with differentially methylated sites. In addition, hypomethylation of a CpG site in the FTO gene was significantly associated with the risk of T2DM. Recently, three studies independently reported that differential methylation at a CpG site in TXNIP, cg19693031, was significantly associated with T2DM.52, 53, 54 TXNIP has been shown to be involved in human skeletal muscle glucose uptake and glucotoxicity-induced β-cell apoptosis.55, 56 In the nested case–control study report by Chambers et al., peripheral blood DNA was investigated with the HumanMethylation450 BeadChip in 2664 Indian Asians and further replication was performed in 1141 Europeans. In addition to TXNIP, the authors found that CpG sites in ABCG1, PHOSPHO1, SOCS3 and SREBF1 were significantly associated with future development of T2DM.

Because epigenetic regulation is tissue specific, it is crucial to investigate a homogenous tissue or cell type from a T2DM target organ. In this regard, pancreatic islets are one of the most thoroughly investigated cell clusters. In 2010, Stitzel et al. investigated the genome-wide analysis of epigenetic markers of DNase I hypersensitivity sites, histone modifications and CCCTC factor binding in primary human pancreatic islets.57 They found over 34 000 distal regulatory elements, among which 47% were specific to islets. In addition, known T2DM genetic loci were enriched in these regulatory elements. Similarly, it has been reported that known T2DM loci were enriched in pancreatic islet enhancer clusters.58 One of the first studies to compare genome-wide methylation in human pancreatic islets between 15 T2DM cases and 34 non-diabetic controls was reported in 2014.59 A total of 1649 CpG sites and 853 genes, including known T2DM loci of TCF7L2, FTO and KCNQ1, were identified to be differentially methylated in T2DM using the HumanMethylation450 BeadChip. In addition, 102 genes showed both differential methylation and gene expression in T2DM islets. Some of these genes affected pancreatic β-cell and α-cell functions. Regarding human skeletal muscle, it has been shown that T2DM subjects had cytosine hypermethylation of PGC-1α and decreased mitochondrial content in vastus lateralis muscle.60 In addition, acute exercise induced promoter hypomethylation and a coordinate increase in the expression of PGC-1α, PDK4 and PPAR-δ in skeletal muscle.61 In human adipose tissue, 6 months of exercise training resulted in differential DNA methylation and mRNA expression in 197 genes, including RALBP1, HDAC4 and NCOR2.62 A total of 39 genes in previously confirmed obesity or T2DM loci had CpG sites that were differentially methylated after exercise training. Furthermore, knockdown of the candidate Hcad4 and Ncor2 in adipocytes resulted in increased lipogenesis in vitro. These findings imply that there are tissue-specific epigenetic changes in T2DM that are at least in part co-regulated by DNA sequence variations.

Interaction between environment, genetics and epigenetics in T2DM

The pathogenesis of T2DM is thought to involve a complex interaction between genetic and environmental factors. Although there could be various mechanisms for a gene–environment interaction, we will specifically focus on epigenetic modulations. Environmental factors can exert their effect by modulating epigenetic changes. Epigenetic change can also be determined by genetic factors. There is substantial evidence for these interactions. In an interesting report by Kim et al., the authors explicitly showed how environmental factors such as obesity could change the promoter methylation state of adiponectin, a well-known adipokine that regulates insulin sensitivity.63 In the obese condition, DNA methyltransferase 1 is activated and hypermethylates a specific region of the adiponectin gene, leading to decreased expression. It should be noted that confirmed genetic loci (rs17300539 and rs266729) for the plasma adiponectin concentration reside at the CpG site of the adiponectin promoter and can either introduce or remove the CpG site according to the genotype (termed CpG-single nucleotide polymorphism (SNP)). Regarding resistin, another well-known adipokine, promoter DNA variants were associated with decreased methylation of cg02346997, which is also located at the resistin promoter.64 The methylation state of cg02346997 was associated with a significantly decreased plasma resistin concentration. These findings highlight the complex interaction between the environment, genetics and epigenetics. The model for these interactions is depicted in Figure 2. Epigenetics can modulate tissue-specific gene expression through DNA methylation, histone modification and microRNA. These epigenetic regulations can be influenced by environmental factors such as age, obesity, physical activity and diet. Furthermore, DNA sequence variations such as CpG-SNP, structural variations and gene–gene interactions can also modulate epigenetic regulation.

Figure 2
figure 2

The complex interaction between various environmental factors, genetics and epigenetics.

Summary and future directions

There has been great success in identifying genetic variants associated with T2DM using large-scale GWAS and their meta-analyses. So far, at least 75 T2DM-associated common genetic variants have been identified. However, detailed mechanisms on how these variants exert their effects on the pathogenesis of T2DM have been largely lacking. Although international collaborative studies on whole-exome and whole-genome sequencing are expected to be reported soon, it remains unknown to what extent the heritability of T2DM could be explained by low- or rare-frequency functional genetic variants. Once we have detailed information on both rare and common genetic variations and their effects on T2DM, it will become necessary to shift our focus to another layer of genomics that could encompass environmental factors, such as epigenetics. As the technology to investigate genome-wide epigenetic changes is expected to further advance, it will be important to collect high-quality tissues or cells that have an important role in the pathogenesis of T2DM. Future research should focus on the interaction between various environmental factors, genetics and epigenetics. Through these efforts, we hope that detailed information on these multilayered genomics could provide a foundation for precision medicine and improve outcomes in the prevention and treatment of T2DM.