Genetic variants including markers from the exome chip and metabolite traits of type 2 diabetes

Jäger, Susanne; Wahl, Simone; Kröger, Janine; Sharma, Sapna; Hoffmann, Per; Floegel, Anna; Pischon, Tobias; Prehn, Cornelia; Adamski, Jerzy; Müller-Nurasyid, Martina; Waldenberger, Melanie; Strauch, Konstantin; Peters, Annette; Gieger, Christian; Suhre, Karsten; Grallert, Harald; Boeing, Heiner; Schulze, Matthias B.; Meidtner, Karina

doi:10.1038/s41598-017-06158-3

Download PDF

Article
Open access
Published: 20 July 2017

Genetic variants including markers from the exome chip and metabolite traits of type 2 diabetes

Susanne Jäger^1,2,
Simone Wahl^2,3,4,
Janine Kröger^1,2,
Sapna Sharma^2,3,4,
Per Hoffmann ORCID: orcid.org/0000-0002-6573-983X^5,6,7,
Anna Floegel⁸,
Tobias Pischon^9,10,11,
Cornelia Prehn ORCID: orcid.org/0000-0002-1274-4715¹²,
Jerzy Adamski ORCID: orcid.org/0000-0001-9259-0199^2,12,13,
Martina Müller-Nurasyid^14,15,16,
Melanie Waldenberger^3,4,
Konstantin Strauch^14,17,
Annette Peters^2,3,16,
Christian Gieger^3,4,
Karsten Suhre ORCID: orcid.org/0000-0001-9638-3912¹⁸,
Harald Grallert^2,3,4,
Heiner Boeing⁸,
Matthias B. Schulze^1,2 &
…
Karina Meidtner ORCID: orcid.org/0000-0001-5810-4062^1,2

Scientific Reports volume 7, Article number: 6037 (2017) Cite this article

2711 Accesses
11 Citations
4 Altmetric
Metrics details

Subjects

Abstract

Diabetes-associated metabolites may aid the identification of new risk variants for type 2 diabetes. Using targeted metabolomics within a subsample of the German EPIC-Potsdam study (n = 2500), we tested previously published SNPs for their association with diabetes-associated metabolites and conducted an additional exploratory analysis using data from the exome chip including replication within 2,692 individuals from the German KORA F4 study. We identified a total of 16 loci associated with diabetes-related metabolite traits, including one novel association between rs499974 (MOGAT2) and a diacyl-phosphatidylcholine ratio (PC aa C40:5/PC aa C38:5). Gene-based tests on all exome chip variants revealed associations between GFRAL and PC aa C42:1/PC aa C42:0, BIN1 and SM (OH) C22:2/SM C18:0 and TFRC and SM (OH) C22:2/SM C16:1). Selecting variants for gene-based tests based on functional annotation identified one additional association between OR51Q1 and hexoses. Among single genetic variants consistently associated with diabetes-related metabolites, two (rs174550 (FADS1), rs3204953 (REV3L)) were significantly associated with type 2 diabetes in large-scale meta-analysis for type 2 diabetes. In conclusion, we identified a novel metabolite locus in single variant analyses and four genes within gene-based tests and confirmed two previously known mGWAS loci which might be relevant for the risk of type 2 diabetes.

Whole-Genome Sequencing Analysis of Human Metabolome in Multi-Ethnic Populations

Article Open access 30 May 2023

Heritability estimates for 361 blood metabolites across 40 genome-wide association studies

Article Open access 07 January 2020

Whole Genome Association Study of the Plasma Metabolome Identifies Metabolites Linked to Cardiometabolic Disease in Black Individuals

Article Open access 22 August 2022

Introduction

Up to now numerous common risk variants (MAF ≥ 5%) for type 2 diabetes were identified by genome-wide association studies (GWAS)^{1, 2}. However, they only explain a small proportion of the population variance^{1, 2}. To address this missing heritability, current studies concentrate on rare variants with potentially strong effects. One study identified four low frequency and rare variants which were associated with risk of type 2 diabetes³. Furthermore, the CHARGE consortium used exome chip variants from 23 different studies and found a novel association of a low-frequency nonsynonymous variant in the GLP1R with type 2 diabetes by an initial exploratory analysis of fasting glucose and insulin and a subsequent analysis of significant results for association with type 2 diabetes⁴. Similarly, metabolites associated with the risk of type 2 diabetes may aid the identification of new type 2 diabetes risk variants⁵. GWAS of metabolomics datasets (mGWAS) identified several so-called genetically influenced metabotypes (GIMs)^6,7,8,9,10 that show large genetic effect sizes explaining 10–20% of the observed variance in the metabolite traits¹¹. In the current study population (EPIC-Potsdam), 14 single metabolites have been identified to be independently associated with risk of type 2 diabetes¹². Two metabolite factors (principal components) largely explained the variance in metabolite traits and were associated with risk of type 2 diabetes in opposing directions¹². Within this study, we aim to identify genetic variants which are associated with those previously identified diabetes-related metabolite traits¹² and test for their association with type 2 diabetes risk. To do so, we firstly, investigated selected known mGWAS loci^{7, 8} for their association with diabetes-related metabolite traits (Fig. 1). Secondly, we conducted an exploratory analysis using HumanExome v1.1 Bead Array data to additionally explore the association of low-frequency (1% < MAF < 5%) and rare genetic variants (MAF < 1%) which were not captured by classical GWAS arrays (Fig. 1). Finally, we analyzed the associations between single genetic variants identified for being associated with diabetes-associated metabolite traits and risk of type 2 diabetes.

Results

All primary analyses have been carried out within the EPIC-Potsdam study with up to 2932 participants within the diabetes analyses (Fig. 1). Results of exploratory analyses were replicated in the KORA F4 study with up to 2692 participants (Fig. 1). Baseline characteristics of both study populations can be found in Supplementary Table S1.

Eleven known mGWAS loci were associated with diabetes-related metabolite traits

We analyzed the association of 19 known mGWAS loci with 106 diabetes-associated metabolite traits (Fig. 1). These metabolite traits included single metabolites (n = 14), metabolite ratios defined on the basis of the observed metabolite network (n = 90) and two metabolite factors derived from PCA of diabetes-associated metabolites¹².

Eleven of the 19 candidate SNPs from previous studies were associated with diabetes-related metabolite traits (P < 4.31 × 10⁻⁵) in EPIC-Potsdam (Supplementary Table S2). The SNP rs174547 (FADS1) was significantly associated with 36 single metabolite traits which primarily originate from the groups of acyl-alkyl-phosphatidylcholines and diacyl-phosphatidylcholines. Further statistically significant associations were found for rs9393903 (ELOVL2), rs7156144 (PLEKHH1), rs12641551 (ACSL1), rs272893 (SLC22A4/ OCTN1), rs603424 (SCD) and phosphatidylcholine ratios. Rs715 (CPS1), rs541503 (PHGDH) and rs1718306 (PAH) were associated with amino acids. Rs11158519 (SYNE2), rs364585 (SPTLC3) and rs603424 (SCD) were related to ratios of sphingomyelins. Previous reports for genotype-by-sex-interactions of rs715 (CPS1) indicating larger effects in women than in men were confirmed in EPIC-Potsdam (Supplementary Table S3)

Exploratory analysis using exome chip data identified one novel locus for diabetes-associated metabolite traits

We carried out an exploratory analysis of all common (MAF ≥ 5%), low-frequency (1% ≤ MAF < 5%) and rare variants (MAF < 1%) with an allele frequency above 0.5% (n ≈ 42.000) for association with 106 diabetes-related metabolite traits (Fig. 1). The analysis revealed twelve loci that were associated with one or multiple metabolite traits (Supplementary Table S4). As the exome chip includes previously published GWAS hits, some of our results from the exploratory analysis overlap with findings from our candidate gene studies by representing the same locus (FADS1, CPS1, SGPP1/SYNE2) or the same SNP (rs12641551(ACSL1), rs364585 (SPTLC3)). For overlapping results between candidate and exploratory analyses we calculated models including all associated SNPs (from candidate and exploratory analysis) at each locus for the top associated metabolite trait to identify independent signals (Supplementary Table S5). The exome chip variant rs174550 (FADS1) showed stronger associations with the PC aa C36:3/PC aa C36:4 ratio than the previously known mGWAS SNP rs174547 on the FADS1 locus (Supplementary Table S5). An intergenic exome chip variant on chromosome 2 (rs4672596) near the previously identified CPS1 locus was associated with levels of glycine and the ratio of glycine and serine (Supplementary Table S4). However, the previously identified mGWAS SNP rs715 showed stronger associations in analyses on glycine levels (Supplementary Table S5). One low-frequency variant from the exome chip (rs12881815 located within SYNE2 with MAF = 4.8%) showed significant associations with two sphingomyelin ratios (SM C16:1/PC aa C28:1 and SM (OH) C22:2/SM (OH) C14:1) (Supplementary Table S4). However, in analyses including all three variants on SYNE2 (Supplementary Table S5) with the SM C16:1/PC aa C28:1 ratio, the exome chip variant rs7157785 (SGPP1) showed stronger effects than rs12881815 (SYNE2) and stronger effects than the previously identified mGWAS SNP (rs11158519 (SYNE2)).

We aimed to replicate suggestive significant findings from EPIC-Potsdam (P < 1.64 × 10⁻⁷) in the KORA F4 study. Meta-analysis of results from both studies revealed six significant loci (P < 4.55 × 10⁻³) associated with one or multiple metabolite traits (Table 1). Among them was one novel locus, rs499974 (MOGAT2), that was associated with the ratio of PC aa C40:5/PC aa C38:5 (P = 6.88 × 10⁻¹⁵). The most deleterious variants identified within this study were located in APOE (CADD = 30.0) and REV3L (CADD = 32.0).

Table 1 Exome chip variants associated with metabolite traits at suggestive significance in EPIC-Potsdam and replication in KORA F4.

Full size table

As previous studies observed that SNP-metabolite associations might be different between women and men, we analyzed SNP-by sex interactions (Fig. 1). Although two variants indicated significant interaction with sex in EPIC-Potsdam, these findings could not be confirmed in KORA F4 (Supplementary Table S6).

The relation of the identified SNPs with the metabolic network structures is depicted in Fig. 2.

Biologic pathway annotations

Not all variants could be linked to specific pathways within the KEGG database (Supplementary Table S7). However, with regard to the annotated gene functions plausible metabolic pathways were found for most variants. For example, FADS1 and FADS2 were linked to “Biosynthesis of unsaturated fatty acids” and FADS2 was associated with “alpha-Linolenic acid metabolism” and “PPAR signalling pathway”. MOGAT2 was connected to “Glycerolipid metabolism” and “Fat digestion and absorption”.

Gene-based analyses identified four genes associated with metabolite traits

Single SNP analyses of low-frequency and rare variants are usually considered underpowered in studies of moderate sample size like ours¹³. Therefore, we conducted two gene-based analyses to enhance the power of our study (Fig. 1). We identified ten genes associated with metabolite traits in SKAT or burden tests (Supplementary Table S8). However, only three genes (GFRAL, BIN1, TFRC) showed nominal significant associations (P < 0.05) with metabolite traits within KORA F4 (Table 2). The association for GFRAL included three low-frequency variants (rs147652095, rs115053739, rs146300118; all with MAF = 1.8%) while BIN1 and TFRC included only common variants (Supplementary Table S9). In a second step, we considered functional annotation and restricted the gene-based analysis to those exome chip variants which are predicted to be damaging (see Methods section). This analysis revealed six different genes in EPIC-Potsdam (Supplementary Table S10) of which, only one (OR51Q1) including two low-frequency variants (rs151161477 with MAF = 1.21% and rs58283839 with MAF = 1.57%) (Supplementary Table S11) was replicated in within KORA F4 (Table 2). By excluding the top variants from the gene-based analyses (see Methods section) we found that multiple low-frequency variants within GFRAL are involved within the association, while in OR51Q1, the association is driven by the top low-frequency variant (rs58283839) (Supplementary Table S12). However, none of the single SNPs within the genes was significantly associated with metabolite traits individually (Supplementary Tables S9 and S11).

Table 2 Gene-based association with metabolite traits using SKAT and burden test in EPIC-Potsdam and replicated in KORA F4.

Full size table

Identified genetic variants were weakly associated with risk of type 2 diabetes and other traits

Among the genes identified to be associated with diabetes-related metabolite traits by gene-based analyses, none was associated with type 2 diabetes in gene-based analyses (Supplementary Table S13).

Based on the single SNP analyses, 16 independent loci could be identified from the candidate SNP analyses and exploratory analyses (Fig. 1).

None of the 16 SNPs associated with diabetes-related metabolic traits (predicted functions are summarized in Supplementary Table S14) was additionally associated with type 2 diabetes within the EPIC-Potsdam study (Table 3). Nevertheless, nine of them (rs541503 (PHGDH), rs715 (CPS1), rs272893 (SLC22A4/OCTH1), rs9393903 (ELOVL2), rs3204953 (REV3L), rs603424 (SCD), rs174550 (FADS1), rs499974 (MOGAT2), rs7157785 (SGPP1)) showed nominally significant association with diabetes in published data from bigger GWAS consortia for type 2 diabetes (up to 110.452 individuals) or exome array analyses (up to 75.670 individuals) of which two (rs174550 (FADS1) and rs3204953 (REV3L)) were significant after correction for multiple testing by FDR (Table 3). For only four out of the nine variants (rs541503 (PHGDH), rs715 (CPS1), rs603424 (SCD), rs499974 (MOGAT2)) the expected effect was directional consistent with the one which was actually observed within the DIAGRAM consortium or the GoT2D consortium^{2, 14}, resulting in a directional consistency of 44.4% (P = 1) in the binomial test. In addition to the association with type 2 diabetes, one variant (rs174550 (FADS1)) was significantly associated with fasting glucose based on a large-scale meta-analysis of the MAGIC (n = 58.074)¹⁵ and two (rs715 (CPS1), rs1136001 (NTAN1)) were significantly associated with BMI within large-scale GWAS meta-analysis of the GIANT consortium (n = 253.288)¹⁶ (Table 3).

Table 3 HRs (95% CI) for type 2 diabetes of genetic variants in EPIC-Potsdam and look-up in other consortia for type 2 diabetes, fasting glucose and BMI.

Full size table

Discussion

Within this study, we investigated the contribution of genetic variants on diabetes-related metabolic traits to get a deeper insight into the underlying biological processes and to identify novel risk variants for this polygenic disease. We tested findings from previous GWAS and found associations with diabetes-associated metabolites for variants located in FADS1, ELOVL2, PLEKHH1, SPTLC3, ACSL1, SCD, SLC22A4, OCTN1, PHGDH ⁷; CPS1 and PAH ⁸. Furthermore, our exploratory analysis identified additional genetic variants from the exome chip, which were located in REV3L, PNLIPRP2, NTAN1, APOE, MOGAT2, and SGPP1 and of which one locus (MOGAT2) was novel for metabolite traits.

Using diabetes-associated metabolites as intermediate phenotypes provides the possibility to investigate biomarkers which are more proximal to genes and biological pathways than complex disease endpoints, ensuring higher statistical power to detect genetic associations⁶.

To deal with multiple testing issues, we selected a reduced panel of metabolite traits using their already published network structure¹⁷. This approach further reduces the number of outcomes to be tested compared to studies which used all possible ratios - independent of their correlation structure. Although varying heritability estimates have been found for the analyzed diabetes-associated metabolites traits ranging from low (h² = 3%) for phenylalanine¹⁸ to moderately to high (33% ≤ h² ≤ 59%)¹⁸ or even higher e.g. glycine (h² = 70%)¹⁹, individual metabolic profiles are long-term conserved²⁰, supporting their suitability for studying pathways underlying disease development.

Despite the smaller sample size in our study compared to one meta-analysis which analyzed Biocrates metabolites in 7,478 individuals⁹, we identified one novel locus rs499974 (MOGAT2) which was related to a ratio between two diacyl-phosphatidylcholines (PC aa C40:5 and PC aa C38:5). This variant was previously mainly shown to be associated with lipid traits (HDL-c, total cholesterol, triglycerides)²¹. Probably due to the limited power of our analysis, none of the single SNP findings could be related to incident type 2 diabetes in EPIC-Potsdam. Nevertheless, two SNPs (FADS1, REV3L) which were associated with metabolite traits were significant in bigger consortia for type 2 diabetes, such as DIAGRAM² or GoT2D¹⁴. Furthermore, according to data from the MAGIC consortium¹⁵, FADS1 was related to fasting glucose. In addition, we show that the rs174550 (FADS1) is stronger related to metabolite traits than the previously known mGWAS hit (rs174547; LD with rs174550: r² = 1; D’ = 1) and remained significantly associated with a ratio of phosphatidylcholines in analyses containing both SNPs. On a functional level, we cannot consider one or another as being more relevant for diabetes risk as the CADD scores are low for both variants (rs174550: CADD = 3.57; rs174547: CADD = 6.23). A recent study identified a functional multiallelic variant (rs174557; LD with rs174550: r² = 0.778; D′ = 1) located in an AluYe5 element in intron 1 of FADS1 which affects two transcription factor binding sites with opposing effects on FADS1 expression²². This newly described variant may represent the causal variant at the FADS1 locus. The FADS1-2 locus is linked to fatty acid metabolism and biosynthesis of unsaturated fatty acids. Fatty acids itself as well as the fatty acid desaturases which catalyse the desaturation of fatty acids are known to be associated with type 2 diabetes risk²³. Variants within FADS1 and FADS2 encoding the fatty acid Δ5 desaturase and Δ6 desaturase, respectively, are located in a gene family (FADS1-2-3) and belong to the most frequently identified mGWAS hits in previous studies showing associations with levels of phosphatidylcholines and sphingolipids^{6, 7, 24}. In EPIC-Potsdam, the minor allele of the variant rs174546 within FADS1 (LD with present FADS1 variant rs174550: r² = 1; D = 1) was associated with lower Δ5 desaturase and Δ6 desaturase activity²³. While a genetically-determined higher Δ5 desaturase activity tended to be associated with lower type 2 diabetes risk after adjustment for the other estimated desaturase activity, a genetically-determined high Δ6 desaturase activity was significantly associated with higher risk²³. As this gene cluster is characterized by high LD, it is difficult to assign effects of single genetic variants located at this locus to one or another desaturase activity. Therefore, observed associations for FADS1 in the present study might be attenuated by confounding by LD of FADS2 and vice versa.

The second finding was a missense variant REV3L (rs3204953, encoding p. Val3064Ile) belonging to the 0.1% most deleterious variants within the human genome (CADD = 32.0)²⁵ which was associated with a ratio between tyrosine and methionine and showed significant association with type 2 diabetes in databases after correction for multiple testing. While this variant is common among Europeans (MAF = 17%), it is rather rare in other populations: Africans (MAF = 0.005), East Asians (MAF = 0.001)²⁶.

According to the PhenoScanner database²⁷, rs3204953 was associated with alanine to tyrosine ratio (p = 2.98 × 10⁻¹¹)²⁸, schizophrenia (0.07 for the effect allele (T); p = 2.14 × 10⁻⁵)²⁹ and age at menopause (beta = 0.15 for the effect allele (T); p = 1.10 × 10⁻⁷)³⁰ in previous GWAS. The latter trait is itself related to type 2 diabetes³¹. On the one hand, factors such as sex-hormones are discussed in this context³², on the other hand, early menopause might represent a marker for premature aging³¹ as it has been related to DNA damage repair³⁰. In this context, REV3L encoding the REV3L (REV3 Like, DNA Directed Polymerase Zeta Catalytic Subunit), a specialized DNA polymerase³³, was linked to the Fanconi anemia pathway in the KEGG database representing an essential pathway for the DNA repair of interstrand cross-links³⁴. More specifically, this enzyme is involved in the translesion DNA synthesis, one essential step within the Fanconi anemia pathway, which represents one of the cellular mechanisms for DNA damage tolerance or post-replication repair³⁴. For example, Singh et al. could show that the human REV3L maintains the integrity of the mitochondrial genome by affecting the mtDNA metabolism and that inactivation of Rev3 leads to mitochondrial dysfunction³⁵. There is quite some evidence that DNA damage induced by oxidative stress is relevant in type 2 diabetes and might additionally represent a mechanistic link with cancer³⁶; however, whether genetic markers in REV3L might play a role in this context needs to be studied.

We applied gene-based tests which have been shown to be better powered than single variant analyses¹³ and a decreased number of tests is carried out by grouping variants based on genes. In these analyses, we identified associations between metabolite outcomes and four genes (GFRAL, BIN1, TFRC and OR51Q1). However, none of the identified genes was linked to type 2 diabetes; neither in EPIC-Potsdam nor in previous studies. Although we found associations with the genes and metabolite traits, it is striking that none of the single variants located in the genes showed any association with the metabolites of interest which further impedes interpretation. Another issue is that compared to our single SNP findings, replication of the gene-based results in KORA F4 was less successful and some of the replicated genes included only common variants. Although, we checked the allele frequencies between both cohorts; still, some minor differences might have affected the gene-based results.

Even though our study identified some interesting candidates which were linked to relevant pathways of type 2 diabetes, all identified SNPs in the single SNP analysis were common variants and no low-frequency or rare variants were identified, which is plausible due to the limited power of our analysis.

In general, SNPs can be used to assess the causality of observed associations between biomarkers and complex traits such as type 2 diabetes by Mendelian Randomisation studies. However, Mendelian randomisation studies rely on a number of assumptions of which one is that “the genotype is associated with the outcome through the studied exposure only”³⁷ or simply said that there are no pleiotropic effects of the SNP. However, many of the variants in our study, e.g. FADS1, SPTLC3 and APOE were associated with different metabolite traits in addition to other biomarkers of lipid metabolism. The observational design of our study does not allow to discriminate if those different effects are all in a causal chain or related via independent physiological mechanisms. Therefore, we did not investigate causal SNP effects of the analysed genetic variants and type 2 diabetes by Mendelian Randomization. However, if we compare our observations to the assumptions of Mendelian randomization, we have to acknowledge that our observations would be more in line with a non-causal relationship between specific metabolites such as certain ratios of diacyl-phosphatidylcholines (PC aa C36:3/PC aa C36:4 or PC aa C38:3/PC aa C38:4) and tyrosine to methionine ratio and type 2 diabetes, since expected and observed effect direction were inconsistent for SNPs in FADS1 and REV3L. Similarly, a recent study⁵ found inconsistent effect directions for two of four nominal significant variants in DIAGRAM data¹ using intermediate metabolite traits to determine an expected direction of the SNP-diabetes association. However, with regard to FADS1, we have only tested the most significant metabolite ratios for their association with type 2 diabetes. Hence, other significant ratios might show diabetes associations with directional consistency.

In summary, we identified a novel metabolite locus (MOGAT2) in single variant analyses and four genes (GFRAL, BIN1, TFRC, OR51Q1) within gene-based tests and could show that two previously known mGWAS loci (FADS1 and REV3L) might be relevant for the risk of type 2 diabetes. Our findings do not clearly support the idea that specific diabetes-associated metabolite ratios (PC aa C36:3/PC aa C36:4, PC aa C38:3/PC aa C38:4 and tyrosine/methionine) may be causal traits for type 2 diabetes, but rather show that shared genetic influences on diabetes-related metabolite traits and type 2 diabetes itself exist.

Materials and Methods

Study population

EPIC-Potsdam study

The European Prospective Investigation into Cancer and Nutrition (EPIC)-Potsdam study consists of 27,548 participants recruited between 1994 and 1998 from the general population in Potsdam and surroundings³⁸. The baseline examination involved a personal interview including questions on prevalent diseases, a self-administered questionnaire on socio-economic and lifestyle characteristics, interviewer-conducted anthropometric measurements and a blood sample collection³⁹. We used a prospective case-cohort nested within the EPIC-Potsdam study, described in detail previously⁴⁰. Briefly, a sub-cohort of 2,500 individuals was randomly selected from 26,444 participants who provided blood samples at baseline. Additionally, 849 incident type 2 diabetes cases were identified in the full cohort of whom 820 cases provided blood samples (Supplementary Fig. S1).

Participants with missing or implausible data on metabolite measurements (described in detail within the metabolite measurements section), prevalent diabetes, or participants with uncompleted follow-up were excluded, leaving 2283 individuals for analyses in the sub-cohort. Similar exclusion criteria were applied for type 2 diabetes cases, leaving 784 incident cases for analysis. Because the sub-cohort is representative of the full cohort at baseline, the sub-cohort included 73 individuals who developed incident type 2 diabetes during follow-up. Exclusions due to missing genotype data or outlying metabolite values were done separately in each analysis.

Cooperative Health Research in the Region of Augsburg study (KORA F4)

KORA (Cooperative Health Research in the Augsburg Region) is a research platform of independent population-based health surveys and subsequent follow-up examinations of community-dwelling adults living in the region of Augsburg in Southern Germany. Study design, sampling method and data collection have been described in detail elsewhere⁴¹. The KORA S4 survey (1999 to 2001) comprised 4,261 participants, 25 to 74 years old⁴². Of these, 3,080 subjects participated in the follow-up examination KORA F4 (2006 to 2008)⁴³. The present study is based on a subsample of 2,818 participants of KORA F4 with Biocrates metabolomics data available.

Ethics statement

All participants provided written informed consent. The EPIC-Potsdam study was approved by the ethics committee of the State of Brandenburg, Germany and the KORA study was approved by the ethics committee of the Bavarian medical association, Germany. All procedures were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

DNA-extraction, genotyping and quality control

EPIC-Potsdam

The DNA has been extracted from buffy coats using the chemagic DNA Buffy Coat Kit special on a Chemagic Magnetic Separation Module I (PerkinElmer Chemagen technologies, Baesweiler, Germany) according to the manufacturer’s manual.

The selection of candidate SNPs (Supplementary Table S15) was based on previous studies available from the literature which were conducted within the KORA F4 study^{7, 8}. Illig et al.⁷ reported 15 GWAS hits for metabolite traits which showed associations with the 14 diabetes-associated metabolite traits from EPIC-Potsdam. However, one finding from Illig et al.⁷ (rs7094971; SLC16A9) which was exclusively associated with phenylalanine levels and showed only a small explained variance in the linear model of 0.10 was not selected for our analysis. As Shin et al.⁸ found a stronger candidate for glycine levels, the hit reported by Illig et al.⁷ (rs2216405; CPS1) was replaced by the SNP rs715 (CPS1). Additional five candidates for glycine (n = 4) and phenylalanine (n = 1) levels were selected from Shin et al.⁸ based on a predefined significance threshold of <10⁻⁶. If more than one SNP was associated on a specific locus, the one showing the lowest p-value while considering LD structures between the variants was selected. By doing so we ended up with 19 candidate SNPs for the present analysis. If neither the respective variant nor a proxy SNP was present on the exome chip, the variants were genotyped within EPIC-Potsdam by KBioscience¸ Teddington, UK (http://www.kbioscience.co.uk) using KASP SNP genotyping system.

The Illumina HumanExome v1.1 Bead Array⁴⁴ was used for genotyping of type 2 diabetes cases and the sub-cohort. Genotyping was performed in the Life and Brain Center in Bonn, Germany. Genotype calling and quality control (QC) were carried out using Illuminas GenomeStudio v2011.1 software suite. To improve genotype calling, all single nucleotide variants were re-clustered based on a cluster file from genotyping of 23000 women in the Women’s Genome Health Study⁴⁵ genotyped with the Illumina HumanExome v1.1 Bead Array. This cluster file was generated according to the CHARGE Best Practices and Joint Calling Protocol, which was also used to derive the final dataset⁴⁶. To improve the genotype calling for rare variants zCall with a threshold of 8 was applied⁴⁷. Individuals with low call rate, discordant sex information (F value between 0.2 and 0.8), related or duplicated individuals (IBD > 0.185) and individuals with divergent ancestry were excluded from further analysis.

KORA F4

DNA was extracted from 9 ml EDTA-blood as described elsewhere⁴⁸. Genotyping was done using the Illumina HumanExome v1.0 Bead Array, calling was carried out according to the CHARGE Best Practices and Joint Calling Protocol⁴⁶.

Metabolite measurements in EPIC-Potsdam and KORA F4

Metabolite quantification of both studies, EPIC-Potsdam and KORA F4, was performed in the Genome Analysis Center at the Helmholtz Zentrum München. All samples have been measured using the AbsoluteIDQ ^TM p150 Kit (Biocrates Life Sciences AG, Innsbruck, Austria) in combination with flow injection analysis-tandem mass spectrometry (FIA-MS/MS). Serum samples of 10 µL serum were used to quantify 163 simultaneously, including 41 acylcarnitines (Cx:y), 14 amino acids, hexose (sum of six-carbon monosaccharides without distinction of isomers), 92 glycerophospholipids (lyso-, diacyl-, and acyl-alkylphosphatidylcholines), and 15 sphingomyelins. The method for sample preparation and measurement as well as the metabolite denomination was previously described in full detail⁴⁹. All samples of EPIC-Potsdam have been processed in one batch⁵⁰. The KORA F4 samples have been measured in three batches of approximately 1000 samples at three different time points with a recalibration of the equipment in between^{7, 51}; therefore, all analyses calculated in KORA were further adjusted for batch effects. Metabolites showed an overall good reliability over a 4-month period, with a median intraclass correlation coefficient of 0.57⁵⁰. To ensure valid measurements, metabolites below the limit of detection (n = 30) and those with very high analytical variance (n = 6) were excluded, resulting in 127 metabolites⁵⁰. For the present analysis 34 diabetes-associated metabolites and 2 metabolic factors were selected¹². Previous studies could show that the usage of metabolite ratios compared to single metabolites provides higher statistical power to detect significant associations with genetic variants⁵². Therefore, we selected all metabolite ratios out of the 34 diabetes-associated metabolites which were connected via one edge¹⁷ based on a metabolite network within EPIC-Potsdam build by Gaussian graphical modeling (Fig. 1)⁵³. We included all single metabolites and 2 metabolites factors which were independently associated with type 2 diabetes (Fig. 1)¹². Therefore, our analysis (Fig. 1) included 106 outcomes referred to as diabetes-related metabolite traits consisting of 90 metabolite ratios, 14 single metabolites and 2 metabolite factors = (list of metabolite traits in Supplementary Table S16).

Assessment of type 2 diabetes in EPIC-Potsdam

Systematic information sources for incident cases were self-reports of a type 2 diabetes diagnosis, type 2 diabetes-relevant medication, and dietary treatment due to type 2 diabetes during follow-up. Furthermore, we obtained additional information from death certificates or from random sources, such as the tumor centers, physicians, or clinics that provided assessments from other diagnoses. Although self-reports of type 2 diabetes were generally reliable, by including other sources of information, we even improved the completeness of case ascertainment. Once a participant was identified as a potential case, disease status was further verified by sending a standard inquiry form to the treating physician. Only physician- verified cases with a diagnosis of type 2 diabetes (International Classification of Diseases, 10^th revision code: E11) and a diagnosis date after the baseline examination were considered confirmed incident cases of type 2 diabetes.

Statistical analysis

All data analyses were performed by using the software packages Statistical Analysis System (SAS) Enterprise Guide 6.1 with SAS version 9.4 (SAS Institute Inc., Cary, NC, USA), PLINK v1.07⁵⁴ and R (version 3.1.0 (2014-04-10)).

Candidate analysis of known mGWAS hits

We selected 19 candidate SNPs that showed associations with diabetes associated metabolites in previously published mGWAS of metabolic traits (Fig. 1). Associations of the selected candidate SNPs (list in Supplementary Table S15)^{7, 8} with metabolite traits were investigated within the sub-cohort via linear regression models (additive genetic model; per copy of the minor allele) adjusted for age and sex using SAS version 9.4. Metabolite traits (except metabolite factors) were natural log-transformed to normalize the right-skewed distributions. After ln-transformation metabolite outliers of >4SD from the mean were excluded and metabolite traits were standardized (mean = 0; SD = 1). Correction for multiple testing was done by the Bonferroni method [significance threshold: 0.05/(19 × 61 independent metabolite traits) = 4.31 × 10⁻⁵]. The effective number of independent metabolite traits of 61 out of 106 was determined by using equation 5 of Li and Ji 2005⁵⁵ which is included in the matSpD R script available onlin^{56, 57}.

Exploratory analysis

Exploratory single variant association analysis was performed in PLINK v1.07⁵⁴ and included all exome chip variants with MAF ≥ 0.005 (n~42,000 markers) as exposure and 106 metabolite traits defined above as outcome (Fig. 1). We considered a genomic control (GC) corrected p-value as exome chip-wide significant at P < 1.95 × 10⁻⁸ [ = 0.05/(42,000 polymorphic variants × 61 independent metabolite traits)]⁵⁵. Suggestive significance threshold was defined as P < 1.64 × 10⁻⁷ [ = (1.00 × 10⁻⁵/61)]. Transformation of metabolite traits and outlier removal was identical to the replication analysis of previously identified mGWAS hits. We assumed an additive genetic model (per copy of the allele on the forward strand), adjusted for age at recruitment and sex. Identified SNPs with at least suggestive significance (P < 1.64 × 10⁻⁷) were replicated within the KORA study by applying the same linear models which were further adjusted for batch effects. Results for single variant analysis from EPIC-Potsdam and KORA F4 were pooled by meta-analysis using a random effects model with the R package meta ⁵⁸ and significance was defined by correcting for eleven test (P < 4.55 × 10⁻³). Analyses for the top associated metabolite traits were performed to check for independent signals at each locus [Y = β1 SNP ₁ + β2 SNP ₂ (+β3 SNP ₃) + β4 age + β5 sex + n]. Independently associated SNPs were selected for the diabetes analysis.

Additionally, we performed an exploratory interaction analysis for sex by modeling linear regression models adjusted for age and sex and including an interaction term (Y = β1 SNP + β2 age + β3 sex + β4 (sex × SNP) + n). Interaction terms with P < 1.64 × 10⁻⁷ were considered as suggestively significant and attempted to replicate in KORA F4. Gene-based analyses were performed with the R package SKAT ⁵⁹. We applied the sequence kernel association test (SKAT-C) and burden-C test with the default options using “SKAT_CommonRare” function⁶⁰ to test for associations between common and rare genetic variants in a gene and 106 metabolite traits (Fig. 1). Models were adjusted for age and sex. Correction for multiple testing was done with the Bonferroni method [suggestive significance threshold: 0.05/(number of genes with >1 variants (ranging from 7243 to 7332)) = 6.8 × 10⁻⁶ to 6.9 × 10⁻⁶]. Additional analyses considering only variants annotated as potentially damaging (based on the annotation list from the CHARGE consortium: column “sc_damaging”) were conducted. Genes showing suggestive significance in EPIC-Potsdam were replicated in KORA F4 considering p < 0.05 as replicated. Within genes including more than two variants (GFRAL, TFRC and OR51Q1), we removed the top variant(s) showing the lowest p-values in pooled single variant analyses from the gene-based analyses. By repeating the SKAT and burden-C test we determined the impact of the top variant(s) on the gene-based statistical significance. Variants were mapped to Ensembl annotation version 84 (GRCh37)²⁶ and files provided by the CHARGE consortium⁴⁶ were used for gene annotation.

Association with type 2 diabetes

We evaluated all replicated previously known and all novel exploratory identified genetic variants (n = 16) with regard to their association with type 2 diabetes risk within EPIC-Potsdam (Fig. 1). Associations between genetic variants and diabetes risk were evaluated in SAS version 9.4 using Cox regression modified for the case-cohort design according to the Prentice method⁶¹. Age was used as the underlying time scale in all Cox models, with entry time defined as the participant’s age at recruitment and exit time defined as the age at the end of follow-up based on the date of diagnosis, death, or return of the last follow-up questionnaire. The analysis was stratified for age at the baseline examination in one-year intervals. Cox models were further adjusted for sex and waist circumference.

Associations between the investigated SNPs and risk of type 2 diabetes as well as BMI and fasting glucose were looked up in the AMP-T2D Program¹⁴. This database provides information on genetic variants and their diabetes-association in consortia for type 2 diabetes using exome sequencing, exome arrays for low-frequency variants and arrays for common variants¹⁴. Additionally, GWAS meta-analysis results for 24 other traits are available¹⁴.

Correction for multiple testing was applied by using false discovery rate (FDR) according to Benjamini and Hochberg⁶². We defined the expected direction for the SNP-diabetes association based on the sign of the product between the SNP-metabolite trait association (e.g. rs3204953 (REV3L) → tyrosine/methionine) and the metabolite trait-type 2 diabetes association (e.g. tyrosine/methionine → type 2 diabetes risk). For the latter we selected either the metabolite trait showing the lowest p-value with the SNP or the one of the SNP-associated metabolite traits indicating significant association with type 2 diabetes risk adjusted for age and sex in EPIC-Potsdam. We tested for directional consistency between expected and observed direction by applying a binomial test in R.

Assessment of functionality of genetic variants

To estimate the relative pathogenicity of identified genetic variants, we used Combined Annotation Dependent Depletion (CADD) v1.3²⁵. This method integrates various annotations into a single measure (C score) for each genetic variant. The scaled C score ranges from 1 to 99 and ranks each variant in the genome relative to all possible 8.6 billion substitutions in the human reference genome based on the formula −10 × log₁₀ (rank/total number of substitutions)²⁵.

Bioinformatics annotation of genes at metabolite-associated loci

All identified SNPs were systematically linked to genes using a flanking region of at least 500 kb. A “GWAS to pathway” workflow method⁶³ was then used to link these genes to potential pathways within the KEGG database⁶⁴. We used the Taverna Workbench Bioinformatics 2.5⁶⁵ software to run the workflow.

References

Morris, A. P. et al. Large-scale association analysis provides insights into the genetic architecture and pathophysiology of type 2 diabetes. Nat Genet 44, 981–990, doi:10.1038/ng.2383 (2012).
Article CAS PubMed PubMed Central Google Scholar
Mahajan, A. et al. Genome-wide trans-ancestry meta-analysis provides insight into the genetic architecture of type 2 diabetes susceptibility. Nat Genet 46, 234–244, doi:10.1038/ng.2897 (2014).
Article CAS PubMed Google Scholar
Steinthorsdottir, V. et al. Identification of low-frequency and rare sequence variants associated with elevated or reduced risk of type 2 diabetes. Nat Genet 46, 294–298, doi:10.1038/ng.2882 (2014).
Article CAS PubMed Google Scholar
Wessel, J. et al. Low-frequency and rare exome chip variants associate with fasting glucose and type 2 diabetes susceptibility. Nat Commun 6, 5897, doi:10.1038/ncomms6897 (2015).
Article CAS PubMed PubMed Central Google Scholar
Fall, T. et al. Non-targeted metabolomics combined with genetic analyses identifies bile acid synthesis and phospholipid metabolism as being associated with incident type 2 diabetes. Diabetologia, doi:10.1007/s00125-016-4041-1 (2016).
Gieger, C. et al. Genetics meets metabolomics: a genome-wide association study of metabolite profiles in human serum. PLoS Genet 4, e1000282, doi:10.1371/journal.pgen.1000282 (2008).
Article PubMed PubMed Central Google Scholar
Illig, T. et al. A genome-wide perspective of genetic variation in human metabolism. Nature Genetics 42, 137–141, doi:10.1038/ng.507 (2010).
Article CAS PubMed Google Scholar
Shin, S.-Y. et al. An atlas of genetic influences on human blood metabolites. Nature Genetics 46, 543–550, doi:10.1038/ng.2982 (2014).
Article CAS PubMed PubMed Central Google Scholar
Draisma, H. H. M. et al. Genome-wide association study identifies novel genetic variants contributing to variation in blood metabolite levels. Nature Communications 6, 7208, doi:10.1038/ncomms8208 (2015).
Article CAS PubMed PubMed Central Google Scholar
Kettunen, J. et al. Genome-wide study for circulating metabolites identifies 62 loci and reveals novel systemic effects of LPA. Nat Commun 7, 11122, doi:10.1038/ncomms11122 (2016).
Article ADS CAS PubMed PubMed Central Google Scholar
Suhre, K. Metabolic profiling in diabetes. Journal of Endocrinology 221, R75–R85, doi:10.1530/joe-14-0024 (2014).
Article CAS PubMed Google Scholar
Floegel, A. et al. Identification of serum metabolites associated with risk of type 2 diabetes using a targeted metabolomic approach. Diabetes 62, 639–648, doi:10.2337/db12-0495 (2013).
Article CAS PubMed PubMed Central Google Scholar
Ladouceur, M., Dastani, Z., Aulchenko, Y. S., Greenwood, C. M. & Richards, J. B. The empirical power of rare variant association methods: results from sanger sequencing in 1,998 individuals. PLoS Genet 8, e1002496, doi:10.1371/journal.pgen.1002496 (2012).
Article CAS PubMed PubMed Central Google Scholar
AMP-T2D data base; T2D-GENES Consortium, GoT2D Consortium, DIAGRAM Consortium., http://www.type2diabetesgenetics.org/home/portalHome (21.07.2016).
Dupuis, J. et al. New genetic loci implicated in fasting glucose homeostasis and their impact on type 2 diabetes risk. Nat Genet 42, 105–116, doi:10.1038/ng.520 (2010).
Article CAS PubMed PubMed Central Google Scholar
Locke, A. E. et al. Genetic studies of body mass index yield new insights for obesity biology. Nature 518, 197–206, doi:10.1038/nature14177 (2015).
Article CAS PubMed PubMed Central Google Scholar
Krumsiek, J. et al. Network-based metabolite ratios for an improved functional characterization of genome-wide association study results. preprint, doi:10.1101/048512 (2016).
Shin, S. Y. et al. Interrogating causal pathways linking genetic variants, small molecule metabolites, and circulating lipids. Genome Med 6, 25, doi:10.1186/gm542 (2014).
Article PubMed PubMed Central Google Scholar
Yet, I. et al. Genetic Influences on Metabolite Levels: A Comparison across Metabolomic Platforms. PLoS One 11, e0153672, doi:10.1371/journal.pone.0153672 (2016).
Article PubMed PubMed Central Google Scholar
Yousri, N. A. et al. Long term conservation of human metabolic phenotypes and link to heritability. Metabolomics 10, doi:10.1007/s11306-014-0629-y (2014).
Willer, C. J. et al. Discovery and refinement of loci associated with lipid levels. Nat Genet 45, 1274–1283, doi:10.1038/ng.2797 (2013).
Article CAS PubMed PubMed Central Google Scholar
Pan, G. et al. PATZ1 down-regulates FADS1 by binding to rs174557 and is opposed by SP1/SREBP1c. Nucleic Acids Res, doi:10.1093/nar/gkw1186 (2016).
Kröger, J. et al. Erythrocyte membrane phospholipid fatty acids, desaturase activity, and dietary fatty acids in relation to risk of type 2 diabetes in the European Prospective Investigation into Cancer and Nutrition (EPIC)-Potsdam Study. Am J Clin Nutr 93, 127–142, doi:10.3945/ajcn.110.005447 (2011).
Article PubMed Google Scholar
Demirkan, A. et al. Genome-Wide Association Study Identifies Novel Loci Associated with Circulating Phospho- and Sphingolipid Concentrations. PLoS Genetics 8, e1002490, doi:10.1371/journal.pgen.1002490 (2012).
Article CAS PubMed PubMed Central Google Scholar
Kircher, M. et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nature Genetics 46, 310-315, doi:10.1038/ng.2892 (2014).
Cunningham, F. et al. Ensembl. Nucleic Acids Res 43, D662–669, doi:10.1093/nar/gku1010 (2015).
Article CAS PubMed Google Scholar
Staley, J. R. et al. PhenoScanner: a database of human genotype-phenotype associations. Bioinformatics 32, 3207–3209, doi:10.1093/bioinformatics/btw373 (2016).
Article CAS PubMed PubMed Central Google Scholar
Kettunen, J. et al. Genome-wide association study identifies multiple loci influencing human serum metabolite levels. Nat Genet 44, 269–276, doi:10.1038/ng.1073 (2012).
Article CAS PubMed PubMed Central Google Scholar
Schizophrenia Working Group of the Psychiatric Genomics Consortium. Biological insights from 108 schizophrenia-associated genetic loci. Nature 511, 421–427, doi:10.1038/nature13595 (2014).
Article ADS PubMed Central Google Scholar
Day, F. R. et al. Large-scale genomic analyses link reproductive aging to hypothalamic signaling, breast cancer susceptibility and BRCA1-mediated DNA repair. Nat Genet 47, 1294–1303, doi:10.1038/ng.3412 (2015).
Article CAS PubMed PubMed Central Google Scholar
Brand, J. S. et al. Age at menopause, reproductive life span, and type 2 diabetes risk: results from the EPIC-InterAct study. Diabetes Care 36, 1012–1019, doi:10.2337/dc12-1020 (2013).
Article PubMed PubMed Central Google Scholar
Ding, E. L., Song, Y., Malik, V. S. & Liu, S. Sex differences of endogenous sex hormones and risk of type 2 diabetes: a systematic review and meta-analysis. JAMA 295, 1288–1299, doi:10.1001/jama.295.11.1288 (2006).
Article CAS PubMed Google Scholar
Gibbs, P. E., McGregor, W. G., Maher, V. M., Nisson, P. & Lawrence, C. W. A human homolog of the Saccharomyces cerevisiae REV3 gene, which encodes the catalytic subunit of DNA polymerase zeta. Proc Natl Acad Sci USA 95, 6876–6880 (1998).
Article ADS CAS PubMed PubMed Central Google Scholar
Kim, H. & D’Andrea, A. D. Regulation of DNA cross-link repair by the Fanconi anemia/BRCA pathway. Genes Dev 26, 1393–1408, doi:10.1101/gad.195248.112 (2012).
Article CAS PubMed PubMed Central Google Scholar
Singh, B. et al. Human REV3 DNA Polymerase Zeta Localizes to Mitochondria and Protects the Mitochondrial Genome. PLoS One 10, e0140409, doi:10.1371/journal.pone.0140409 (2015).
Article PubMed PubMed Central Google Scholar
Lee, S. C. & Chan, J. C. Evidence for DNA damage as a biological link between diabetes and cancer. Chin Med J (Engl) 128, 1543–1548, doi:10.4103/0366-6999.157693 (2015).
Article Google Scholar
Boef, A. G., Dekkers, O. M. & le Cessie, S. Mendelian randomization studies: a review of the approaches used and the quality of reporting. Int J Epidemiol 44, 496–511, doi:10.1093/ije/dyv071 (2015).
Article PubMed Google Scholar
Kroke, A. et al. Measures of Quality Control in the German Component of the EPIC Study. Ann Nutr Metab 43, 216–224 (1999).
Article CAS PubMed Google Scholar
Boeing, H., Wahrendorf, J. & Becker, N. EPIC-Germany–A source for studies into diet and risk of chronic diseases. European Investigation into Cancer and Nutrition. Ann Nutr Metab 43, 195–204, 2786 (1999).
Stefan, N. et al. Plasma Fetuin-A Levels and the Risk of Type 2 Diabetes. Diabetes 57, 2762–2767, doi:10.2337/db08-0538 (2008).
Article CAS PubMed PubMed Central Google Scholar
Holle, R., Happich, M., Löwel, H. & Wichmann, H. E. KORA–a research platform for population based health research. Gesundheitswesen 67(Suppl 1), S19–25, doi:10.1055/s-2005-858235 (2005).
Article PubMed Google Scholar
Meisinger, C. et al. Prevalence of undiagnosed diabetes and impaired glucose regulation in 35-59-year-old individuals in Southern Germany: the KORA F4 Study. Diabet Med 27, 360–362, doi:10.1111/j.1464-5491.2009.02905.x (2010).
Article CAS PubMed Google Scholar
Rathmann, W. et al. Incidence of Type 2 diabetes in the elderly German population and the effect of clinical and lifestyle risk factors: KORA S4/F4 cohort study. Diabet Med 26, 1212–1219, doi:10.1111/j.1464-5491.2009.02863.x (2009).
Article CAS PubMed Google Scholar
Exome Chip Design, http://genome.sph.umich.edu/wiki/Exome_Chip_Design (14.04.2015).
Ridker, P. M. et al. Rationale, design, and methodology of the Women’s Genome Health Study: a genome-wide association study of more than 25,000 initially healthy american women. Clin Chem 54, 249–255, doi:10.1373/clinchem.2007.099366 (2008).
Article CAS PubMed Google Scholar
Grove, M. L. et al. Best practices and joint calling of the HumanExome BeadChip: the CHARGE Consortium. PLoS One 8, e68095, doi:10.1371/journal.pone.0068095 (2013).
Article ADS CAS PubMed PubMed Central Google Scholar
Goldstein, J. I. et al. zCall: a rare variant caller for array-based genotyping: genetics and population analysis. Bioinformatics 28, 2543–2545, doi:10.1093/bioinformatics/bts479 (2012).
Article CAS PubMed PubMed Central Google Scholar
Grallert, H. et al. APOA5 variants and metabolic syndrome in Caucasians. J Lipid Res 48, 2614–2621, doi:10.1194/jlr.M700011-JLR200 (2007).
Article CAS PubMed Google Scholar
Römisch-Margl, W. et al. Procedure for tissue sample preparation and metabolite extraction for high-throughput targeted metabolomics. Metabolomics 8, 133–142 (2012).
Article Google Scholar
Floegel, A. et al. Reliability of serum metabolite concentrations over a 4-month period using a targeted metabolomic approach. PLoS One 6, e21103, doi:10.1371/journal.pone.0021103 (2011).
Article ADS CAS PubMed PubMed Central Google Scholar
Jourdan, C. et al. Associations between thyroid hormones and serum metabolite profiles in an euthyroid population. Metabolomics 10, 152–164, doi:10.1007/s11306-013-0563-4 (2014).
Article CAS PubMed Google Scholar
Petersen, A.-K. et al. On the hypothesis-free testing of metabolite ratios in genome-wide and metabolome-wide association studies. BMC Bioinformatics 13, 120, doi:10.1186/1471-2105-13-120 (2012).
Article PubMed PubMed Central Google Scholar
Floegel, A. et al. Linking diet, physical activity, cardiorespiratory fitness and obesity to serum metabolite networks: findings from a population-based study. International Journal of Obesity 38, 1388–1396, doi:10.1038/ijo.2014.39 (2014).
CAS PubMed PubMed Central Google Scholar
Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81, 559–575, doi:10.1086/519795 (2007).
Article CAS PubMed PubMed Central Google Scholar
Li, J. & Ji, L. Adjusting multiple testing in multilocus analyses using the eigenvalues of a correlation matrix. Heredity 95, 221–227, doi:10.1038/sj.hdy.6800717 (2005).
Article CAS PubMed Google Scholar
Nyholt, D. R. http://neurogenetics.qimrberghofer.edu.au/matSpD/ (27.01.2017).
Nyholt, D. R. A simple correction for multiple testing for single-nucleotide polymorphisms in linkage disequilibrium with each other. Am J Hum Genet 74, 765–769, doi:10.1086/383251 (2004).
Article CAS PubMed PubMed Central Google Scholar
Schwarzer, G. General Package for Meta-Analysis version 4.3-2, https://cran.r-project.org/web/packages/meta/.
Lee, S. S., Miropolsky, L. & Wu, M. SNP-set (Sequence) Kernel Association Test version 1.0.1, http://cran.r-project.org/web/packages/SKAT/ (23.09.2015).
Ionita-Laza, I., Lee, S., Makarov, V., Buxbaum Joseph, D. & Lin, X. Sequence Kernel Association Tests for the Combined Effect of Rare and Common Variants. The American Journal of Human Genetics 92, 841–853, doi:10.1016/j.ajhg.2013.04.015 (2013).
Article CAS PubMed Google Scholar
Prentice, R. L. A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika 73, 1–11 (1986).
Article MathSciNet MATH Google Scholar
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society 57, 289–300 (1995).
MathSciNet MATH Google Scholar
Hettne, K. M. et al. Structuring research methods and data with the research object model: genomics workflows as a case study. J Biomed Semantics 5, 41, doi:10.1186/2041-1480-5-41 (2014).
Article PubMed PubMed Central Google Scholar
Kanehisa, M., Goto, S., Sato, Y., Furumichi, M. & Tanabe, M. KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res 40, D109–114, doi:10.1093/nar/gkr988 (2012).
Article CAS PubMed Google Scholar
Wolstencroft, K. et al. The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud. Nucleic Acids Research 41, W557–W561, doi:10.1093/nar/gkt328 (2013).
Article PubMed PubMed Central Google Scholar
Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13, 2498–2504, doi:10.1101/gr.1239303 (2003).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

We thank the Human Study Centre (HSC) of the German Institute of Human Nutrition Potsdam-Rehbrücke, namely the trustee and the data hub for the processing, and the participants for the provision of the data, the biobank for the processing of the biological samples and the head of the HSC, Manuela Bergmann, for the contribution to the study design and leading the underlying processes of data generation. Furthermore, we are grateful to the field staff in Augsburg and Munich involved in the conduct of the KORA studies and we also thank Julia Scarpa, Arsin Sabunchi and Dr. Werner Römisch-Margl for metabolomics measurements performed at the Helmholtz Zentrum München, Genome Analysis Center, Metabolomics Platform. Special thanks belong to Kristina M. Hettne (Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands) for her help in applying her “GWAS to pathway” workflow method to our data and Stefan Herms (Division of Medical Genetics, Department of Biomedicine, University of Basel, Basel, Switzerland; Department of Genomics, Life and Brain Center, Bonn, Germany; Institute of Human Genetics, University of Bonn, Bonn, Germany) for his assistance in quality control of the EPIC-Potsdam exome chip data.

Author information

Authors and Affiliations

Department of Molecular Epidemiology, German Institute of Human Nutrition Potsdam-Rehbruecke, Nuthetal, Germany
Susanne Jäger, Janine Kröger, Matthias B. Schulze & Karina Meidtner
German Center for Diabetes Research (DZD), Neuherberg, Germany
Susanne Jäger, Simone Wahl, Janine Kröger, Sapna Sharma, Jerzy Adamski, Annette Peters, Harald Grallert, Matthias B. Schulze & Karina Meidtner
Institute of Epidemiology II, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
Simone Wahl, Sapna Sharma, Melanie Waldenberger, Annette Peters, Christian Gieger & Harald Grallert
Research Unit of Molecular Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
Simone Wahl, Sapna Sharma, Melanie Waldenberger, Christian Gieger & Harald Grallert
Division of Medical Genetics, Department of Biomedicine, University of Basel, Basel, Switzerland
Per Hoffmann
Department of Genomics, Life and Brain Center, Bonn, Germany
Per Hoffmann
Institute of Human Genetics, University of Bonn, Bonn, Germany
Per Hoffmann
Department of Epidemiology, German Institute of Human Nutrition Potsdam-Rehbruecke, Nuthetal, Germany
Anna Floegel & Heiner Boeing
Molecular Epidemiology Group, Max Delbrueck Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin-Buch, Germany
Tobias Pischon
Charité – Universitätsmedizin Berlin, Berlin, Germany
Tobias Pischon
DZHK (German Centre for Cardiovascular Research), partner site Berlin, Berlin, Germany
Tobias Pischon
Genome Analysis Center, Institute of Experimental Genetics, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
Cornelia Prehn & Jerzy Adamski
Chair of Experimental Genetics, Center of Life and Food Sciences Weihenstephan, Technische Universität München, Freising-Weihenstephan, Germany
Jerzy Adamski
Institute of Genetic Epidemiology, Helmholtz Zentrum München, German Research Center for Environmental Health, Neuherberg, Germany
Martina Müller-Nurasyid & Konstantin Strauch
Department of Medicine I, University Hospital Grosshadern, Ludwig-Maximilians-Universität, Munich, Germany
Martina Müller-Nurasyid
DZHK (German Centre for Cardiovascular Research), partner site Munich Heart Alliance, Munich, Germany
Martina Müller-Nurasyid & Annette Peters
Institute of Medical Informatics, Biometry and Epidemiology, Chair of Genetic Epidemiology, Ludwig-Maximilians-Universität, Munich, Germany
Konstantin Strauch
Department of Physiology and Biophysics, Weill Cornell Medical College in Qatar, Qatar Foundation-Education City, Doha, Qatar
Karsten Suhre

Authors

Susanne Jäger
View author publications
You can also search for this author in PubMed Google Scholar
Simone Wahl
View author publications
You can also search for this author in PubMed Google Scholar
Janine Kröger
View author publications
You can also search for this author in PubMed Google Scholar
Sapna Sharma
View author publications
You can also search for this author in PubMed Google Scholar
Per Hoffmann
View author publications
You can also search for this author in PubMed Google Scholar
Anna Floegel
View author publications
You can also search for this author in PubMed Google Scholar
Tobias Pischon
View author publications
You can also search for this author in PubMed Google Scholar
Cornelia Prehn
View author publications
You can also search for this author in PubMed Google Scholar
Jerzy Adamski
View author publications
You can also search for this author in PubMed Google Scholar
Martina Müller-Nurasyid
View author publications
You can also search for this author in PubMed Google Scholar
Melanie Waldenberger
View author publications
You can also search for this author in PubMed Google Scholar
Konstantin Strauch
View author publications
You can also search for this author in PubMed Google Scholar
Annette Peters
View author publications
You can also search for this author in PubMed Google Scholar
Christian Gieger
View author publications
You can also search for this author in PubMed Google Scholar
Karsten Suhre
View author publications
You can also search for this author in PubMed Google Scholar
Harald Grallert
View author publications
You can also search for this author in PubMed Google Scholar
Heiner Boeing
View author publications
You can also search for this author in PubMed Google Scholar
Matthias B. Schulze
View author publications
You can also search for this author in PubMed Google Scholar
Karina Meidtner
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.J. and K.M. conceptualized the project and designed the analysis plan. S.J. analyzed the data in EPIC-Potsdam and carried out meta-analysis supervised by K.M. and M.B.S. S.W. and S.S.H. performed replication analyses in KORA. P.H. (in EPIC-Potsdam), C.G. and H.G. (in KORA) acquired exome chip data. K.M. (in EPIC-Potsdam) and M.M. (in KORA) performed quality control of exome chip data. C.P., J.A. and K. Su acquired Biocrates metabolomics data within KORA and EPIC-Potsdam. H.B. and A.P. acquired data. S.J., S.W., J.K., A.F., T.P., M.W., K. St, M.B.S., K.M. interpreted the data. S.J. wrote the first draft of the manuscript. S.W., J.K., S. Sh, P.H., A.F., T.P., C.P., J.A., M.M., M.W., K. St, A.P., C.G., K. Su, H.G., H.B., M.B.S., K.M. reviewed the manuscript. Both S.J. and K.M. had access to all data for this study and take responsibility for the manuscript contents. All authors approved the final version to be published.

Corresponding author

Correspondence to Matthias B. Schulze.

Ethics declarations

Competing Interests

The authors declare that they have no competing interests.

Additional information

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Supplementary PDF File

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Jäger, S., Wahl, S., Kröger, J. et al. Genetic variants including markers from the exome chip and metabolite traits of type 2 diabetes. Sci Rep 7, 6037 (2017). https://doi.org/10.1038/s41598-017-06158-3

Download citation

Received: 21 June 2016
Accepted: 08 June 2017
Published: 20 July 2017
DOI: https://doi.org/10.1038/s41598-017-06158-3

This article is cited by

A network analysis framework of genetic and nongenetic risks for type 2 diabetes
- Yuan Zhang
- Shu Li
- Yaogang Wang
Reviews in Endocrine and Metabolic Disorders (2021)
Identification of novel biomarkers affecting the metastasis of colorectal cancer through bioinformatics analysis and validation through qRT-PCR
- Wenping Lian
- Huifang Jin
- Mengle Peng
Cancer Cell International (2020)
Relationships of Non-coding RNA with diabetes and depression
- Tian An
- Jing Zhang
- Guang-Jian Jiang
Scientific Reports (2019)
Identification of novel alleles associated with insulin resistance in childhood obesity using pooled-DNA genome-wide association study approach
- P Kotnik
- E Knapič
- S Horvat
International Journal of Obesity (2018)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

Eleven known mGWAS loci were associated with diabetes-related metabolite traits

Exploratory analysis using exome chip data identified one novel locus for diabetes-associated metabolite traits

Biologic pathway annotations

Gene-based analyses identified four genes associated with metabolite traits

Identified genetic variants were weakly associated with risk of type 2 diabetes and other traits

Discussion

Materials and Methods

Study population

EPIC-Potsdam study

Cooperative Health Research in the Region of Augsburg study (KORA F4)

Ethics statement

DNA-extraction, genotyping and quality control

EPIC-Potsdam

KORA F4

Metabolite measurements in EPIC-Potsdam and KORA F4

Assessment of type 2 diabetes in EPIC-Potsdam

Statistical analysis

Candidate analysis of known mGWAS hits

Exploratory analysis

Association with type 2 diabetes

Assessment of functionality of genetic variants

Bioinformatics annotation of genes at metabolite-associated loci

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Additional information

Electronic supplementary material

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Comments

Search

Quick links