Novel DNA methylation sites of glucose and insulin homeostasis: an integrative cross-omics analysis

Despite existing reports on differential DNA methylation in type 2 diabetes (T2D) and obesity, our understanding of the functional relevance of the phenomenon remains limited. Because obesity is the main risk factor for T2D and a driver of methylation from previous study, we aimed to explore the effect of DNA methylation in the early phases of T2D pathology while accounting for body mass index (BMI). We performed a blood-based epigenome-wide association study (EWAS) of fasting glucose and insulin among 4,808 non-diabetic European individuals and replicated the findings in an independent sample consisting of 11,750 non-diabetic subjects. We integrated blood-based in silico cross-omics databases comprising genomics, epigenomics and transcriptomics collected by BIOS project of the Biobanking and BioMolecular resources Research Infrastructure of the Netherlands (BBMRI-NL), the Meta-Analyses of Glucose and Insulin-related traits Consortium (MAGIC), the DIAbetes Genetics Replication And Meta-analysis (DIAGRAM) consortium, and the tissue-specific Genotype-Tissue Expression (GTEx) project. We identified and replicated nine novel differentially methylated sites in whole blood (P-value < 1.27 × 10−7): sites in LETM1, RBM20, IRS2, MAN2A2 genes and 1q25.3 region were associated with fasting insulin; sites in FCRL6, SLAMF1, APOBEC3H genes and 15q26.1 region were associated with fasting glucose. The association between SLAMF1, APOBEC3H and 15q26.1 methylation sites and glucose emerged only when accounted for BMI. Follow-up in silico cross-omics analyses indicate that the cis-acting meQTLs near SLAMF1 and SLAMF1 expression are involved in glucose level regulation. Moreover, our data suggest that differential methylation in FCRL6 may affect glucose level and the risk of T2D by regulating FCLR6 expression in the liver. In conclusion, the present study provided nine new DNA methylation sites associated with glycemia homeostasis and also provided new insights of glycemia related loci into the genetics, epigenetics and transcriptomics pathways based on the integration of cross-omics data in silico.


158
The 15 novel CpGs were tested using the same models in meta-analysis of 11 independent cohorts including 159 11,750 non-diabetic subjects from the Cohorts for Heart and Aging Research in Genomic Epidemiology 160 (CHARGE) consortium (Supplementary Table 1

186
Using BIOS database (blood-based) 26 , we found that 2,991 single-nucleotide polymorphisms (SNPs) in 29 187 unique genetic loci were associated with methylation in either cis or trans across 18 unique CpG sites among 188 the tested 20 target methylation sites. For two CpG sites located in SLC7A11 and LETM1, we did not find any 189 significant meQTLs. Results are shown in Figure 2 and given in detail in Supplementary Table 4. Seven of the 190 29 meQTLs, 5 cis-acting and 2 trans-acting were found significantly associated with T2D, fasting glucose or 191 HbA1c (shown in Figure 2 and given in detail Supplementary

204
To understand if the methylation is also eQTM, we investigated the association between gene expression and 205 the 20 key glycemic methylation sites from the European blood-based BIOS database 26 (intergrated in Figure  2.

213
We next explored if the genetic variants associated with the differential expression above were the same as 214 the meQTLs using the eQTL data from the European blood-based BIOS database 26 (integrated in Figure 2 and 215 detailed in Supplementary Table 8). We found three genetic determinants associated with both differential 216 DNA methylation, including two novel methylation sites (in FCRL6 and SLAMF1) and one known (in SREBF1),

224
We then explored the tissue-specific differential expression associated with T2D and related traits by data-225 mining from MetaXcan database from the GTEx project 28,29 . This analyses targeted on the eQTM-related 226 expression of seven genes as shown in 2.2.1 in six glucose-metabolism-related tissues including blood, adipose 227 subcutaneous, adipose visceral omentum, liver, pancreas, and muscle skeletal (Supplementary Table 9). The 228 effect direction consistency was checked between methylation sites, gene expression and T2D or related traits. ) (shown in Figure 1c). Higher liver gene expression of FCRL6, a novel locus, was associated with 236 increased risk of T2D (Z = 2.14, P-value = 0.032) based on MetaXcan 28,29,31 results generated by integrating 237 functional data in liver 32,33 and the GWAS of T2D 9 . The novel cg00936728 in FCRL6 was negatively associated

253
SLAMF1 with inflammatory bowel disease 36 and FCRL6 with C-reactive protein (CRP) 37,38 ), cardiovascular 254 phenotypes (RBM20 with electrocardiographic traits 39 ), cancer (IRS2 with prostate cancer 40 ) and schizophrenia 255 (MAN2A2) 41 . Thus, observations provided insight into the pathways that might link T2D to inflammation, 256 cardiovascular disease, cancer and schizophrenia, all disorders associated epidemiologically or clinically with 257 T2D. This phenomenon may point at genetic pleiotropy of the genes, i.e. a gene codes the same products in 258 various cells or have cascade-like signaling function that affects various targets.

259
In this paper, we used the assumption that genetic variants drive DNA methylation which subsequently 260 regulates gene expression and then glycemic traits 42 . Two pathways (on SREBF1 and FCRL6) related to 261 genetics-epigenetics-transcriptomics-phenotype were observed in the present study (Figure 1b and Figure1c).

262
We validated the differential methylation of SREBF1 in insulin metabolism 43 and extended the findings building 263 a pathway based on the cascading cross-omics analysis in the assumption of genetics-epigenetics-264 transcriptomics-phenotype. We also discovered a new pathway of FCRL6 in glucose metabolism, which still The present study provides new genomic targets for further work on the pathology of T2D through large-scale 271 EWAS and replication. However, the main findings are based on data from blood which was the only accessible 272 tissue and may not be representative of more glucose-relevant tissues, although concordance of differential 273 methylation between blood and adipose is high for certain pathways 44 .

316
Epigenome-wide association analysis and replication 317 All statistical analyses were performed using R statistical software. Insulin was natural log transformed. In the 318 discovery analysis, we first performed epigenome-wide association studies (EWAS) in each cohort separately.

319
Linear regression analysis was used to test the association between glucose and insulin with each methylation 320 site in the Rotterdam Study samples. Linear mixed models were used in NTR and TwinsUK accounting the family structure. We fitted two models for each cohort: 1) the baseline model adjusting for age, sex, technical 322 covariates (chip array number and position on the array), white blood cell counts (lymphocytes, monocytes, 323 and granulocytes) and smoking status, and 2) a second model additionally adjusting for BMI. We removed 324 probes that have evidence of multiple mapping or contain a genetic variant in the methylation site 50 . All 325 cohort-specific EWAS results for each model were then meta-analysed using inverse variance-weighted fixed 326 effects meta-analysis as implemented in the "metafor" R package 51 . In total, we meta-analysed 403,011 CpGs 327 that passed quality control in all four discovery cohorts. The detail of the quality control for each cohort could

335
For the associations discovered in the meta-EWAS that have not been reported previously, we attempted 336 replication in independent samples using the same traits and models as in the discovery analyses.   45 . We managed to exclude the genetic loci which were genome-wide significantly associated 373 with glycemic traits, but none of the genetic loci meet this exclusion criteria. The instrumental variables that 374 explain more than 1% of variance in exposure (DNA methylation) were taken forward for MR test. The To explore whether the differential methylation sites were associated with differential expression in blood, we 378 explored the European blood-based BIOS database for eQTMs 26 . The significantly associated gene expression 379 probes were searched in the eQTL data in BIOS database 26 . We also investigated if the genetic variants 380 associated with these gene expression probes in blood were also related to the DNA methylation sites with 381 glycemia. Finally, we tested whether the expression of the genes that harbor the eQTMs was associated with 382 T2D and related traits in glucose metabolism-related tissues (adipose subcutaneous, adipose visceral 383 omentum, liver, whole blood, pancreas, and muscle skeletal) using  The  The gene expressions associated with the glycemia related methylation sites are shown based on the European blood-based BIOS database (n = 3,841) 26 .  The common genetic determinants of glycemia related methylation sites and gene expression in the European blood-based BIOS database (n = 3,841) 26   The significant associations between the gene expression level in the glucose metabolism-related tissue and the T2D or related traits are shown based on the tissue-specific Genotype-Tissue Expression (GTEx) project 28,29 . It was explored in six glucose related tissues, i.e. adipose subcutaneous, adipose visceral omentum, liver, whole blood, pancreas, and muscle skeletal, and five T2D or related traits, i.e. T2D 9 , fasting glucose 5,6 , fasting insulin 6 , HbA1c 57 , and HOMA-IR 4 . Z: effect estimate per standard error.

Figure 1 Overview of the cross-omics analysis and examples
Cascading associations cross multiple-omics-based on different data sources were integrated in the network figures. The assumption is genetic variants drive DNA methylation which subsequently regulates gene expression and then glycemic traits. FG: fasting glucose. FI: fasting insulin. T2D: type 2 diabetes