An epigenome-wide association study of metabolic syndrome and its components

The role of metabolic syndrome (MetS) as a preceding metabolic state for type 2 diabetes and cardiovascular disease is widely recognised. To accumulate knowledge of the pathological mechanisms behind the condition at the methylation level, we conducted an epigenome-wide association study (EWAS) of MetS and its components, testing 1187 individuals of European ancestry for approximately 470 000 methylation sites throughout the genome. Methylation site cg19693031 in gene TXNIP —previously associated with type 2 diabetes, glucose and lipid metabolism, associated with fasting glucose level (P = 1.80 × 10−8). Cg06500161 in gene ABCG1 associated both with serum triglycerides (P = 5.36 × 10−9) and waist circumference (P = 5.21 × 10−9). The previously identified type 2 diabetes–associated locus cg08309687 in chromosome 21 associated with waist circumference for the first time (P = 2.24 × 10−7). Furthermore, a novel HDL association with cg17901584 in chromosome 1 was identified (P = 7.81 × 10−8). Our study supports previous genetic studies of MetS, finding that lipid metabolism plays a key role in pathology of the syndrome. We provide evidence regarding a close interplay with glucose metabolism. Finally, we suggest that in attempts to identify methylation loci linking separate MetS components, cg19693031 appears to represent a strong candidate.


Scientific Reports
| (2020) 10:20567 | https://doi.org/10.1038/s41598-020-77506-z www.nature.com/scientificreports/ be jointly associated to all of these component phenotypes could be found 6 . In previous GWA-studies the role of lipid metabolism as a major contributor to the MetS phenotypic outcome has emphasised 6 . Epigenetics might serve as one of the factors linking the individual components of the MetS phenotypic outcome. Thus far, only a few studies have successfully identified epigenetic regions and individual markers associated with separate MetS components and only four epigenetic markers have been reported to be associated with MetS as a whole [8][9][10] .
We therefore conducted an epigenome-wide association study (EWAS) of MetS and its components using whole blood methylation levels in order to test the association (i) between methylation (CpG) and MetS as a binomial trait, as well as (ii) between methylation (CpG) and individual MetS components to shed light on the metabolic interplay underlying this complex phenotype.

Materials and methods
Study population. Our  The national FINRISK study. FINRISK surveys consist of cross-sectional, population-based studies conducted every five years between 1972 and 2012 to monitor the risk of chronic diseases in Finland. Each survey included a questionnaire and a clinical examination during which a blood sample was drawn. Data are linked to the national electronic healthcare registers for cardiovascular disease and other health outcomes 11 .
Our study consisted of eligible individuals from a specific subset examined for metabolic traits during the 2007 FINRISK survey. The DILGOM study (The Dietary, Lifestyle, and Genetic Determinants of Obesity and Metabolic Syndrome) was collected as an extension of the FINRISK 2007 survey. This subset consisted of a total sample size of 5025 individuals aged between 25 and 74, of whom we studied 517 individuals from the Helsinki and Vantaa region using various omics 12 . Detailed characteristics of the study sample are provided in Table 1. Detailed descriptions of the methodologies used to measure the waist circumference (WC), triglycerides, HDL, systolic and diastolic blood pressure, and plasma glucose levels for the study sample are described in Supplementary material as well as in previous studies 13,14 . In this study, the presence of MetS is defined according to the 2005 International Diabetes Federation (IDF) definition ( (NFBC1966) is a population-based prospective birth cohort consisting of all mothers in the two northern-most provinces of Finland with children whose expected date of birth fell during 1966 15 . In total, 12 058 live-born children were recruited into the cohort. In 1997, a health and lifestyle questionnaire was sent to all living cohort members with a known address (n = 11 322). Those cohort members living in Northern Finland and in the Helsinki area (n = 8463) were also invited for a clinical examination, of whom 6033 attended. Information on smoking and alcohol use was obtained from the questionnaire. Descriptions of the methodologies used to measure the waist circumference (WC), triglycerides, HDL, systolic and diastolic blood pressure, and plasma glucose levels for the study sample are described in detail in Supplementary material. Informed consent and written permission were obtained from the study participants at both 31 and 46 years of age.

DNA extraction and methylation array analysis.
In the DILGOM cohort, whole blood samples were obtained from 517 individuals (n = 239 men and n = 278 women). All subjects were older than 18 years, with a mean age of 51.9 years, and provided their written informed consent for the use of their DNA sample for research. DNA was extracted from 10-ml whole-blood peripheral white blood cells using the NucleoSpin® Tissue kit (Macherey-Nagel GmbH, Düren, Germany) with the salting-out method using 10 M ammonium acetate. DNA was precipitated in isopropanol, washed in 70% ethanol, and resuspended in 1X TE buffer. The purity and concentrations of the DNA samples were measured by spectrophotometer (NanoDrop® ND1000; Thermo Fisher Scientific Inc., Waltham, MA, US). From each sample, 600 ng of genomic DNA was bisulfite modified using the EZ DNA Methylation kit (Zymo Research Orange, California, US) according to the manufacturer's recommendations for the Illumina Infinium Assay. After purification, 4 µl of each bisulfite-converted DNA sample was used for hybridisation on the Infinium Human Methylation 450 BeadChip (HM450K) following the Illumina Infinium HD Methylation protocol with the original IDAT files extracted from the HiScan scanner.
In NFBC1966, DNA methylation was extracted after a 31-year follow-up period for a random sample of 807 participants for whom complete follow-up data were available (postal questionnaire and clinical examination at 31 as well as 46 years of age). DNA methylation was further measured using the HM450K assay.

Methylation normalisation.
In the DILGOM cohort, the methylation data processing and quality control analyses were performed using the Bioconductor package minfi 16 . Pre-normalised raw data were used to convert the intensities from the red and the green channels into methylated (M) and unmethylated (U) signals. Βeta values for each CpG probe were calculated according to Illumina's recommendations using [β = M/(M + U + 100)]. The difference in the distribution of β values for type I and type II probes was corrected using subset-quantile within array normalisation (SWAN) 17 . Detection P values were obtained for every CpG probe in every sample. Failed positions were defined as signal levels lower than background from both methylated and unmethylated channels. Probes with a detectable methylation level in < 5% of samples (detection P < 0.01) were excluded. In addition, the CpG sites on the X and Y chromosomes were discarded, which resulted in 468 809 CpG probes for further analysis.
In NFBC1966, quality control and normalisation for DNA methylation data were performed based on the CPACOR pipeline 18 with minor adaptations. The pipeline uses 30 PCs as covariates to control for technical confounding. Data were first retrieved using the minfi R package 16 , and the Illumina Background Correction to the intensity values was applied. A detection P value threshold was set to P < 1 × 10 −16 , and samples with a call rate < 98% were excluded. Quantile normalisation was performed separately for six probe-type categories using R package limma 19 . These normalised intensity values were used to calculate the β value at each CpG site. A principal component analysis was performed for HM450K control probes, and the first 30 principal components were used as additional explanatory variables in subsequent regression models in both, DILGOM and NFBC1966 cohorts. In both DILGOM and NFBC1966, white blood cell subpopulation estimates were acquired using the software provided by Houseman et al. 20 , and these subpopulation estimates were also added to regression models as explanatory variables. Further exclusions of individuals from DILGOM and NFBC1966 data were performed before association analyses, based on outlier-status and gender mismatches (Supplementary material).
Power calculation. Power calculation for DILGOM cohort was performed with pwrEWAS 21 . For binary analyses within the discovery sample of approximately 500 individuals, differences up to 20% in CpG-specific methylation with at 94% power and differences up to 2% at 28% power were able to be detected.

Epigenome-wide association study (EWAS).
In the discovery analysis of the DILGOM cohort, the association between the M value for all 468 809 CpG probes (as outcome variable) and MetS, as well as its six individual components, were tested using regression analysis fitting generalised linear models (glm) using the R software program, version 3.3.1 22 . Analyses were adjusted for age, sex, smoking status (defined either as current or never/ex-smokers), alcohol consumption (grams/week), cell subtype proportion, study-specific technical covariates (described in detail in Supplementary material), and the first five genetic principal components of the data to control for potential population substructure.
Replication analysis. Association analysis of a total of 33 CpG probes was replicated in an independent population sample (NFBC1966, n = 670, Supplementary Table 1, Supplementary Fig. 1). The 33 CpG probes were selected for replication by generating Q-Q probability plots of the discovery EWAS results of MetS and its individual components and further verifying selected probes not to be false positive discoveries using permutation analysis. The EWAS results of those 33 CpG probes clearly deviated from the expected line of the Q-Q probability plots. In permutation analysis of each probe we first 1) randomly re-assigned the probe signal values between the study participants while keeping their phenotype data untouched, and then 2) tested for an association between the probe and the phenotypes as described above for the original association analysis. We repeated this procedure 1000 times and calculated how many occurrences of a P value smaller than the P value from the original association analysis was observed. For all 33 selected probes, the count was less than 1%. The replication analysis for the 33 CpG probes was performed using the same analytical protocol used in the discovery analysis for the DILGOM cohort.

Meta-analysis. Summary statistics from both cohorts were meta-analysed using an inverse-variance
weighted fixed-effect model, using the GWAMA software program 23 .

Conditional analysis.
To investigate the independence of the replicated methylation signals from the possible underlying genetic effect (DNA variation expressed as an SNP effect), a conditional linear regression analysis was performed for selected, successfully replicated probes (N = 5). First, SNPs located ± 5 Mb from the genomic position of each replicated methylation probe were tested for an association with the phenotype in question ( Supplementary Fig. 2). The strongest associating SNP (based on their association P value) was then used as a covariate in the regression analysis where association between the methylation probe and phenotype was tested.
Gene expression data processing and analysis. Sample collection and data processing is described in detail in Inouye et al. 24 . In brief, RNA was hybridized to Illumina HT-12 v3 BeadChip arrays. The background corrected probes were subjected to quantile normalization for each array at the strip-level. Technical replicates were combined by bead count weighted average and replicates with Pearson correlation coefficient < 0.94 or Spearman's rank correlation coefficient < 0.60 were removed. Expression values for each probe were log2 transformed.
To investigate the relation between the replicated methylation signals and gene expression, a linear regression analysis was performed for successfully replicated methylation probes located to a gene and equivalent gene transcripts (available for three replicated methylation loci). The same covariates were used to adjust the analyses as those used in discovery EWA analyses. Possible trans effects between selected replicated methylation loci and nearest gene transcripts of other replicated methylation probes were also studied with linear regression analysis.

Results
Epigenome-wide association between DNA methylation and MetS and its components. We found several differentially methylated CpG methylation sites for MetS and its component traits (WC, fasting glucose, HDL, triglycerides, systolic blood pressure, diastolic blood pressure) across the 468 809 CpG probes analysed in the FINRISK sample on 517 Finnish individuals. In total, 33 probes with methylation M value based P values ranging from P = 7.33 × 10 −6 to P = 5.08 × 10 −8 , passed permutation based test of statistical significance (P perm < 0.05) and clearly deviated from the expected line of the Q-Q probability plots of the discovery analysis results ( Supplementary Fig. 1). From these, 12 probes associated directly with the MetS status, and all 33 associated with one or more component traits. The 33 probes were further selected for a replication analysis (Fig. 1 Replication, meta-analysis and integration with genetics. Altogether, five CpG markers were successfully replicated in an independent Finnish sample of 670 individuals at a nominal significance level of P < 0.05 (Table 3), and subsequently tested in a meta-analysis combining the discovery and replication cohorts. All replicated associations were for the component phenotypes and none for the MetS case/control status. The methylation site cg19693031 in the TXNIP gene region (chr1:145 441 552) associated inversely with fasting glucose levels in the meta-analysis (β eff = -0.076, P = 1.80 × 10 −8 ). The methylation site cg06500161 in the ATPbinding cassette, sub-family G (WHITE), member 1 (ABCG1) gene region (chr21:43 656 587) associated both with serum triglycerides (β eff = 0.047, P = 5.36 × 10 −9 ) and WC (β eff = 0.003, P = 5.21 × 10 −9 ). Two other WC-associated methylation sites were also observed: cg08309687, located within the long intergenic non-protein coding RNA 649 approximately 8 Mb downstream from the ABCG1 gene region (β eff = -0.004, P = 2.24 × 10 −7 ), and cg11024682 in the sterol regulatory element binding transcription factor 1 (SREBF1) gene region (chr17:17 730 094) (β eff = 0.003, P = 5.96 × 10 −9 ). Furthermore, methylation site cg17901584 located in the genomic region of HP08874 mRNA (chr1:55 353 706) associated with HDL (β eff = 0.133, P = 7.81 × 10 −8 ). among the association results of the component phenotypes building up the metabolic syndrome as an entity was observed for the 33 CpG probes taken forward for replication (Fig. 1). Several methylation probes associated with both WC and MetS (Fig. 1, Suppl. Table 1). Numerous methylation probes associated with triglycerides and WC, often accompanied by a concurrent association with MetS. Unsurprisingly, several probes were also simultaneously associated both with triglycerides and HDL as well as both with glucose and MetS (Fig. 1). Two of the five replicated methylation probes simultaneously associated with MetS and with several of its components in the discovery cohort. In addition to the association with triglycerides and WC, methylation site cg06500161 in the ABCG1 gene associated with glucose, HDL, and MetS (β eff = 0.049, P = 7.39 × 10 −4 ; β eff = -0.096, P = 2.70 × 10 −4 ; β eff = 0.077, P = 7.44 × 10 −5 , respectively). Furthermore, the glucose-associated methylation site cg19693031 in the TXNIP gene region suggestively associated with WC, triglycerides, systolic blood pressure as well as MetS as a condition (β eff = -0.004, P = 2.25 × 10 −3 ; β eff = -0.107, P = 1.09 × 10 −5 ; β eff = -0.003, P = 3.24 × 10 −3 ; β eff = -0.124, P = 6.98 × 10 −5 , respectively), but notably not with HDL (Fig. 1, Suppl. Table 1). Stars indicate the P values from individual tests as follows: *P < 0.01, **P < 0.0001, ***P < 1 × 10-6. Beta values from the waist circumference, diastolic blood pressure, and systolic blood pressure analyses are multiplied by 10 in order to provide more comparable and clinically relevant values for beta in the heat map. The detailed numeric data for the heat map is presented in the Supplementary www.nature.com/scientificreports/ In addition, the remaining replicated methylation probes associated with a few other MetS component phenotypes as follows. The WC-associated site cg08309687, located downstream from the ABCG1 gene region, also associated with triglycerides (β eff = -0.058, P = 4.17 × 10 −3 ). Cg11024682 in the SREBF1 gene region appeared to associate with triglycerides and MetS (β eff = 0.034, P = 5.21 × 10 −3 ; β eff = 0.067, P = 2.11 × 10 −5 , respectively). Despite the general association pattern with the lipid components of MetS, neither cg08309687 nor cg11024682 associated with HDL. Rather, HDL-associated site cg17901584 appeared to also associate with triglycerides and WC (β eff = -0.056, P = 4.44 × 10 −3 ; β eff = -0.003, P = 1.28 × 10 −3 , respectively) (Fig. 1, Suppl. Table 1). Figure 1 shows some general behaviour patterns of MetS component phenotypes (Fig. 1). Notably, the association for the blood pressure components (systolic-and diastolic blood pressure) of MetS behaved rather independently compared to other components. Methylation probe cg19693031 in the TXNIP region emerged as the only replicated probe binding blood pressure to other MetS components. Interestingly, methylation site cg06500161 in the gene ABCG1 appeared to also associate with diastolic blood pressure in addition to the association with triglycerides and WC in the replication cohort (Fig. 2, Supplementary Table 2), despite this association remaining unidentified in the discovery analysis (Fig. 1, Suppl. Table 1).

Examination of the underlying genetic effects.
To test the possible underlying genetic effects (DNA variation expressed as an SNP effect) behind the CpG-phenotype associations, we used as a covariate in the conditional analyses the strongest associating SNP ± 5 Mb from the genomic position of each replicated methylation probe for analysis of cg19693031 and glucose, cg17901584 and HDL, cg06500161 and triglycerides, and cg11024682, cg06500161 and cg08309687 and WC (Tables 3 and 4). Using this approach, we found suggestive evidence for a genetic effect in our analysis for cg19693031 in gene TXNIP and glucose as well as for cg11024682 in the gene SREBF1 and WC. After adjusting for the genetic signal rs186387341, we found a stronger association (β eff = -0.136, P = 1.69 × 10 −8 ) between glucose and the cg19693031 methylation site. While adjusting for the genetic signal rs183499598, a stronger association (β eff = 0.003, P = 5.94 × 10 −8 ) between WC and the cg11024682 methylation site was observed ( Table 4).

Associations of replicated DNA methylation loci and gene expression.
We examined the relation of the replicated methylation loci annotated to a gene with gene expression of the gene in question. Gene expression data was available for transcripts of genes TXNIP, SREB and ABCG1 ( Table 5). The methylation site cg11024682 annotated to SREB gene body associated inversely with SREB gene transcript (β eff = −0.119, P = 0.045) indicating that the increased methylation was associated with decreased gene expression of the corresponding gene (Table 5). We could also see inverse borderline significant association between methylation site cg11024682 and ABCG1 gene transcript (β eff = −0.185, P = 0.051) marking a possible trans effect between the methylation locus and the gene annotated to a different chromosome (Supplementary Table 4). The methylation site cg06500161 annotated to ABCG1 gene body showed a strong inverse association with ABCG1 gene transcript (β eff = −0.221, P = 0.0043). The methylation site cg19693031 annotated to 3′UTR of gene TXNIP showed no association with TXNIP gene transcript (β eff = −0.004, P = 0.95). Interestingly, we were able to see a suggestive direct trans effect between the methylation locus and the gene transcripts of SREB (β eff = 0.062, P = 0.038) as well as two separate transcripts of ABCG1 (β eff = 0.111, P = 0.0011; β eff = 0.123, P = 0.0094) (Supplementary Table 4).

Discussion
We performed EWAS of MetS and its components to shed light on the biological background of this complex phenotype. In an interpretation of the results the evident overlapping interrelation of separate component traits of MetS should be taken into account. In should also be noted that medication is in general a known epigenome altering factor. Nevertheless we believe that our analyses capture rather association signals induced by studied phenotypes than by individual drug ingredients used by study individuals (Supplementary material, Supplementary Table 3). Our data suggest that the interplay between lipid and glucose metabolism may represent a key element in the pathology of MetS, implying that epigenetic markers linking blood pressure to the other components of MetS may be scarce.

The role of replicated methylation loci in metabolism. We observed numerous associations between
CpG methylation sites and separate component phenotypes of MetS. The essential role of dyslipidemia in MetS was previously suggested 6 . Our data support the view that the biological background of MetS strongly points Stars indicate significant P values from the individual tests as follows: *P < 0.01, **P < 0.0001, ***P < 1 × 10 −6 . Beta values from the waist circumference, diastolic blood pressure, and systolic blood pressure analyses are multiplied by 10 in order to provide more comparable and clinically relevant values for beta in the heat map. The detailed numeric data for the heat map is presented in the Supplementary Table 2. § indicates the successful replication of the association between the methylation probe and the individual phenotype.
Scientific Reports | (2020) 10:20567 | https://doi.org/10.1038/s41598-020-77506-z www.nature.com/scientificreports/ towards alterations in lipid metabolism. Methylation site cg06500161 in gene ABCG1 associated with triglycerides and WC in our primary EWAS and suggested an association with HDL, glucose, and MetS as a condition. We also saw a clear inverse correlation between the methylation locus and the gene expression of ABCG1 indicating the possible changes in gene function related to the methylation of cg06500161. The association between cg06500161 and triglyceride and HDL metabolism was previously reported, and the locus appears to act as an epigenetic link between myocardial infarction and the blood lipid levels 25 . In addition, the methylation site was previously found to act as an integral part of insulin and glucose metabolism, as well as associating with homeostatic model assessment for insulin resistance (HOMA-IR), a commonly used surrogate to define the state of insulin resistance 26,27 . Cg06500161 is also directly associated with type 2 diabetes, and appears to serve as an epigenetic marker when evaluating an individual's risk for it 28 . Recent study directly linked the site to MetS 10 . Our data support the idea that the ABCG1 gene plays an important role in both lipid and glucose metabolism, suggesting that it could potentially link the two as an underlying factor in the pathology of MetS. In a recent study, support for the hypothesized role of body mass index as a factor partly explaining the association between type 2 diabetes and DNA methylation was given 29 . Our data suggests that these findings can be seen in a broader context of MetS -related dyslipidaemia as a preceding state of type 2 diabetes.  www.nature.com/scientificreports/ Cg11024682 in the SREBF1 gene associated with WC in our primary analysis suggesting an association with triglycerides and MetS. The association between cg11024682 and triglycerides has been previously identified, whereby the results were validated through a tissue-specific analysis using adipose and skin tissue 25 . In addition, an interaction between the gene products of ABCG1 and SREBF1 has also been reported 25 . Furthermore, evidence exists for the role of SREBF1 in glucose metabolism and type 2 diabetes 26,28,30 . While we found no association between glucose metabolism and SREBF1 in our analyses, our findings link the function of SREBF1 to MetS, thus supporting the concept of an altered lipid metabolism related to the syndrome. The possible effect of cg11024682 methylation locus on expression of SREBF1 was supported by our findings. Our suggestive finding about the possible trans effect between the methylation locus at SREBF1 and the expression of ABCG1 further support the idea about the interaction between the two genes.
The third replicated methylation site associated with WC in our data, cg08309687 located in the intergenic area downstream from the ABCG1 gene region, suggested a correlation with triglyceride levels. Interestingly, in a family study attempting to identify novel epigenetic determinants of type 2 diabetes 28 , this methylation site represented one of the most significant CpG sites directly associated with type 2 diabetes, along with fasting glucose and insulin resistance. To our knowledge, our study is the first to also link this methylation site to lipid metabolism.
Additionally, we report the novel association between cg17901584 in chromosome 1 and HDL as well as the association between this methylation site and triglycerides and WC. This is, to our knowledge, the first time that cg17901584, located approximately 1 kb downstream from 24-dehydrocholesterol reductase (DHCR24) has been linked to lipid metabolism at the methylation level.
The methylation site cg19693031 in the gene TXNIP was previously found to associate with lipid metabolism 31 , and its role in glucose metabolism and type 2 diabetes has been confirmed in many recent studies 28,30,32,33 . Our data also indicates that cg19693031 is indeed an important methylation site, possibly linking lipid and glucose metabolism to each other. In our association analyses between cg19693031 methylation locus and gene expression we could see that rather than affecting the expression of its "own" gene TXNIP the locus shows association with the expression of lipid associated genes SREBF1 and ABCG1. Notably, in our analyses, cg19693031 was the only successfully replicated methylation site binding all of the different components together with MetS. In our search for methylation loci linking the individual MetS components together as a pathological cardiometabolic condition, cg19693031 in the TXNIP gene appears to represent a strong candidate.
To date, only a few studies have investigated the epigenome-wide association for MetS as a condition 8,10 . In a recent study, methylation sites cg00574958 and cg17058475, both located on chromosome 11, associated with MetS 8 . In our study, we reproduced this result at a significance level of P < 0.05 (β = −0.08, P = 4.56 × 10 −3 for cg00574958; β = -0.09, P = 5.21 × 10 −3 for cg17058475 in the discovery analysis). In another study 10 , methylation site cg06638433 on chromosome 17 associated with MetS alongside the cg06500161. We were unable to reproduce the result for cg06638433 at a significance level of P < 0.05 (β = 0.02, P = 2.85 × 10 −1 ).

Blood pressure as a component of MetS.
Our data identified the rather independent behaviour of blood pressure components on MetS compared to the metabolic components of the syndrome, as well as the non-aligned behaviour between diastolic and systolic blood pressure (Fig. 1). Only one of the four probes showing suggestive association with blood pressure on top of the other components of MetS, cg19693031, was replicated. Thus, any successfully replicated connective epigenetic marker can serve as a solid target in future studies of the MetS biology. We should also note that the mean age of the replication cohort of our study is rather low, possibly explaining the small replication rate among the blood pressure-related probes in our analyses (Table 1). At the genetic level, the association between the TXNIP allele variation, type 2 diabetes, and hypertension was previously demonstrated, and the role of TXNIP as a key modulator linking the diabetogenic and vascular pathways behind metabolic conditions has been discussed 34,35 . To our knowledge, however, our study is the first to suggest that the methylation component should also be considered in such discussions.
In our study, the non-replicated methylation probe cg07699454 appeared to associate with MetS and its lipid components, as well as with diastolic blood pressure. This probe is located in the gene KCNK17, which has previously been associated with a risk of stroke 36 . It also resides in close proximity to the gene KCNK16, a known susceptibility locus for type 2 diabetes 37 . These findings render cg07699454 an interesting locus for future studies on the role of vascular components of type 2 diabetes and MetS. The apparent association between cg06500161 and diastolic blood pressure in NFBC1966 represents an interesting finding. The well-characterised ABCG1 gene has only very recently been associated with vascular phenotypes 38 . Our data also support the previous idea that cg06500161 could represent a target locus in further studies regarding the molecular background of MetS.

Strengths and limitations.
Our study has several strengths. Finnish population cohorts are, in general, homogeneous, both genetically and culturally and well characterised, making biological patterns detectable even amongst relatively small sample sizes. In addition, a detailed cohort characterisation also helped to construct analytical models in a harmonious way, which is crucial when attempting to control the challenging nature of methylation data. We observe a slightly increased pattern of general hypomethylation in association analyses performed in the discovery cohort with older individuals, compared to the younger replication cohort. The finding, even though should be examined with a caution, fits well with the general observation about promoting effect of aging leading to a global genome-wide hypomethylation in various types of tissues. We should, however, address some limitations. As discussed in a wide range of EWASes, whole-blood peripheral white blood cells are not ideal sources of methylation data, since methylation as a DNA regulation mechanism is highly tissuespecific. Furthermore, due to methylation's great sensitivity to variation in age, our replication cohort may not be the best possible cohort for the purpose and may in part explain the relatively modest success in replication Scientific Reports | (2020) 10:20567 | https://doi.org/10.1038/s41598-020-77506-z www.nature.com/scientificreports/ of our findings. However, because of the notable age-difference between discovery and replication cohorts, the statistical power of detecting true positive findings diminishes. Thus we feel that the risk of detecting and reporting false positive findings in our study remains small. We also recognise the role of elevated waist circumference as a prerequisite of MetS condition in 2005 IDF definition for MetS, and it cannot be ruled out that associations that we see between different methylation sites and MetS might for one manifest the association between methylation site and waist circumference.

Conclusions
Taken together, our study links the previously type 2 diabetes-associated methylation locus cg08309687 to lipid metabolism. We also found a novel association between the methylation locus cg17901584 and HDL. Overall, our study supports the idea derived from genetic studies whereby aberrations in lipid metabolism, combined with the interplay with glucose metabolism, play central roles as functional elements of MetS. In our study we consider EWAS as a tool of a hypothesis free approach to examine if epigenetic variation explains any variation in the phenotype to be studied. The epigenetics of MetS appear polymorphic as expected. Our data suggest that blood pressure might act rather independently compared to the metabolic components of MetS, at least at the epigenetic level. In our attempt to identify a comprehensive methylation locus behind the MetS condition, cg19693031 in gene TXNIP emerges as a strong candidate.

Data availability
The data that support the findings of this study are available from THL Biobank and University of Oulu but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of THL Biobank and University of Oulu.