According to the World Health Organization1, worldwide obesity has nearly doubled since 1980, resulting in an increase in cardiovascular diseases and diabetes type 2. One of the possible causes to this negative development is the increase of consumption of energy-dense foods that are high in fat. Dietary guidelines do not only recommend to eat a moderate amount of fat, but they also recommend to consume the right type of fat.2 Fatty acids include saturated fatty acids (SFA), monounsaturated fatty acids (MUFA) or polyunsaturated fatty acids (PUFA), and their structural differences explain why they have different biological effects.3 Consuming PUFA or MUFA instead of SFA is known to improve the blood lipid profile.4 Moreover, consumption of SFA in place of MUFA may worsen glucose-insulin homeostasis.5 Finally, replacing SFA with PUFA has been reported to lower coronary heart disease risk.6

Some of the effects of the qualitative and quantitative aspects of fat intake have been imputed to a modification of the transcription of key genes involved in pathways related to lipid and glucose metabolism, and/or inflammation.7 The regulation of gene expression can be achieved by mechanisms other than changes in the nucleotide sequence, namely epigenetic processes. Such processes are responsible for the establishment, maintenance, and reversal of metastable transcriptional states.8 One major example of such processes is the methylation of cytosine, usually at CpG dinucleotides, called DNA methylation. Regions rich in CpGs are called ‘CpG islands’ and are mostly unmethylated when located in the promoter of active genes. Conversely, methylated promoters are associated with reduced gene expression.9

Five studies have investigated the link between DNA methylation and fat intake in humans, but the methylation assays in those studies were limited to only few key genes. One study found a significantly higher methylation in the peroxisome proliferator-activated receptor coactivator-1 gene (PPARGC1A) in high-fat overfed men.10 Another study found that the clock circadian regulator gene (CLOCK) methylation was negatively associated with MUFA intake, but positively associated with PUFA intake.11 A third study showed that higher n-6 PUFA intake was associated with lower methylation in the promoter of tumor necrosis factor-α (TNFα).12 A fourth study found no significant correlation between a diet rich in fat and sucrose, and methylation of hydroxyacyl-coenzyme A dehydrogenase (HADH) and glucokinase (GCK) genes.13 The fifth paper reported a lack of correlation between four diets enriched in different types of fat and the methylation levels of leptin (LEP), leptin receptor (LEPR), and pro-opiomelanocortin (POMC) genes.14

Here we explore the genome-wide DNA methylation profiles of Greek preadolescents with respect to parameters related to dietary fat quantity, and dietary fat quality. To our knowledge, this is the first time that parameters related to both quantitative and qualitative aspects of fat intake with respect to DNA methylation are investigated at a genome-wide scale. Moreover, no such studies have been performed in children.

Materials and Methods

Genome-wide changes of DNA methylation pattern associated with parameters related to fat intake were assessed. Two variables related to dietary fat quantity (proportion of energy intake derived from fat, cholesterol intake) and three related to dietary fat quality (MUFA/SFA, PUFA/SFA and (MUFA+PUFA)/SFA) were analyzed. A linear model that explains the methylation level for each CpG site/island corrected for gender, weight category, Tanner stage (an estimation of physical development), and white blood cell count was utilized. The ratios between the unsaturated and saturated fatty acid intakes were preferred to their individual values, as they have been reported to have antagonistic effects. A higher fatty acids ratio would account for a ‘healthier’ fatty acid intake profile, while a lower ratio would account for an ‘unhealthier’ fatty acid intake profile.


All participants and their guardians gave informed written consent and the study was approved by the Greek Ministry of National Education (7055/C7-Athens, 19-01-2007) and the Ethical Committee of Harokopio University (16/ Athens, 19-12-2006).


The ‘Healthy Growth Study’ was a cross-sectional epidemiological study initiated in May 2007. Approval to conduct the study was granted by the Greek Ministry of National Education (7055/C7-Athens, 19-01-2007) and the Ethics Committee of Harokopio University of Athens (16/Athens, 19-12-2006). The study population comprised school children attending the fifth and sixth grades of primary schools located in the regions of Attica, Etoloakarnania, Thessaloniki and Heraklion. The sampling procedure is fully described elsewhere.15 For the purpose of the current analysis, a subsample of 24 obese and 23 normal-weight preadolescent girls, as well as 11 obese and 11 normal-weight preadolescent boys (Table 1) was selected. This subsample was initially used to investigate the effect of polymorphism in the FTO gene on genome-wide DNA methylation patterns.16

Table 1 Demographic data stratified for weight category and gender

Dietary assessment

Dietary intake data was obtained for two consecutive weekdays and one weekend day, via morning interviews with the children at the school site using the 24-h recall technique. More specifically, all study participants were asked to describe the type and amount of foods and beverages consumed during the previous day, provided that it was a usual day according to the participant’s perception. To improve the accuracy of food descriptions, standard household measures (cups, tablespoons, etc) and food models were used to define amounts. At the end of each interview, the interviewers, who were dietitians rigorously trained to minimize interviewer’s effect, reviewed the collected food intake data with the respondent to clarify entries, servings and possible forgotten foods. Food intake data were analyzed using the Nutritionist V diet analysis software (version 2.1, 1999, First Databank, San Bruno, CA, USA), which was extensively amended to include traditional Greek recipes, as described in Food Composition Tables of Greek Cooked Foods and Dishes. Furthermore, the database was updated with nutritional information of processed foods provided by independent research institutes, food companies and fast-food chains.

DNA methylation profiling

The genome-wide Illumina Infinium HumanMethylation27 BeadChip (Illumina, San Diego, CA, USA), which allows interrogation of 27 578 CpG dinucleotides covering 14 495 genes was applied to determine the methylation profile of genomic DNA isolated and purified from the peripheral whole blood. This chip has been shown to give a reliable and reproducible estimation of the methylation profile on a genomic scale.17 First, bisulfite conversion of genomic DNA was performed using the EZ DNA Methylation-Gold Kit (Zymo Research, Irvine, CA, USA) according to the manufacturer’s protocol. Briefly, 500 ng of DNA was sodium bisulfite-treated, denatured at 98 °C for 10 min, and bisulfite converted at 64 °C for 2.5 h. After conversion, samples were desulfonated and purified using column preparation. Approximately 200 ng of bisulfate-converted DNA was processed according to the Illumina Infinium Methylation Assay protocol. This assay is based on the conversion of unmethylated cytosine (C) nucleotides into uracil/thymine (T) nucleotides by the bisulfite treatment. The DNA was whole-genome amplified, enzymatically fragmented, precipitated, resuspended, and hybridized overnight at 48 °C to locus-specific oligonucleotide primers on the BeadChip. After hybridization, the C or T nucleotides were detected by single-base primer extension. The fluorescence signals corresponding to the C or T nucleotides were measured from the BeadChips using the Illumina iScan scanner. Phenotypes, raw data and background-corrected normalized DNA methylation data are available through the GEO database ( with accession numbers GSE27860 for the girls and GSE57484 for the boys.

Data processing

All downstream data processing and statistical analyses were performed with the statistical software R ( together with the lumi,18 limma19 and IMA20 packages of the Bioconductor project.

Data preprocessing

The fluorescence data were preprocessed using the GenomeStudio 2009.2 (Illumina) software. We used the log2 ratio of the intensities of methylated probe versus unmethylated probe, also called M-value, which is more statistically valid for the differential analysis of methylation levels.21

Quality control

The data were imported and submitted to quality control using a modified version of the IMA.methy450PP function of the IMA package. The following CpG sites and samples were removed: the sites with missing β-values, the sites with detection P-value>0.05, the sites having <75% of samples with detection P-value<10−5, the samples with missing β-values, the samples with detection P-value>10−5 and the samples having <75% of sites with detection P-value<10−5. A total of 26 168 probes were included in the analysis, after discarding 328 probes that did not reach the quality control together with 1082 probes from the sex chromosomes.


Quantile normalization was performed on the M-values of all the 26 168 CpG sites using the lumiMethyN function of the lumi package.


For better interpretation of the genome-wide methylation patterns, we chose to use the expanded annotation table for the Illumina Infinium HumanMethylation450 BeadChip array generated by Price et al.22 There are a total of 27 578 loci for 27k array, and 1600 of them are not mapped to 450k array. For those unmapped loci, we kept their original annotation from the 27k array. The expanded annotation file was used to determine the average methylation value of CpG sites belonging to the same island or island shores (all sites with the same name in the ‘HIL_CpG_Island_Name’ column of the annotation file were averaged). We obtained the average methylation value of 5980 islands/island shores, which reduced the number of interrogated locations to 19 437 sites/islands. The CpG island classification developed by Price et al22 provides good enrichment discrimination of CpG islands. This classification is a combination of Weber et al’s classification23 where CpG islands are defined according to the GC content, the Obs/Exp CpG ratio and the island length, and Illumina’s classification, where CpG islands are defined according to their physical position (islands, island shores, and shelves). The location within a CpG island or shore are suggested to be relevant,24 and Price et al’s definition of CpG islands allowed to distinguish different methylation distribution between probes, which remained undetectable with the Illumina CpG island classification.22 Besides, their classification demonstrated a more extreme DNA methylation profile and a larger proportion of differentially methylated regions between different tissues.

The expanded annotation file was also used to determine which gene each interrogated CpG site/island may be associated with (‘Closest_TSS_gene_name’ column of the annotation file), the distance of each interrogated CpG site/island to the closest TSS (transcription start site) (‘Distance_closest_TSS’ column of the annotation file) and the CpG density surrounding each interrogated CpG site/island (‘HIL_CpG_class’ column of the annotation file). Each site can either be located in a high-density CpG island, an intermediate-density CpG island, a region of intermediate-density CpG island that borders HCs, or a non-island. Indeed, the local CpG density has been shown to influence the role of methylated cytosines, with methylation having more transcriptional effect in high-density CpG island and less at non-islands.25

The Illumina-provided MAPINFO GenomeStudio column was used to determine the genomic location of each interrogated CpG site. For CpG islands, the name of the island was used to determine its genomic location (eg the island ‘chr19_IC:17905037-17906698’ would be a CpG island of intermediate density located on chromosome 19, between 17 905 037 and 17 906 698).


Linear model

We developed the following linear model for each CpG site k, using limma’s robust regression method, with a maximum number of iteration equal to 10 000:

where Mk is the M-value of CpG site/island k, G is the dichotomized gender (female=1 and male=0), T is the Tanner stage, B is the white blood cell count, W is the dichotomized weight category (normal-weight=0 and obese=1), ɛk is the unexplained variability, and V is one of the following variables: proportion of energy intake derived from total fat intake, cholesterol intake (g/day), MUFA/SFA, PUFA/SFA, or (MUFA+PUFA)/SFA.

The coefficients bkx summarize the correlation between the methylation level and the variables of interest. Moderated t-statistics for each contrast and CpG site/island were created using an empirical Bayes model, to rank genes in order of evidence for differential methylation.19 To control the proportion of false positives, P-values were adjusted for multiple comparisons as proposed by Benjamini and Hochberg (BH).26 An adjusted P-value>0.05 was considered nonsignificant.

Three children from the cohort had a MUFA/SFA, a PUFA/SFA, and a (MUFA+PUFA)/SFA higher than the mean±3 × SD. Thus, they were excluded from the linear models developed for MUFA/SFA, PUFA/SFA, and (MUFA+PUFA)/SFA.

Functional enrichment analysis

The unique Entrez Gene ID associated with each significant gene-based site/island was identified. Three gene lists were generated for MUFA/SFA, PUFA/SFA, and (MUFA+PUFA)/SFA, respectively.

We used the web-based ConsensusPathDB-human (CPDB)27, 28 to determine the significant pathways each gene list may be involved in. On the basis of the reference gene set (all Entrez Gene IDs from the 27k BeadChip annotation file were used as a background), the expected number of genes in each pathway of the CPDB database is compared with the actual number of genes found for this pathway. For each pathway, a P-value and a q-value are calculated according to the hypergeometric test. The pathways with a raw P-value<0.05 together with a q-value<0.05 were selected. As CPDB includes information from 30 databases, the pathways often overlap with each other to some extent. Thus, to show the relationships between the different pathways, we constructed a heatmap of the proportion of shared input genes between the significant pathways. For instance, if P1 is a given pathway containing genes A, B, and C from the input gene list, and P2 is a given pathway containing genes B, C, D, and E from the input gene list, the proportion of shared genes between P1 and P2 is:

We also used the web-based g:Profiler29, 30 as an alternative method for pathway analysis, to confirm the significant results obtained with CPDB. The g:GOSt tool was used for enrichment analysis, with the same background gene list, and the g:GOSt native method g:SCS for multiple testing correction. The pathways with an adjusted P-value<0.05 were selected. It is important to note that g:Profiler only includes pathways from two databases: KEGG and Reactome.


Four CpG sites and one CpG island were found to be significantly associated with the proportion of overall fat intake (Figure 1a), while no significance was found for cholesterol intake. The methylation levels of 2 islands, 11 island shores, and 16 sites were significantly correlated with PUFA/SFA; 9 islands, 26 island shores, and 158 sites for MUFA/SFA; 10 islands, 40 island shores, and 130 sites for (MUFA+PUFA)/SFA (Figure 1b and Supplementary Tables 1–3).

Figure 1
figure 1

Volcano plots for proportion of total energy intake derived from fat (a) and (MUFA+PUFA)/SFA (b). The regression coefficient refers to the coefficient of the linear model and each point represents a CpG site or a CpG island. The red horizontal line is the significance threshold (P-value=0.05) and all points above this line are significant. (a) proportion of total energy intake derived from fat (positive coefficients refer to an increased methylation in children for whom fat represents a higher proportion of total energy intake). (b) (MUFA+PUFA)/SFA (positive coefficients refer to an increased methylation in children having a higher (MUFA+PUFA)/SFA).

What genes are associated with the significant CpG sites/islands?

To determine which gene may be regulated by each CpG site and island, we identified the gene whose TSS is closest to each CpG site and island. Each significant site, island or island shore can show either a positive fold change if its methylation is higher in children having an elevated dietary variable (eg, a higher cholesterol intake), or a negative fold change if its methylation is lower in children having an elevated dietary variable.

Regarding the proportion of fat intake, one CpG site associated with taste receptor, type 2, member 13 (TAS2R13) that may have a role in the perception of bitterness, while another site associated with thioredoxin interacting protein (TXNIP), a regulator of cellular oxidative stress downregulated by SFA uptake31 (Table 2).

Table 2 Information on the significant CpG sites/island found for proportion of energy intake derived from fat and the top 10 most significant CpG sites/islands found for MUFA/SFA, PUFA/SFA, and (MUFA+PUFA)/SFA

The 10 most significant sites/islands/island shores found for MUFA/SFA included one CpG site associated with aldehyde dehydrogenase 3 family, member A2 (ALDH3A2) (P=0.00097), whose expression is reduced in insulin-resistant murine models.32 It also included a CpG site associated with sema domain, immunoglobulin domain (Ig), short basic domain, secreted, (semaphorin) 3G (SEMA3G) (P=0.0039), whose expression increases during adipogenesis.33 Among the top 10 found for PUFA/SFA, there was 1 CpG site associated with nuclear receptor coactivator 1 (NCOA1) (P=0.0072) and another 1 associated with PC-esterase domain containing 1A (PCED1A) (P=0.0091), as well as an island shore associated with phosphodiesterase 3A, cGMP-inhibited (PDE3A) (P=0.0066; Table 2).

There were only 4 sites and 1 island shore found significant for all fatty acid ratios, but 86 sites/islands/island shores in common between MUFA/SFA and (MUFA+PUFA)/SFA, and 7 in common between PUFA/SFA and (MUFA+PUFA)/SFA (Figure 2). Notably, the four sites found significant for all fatty acid ratios contained some of previously mentioned sites (Table 2), for example, the ones associated with NCOA1 (P=0.0031) (Figure 3a) and PCED1A (P=0.0031) (Figure 3b). It also included an island shore associated with CCNA2 (Figure 3c), a gene recently shown to be associated with serum phosphatidylcholine concentration in mice.34

Figure 2
figure 2

Venn diagram of the significant CpG sites and islands found for MUFA/SFA, PUFA/SFA, and (MUFA+PUFA)/SFA.

Figure 3
figure 3

Correlation between methylation of three sites associated with NCOA1 (a), PCED1A (b), CCNA2 (c), and (MUFA+PUFA)/SFA. Coeff (coefficient) of the linear model associated with (MUFA+PUFA)/SFA; full triangles, obese girls (n=23); full circles, obese boys (n=11); empty triangles, normal-weight girls (n=22); empty circles, normal-weight boys (n=10).

In which pathways are the significant genes involved?

Instead of going through all the genes associated with the significant sites found for MUFA/SFA, PUFA/SFA, and (MUFA+PUFA)/SFA, it was preferred to perform a gene enrichment analysis. Using CPDP,27 we identified the significant pathways for each of the fatty acid ratios. We considered a pathway significant if the significant CpG sites/island/island shores were associated with a high proportion of genes involved in this particular pathway.

Neither CPDB nor g:Profiler identified significant pathways for MUFA/SFA, but CPDB found 34 significant pathways for PUFA/SFA (Supplementary Table 4), including 1 group of pathways related to adipogenesis and mechanism of gene regulation by peroxisome proliferators via PPARα (Group 1, Figure 4a), and another group of pathways related to leptin and IL6 (Group 2, Figure 4a). Five significant pathways were identified for (MUFA+PUFA)/SFA using CPDB (Supplementary Table 4), including one group of pathways linked to NF-κB (Group 1, Figure 4b). g:Profiler identified only one significant pathway for (MUFA+PUFA)/SFA, also linked to NF-κB (IKKβ phosphorylates IkB causing NF-κB to dissociate, P-value=0.041).

Figure 4
figure 4

Heatmap representation of the proportion of shared genes between the significant pathways found for PUFA/SFA (a) and (MUFA+PUFA)/SFA (b). Each significant pathway retrieved from CPDB (P-value<0.05 and q-value<0.05) is represented on the graph, along with the database it comes from. A stronger color indicates a higher proportion of shared genes between two pathways. (a) Red rectangle 1: group of pathways related to adipogenesis and mechanism of gene regulation by peroxisome proliferators via PPARα; red rectangle 2: group of pathways related to leptin and IL6. (b) Red rectangle 1: group of pathways related to NF-κB. The full colour version of this figure is available at European Journal of Human Genetics online.


In the present study of Greek preadolescents, we found a large number of CpG sites and regions significantly associated with variables related to the quality of fat intake and few sites significantly associated with variables related to the quantity of fat intake.

Our findings suggest that fat quality is likely to influence DNA methylation on a large genomic scale. NCOA1, one of the most significant gene found for all fatty acids ratios, is involved in the mechanism of gene regulation by peroxisome proliferators via PPARα, a master gene whose regulation is altered in obesity.35 NCOA1 is a transcriptional coactivator whose ablation confers susceptibility to diet-induced obesity.36 Interestingly, various fatty acids, but especially PUFAs, act as ligands for PPARα. Moreover, along with PDE3A, the fifth most significant gene found for PUFA/SFA, NCOA1, is part of the leptin pathway. Leptin is an adipokine that has a key role in regulating energy intake by inhibiting the sensation of hunger.37 Fish oil has been reported to increase plasma leptin concentrations,38 and leptin induces the expression of NCOA1 in human cells.39 Besides, PDE3A’s expression is enhanced in cows fed with a diet enriched in fish oil or in SFA.40 Interestingly, an island shore located near the TSS of PDE3A was less methylated in children with a higher PUFA/SFA. All this information is consistent with the negative fold change observed for NCOA1 in our cohort.

There was substantial overlap between the significant sites/islands/island shores found for the different fatty acid ratios, but little overlap between all fatty acid ratios. This may reflect how MUFA and PUFA affect DNA methylation in a different way. Interestingly, the site associated with NCOA1 was more significant for (MUFA+PUFA)/SFA than for PUFA/SFA or MUFA/SFA, suggesting that PUFA and MUFA affect the methylation of this gene in an identical way. A similar observation can be made for PCED1A and CCNA2 that were more significant for (MUFA+PUFA)/SFA than for PUFA/SFA or MUFA/SFA. However, this may also be due to differences in power to detect significant correlations, as the fatty acids ratios distributions were quite different (Supplementary Figure 1).

It should be noted that two of the four individual CpG sites found to be significantly associated with the proportion of energy intake derived from fat might be relevant to obesity. It has been hypothesized that individuals with increased bitter taste sensitivity avoid antioxidant-rich vegetables because of their perceived bitterness, consuming instead sweet, fatty foods.41 The site associated with TAS2R13 was more methylated in children for whom fat represents a higher proportion of the total energy intake. In addition, children with a higher proportion of energy intake derived from fat had a higher methylation at a site located in an island shore near TXNIP, which is consistent with the observed downregulation of TXNIP by SFA uptake.31 None of these genes were previously reported to be differentially methylated depending on fat intake, probably because the methylation assays of previous studies were limited in scope only addressing key genes.

The present work was not devoid of limitations. First of all, our sample size is limited (n=69) and therefore replication is needed to confirm our findings and to allow generalization to larger populations. Second, the fatty acid ratios investigated herein are among the most interesting to compare with respect to health, as their roles are heavily debated and researched. However, other fatty acids not examined in this study may reflect other aspects of the quality of fat intake. For example, unsaturated fatty acids includes trans unsaturated fatty acids, which have been demonstrated to have adverse effects on health.42 In addition, we did not separate n-3 and n-6 PUFA in our study, but these two fatty acids do not have the same effects; while both n-3 and n-6 PUFA have beneficial effects, an excess of n-6 PUFA can cause health disorders.43 DNA methylation was assessed in whole peripheral blood, which is the case for most epigenetic studies focused on nutrition, as peripheral changes may occur in relation to overall energy balance.44 However, the methylation pattern observed in blood may not always reflect the pattern in other tissues.45 The other weakness of this approach is that DNA methylation can vary by blood cell type, and thus the methylation changes associated with the variables investigated in this study may represent an alteration in blood cell composition, rather than a change in methylation. However, no correlation was found between any of the investigated variables and the relative proportions of granulocytes, lymphocytes, or mid cells (P-value>0.05 on Spearman’s correlation test). Finally, an increasing number of human studies suggest that parental BMI impacts DNA methylation in the offspring, especially at imprinted genes.46, 47, 48 However, evidences in humans are still scarce and limited to two available tissues at birth: umbilical cord and/or placenta; thus, we did not take parental BMI into account in our analysis.

In conclusion, this study is the first to demonstrate the roles of fat quantity and quality in DNA methylation patterns at a genome-wide scale. Our results suggest that specific changes in DNA methylation may have an important role in the mechanisms involved in the physiological responses to different types of dietary fat. Future studies could reveal other potential impacts of dietary fat quality on DNA methylation in controlled, randomized designs, and perhaps investigate further the downstream effects of this process.