Introduction

Elevated serum concentration of low-density lipoprotein cholesterol (LDL-C) is an important causal risk factor for cardiovascular disease (CVD)1,2,3 and has now largely replaced total cholesterol (TC) as the primary treatment target for dyslipidemia1. Reduced serum concentration of high-density lipoprotein cholesterol (HDL-C) is independently associated with CVD4,5, in genetic studies however, HDL-C has not been causally associated with CVD3. Serum lipid concentrations (in the following named as lipids) are complex human traits that are influenced by both genetic and lifestyle factors6. With regard to health inequalities, research has shown that indicators of socioeconomic position (SEP) such as educational attainment, household income and employment status are also associated with lipids7,8,9,10,11. However, the association has been of different strength with regard to different SEP indicators, heterogeneous between men and women and appears to be of different direction in developed and developing countries. In developed countries a more unfavorable serum lipid profile (higher LDL-C and TC levels and lower HDL-C levels) is observed in lower SEP groups8,10,11,12,13.

The genome-wide association study (GWAS) meta-analysis from the Global Lipids Genetics Consortium (GLGC) in 2013 identified 157 loci associated with lipids accounting for up to 12% of the variance in each lipid trait14. Subsequent large-scale GWAS demonstrated enhanced gene discovery from expanded sample sizes, independently reporting additional novel loci in different ancestry groups, accounting for up to 20% of the variance in lipids15,16,17. In contrast family-based association studies have indicated that 30–70% of the variance in lipids is genetically based18,19.

It is assumed that gene-environment (G × E) interactions, where the effect of some gene variants depend on specific environmental exposures20, may account for parts of the unexplained variance of lipids21. G×E interactions for lipids have been observed for life style risk factors such as obesity, smoking, diet, physical activity, diabetes and sleep duration22,23,24,25,26,27,28,29. While some studies showed stronger effects of lipid-increasing alleles in groups reporting a more favorable life style such as lean or physically active individuals27,28, Bentley et al. also reported SNPs with stronger effects in smokers as an example for an unfavorable life style24. So far interactions of lipid-related effect alleles with SEP have not been investigated in adults, but G×SEP interactions on overall health and other traits such as body mass index (BMI) have been indicated by twin and population based studies30,31,32. Li et al. reported an interaction between one SNP related with a metabolically obese, normal weight phenotype including high TC levels and a composite score including parental education and household income on metabolically obese, normal weight children33.

SEP can be considered as a context-defining variable that describes certain risk constellations such as unequally distributed environmental, psychosocial and behavioral health risk factors and may be better suited for describing health-related environments as a whole rather than single risk factors. Thus, investigating interactions between lipid-associated loci and SEP indicators may be crucial for identifying subgroups for which genetic effects show stronger signals than for the average population34 and who may benefit from genotype-based targeted intervention35.

The aim of this study was to investigate whether the SEP indicators education and income interact with genetic sum scores of lipid-increasing effect alleles (GES) for HDL-C (GESHDL-C), LDL-C (GESLDL-C) and TC (GESTC) in a population-based cohort study. In order to explore whether any detected SEP interactions can be explained by SEP-associated life style risk factors, information on smoking, BMI, physical activity, alcohol consumption and diabetes mellitus was included in the interaction analysis. Main results of GES based on the GLGC GWAS meta-analysis14 were compared to the results of GES also including loci of more recently published GWAS to check for differences in lipid prediction and GES×SEP interaction.

Methods

Study population

All analysis are based on baseline data of the Heinz Nixdorf Recall Study (Risk factors evaluation of coronary calcium and lifestyle cohort), a population-based prospective cohort study. Details on the rationale of the study, study design, sampling methods, response rate, and data collection have been published in detail previously36,37. In brief, 4814 participants aged 45–74 years were randomly selected between 2000 and 2003 from mandatory registries of residence of the cities Bochum, Essen and Mülheim/Ruhr within the densely populated Ruhr metropolitan area in Germany. All participants gave written informed consent. The study has been approved by the institutional ethics committee of the University Duisburg-Essen and was conducted according to the guidelines and recommendations for ensuring Good Epidemiological Practice38. An extended quality management procedure and certification according to DIN ISO 9001:2000 was established. All study participants were of European ancestry and thus genetically very homogeneous.

Data collection

At study baseline, standard enzymatic methods (homogeneous direct determination with OPERA measuring system) were used to measure HDL-C, LDL-C and TC within 12 h after blood serum collection at the central laboratory of the University Hospital of Essen, Germany. Participants were asked to fast for at least 4 h before examination resulting in 60% of subjects with fasting status > 8 h, 2% 6–8 h, 5% 4–6 h, 26% 2–4 h and 7% with < 2 h of fasting. The fasting duration was on average 9.7 h (SD: ± 4.9 h), with no difference between men and women39. Information on educational attainment, household income, smoking status, physical activity and diagnosis of diabetes mellitus or use of anti-diabetic medication was collected at study baseline in standardized computer-assisted face-to-face interviews. Information on alcohol consumption was collected with a self-administered questionnaire. Education was defined by combining school and vocational training as total years of formal education according to the International Standard Classification of Education (ISCED 97)40. Years of education were categorized into three groups with ≤ 10 years (equivalent to a minimum compulsory school attendance with no additional vocational degree), 11–13 years (equivalent to upper secondary educational degrees or a combination of lower secondary education and vocational training), and ≥ 14 years of education (equivalent to a vocational training including additional qualification or a university degree). Income was measured as the monthly household equivalent income calculated by dividing the participants’ household net income by a weighting factor for each household member41. Income was categorized into three groups using sex-specific tertiles. In order to take account for their different mechanisms in causing health inequalities, both SEP indicators were analyzed separately42,43. The body-mass-index (BMI) was calculated based on standardized measurements of body weight (in underwear) and height (kg/m2). Physical activity was defined as exercising one and more times per week versus no weekly engagement in physical exercise. Smoking status was dichotomized for analyses as current smoker (smoking cigarettes during the past 12 months) versus former and never smoker. Alcohol intake was estimated as gram of pure alcohol per week using information on the number of alcoholic drinks usually consumed in a week by type of drink (i.e., beer, wine, sparkling wine, spirits). Participants were classified as diabetics if they reported a diagnosis of diabetes mellitus, or if a fasting blood glucose level ≥ 126 mg/dl, a postprandial blood glucose level ≥ 200 mg/dl was found, or if the use of anti-diabetic medication was documented. All variables have been checked in an ongoing data quality control during the baseline examination period.

Genetic data

Lymphocyte DNA was isolated from EDTA anti-coagulated venous blood using the Chemagic Magnetic Separation Module I (Chemagen, Baesweiler, Germany). Genotyping was performed by matrix-assisted laser desorption ionization-time of flight mass spectrometry-based iPLEX Gold assay at the Department of Genomics, Life and Brain Center, Bonn Germany using two different Illumina microarrays (Metabochip, Global Screening Array (GSA); Illumina, San Diego, USA) according to the manufacturer’s protocols. Genotype imputation was carried out using IMPUTE v.2.3.044. Quality control was applied prior to imputation and performed on subject level including sex-, ethnicity- and relatedness-checks, excluding subjects with missing genotype data > 10%. Further, single nucleotide polymorphisms (SNPs) with a missing genotype frequency > 10% were excluded.

Using the GLGC GWAS meta-analyses14 and all afterwards published large-scale GWAS15,16,17, 152 HDL-C-, 108 LDL-C- and 131 TC-associated SNPs at genome-wide significance level of p < 5 × 10–8 in study populations of European ancestry have been selected for analysis (Supplementary Table S1). Of these, 102, 85, and 103 genotyped SNPs or a proxy were found for HDL-C, LDL-C and TC, respectively, on the Metabochip. Additionally, 3/2/3 HDL-C/LDL-C/TC SNPs were found in imputed data of the Metabochip and 18/8/11 HDL-C/LDL-C/TC SNPs or a proxy were found on the GSA (Supplementary Table S1S4). A proxy was defined as a SNP within a linkage disequilibrium (LD) ≥ 0.8. For all SNPs included in the analysis, no deviation from Hardy–Weinberg equilibrium was found (p ≤ 1 × 10–6). LD-based SNP pruning was performed with PLINK to exclude selected SNPs correlated with an LD ≥ 0.8 before calculating the GESLipid. However, all SNPs included represented independent loci and none of the SNPs had to be pruned out. Two different GESLipid were calculated for each lipid trait: GESLipid based on14 and an extended GESLipid-EXT based on14,15,16,17. In detail, the genetic sum scores of effect alleles for HDL-C (GESHDL-C), LDL-C (GESLDL-C) and TC (GESTC) were calculated by aggregating the total number of lipid-increasing effect alleles (0,1 or 2) for each individual from the Heinz Nixdorf Recall Study population across the selected SNPs based on the GLGC GWAS meta-analyses14. For comparison, extended GESLipid-EXT have been calculated for each trait by additionally adding the number of lipid-increasing effect alleles of selected SNPs based on all afterwards published large-scale GWAS (i.e., GESHDL-C-EXT, GESLDL-C-EXT, GESTC-EXT)15,16,17. Imputation of missing genotype information was based on the study sample’s effect allele frequencies according to the PLINK scoring routine45.

Statistical analyses

Out of the study population (n = 4814) all participants without genetic information (n = 296) and missing values for all three lipids (HDL-C, LDL-C and TC) (n = 22) were excluded from the analysis, leading to an analysis population of 4516 study participants (50.0% women) (Supplementary Figure S1). Participants with missing information on education (n = 13) and income (n = 283), were excluded from respective analysis. Compared to the analysis population, participants with missing genetic information as well as missing values in lipids and SEP indicators did not differ substantially regarding the main variables included in the analysis.

First, sex- and age-adjusted linear regression models were fitted to calculate effect size estimates and their corresponding 95% confidence intervals (95% CIs) for the association of education, income and the respective GESLipid (and GESLipid-EXT) with each lipid trait. The explained variance of GESLipid (and GESLipid-EXT) on lipids was calculated with a non-adjusted linear regression model. Second, the GESLipid (and GESLipid-EXT) and SEP main effects as well as GES×SEP interaction terms were included into sex- and age-adjusted linear regression models to investigate GES×SEP interactions. Third, the genetic effect of GESLipid (and GESLipid-EXT) on each lipid trait was calculated stratified by education groups, income tertiles and diabetes status. Fourth, all possible combinations of GESLipid (and GESLipid-EXT) tertiles and SEP groups were entered into regression models as dummy variables to calculate single reference joint effects of the GES and the SEP indicators, using the group with the highest SEP and the lowest GESLDL-C/TC tertile and accordingly the highest SEP and the highest GESHDL-C tertile as reference. Additionally, absolute measures of lipids in each of the different combinations of SEP groups and GESLipid tertiles were calculated. Fifth, to analyze whether GES×SEP interactions may be affected by underlying interactions between GESLipid and SEP-related life style risk factors, smoking (S), BMI, physical activity (PA), alcohol consumption (A) or diabetes mellitus (D) main effects and the respective GES×S/BMI/PA/A/D interaction terms in addition to an SEP×S/BMI/PA/A/D interaction term were included in the interaction model separately for each life style risk factor. Single SNP main effect and single SNP interaction analysis between SNPs and education groups were performed for all SNPs used in the GESLipid. Education is entered as a dummy variable and only the results of the lowest education group compared to the highest education group as reference were presented. Additionally, participants with lipid-lowering medication (n = 557) were considered in sensitivity analyses by (1) adjusting the main results of GESLipid, education and GES×Education for lipid-lowering medication and by (2) excluding individuals with lipid-lowering medication. The LD-based pruning and the calculation of GES and single SNP analyses were performed with using PLINK v1.07 software package45 and RStudio v3.6.046. For all other analyses SAS software v9.447 was used. All GESLipid- and extended GESLipid-EXT-related beta coefficients and 95% CIs (except for the single reference joint effect analysis) were standardized by multiplying the coefficients by the standard deviation of the respective GESLipid or the respective extended GESLipid-EXT to facilitate comparability of each GES.

Results

In the analysis population, the mean age (± standard deviation) was 59.6 ± 7.8 years and the mean serum lipid concentration (± standard deviation) were 58.16 ± 17.29 mg/dl for HDL-C, 145.41 ± 36.21 mg/dl for LDL-C and 229.26 ± 39.18 mg/dl for TC (Table 1). 11.4% had less than or equal 10 years of education and 33.0% had more than or equal 14 years of education. The median income was 1448.7 Euro/month. The mean number of effect alleles were 78.3 ± 5.1 for the GESHDL-C (GESHDL-C-EXT: 136.4 ± 6.6), 55.1 ± 4.5 for the GESLDL-C (GESLDL-C-EXT: 93.6 ± 5.9) and 75.7 ± 5.2 for the GESTC (GESTC-EXT: 126.2 ± 6.6). 615 (13.6%) participants had diabetes mellitus. Correlation matrix of lipid phenotypes, GESLipid, education and income is shown in Supplementary Table S5.

Table 1 Characteristics of analysis population.

SEP inequalities in HDL-C and LDL-C were found in the study population with worse HDL-C and LDL-C profiles observed in lower income and education groups (Table 2). SEP inequalities in TC were not seen. Participants in the lowest education group (≤ 10 years) had a 4.14 (95%-CI: -5.82, −2.47) mg/dl lower HDL-C, a 4.23 (95%-CI: 0.37, 8.09) mg/dl higher LDL-C and 3.34 (95%-CI: −0.81, 7.49) mg/dl higher TC level compared to the participants in the highest education group (≥ 14 years) with similar patterns for income (Table 2). On average, a 2.91 (95%-CI: (2.45, 3.37)) mg/dl higher HDL-C, a 6.17 (95%-CI: 5.13, 7.20) mg/dl higher LDL-C and a 7.33 (95%-CI: 6.24, 8.48) mg/dl higher TC were seen per standard deviation of the respective GESLipid (Table 2). The explained variance (R2) of GESHDL-C/ GESLDL-C/ GESTC on their respective lipid was 2.9/ 2.9/ 3.6%.

Table 2 Sex- and age- adjusted effects per GESLipid standard deviation and corresponding 95% confidence intervals (95% CI) on high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C) and total cholesterol (TC) in linear regression models including main effects of education groups (≤ 10 years/11–13 years/≥ 14 years), income tertiles and genetic effect allele sum scores (GESLipid) based on14.

In the linear regression analysis including interaction terms, effect size estimates of interaction terms showed stronger indication of GESLipid by low education interaction for LDL-C (ßGES×Education: −3.87; 95%-CI: −7.47, −0.32) compared to TC (ßGES×Education: −3.64; 95%-CI: −7.44; 0.16) and HDL-C (ßGES×Education: −0.56; 95%-CI: −2.09; 1.02) using the highest education group as reference (Table 3). The negative interaction coefficient showed that in the lower education group genetic effects of GESLipid were less strong. The effect size estimates for the GESLipid by income interaction were directionally consistent, except for TC, but substantially smaller in magnitude.

Table 3 Sex- and age- adjusted effects per GESLipid standard deviation and corresponding 95% confidence intervals (95% CI) on high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C) and total cholesterol (TC) in linear regression models including main effects of a lipid-associated genetic effect allele sum score (GESLipid based on14), indicators of socioeconomic position (SEP; education groups and income tertiles) and interaction terms of GESLipid and indicators of SEP.

In the stratified analysis, the two higher education groups compared to the lowest showed stronger genetic effect size estimates per GESLipid standard deviation for LDL-C and TC, supporting the results of the interaction analysis (Fig. 1). The results for HDL-C followed the same pattern, but the difference in effect size between the highest and the lowest education group was considerably less strong, while the 95% confidence interval of the effect in the lowest education group was completely overlapping with the 95% confidence intervals of both higher education groups. The results of the stratified analysis for income did not follow a clear pattern (Fig. 1). The partial R2 (explained proportion of variance) of the GESLDL-C on LDL-C and GESTC on TC in the respective education groups was higher in the two higher education groups (LDL-C: high education group R2 = 0.033, middle education group R2 = 0.035; TC: high education group R2 = 0.044, middle education group R2 = 0.038) compared to the lower education group (LDL-C: R2 = 0.005; TC: R2 = 0.013). For Income and HDL-C this trend could not be observed.

Figure 1
figure 1

Sex-, age-adjusted effects per GESLipid standard deviation and corresponding 95% confidence interval (95% CI) of the genetic effect on (a) high-density lipoprotein cholesterol (HDL-C), (b) low-density lipoprotein cholesterol (LDL-C) and (c) total cholesterol (TC), stratified by education groups (years) and income tertiles in linear regression models using the GESLipid based on14.

The analysis of single reference joint effects for lipids describe the relationship between SEP and GESLipid on lipids in detail by comparing effects of different combinations of SEP groups and GESLipid tertiles. Each beta estimate represent the increase in lipids of the specific group compared to the reference group. Reference group was selected as the combination of GESLDL-C and SEP group representing the lowest CVD risk (equally applied for GESTC and GESHDL-C). For HDL-C beta estimates showed a downward trend between and within education groups with decreasing years of education and decreasing number of effect alleles. Compared to the reference group with the highest education and highest GESHDL-C, participants with the lowest education and the lowest GESHDL-C showed a 9.85 mg/dl lower HDL-C level. Slightly smaller joint effects were observed for income and GESHDL-C on HDL-C still following the same pattern (Fig. 2). The joint effects for LDL-C and TC showed an upward trend with decreasing years of education and increasing number of effect alleles. Participants with highest CVD-risk (highest GESLipid and lowest education) had a 13.36 mg/dl higher LDL-C and a 15.40 mg/dl higher TC level than those with the lowest CVD-risk (highest education and lowest GESLipid). Slightly stronger joint effects were observed for income and GESLipid on LDL-C and TC (Figs. 3, 4). The absolute measures of lipids in each of the different combinations of SEP groups and GESLipid tertiles show the same pattern as in the single reference joint effect analysis (Supplementary Figures S3S5). Participants in the highest GESLipid tertile and the lowest education had on average 8.8/14.0/21.1 mg/dl higher HDL-C/LDL-C/TC level than participants in the lowest GESLipid tertiles and the highest education. Almost similar measures have been observed for income and GESLipid tertiles.

Figure 2
figure 2

Sex- and age-adjusted effects and corresponding 95% confidence intervals (95% CI) on high-density lipoprotein cholesterol (HDL-C) in linear regression models for single reference joint effects of tertiles of a HDL-C-associated genetic effect allele sum score (GESHDL-C based on14) and socioeconomic position indicators, calculated separately for education groups and income tertiles, with the group of having a high GESHDL-C and the highest socioeconomic position as reference.

Figure 3
figure 3

Sex- and age-adjusted effects and corresponding 95% confidence intervals (95% CI) on low-density lipoprotein cholesterol (LDL-C) in linear regression models for single reference joint effects of tertiles of a LDL-C-associated genetic effect allele sum score (GESLDL-C based on14) and socioeconomic position indicators, calculated separately for education groups and income tertiles, with the group of having a low GESLDL-C and the highest socioeconomic position as reference.

Figure 4
figure 4

Sex- and age-adjusted effects and corresponding 95% confidence intervals (95% CI) on total cholesterol (TC) in linear regression models for single reference joint effects of tertiles of a TC-associated genetic effect allele sum score (GESTC based on14) and socioeconomic position indicators, calculated separately for education groups and income tertiles, with the group of having a low GESTC and the highest socioeconomic position as reference.

After including interaction terms of smoking, BMI, physical activity, alcohol consumption and diabetes separately into the GESLDL-C×Education interaction model, we observed a GESLDL-C×Diabetes interaction effect (ßGES×Diabetes: −4.46; 95%-CI: −7.38, −1.53) indicating less strong genetic effects on LDL-C in diabetics compared to non-diabetics (Table 4), which was also observed in the stratified analysis (Supplementary Figure S6). Including the GESLDL-C×Diabetes interaction effect also partly explained the GESLDL-C×Education interaction effect of the lowest compared to the highest education group, as the effect estimate was attenuated (ßGES×Education: −3.42; 95%-CI: −6.98, 0.18) (Table 4). The GESLDL-C×Education interaction was not affected by other life style risk factors, as the respective GESLDL-C×low education interaction effect size estimates did not change in magnitude after including smoking, BMI, physical activity and alcohol consumption in the regression models (Table 4). In addition, results did not indicate GESLDL-C by life style risk factor interactions.

Table 4 Sex- and age- adjusted effects per GESLDL-C standard deviation and corresponding 95% confidence intervals (95% CI) on LDL-C in linear regression models including main effects and interaction terms of a LDL-C-associated genetic effect allele score (GESLDL-C based on14), education groups and SEP-related life style risk factors (i.e., current smoking [S], BMI, physical activity [PA], alcohol consumption [A; per 100 g/week]) and diabetes mellitus [D].

Results of the extended GESLipid-EXT, which included additional SNPs selected from recent large-scale GWAS, showed overall smaller effect size estimate per GESLipid-EXT standard deviation for all three lipid traits compared to the GESLipid (Supplementary Table S6). The explained variance (R2) of the extended GESHDL-C-EXT was slightly higher (3.2%) compared to the GESHDL-C and for the extended GESLDL-C-EXT (2.6%) and GESTC-EXT (3.4%) slightly lower compared to the GESLDL-C and GESTC. In the extended GESLipid-EXT by SEP indicator interaction analysis, using the highest SEP groups as reference, effect size estimates of interaction terms were overall slightly smaller in magnitude for all three lipid levels compared to the GESLipid (Supplementary Table S7). Effects of the extended GESLipid-EXT stratified by education groups showed similar patterns for HDL-C, LDL-C, and TC compared to the GESLipid. However, differences in the genetic effects between education groups were less strong in magnitude, while for income no difference in the genetic effects were observed (Supplementary Figure S2). The downward trend between and within education groups with decreasing years of education and decreasing number of effect alleles in the analysis of single reference joint effects for HDL-C and the upward trend for LDL-C and TC for both SEP indicators also showed the same pattern using the extended GESLipid-EXT compared to the GESLipid (Supplementary Tables S8S9).

Results of the single SNP main effect of all SNPs used in the GESLipid are presented in Supplementary Tables S2S4. 61 out of 71 HDL-C-, 52 out of 58 LDL-C-, 65 out of 74 TC-associated SNPs were directionally consistent. Single SNP interaction analysis for education and lipid-associated SNPs showed that some SNPs contributed with stronger effects to the observed GESLDL-C by low education interaction on LDL-C in relation to the remaining SNPs (Supplementary Table S10). Similar differences of single SNP interaction effect size estimates were present for HDL-C- and TC-associated SNPs (Supplementary Tables S11S12). Two of the LDL-C-associated loci with the strongest indication for interaction with education (i.e., PCSK9, MAFB) were also upon the strongest TC-associated SNPs. The two HDL-C-associated loci with the strongest indication for interaction with education were OR4C46 and LPL.

In the sensitivity analysis, main results for all lipid traits did not differ in direction and only slightly in magnitude after adjustment for lipid-lowering medication as well as after the exclusion of participants on lipid-lowering medication (Supplementary Tables S13S14). The GESLDL-C by low education interaction effect size estimates did not change substantially compared to the main analysis population.

Discussion

The aim of the study was to investigate whether the SEP indicators education and income interact with genetic sum scores of lipid-increasing effect alleles in a population-based cohort study. To the best of our knowledge, this was the first study investigating G×E interaction with SEP as environmental factor on lipids in adults. Results gave some indication for an interaction between the GESLipid and the SEP indicator education, which was strongest for LDL-C. This was supported by stratified analysis in which the strongest genetic effects on LDL-C were observed in the high education group as well as by single reference joint effect analysis. After including information on smoking, BMI, physical activity, alcohol consumption and diabetes mellitus into analysis, there was an indication that a GESLDL-C by diabetes mellitus interaction partly explained the observed GESLDL-C by low education interaction. Using the extended GESLipid-EXT in comparison to the GESLipid, effect size measures were smaller but directionally consistent. Li et al. reported an interaction between rs2206734 SNP (CDKAL1), a favorable childhood environment and birthweight on metabolically obese, normal weight phenotype (defined as the presence of hypertension, hypertriglyceridemia, low serum HDL-C or impaired fasting plasma concentrations of glucose) in Chinese children. Their findings suggest that a favorable childhood environment represented by a composite score consisting of parental education, annual household income, high physical activity and fruit consumption can further amplify a protective effect of the CDKAL1 locus in children with a pediatric metabolic syndrome and high birthweight33. However, this study investigated a composite environmental score with parental SEP on a composite children’s phenotype in Chinese and can therefore only indirectly be compared with present study results.

Recent studies have investigated statistical interactions between genetic risk scores and SEP-related lifestyle factors or health behaviors such as physical activity, dietary patterns and BMI and their effects on lipids27,28. Cole et al. (2014) have demonstrated in a population of European ancestry that the effect of a genetic risk score consisting of HDL-C-increasing alleles has been stronger for lean than for obese (BMI ≥ 35 kg/m2) study participants. These interactions have been largely driven by the SNPs rs3764261 (CETP), rs4846914 (GALNT2), rs7241918 (LIPG) and rs6065906 (PLTP)28. As high education is strongly associated with low BMI48, these results may at least partly reflect the results of the present study. However, in the present study the SNPs representing the loci CETP, GALNT2, LIPG and PLTP did not show indication for SNP by education interaction on HDL-C.

Justesen et al. (2015) have reported an interaction between a genetic risk score of HDL-C-decreasing effect alleles and physical activity in a Danish population (n = 5961), suggesting that the genetic risk score has exerted a smaller effect in physically active compared to inactive individuals. However, this interaction was statistically not significant in a replication cohort of smaller sample size27. As higher education is usually associated with a higher level of physical activity49, the results of the present study may represent the same interaction signal, i.e. a stronger genetic effect of HDL-C-increasing effect alleles on HDL-C in population groups of higher education groups.

Recent SNP×E interaction analyses have identified several lipid-associated loci interacting with lifestyle factors such as smoking and diet50,51,52. While Junyent et al. (2009) have reported an interaction of rs6720173 (ABCG5) and rs11887534, rs6709904, rs4148217 (ABCG8) with smoking50, Lu et al. (2010) have shown an interaction of rs174546 (FADS1) with intake of n-3 and n-6 polyunsaturated fatty acid51 and Kim et al. (2013) have demonstrated an interaction of rs2072183 (NPC1L1) with cholesterol intake in male only52. Although SEP is strongly associated with smoking and dietary factors, none of these loci showed indication for interaction with education in the present study. The GESLDL-C×Diabetes mellitus interaction on LDL-C observed in the present study partly explained the GESLDL-C×Education interaction. Deng et al. found a Gene×Diabetes mellitus interaction. In their study the SNP rs16996148 (CILP2) decreased the risk of hyperlipidemia, whereas rs16996148 GT/TT and diabetes mellitus as well as rs16996148 TT and diabetes mellitus increased the risk of hyperlipidemia29.

In the present analysis, smaller effects on lipids were observed for using the extended GESLipid-EXT compared to the GESLipid. This may be caused by the overall smaller effect size of newly discovered loci as a result of larger analysis populations in recent GWAS, making it possible to detect risk alleles with very small effects. It may also be due to the recently published GWAS meta-analyses that were based on single large cohorts potentially producing less generalizable study results16,17. The overall smaller main effects of the additional SNPs included in the extended GESLipid-EXT have led to smaller interaction effect size estimates, i.e. less strong impact of SEP on the expression of the average genetic effect of all SNPs included.

It was assumed in the present analysis that the GESLipid represent cumulative causal factors for lipids even if it is most likely that the SNPs used to construct the GESLipid are proxy markers in high LD with the causal genetic variants53. The effect of SEP, especially education, on lipids and CVD risk in general was also assumed to be causal, as supported by numerous studies54,55,56 including mendelian randomization studies exploring the association of instrumental variables with CVD and CVD risk factors by using genetic risk scores related to educational attainment57,58. However, SEP has no direct causal effect on CVD risk, but is mediated by a complex interplay of social inequalities in risk factors, e.g., access to preventive interventions, lifestyle factors, physiological stress, psychosocial risks, as well as in protective factors54,57,59. Results of the present study suggest that SEP may also have an effect on CVD risk by affecting the expression of LDL-C-related genetic risks. One possible mechanism that has been hypothesized in this regard is epigenetic modification. In contrast to an individual’s genome the epigenome is subject to environmentally induced changes during the life course, but is crucial for the regulation of gene expression. Several studies have indeed reported SEP-related differences in epigenetic markers60,61. Interestingly, the lifestyle factors BMI, physical activity, smoking and alcohol consumption did not account for the observed GESLDL-C by education interaction, while diabetes mellitus accounted for it only partly. Consequently, it has to be assumed that other risk factors besides those included in the present analysis may have a mediating effect on the found GESLDL-C by education interaction. One explanation for the stronger genetic effects on LDL-C in higher education groups may be that non-genetic health risks being of lower prevalence in high education groups leading to LDL-C profiles that are stronger affected by genetic than by non-genetic risk factors. This hypothesis is supported by the explained proportion of the variance (R2) of the GESLDL-C on LDL-C, which was higher in the two higher education groups compared to the lower education group.

The effect of SEP indicators on health is outcome specific and each indicator operates via different pathways linking social factors to health outcomes43,62. Even though educational attainment and income are moderately correlated (r = 0.45) in the present study, the different strength of genetic effect in education groups on LDL-C could not be seen in income tertiles. The net effect of education is reflected among others in the ability to turn health related information into behavior and facilitates understanding of therapeutic measures43. Which could support the hypothesis that in highly educated, due to the ability to create environments with less health risks, genetic influence on LDL-C might be stronger. Furthermore, education as a marker of childhood social environment could, due to the duration of exposure until adulthood, be more likely support epigenetic changes. Material resources do not seem to modify genetic risk on LDL-C.

Strengths of the present study were its population-based study sample and the use of two different individual SEP indicators in the analysis. Even though education and income are correlated SEP indicators, each of them represents certain aspects of SEP related to different health behaviors and risks. Moreover, two different GESLipid for each lipid trait were compared, allowing to check for differences in the genetic effects and G×E interactions between scores derived by different GWAS study populations. The sample size and the limited statistical power for single SNP analysis had to be mentioned as limitation of the present study. However, indication for interaction was based on the cumulative genetic risk of the study participants. Another limitation was the cross-sectional design of the study that does not allow for strong conclusions on causality of effects. However, educational attainment is usually acquired in adolescence or early adulthood and lipids were assessed at an older age in the present study. Due to this exposure-outcome temporality reverse causation is very unlikely. Even if the effect of education on lipids was not causal, a modification of the GESLDL-C effect on LDL-C by education would still be of interest, because the knowledge of the heterogeneous genetic effects in different education groups could be interesting for CVD risk prediction and genotype-based targeted interventions35. Furthermore, this knowledge supports CVD lifestyle-based interventions in lower education groups due to lower genetic effect on LDL-C in these groups. Finally, it cannot be excluded that the indication for a GESLDL-C × Education interaction could have been observed randomly due to the number of independent tests performed. However, the number of independent tests performed are justified through the three lipid end points and the two SEP indicators and we have calculated 95% confidence intervals to report the precision of the obtained effect size estimates. Furthermore, the interaction analyses results are supported by the results of the stratified and single reference joint effect analysis, which showed a constant pattern across phenotypes and indicated difference in genetic effect sizes between the education groups.

The results of the present study gave some indication for interaction between genetic variants associated with LDL-C and education in a population-based cohort study. Stronger genetic effects were observed in groups of higher education, which seemed to be partly mediated by diabetes mellitus but not by other life style risk factors such as BMI, smoking, physical activity and alcohol consumption. This gave supporting evidence that SEP has an impact on the expression of genetic susceptibility related to LDL-C. Further research is needed to replicate our findings in independent study samples, investigate possible biological mechanisms behind the interaction and to assess the potential of the found gene by SEP interactions for improving CVD prediction. Additionally, our study included only individuals of European origin and therefore the results may not be applied to populations of other ethnicities.