A genetic sum score of effect alleles associated with serum lipid concentrations interacts with educational attainment

High-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C), and total cholesterol (TC) levels are influenced by both genes and the environment. The aim was to investigate whether education and income as indicators of socioeconomic position (SEP) interact with lipid-increasing genetic effect allele scores (GES) in a population-based cohort. Using baseline data of 4516 study participants, age- and sex-adjusted linear regression models were fitted to investigate associations between GES and lipids stratified by SEP as well as including GES×SEP interaction terms. In the highest education group compared to the lowest stronger effects per GES standard deviation were observed for HDL-C (2.96 mg/dl [95%-CI: 2.19, 3.83] vs. 2.45 mg/dl [95%-CI: 1.12, 3.72]), LDL-C (6.57 mg/dl [95%-CI: 4.73, 8.37] vs. 2.66 mg/dl [95%-CI: −0.50, 5.76]) and TC (8.06 mg/dl [95%-CI: 6.14, 9.98] vs. 4.37 mg/dl [95%-CI: 0.94, 7.80]). Using the highest education group as reference, interaction terms showed indication of GES by low education interaction for LDL-C (ßGES×Education: −3.87; 95%-CI: −7.47, −0.32), which was slightly attenuated after controlling for GESLDL-C×Diabetes interaction (ßGES×Education: −3.42; 95%-CI: −6.98, 0.18). The present study showed stronger genetic effects on LDL-C in higher SEP groups and gave indication for a GESLDL-C×Education interaction, demonstrating the relevance of SEP for the expression of genetic health risks.


Scientific Reports
| (2021) 11:16541 | https://doi.org/10.1038/s41598-021-95970-z www.nature.com/scientificreports/ diabetes and sleep duration [22][23][24][25][26][27][28][29] . While some studies showed stronger effects of lipid-increasing alleles in groups reporting a more favorable life style such as lean or physically active individuals 27,28 ,Bentley et al. also reported SNPs with stronger effects in smokers as an example for an unfavorable life style 24 . So far interactions of lipidrelated effect alleles with SEP have not been investigated in adults, but G×SEP interactions on overall health and other traits such as body mass index (BMI) have been indicated by twin and population based studies [30][31][32] . Li et al. reported an interaction between one SNP related with a metabolically obese, normal weight phenotype including high TC levels and a composite score including parental education and household income on metabolically obese, normal weight children 33 . SEP can be considered as a context-defining variable that describes certain risk constellations such as unequally distributed environmental, psychosocial and behavioral health risk factors and may be better suited for describing health-related environments as a whole rather than single risk factors. Thus, investigating interactions between lipid-associated loci and SEP indicators may be crucial for identifying subgroups for which genetic effects show stronger signals than for the average population 34 and who may benefit from genotype-based targeted intervention 35 .
The aim of this study was to investigate whether the SEP indicators education and income interact with genetic sum scores of lipid-increasing effect alleles (GES) for HDL-C (GES HDL-C ), LDL-C (GES LDL-C ) and TC (GES TC ) in a population-based cohort study. In order to explore whether any detected SEP interactions can be explained by SEP-associated life style risk factors, information on smoking, BMI, physical activity, alcohol consumption and diabetes mellitus was included in the interaction analysis. Main results of GES based on the GLGC GWAS meta-analysis 14 were compared to the results of GES also including loci of more recently published GWAS to check for differences in lipid prediction and GES×SEP interaction.

Methods
Study population. All analysis are based on baseline data of the Heinz Nixdorf Recall Study (Risk factors evaluation of coronary calcium and lifestyle cohort), a population-based prospective cohort study. Details on the rationale of the study, study design, sampling methods, response rate, and data collection have been published in detail previously 36,37 . In brief, 4814 participants aged 45-74 years were randomly selected between 2000 and 2003 from mandatory registries of residence of the cities Bochum, Essen and Mülheim/Ruhr within the densely populated Ruhr metropolitan area in Germany. All participants gave written informed consent. The study has been approved by the institutional ethics committee of the University Duisburg-Essen and was conducted according to the guidelines and recommendations for ensuring Good Epidemiological Practice 38 . An extended quality management procedure and certification according to DIN ISO 9001:2000 was established. All study participants were of European ancestry and thus genetically very homogeneous. Data collection. At study baseline, standard enzymatic methods (homogeneous direct determination with OPERA measuring system) were used to measure HDL-C, LDL-C and TC within 12 h after blood serum collection at the central laboratory of the University Hospital of Essen, Germany. Participants were asked to fast for at least 4 h before examination resulting in 60% of subjects with fasting status > 8 h, 2% 6-8 h, 5% 4-6 h, 26% 2-4 h and 7% with < 2 h of fasting. The fasting duration was on average 9.7 h (SD: ± 4.9 h), with no difference between men and women 39 . Information on educational attainment, household income, smoking status, physical activity and diagnosis of diabetes mellitus or use of anti-diabetic medication was collected at study baseline in standardized computer-assisted face-to-face interviews. Information on alcohol consumption was collected with a self-administered questionnaire. Education was defined by combining school and vocational training as total years of formal education according to the International Standard Classification of Education (ISCED 97) 40 . Years of education were categorized into three groups with ≤ 10 years (equivalent to a minimum compulsory school attendance with no additional vocational degree), 11-13 years (equivalent to upper secondary educational degrees or a combination of lower secondary education and vocational training), and ≥ 14 years of education (equivalent to a vocational training including additional qualification or a university degree). Income was measured as the monthly household equivalent income calculated by dividing the participants' household net income by a weighting factor for each household member 41 . Income was categorized into three groups using sex-specific tertiles. In order to take account for their different mechanisms in causing health inequalities, both SEP indicators were analyzed separately 42,43 . The body-mass-index (BMI) was calculated based on standardized measurements of body weight (in underwear) and height (kg/m 2 ). Physical activity was defined as exercising one and more times per week versus no weekly engagement in physical exercise. Smoking status was dichotomized for analyses as current smoker (smoking cigarettes during the past 12 months) versus former and never smoker. Alcohol intake was estimated as gram of pure alcohol per week using information on the number of alcoholic drinks usually consumed in a week by type of drink (i.e., beer, wine, sparkling wine, spirits). Participants were classified as diabetics if they reported a diagnosis of diabetes mellitus, or if a fasting blood glucose level ≥ 126 mg/dl, a postprandial blood glucose level ≥ 200 mg/dl was found, or if the use of anti-diabetic medication was documented. All variables have been checked in an ongoing data quality control during the baseline examination period. Genetic data. Lymphocyte DNA was isolated from EDTA anti-coagulated venous blood using the Chemagic Magnetic Separation Module I (Chemagen, Baesweiler, Germany). Genotyping was performed by matrixassisted laser desorption ionization-time of flight mass spectrometry-based iPLEX Gold assay at the Department of Genomics, Life and Brain Center, Bonn Germany using two different Illumina microarrays (Metabochip, Global Screening Array (GSA); Illumina, San Diego, USA) according to the manufacturer's protocols. Genotype imputation was carried out using IMPUTE v.2.3.0 44  www.nature.com/scientificreports/ formed on subject level including sex-, ethnicity-and relatedness-checks, excluding subjects with missing genotype data > 10%. Further, single nucleotide polymorphisms (SNPs) with a missing genotype frequency > 10% were excluded. Using the GLGC GWAS meta-analyses 14 and all afterwards published large-scale GWAS [15][16][17] , 152 HDL-C-, 108 LDL-C-and 131 TC-associated SNPs at genome-wide significance level of p < 5 × 10 -8 in study populations of European ancestry have been selected for analysis (Supplementary Table S1). Of these, 102, 85, and 103 genotyped SNPs or a proxy were found for HDL-C, LDL-C and TC, respectively, on the Metabochip. Additionally, 3/2/3 HDL-C/LDL-C/TC SNPs were found in imputed data of the Metabochip and 18/8/11 HDL-C/LDL-C/TC SNPs or a proxy were found on the GSA (Supplementary Table S1-S4). A proxy was defined as a SNP within a linkage disequilibrium (LD) ≥ 0.8. For all SNPs included in the analysis, no deviation from Hardy-Weinberg equilibrium was found (p ≤ 1 × 10 -6 ). LD-based SNP pruning was performed with PLINK to exclude selected SNPs correlated with an LD ≥ 0.8 before calculating the GES Lipid . However, all SNPs included represented independent loci and none of the SNPs had to be pruned out. Two different GES Lipid were calculated for each lipid trait: GES Lipid based on 14 and an extended GES Lipid-EXT based on [14][15][16][17] . In detail, the genetic sum scores of effect alleles for HDL-C (GES HDL-C ), LDL-C (GES LDL-C ) and TC (GES TC ) were calculated by aggregating the total number of lipid-increasing effect alleles (0,1 or 2) for each individual from the Heinz Nixdorf Recall Study population across the selected SNPs based on the GLGC GWAS meta-analyses 14 . For comparison, extended GES Lipid-EXT have been calculated for each trait by additionally adding the number of lipid-increasing effect alleles of selected SNPs based on all afterwards published large-scale GWAS (i.e., GES HDL-C-EXT , GES LDL-C-EXT , GES TC-EXT ) [15][16][17] . Imputation of missing genotype information was based on the study sample's effect allele frequencies according to the PLINK scoring routine 45 .

Statistical analyses.
Out of the study population (n = 4814) all participants without genetic information (n = 296) and missing values for all three lipids (HDL-C, LDL-C and TC) (n = 22) were excluded from the analysis, leading to an analysis population of 4516 study participants (50.0% women) (Supplementary Figure S1). Participants with missing information on education (n = 13) and income (n = 283), were excluded from respective analysis. Compared to the analysis population, participants with missing genetic information as well as missing values in lipids and SEP indicators did not differ substantially regarding the main variables included in the analysis.
First, sex-and age-adjusted linear regression models were fitted to calculate effect size estimates and their corresponding 95% confidence intervals (95% CIs) for the association of education, income and the respective GES Lipid (and GES Lipid-EXT ) with each lipid trait. The explained variance of GES Lipid (and GES Lipid-EXT ) on lipids was calculated with a non-adjusted linear regression model. Second, the GES Lipid (and GES Lipid-EXT ) and SEP main effects as well as GES×SEP interaction terms were included into sex-and age-adjusted linear regression models to investigate GES×SEP interactions. Third, the genetic effect of GES Lipid (and GES Lipid-EXT ) on each lipid trait was calculated stratified by education groups, income tertiles and diabetes status. Fourth, all possible combinations of GES Lipid (and GES Lipid-EXT ) tertiles and SEP groups were entered into regression models as dummy variables to calculate single reference joint effects of the GES and the SEP indicators, using the group with the highest SEP and the lowest GES LDL-C/TC tertile and accordingly the highest SEP and the highest GES HDL-C tertile as reference. Additionally, absolute measures of lipids in each of the different combinations of SEP groups and GES Lipid tertiles were calculated. Fifth, to analyze whether GES×SEP interactions may be affected by underlying interactions between GES Lipid and SEP-related life style risk factors, smoking (S), BMI, physical activity (PA), alcohol consumption (A) or diabetes mellitus (D) main effects and the respective GES×S/BMI/PA/A/D interaction terms in addition to an SEP×S/BMI/PA/A/D interaction term were included in the interaction model separately for each life style risk factor. Single SNP main effect and single SNP interaction analysis between SNPs and education groups were performed for all SNPs used in the GES Lipid . Education is entered as a dummy variable and only the results of the lowest education group compared to the highest education group as reference were presented. Additionally, participants with lipid-lowering medication (n = 557) were considered in sensitivity analyses by (1) adjusting the main results of GES Lipid , education and GES×Education for lipid-lowering medication and by (2) excluding individuals with lipid-lowering medication. The LD-based pruning and the calculation of GES and single SNP analyses were performed with using PLINK v1.07 software package 45 and RStudio v3.6.0 46 . For all other analyses SAS software v9.4 47 was used. All GES Lipid -and extended GES Lipid-EXT -related beta coefficients and 95% CIs (except for the single reference joint effect analysis) were standardized by multiplying the coefficients by the standard deviation of the respective GES Lipid or the respective extended GES Lipid-EXT to facilitate comparability of each GES.

Results
In the analysis population, the mean age (± standard deviation) was 59.6 ± 7.8 years and the mean serum lipid concentration (± standard deviation) were 58.16 ± 17.29 mg/dl for HDL-C, 145.41 ± 36.21 mg/dl for LDL-C and 229.26 ± 39.18 mg/dl for TC (Table 1). 11.4% had less than or equal 10 years of education and 33.0% had more than or equal 14 years of education. The median income was 1448.7 Euro/month. The mean number of effect alleles were 78.3 ± 5.1 for the GES HDL-C (GES HDL-C-EXT : 136.4 ± 6.6), 55.1 ± 4.5 for the GES LDL-C (GES LDL-C-EXT : 93.6 ± 5.9) and 75.7 ± 5.2 for the GES TC (GES TC-EXT : 126.2 ± 6.6). 615 (13.6%) participants had diabetes mellitus. Correlation matrix of lipid phenotypes, GES Lipid , education and income is shown in Supplementary Table S5. SEP inequalities in HDL-C and LDL-C were found in the study population with worse HDL-C and LDL-C profiles observed in lower income and education groups ( Table 2). SEP inequalities in TC were not seen. Participants in the lowest education group (≤ 10 years) had a 4.14 (95%-CI: -5.82, −2.47) mg/dl lower HDL-C, a 4.23 (95%-CI: 0.37, 8.09) mg/dl higher LDL-C and 3.34 (95%-CI: −0.81, 7.49) mg/dl higher TC level compared to the participants in the highest education group (≥ 14 years) with similar patterns for income (  (Table 2). The explained variance (R 2 ) of GES HDL-C / GES LDL-C / GES TC on their respective lipid was 2.9/ 2.9/ 3.6%.
In the linear regression analysis including interaction terms, effect size estimates of interaction terms showed stronger indication of GES Lipid by low education interaction for LDL-C (ß GES×Education : −3.87; 95%-CI: −7.47, −0.32) compared to TC (ß GES×Education : −3.64; 95%-CI: −7.44; 0.16) and HDL-C (ß GES×Education : −0.56; 95%-CI: −2.09; 1.02) using the highest education group as reference ( Table 3). The negative interaction coefficient showed that in the lower education group genetic effects of GES Lipid were less strong. The effect size estimates for the GES Lipid by income interaction were directionally consistent, except for TC, but substantially smaller in magnitude.
In the stratified analysis, the two higher education groups compared to the lowest showed stronger genetic effect size estimates per GES Lipid standard deviation for LDL-C and TC, supporting the results of the interaction analysis (Fig. 1). The results for HDL-C followed the same pattern, but the difference in effect size between the highest and the lowest education group was considerably less strong, while the 95% confidence interval of the effect in the lowest education group was completely overlapping with the 95% confidence intervals of both higher education groups. The results of the stratified analysis for income did not follow a clear pattern (Fig. 1). The partial R 2 (explained proportion of variance) of the GES LDL-C on LDL-C and GES TC on TC in the respective education groups was higher in the two higher education groups (LDL-C: high education group R 2 = 0.033, middle education group R 2 = 0.035; TC: high education group R 2 = 0.044, middle education group R 2 = 0.038) compared to the lower education group (LDL-C: R 2 = 0.005; TC: R 2 = 0.013). For Income and HDL-C this trend could not be observed.
The analysis of single reference joint effects for lipids describe the relationship between SEP and GES Lipid on lipids in detail by comparing effects of different combinations of SEP groups and GES Lipid tertiles. Each beta estimate represent the increase in lipids of the specific group compared to the reference group. Reference group was selected as the combination of GES LDL-C and SEP group representing the lowest CVD risk (equally applied for GES TC and GES HDL-C ). For HDL-C beta estimates showed a downward trend between and within education groups with decreasing years of education and decreasing number of effect alleles. Compared to the reference group with the highest education and highest GES HDL-C , participants with the lowest education and the lowest Table 1. Characteristics of analysis population. n miss = number of participants with missing values, HDL-C = high-density lipoprotein cholesterol, LDL-C = low-density lipoprotein cholesterol, TC = total cholesterol, GES = genetic effect allele sum score, diabetes mellitus is defined as self-reported diabetes mellitus, or fasting blood glucose level ≥ 126 mg/dl or postprandial blood glucose level ≥ 200 mg/dl or if the use of anti-diabetic medication was documented. *Mean ± standard deviation (SD) # Proportion (%) † Median (first quartile-third quartile) § GES Lipid based on 14 + GES Lipid-EXT based on [14][15][16][17]  www.nature.com/scientificreports/ GES HDL-C showed a 9.85 mg/dl lower HDL-C level. Slightly smaller joint effects were observed for income and GES HDL-C on HDL-C still following the same pattern (Fig. 2). The joint effects for LDL-C and TC showed an upward trend with decreasing years of education and increasing number of effect alleles. Participants with highest CVD-risk (highest GES Lipid and lowest education) had a 13.36 mg/dl higher LDL-C and a 15.40 mg/dl higher TC level than those with the lowest CVD-risk (highest education and lowest GES Lipid ). Slightly stronger joint effects were observed for income and GES Lipid on LDL-C and TC (Figs. 3, 4). The absolute measures of lipids in each of the different combinations of SEP groups and GES Lipid tertiles show the same pattern as in the single reference joint effect analysis (Supplementary Figures S3-S5). Participants in the highest GES Lipid tertile and the lowest education had on average 8.8/14.0/21.1 mg/dl higher HDL-C/LDL-C/TC level than participants in the Table 2. Sex-and age-adjusted effects per GES Lipid standard deviation and corresponding 95% confidence intervals (95% CI) on high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C) and total cholesterol (TC) in linear regression models including main effects of education groups (≤ 10 years/11-13 years/≥ 14 years), income tertiles and genetic effect allele sum scores (GES Lipid ) based on 14 . After including interaction terms of smoking, BMI, physical activity, alcohol consumption and diabetes separately into the GES LDL-C ×Education interaction model, we observed a GES LDL-C ×Diabetes interaction effect (ß GES×Diabetes : −4.46; 95%-CI: −7.38, −1.53) indicating less strong genetic effects on LDL-C in diabetics compared to non-diabetics (Table 4), which was also observed in the stratified analysis (Supplementary Figure S6). Including the GES LDL-C ×Diabetes interaction effect also partly explained the GES LDL-C ×Education interaction effect of the lowest compared to the highest education group, as the effect estimate was attenuated (ß GES×Education : −3.42; 95%-CI: −6.98, 0.18) ( Table 4). The GES LDL-C ×Education interaction was not affected by other life style risk factors, as the respective GES LDL-C ×low education interaction effect size estimates did not change in magnitude after including smoking, BMI, physical activity and alcohol consumption in the regression models (Table 4). In addition, results did not indicate GES LDL-C by life style risk factor interactions.
Results of the extended GES Lipid-EXT , which included additional SNPs selected from recent large-scale GWAS, showed overall smaller effect size estimate per GES Lipid-EXT standard deviation for all three lipid traits compared to the GES Lipid (Supplementary Table S6). The explained variance (R 2 ) of the extended GES HDL-C-EXT was slightly higher (3.2%) compared to the GES HDL-C and for the extended GES LDL-C-EXT (2.6%) and GES TC-EXT (3.4%) slightly lower compared to the GES LDL-C and GES TC . In the extended GES Lipid-EXT by SEP indicator interaction analysis, using the highest SEP groups as reference, effect size estimates of interaction terms were overall slightly smaller in magnitude for all three lipid levels compared to the GES Lipid (Supplementary Table S7). Effects of the extended GES Lipid-EXT stratified by education groups showed similar patterns for HDL-C, LDL-C, and TC compared to the GES Lipid . However, differences in the genetic effects between education groups were less strong in magnitude, while for income no difference in the genetic effects were observed (Supplementary Figure S2). The downward trend between and within education groups with decreasing years of education and decreasing number of effect alleles in the analysis of single reference joint effects for HDL-C and the upward trend for LDL-C and TC for both SEP indicators also showed the same pattern using the extended GES Lipid-EXT compared to the GES Lipid (Supplementary Tables S8-S9).
Results of the single SNP main effect of all SNPs used in the GES Lipid are presented in Supplementary Tables S2-S4. 61 out of 71 HDL-C-, 52 out of 58 LDL-C-, 65 out of 74 TC-associated SNPs were directionally consistent. Single SNP interaction analysis for education and lipid-associated SNPs showed that some SNPs Table 3. Sex-and age-adjusted effects per GES Lipid standard deviation and corresponding 95% confidence intervals (95% CI) on high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein cholesterol (LDL-C) and total cholesterol (TC) in linear regression models including main effects of a lipid-associated genetic effect allele sum score (GES Lipid based on 14 ), indicators of socioeconomic position (SEP; education groups and income tertiles) and interaction terms of GES Lipid and indicators of SEP. # Education, $ Income.  Table S10). Similar differences of single SNP interaction effect size estimates were present for HDL-C-and TC-associated SNPs (Supplementary Tables S11-S12). Two of the LDL-Cassociated loci with the strongest indication for interaction with education (i.e., PCSK9, MAFB) were also upon the strongest TC-associated SNPs. The two HDL-C-associated loci with the strongest indication for interaction with education were OR4C46 and LPL.

Lipid ~ age + sex + GES Lipid + Education + GES Lipid *Education
In the sensitivity analysis, main results for all lipid traits did not differ in direction and only slightly in magnitude after adjustment for lipid-lowering medication as well as after the exclusion of participants on lipid-lowering medication (Supplementary Tables S13-S14). The GES LDL-C by low education interaction effect size estimates did not change substantially compared to the main analysis population.

Discussion
The aim of the study was to investigate whether the SEP indicators education and income interact with genetic sum scores of lipid-increasing effect alleles in a population-based cohort study. To the best of our knowledge, this was the first study investigating G×E interaction with SEP as environmental factor on lipids in adults. Results gave some indication for an interaction between the GES Lipid and the SEP indicator education, which was strongest for LDL-C. This was supported by stratified analysis in which the strongest genetic effects on LDL-C were observed in the high education group as well as by single reference joint effect analysis. After including information on smoking, BMI, physical activity, alcohol consumption and diabetes mellitus into analysis, there was an indication that a GES LDL-C by diabetes mellitus interaction partly explained the observed GES LDL-C by low education interaction. Using the extended GES Lipid-EXT in comparison to the GES Lipid , effect size measures were smaller but directionally consistent. Li et al. reported an interaction between rs2206734 SNP (CDKAL1), a favorable childhood environment and birthweight on metabolically obese, normal weight phenotype (defined as the presence of hypertension, hypertriglyceridemia, low serum HDL-C or impaired fasting plasma concentrations of glucose) in Chinese children. Their findings suggest that a favorable childhood environment represented by a composite score consisting of parental education, annual household income, high physical activity and fruit consumption can further amplify a protective effect of the CDKAL1 locus in children with a pediatric metabolic syndrome and high birthweight 33 . However, this study investigated a composite environmental score with parental SEP on a composite children's phenotype in Chinese and can therefore only indirectly be compared with present study results.  2014) have demonstrated in a population of European ancestry that the effect of a genetic risk score consisting of HDL-C-increasing alleles has been stronger for lean than for obese (BMI ≥ 35 kg/m 2 ) study participants. These interactions have been largely driven by the SNPs rs3764261 (CETP), rs4846914 (GALNT2), rs7241918 (LIPG) and rs6065906 (PLTP) 28 . As high education is strongly associated with low BMI 48 , these results may at least partly reflect the results of the present study. However, in the present study the SNPs representing the loci CETP, GALNT2, LIPG and PLTP did not show indication for SNP by education interaction on HDL-C. Justesen et al. (2015) have reported an interaction between a genetic risk score of HDL-C-decreasing effect alleles and physical activity in a Danish population (n = 5961), suggesting that the genetic risk score has exerted a smaller effect in physically active compared to inactive individuals. However, this interaction was statistically not significant in a replication cohort of smaller sample size 27 . As higher education is usually associated with a higher level of physical activity 49 , the results of the present study may represent the same interaction signal, i.e. a stronger genetic effect of HDL-C-increasing effect alleles on HDL-C in population groups of higher education groups.
Recent SNP×E interaction analyses have identified several lipid-associated loci interacting with lifestyle factors such as smoking and diet [50][51][52] Figure 3. Sex-and age-adjusted effects and corresponding 95% confidence intervals (95% CI) on low-density lipoprotein cholesterol (LDL-C) in linear regression models for single reference joint effects of tertiles of a LDL-C-associated genetic effect allele sum score (GES LDL-C based on 14 ) and socioeconomic position indicators, calculated separately for education groups and income tertiles, with the group of having a low GES LDL-C and the highest socioeconomic position as reference.  29 .
In the present analysis, smaller effects on lipids were observed for using the extended GES Lipid-EXT compared to the GES Lipid . This may be caused by the overall smaller effect size of newly discovered loci as a result of larger analysis populations in recent GWAS, making it possible to detect risk alleles with very small effects. It may also be due to the recently published GWAS meta-analyses that were based on single large cohorts potentially producing less generalizable study results 16,17 . The overall smaller main effects of the additional SNPs included in the extended GES Lipid-EXT have led to smaller interaction effect size estimates, i.e. less strong impact of SEP on the expression of the average genetic effect of all SNPs included.
It was assumed in the present analysis that the GES Lipid represent cumulative causal factors for lipids even if it is most likely that the SNPs used to construct the GES Lipid are proxy markers in high LD with the causal genetic variants 53 . The effect of SEP, especially education, on lipids and CVD risk in general was also assumed to be causal, as supported by numerous studies [54][55][56] including mendelian randomization studies exploring the association of instrumental variables with CVD and CVD risk factors by using genetic risk scores related to educational attainment 57,58 . However, SEP has no direct causal effect on CVD risk, but is mediated by a complex interplay of social inequalities in risk factors, e.g., access to preventive interventions, lifestyle factors, physiological stress, psychosocial risks, as well as in protective factors 54,57,59 . Results of the present study suggest that SEP may also have an effect on CVD risk by affecting the expression of LDL-C-related genetic risks. One possible mechanism that has been hypothesized in this regard is epigenetic modification. In contrast to an individual's ≤ ≥ Ɵ Ɵ Ɵ Figure 4. Sex-and age-adjusted effects and corresponding 95% confidence intervals (95% CI) on total cholesterol (TC) in linear regression models for single reference joint effects of tertiles of a TC-associated genetic effect allele sum score (GES TC based on 14 ) and socioeconomic position indicators, calculated separately for education groups and income tertiles, with the group of having a low GES TC and the highest socioeconomic position as reference.  60,61 . Interestingly, the lifestyle factors BMI, physical activity, smoking and alcohol consumption did not account for the observed GES LDL-C by education interaction, while diabetes mellitus accounted for it only partly.
Consequently, it has to be assumed that other risk factors besides those included in the present analysis may have a mediating effect on the found GES LDL-C by education interaction. One explanation for the stronger genetic effects on LDL-C in higher education groups may be that non-genetic health risks being of lower prevalence in high education groups leading to LDL-C profiles that are stronger affected by genetic than by non-genetic risk factors. This hypothesis is supported by the explained proportion of the variance (R 2 ) of the GES LDL-C on LDL-C, which was higher in the two higher education groups compared to the lower education group. The effect of SEP indicators on health is outcome specific and each indicator operates via different pathways linking social factors to health outcomes 43,62 . Even though educational attainment and income are moderately LDL-C ~ age + sex + Edu + GESLDL-C + S + GESLDL-C*Edu + S*Edu + GESLDL-C*S LDL-C ~ age + sex + Edu + GESLDL-C + BMI + GESLDL-C*Edu + BMI *Edu + GESLDL-C* BMI LDL-C ~ age + sex + Edu + GESLDL-C + PA + GESLDL-C*Edu + PA*Edu + GESLDL-C*PA LDL-C ~ age + sex + Edu + GESLDL-C + A + GESLDL-C*Edu + A*Edu + GESLDL-C*A LDL-C ~ age + sex + Edu + GESLDL-C + D + GESLDL-C*Edu + D*Edu + GESLDL-C*D   Table 4. Sex-and age-adjusted effects per GES LDL-C standard deviation and corresponding 95% confidence intervals (95% CI) on LDL-C in linear regression models including main effects and interaction terms of a LDL-C-associated genetic effect allele score (GES LDL-C based on 14  www.nature.com/scientificreports/ correlated (r = 0.45) in the present study, the different strength of genetic effect in education groups on LDL-C could not be seen in income tertiles. The net effect of education is reflected among others in the ability to turn health related information into behavior and facilitates understanding of therapeutic measures 43 . Which could support the hypothesis that in highly educated, due to the ability to create environments with less health risks, genetic influence on LDL-C might be stronger. Furthermore, education as a marker of childhood social environment could, due to the duration of exposure until adulthood, be more likely support epigenetic changes. Material resources do not seem to modify genetic risk on LDL-C. Strengths of the present study were its population-based study sample and the use of two different individual SEP indicators in the analysis. Even though education and income are correlated SEP indicators, each of them represents certain aspects of SEP related to different health behaviors and risks. Moreover, two different GES Lipid for each lipid trait were compared, allowing to check for differences in the genetic effects and G×E interactions between scores derived by different GWAS study populations. The sample size and the limited statistical power for single SNP analysis had to be mentioned as limitation of the present study. However, indication for interaction was based on the cumulative genetic risk of the study participants. Another limitation was the cross-sectional design of the study that does not allow for strong conclusions on causality of effects. However, educational attainment is usually acquired in adolescence or early adulthood and lipids were assessed at an older age in the present study. Due to this exposure-outcome temporality reverse causation is very unlikely. Even if the effect of education on lipids was not causal, a modification of the GES LDL-C effect on LDL-C by education would still be of interest, because the knowledge of the heterogeneous genetic effects in different education groups could be interesting for CVD risk prediction and genotype-based targeted interventions 35 . Furthermore, this knowledge supports CVD lifestyle-based interventions in lower education groups due to lower genetic effect on LDL-C in these groups. Finally, it cannot be excluded that the indication for a GES LDL-C × Education interaction could have been observed randomly due to the number of independent tests performed. However, the number of independent tests performed are justified through the three lipid end points and the two SEP indicators and we have calculated 95% confidence intervals to report the precision of the obtained effect size estimates. Furthermore, the interaction analyses results are supported by the results of the stratified and single reference joint effect analysis, which showed a constant pattern across phenotypes and indicated difference in genetic effect sizes between the education groups.
The results of the present study gave some indication for interaction between genetic variants associated with LDL-C and education in a population-based cohort study. Stronger genetic effects were observed in groups of higher education, which seemed to be partly mediated by diabetes mellitus but not by other life style risk factors such as BMI, smoking, physical activity and alcohol consumption. This gave supporting evidence that SEP has an impact on the expression of genetic susceptibility related to LDL-C. Further research is needed to replicate our findings in independent study samples, investigate possible biological mechanisms behind the interaction and to assess the potential of the found gene by SEP interactions for improving CVD prediction. Additionally, our study included only individuals of European origin and therefore the results may not be applied to populations of other ethnicities.

Data availability
Due to data security reasons (i.e., data contain potentially participant identifying information), the Heinz Nixdorf Recall Study does not allow sharing data as a public use file. However, other authors are allowed to access data upon request, which is the same way authors of the present paper obtained the data. Data requests can be addressed to: recall@uk-essen.de.