Introduction

Type 2 diabetes (T2D) and cardiovascular disease (CVD) are major worldwide contributors to disease burden and premature mortality1,2. Targeted primary cardiometabolic risk prevention requires pathway-specific biomarkers to detect the early metabolic alterations that predispose to developing these common diseases. Pathway-specific biomarkers can help identify at-risk individuals and discover the molecular processes that expose them to higher cardiometabolic risk. Such biomarkers may also help understand the influence of lifestyle on disease risk, enabling precise disease prevention.

Altered blood lipid composition is a common metabolic determinant of T2D and CVD3. Among lipids, ceramides are crucial second messengers in systemic signaling cascades, triggering cardiometabolic diseases4. In rodents, ceramide metabolites regulate inflammatory signaling, insulin resistance, and cellular stress responses. Genetic modifications of ceramide metabolizing enzymes either protected or predisposed animals for severe metabolic impairments5,6. Epidemiological studies have shown associations of ceramides and dihydroceramides with CVD and T2D risk7,8,9,10,11, suggesting that ceramide-dependent pathogenic mechanisms are also active in human populations.

Concurrently, plasma ceramide concentrations are susceptible to lifestyle modification, including diet. Double-blinded randomized controlled trials (RCTs) have demonstrated that modification of the diet’s fatty acid (FA) composition (higher palmitate- vs. linoleic acid-content) alone increased liver fat content and plasma ceramide levels12,13. Besides, a post hoc analysis of the PREDIMED trial suggested that CVD prevention with a Mediterranean diet intervention particularly alleviated the higher risk of major cardiovascular events in participants with elevated ceramide levels before the intervention14.

Accordingly, a beneficial composition of the habitual diet was related to lower cardiometabolic disease incidence15,16,17,18. For example, we and others have shown that red meat and coffee consumption were associated with altered cardiometabolic risk19,20,21,22,23 and altered lipid metabolism24,25,26,27,28. However, the actual metabolic pathways that connect these foods to cardiometabolic risk are still poorly understood. Due to their potential role as disease determinants and the demonstrated sensitivity to dietary exposures, ceramides are plausibly among metabolic mediators of the effect of diet on cardiometabolic risk.

Ceramide metabolism is complex, regulated by over 40 enzymes; these enzymes are subject to multiple regulatory processes and selectively synthesize or degrade groups of ceramides with similar acyl chains29. However, it is unclear how molecular pathways in ceramide metabolism are reflected in circulating ceramide profiles. In such situations, data-driven networks can provide information on the biological dependencies that drive the correlation structure of lipidomics profiles30,31. We have shown that partial correlation networks of metabolomics data reconstruct molecular pathways30,31. Through adjusting for metabolomics network neighbors, our new NetCoupler-algorithm controls for confounding by biologically closely related metabolites. Thereby, the robust associations indicate putative direct effects of molecular markers on disease risk and are not attributable to the correlations with other metabolites32.

Advanced high throughput lipidomics screens generate unprecedented insights into ceramide metabolism33. Here we applied the NetCoupler-algorithm to ceramide-profiling data from a large human population study to infer the direct effects of specific ceramides and dihydroceramides on the risk of developing T2D and CVD. We then conducted genome-wide association studies (GWAS) on these disease-associated ceramides to learn about inherited biological determinants and select genetic instruments for subsequent Mendelian randomization studies. We also performed hypothesis-generating mediation analyses, estimating the extent to which diet-related (dh)ceramide levels could explain the adverse effects of red meat consumption and the beneficial effects of coffee consumption on T2D risk.

Results

Data distribution and the network model

We used the following short notation for ceramides throughout the manuscript: CerXX:Y for ceramides and dhCerXX:Y for dihydroceramides with XX carbon atoms and Y double-bounds in the acyl chain (Supplementary Table 1). In a pilot study in 35 EPIC-Potsdam participants with two blood samples taken ~6 weeks apart, we assessed the within-person agreement of (dh)ceramide measurements. The intraclass correlation coefficients (ICC) from the pilot indicated fair to excellent reliability of most ceramide—and about half of the dihydroceramide measurements. However, few ceramide measurements and about half of the dihydroceramide measurements showed poor reliability (Supplementary Fig. 1).

The observational analyses were based on the measurement of 12 ceramides and 13 (dh)ceramides from a large lipidomics dataset in two case-cohort samples nested within the prospective EPIC-Potsdam study (775 participants with incident T2D among 1886 at-risk participants, and 551 participants with incident CVD among 1671 at-risk participants). In the random subcohort (n = 1137; baseline-prevalent T2D cases excluded), representative for the full EPIC-Potsdam cohort at cardiometabolic risk, the median plasma concentrations ranged between 0.2 nM (Cer18:1) and 42 nM (Cer24:0) for ceramides, and 0.62 nM (dhCer14:0) and 11 nM (dhCer24:1) for dihydroceramides. Median total concentrations (sum of all single compounds within the lipid class) were 91 nM (IQR 76–108 nM) for ceramides and 46 nM (IQR 41–52 nM) for dihydroceramides (Fig. 1A). Log-transformation and z-standardization of the concentrations resulted in similarly scaled, approximately normal distributions (Fig. 1B). Correlation analyses showed moderate to strong correlations between most (dh)ceramides. Partial correlations (conditioning on all other (dh)ceramides) were on average weaker and more specific (Supplementary Fig. 2). Participants with higher total plasma ceramide concentrations and with higher total plasma dihydroceramide ceramide concentration likewise tended to be older, to have a higher waist circumference, to have unhealthy lifestyle habits, and to be on medication (Supplementary Tables 2 and 3). Likewise, participants with incident cardiometabolic diseases had expectedly higher levels of these known risk factors compared to participants who remained cardiometabolic disease-free during follow-up (Supplementary Tables 4 and 5). We adjusted for all these potential confounders in the prospective analyses.

Fig. 1: Distribution of ceramide and dihydroceramide measurements.
figure 1

A Distribution of the absolute (dh)ceramide plasma concentrations; note that the x-axis is log scaled. B Comparison of Z-scores derived from the non-transformed and log-transformed (dh)-ceramide plasma concentrations. Cer ceramide, dhCer dihydroceramide.

Dihydroceramide- and ceramide-associated cardiometabolic risk from standard Cox models

First, we estimated the T2D and CVD risk associated with each single (dh)ceramide without considering the possible influence of other (dh)ceramides. In minimally adjusted models (age and sex only), 9 out of 12 ceramides and 11 out of 13 dihydroceramides were statistically significantly associated with higher T2D risk (FDR < 0.05) (Supplementary Table 6). Further adjustment for lifestyle, anthropometry, medications, blood pressure, and general lipid markers, including total ceramide and dihydroceramide concentration, rendered most of these associations non-significant. However, two ceramides (Cer18:0, Cer22:0) and two dihydroceramides (dhCer20:0, dhCer22:0) remained significantly associated with higher T2D risk and Cer24:0 with lower T2D risk after multiple testing correction (FDR < 0.05). We also observed significant associations of all 12 ceramides and 12 out of 13 dihydroceramides with higher CVD risk in minimally adjusted models. However, in the extensively confounder-adjusted models, only dhCer22:2 was significantly associated with higher CVD risk (FDR < 0.05). Most of the significant CVD associations were rendered non-significant by adjustment for total ceramide and dihydroceramide plasma concentrations (Supplementary Table 6).

Direct links between the (dh)ceramide network and cardiometabolic risk

Ceramide metabolites, depending on their acyl chain, are produced by different enzymes and exhibit distinct signaling functions34. Therefore, we were interested in the direct effects of specific (dh)ceramides on cardiometabolic risk, controlling for potentially confounding associations with other, disease-related (dh)ceramides. Our NetCoupler-algorithm exploits that adjustment for all network variables is not necessary to block potential confounding and indirect influences in a conditional independence network. Adjustment for a subset of direct network neighbors [i.e., the (dh)ceramides that are directly connected with an edge] is sufficient32,35,36. We first learned a graphical representation of the conditional independence structure, the (dh)ceramide network, from lipidomics data in the random EPIC-Potsdam subcohort (Fig. 2). In this data-driven network, most edges reflected known product-substrate-relations in lipid metabolism, such as fatty acid (FA) elongation steps, FA desaturation steps, or desaturation of dihydroceramides to ceramides. Consistent with our previous reports30,31, the network-encoded conditional independence structure corresponds well with known biological relations.

Fig. 2: Data-driven conditional independence network of (dh)ceramides.
figure 2

Bars within nodes show network-adjusted cardiometabolic disease risk. Left: T2D risk; Right: CVD risk; Orange: increased risk; Blue: decreased risk; Numbers: percent risk change with 1 standard deviation higher (dh)ceramide concentration. Frame colors—Green: only T2D-associated; Purple: only CVD-associated; Brown: T2D- and CVD-associated. CER ceramide, dhCER dihydroceramide.

We used the network to estimate the direct effects of specific (dh)ceramides on cardiometabolic risk, applying Cox proportional hazards regression. To this end, we constructed sets of Cox models for each (dh)ceramide with time-to-disease incidence as the endpoint. All models were extensively adjusted for potential confounders, and the models within each set adjusted for all possible combinations of direct network neighbors of the exposure-(dh)ceramide. We classified (dh)ceramides as having direct effects if they were consistently, statistically significantly (P < 0.05) associated with disease risk across all the network-based adjustment sets.

According to these criteria, three ceramides (Cer18:0, Cer20:0, Cer22:0) and three dihydroceramides (dhCer20:0, dhCer22:2, dhCer26:1) were associated with T2D risk. When simultaneously included in a joint Cox model, including adjustments for the predefined confounder set and total ceramide and dihydroceramide concentration, Cer18:0, Cer22:0, dhCer20:0, and dhCer22:2 were statistically significantly (P < 0.05) associated with higher and Cer20:0 and dhCer26:1 with lower T2D risk (Table 1 and Supplementary Table 7). The three saturated FA (SFA)-containing ceramides were closely related in the network (Fig. 2).

Table 1 Direct links between circulating (dh)ceramides and cardiometabolic risk.

The NetCoupler-algorithm also detected associations of Cer16:0 and dhCer22:2 with CVD risk (Supplementary Table 8). In the confounder-adjusted joint model, both (dh)ceramides were statistically significantly (P < 0.05) associated with higher CVD risk (Table 1). In the network, Cer16:0 was linked to the SFA-containing T2D-associated ceramides, while dhCer22:2 was associated with higher risk of both cardiometabolic endpoints (Fig. 2).

In sensitivity analyses, neither additional adjustment of the final model for HDL-cholesterol (Supplementary Table 9) nor exclusion of participants on lipid-lowering medication at baseline (Supplementary Table 10) substantially changed the effect estimates for T2D risk or CVD risk. Similarly, exclusion of participants with disease incidence within the first 2 years of follow-up generated directionally consistent estimates for T2D risk and CVD risk for all selected (dh)ceramides, though the associations of dhCer20:0 with higher and dhCer26:1 with lower T2D risk were substantially attenuated (Supplementary Table 11).

Genome-wide association studies on disease-associated (dh)ceramides

We conducted a GWAS with the seven disease-related (dh)ceramide plasma concentrations as the phenotypes in all participants in the representative EPIC-Potsdam subcohort with genetic and lipidomics data (n = 1094). Then, we looked up SNP-(dh) ceramide associations at a genome-wide suggestive significance level (p-value < 10−5) in independent study populations. To this end, we used partly unpublished results from a previous GWAS on ceramides Cer18:0, Cer20:0, and Cer22:0 in the EUROSPAN consortium37,38 (Supplementary Data 1), and results from a GWAS on Cer22:0 in the Framingham Heart Study Offspring Cohort published by Cresci et al.39. GWAS in these external cohorts supported the association of SNPs in the SPTLC3 gene region with Cer22:0 plasma concentrations (Table 2). Other suggestive GWAS signals (p-value < 10−5) in EPIC-Potsdam were either not significant (FDR > 0.05 correcting number of SNPs available for replication) or not available in the external replication cohorts and are provided in the supplement (Supplementary Data 28).

Table 2 Genetic variants associated with cardiometabolic disease-related (dh)ceramides.

Enrichment of ceramide-associated SNPs in cardiometabolic disease-related pathways

Based on all p-values from our GWAS in 1094 EPIC-Potsdam participants, we conducted gene set enrichment analyses with the GSA-SNP2 software40. As the reference, we considered a curated list of T2D-related pathways for the T2D-related (dh)ceramides40, and we generated a curated list of CVD-related pathways for the CVD-related (dh)ceramides. We selected enriched gene sets at a Q-value of 0.25, a standard cutoff in gene set enrichment analyses. For T2D, we observed enriched genetic associations with T2D-associated, long-chain and very long-chain SFA-containing (dh)ceramides (Cer18:0, Cer22:0, dhCer20:0, and dhCer26:1) in gene sets related to glucose homeostasis, insulin signaling, and inflammation. For the very-long-chain FA-containing dhCer22:2, associated with T2D and CVD risk, enrichment analyses suggested overrepresentation of genetic associations in gene sets that reflect mitochondrial dysfunction as well as signaling cascades involved in hemostasis (Supplementary Fig. 3). No enriched signals in CVD-related gene sets were detected for the CVD-associated Cer16:0. External data for replication of the gene set enrichment analyses were not available.

Mendelian randomization to evaluate the causal role of ceramides

The association of several SNPs in the SPTLC3 gene region with the plasma concentrations of the T2D-associated Cer22:0 was the single suggestive GWAS signal in EPIC-Potsdam consistent with the limited available data for external replication. The detected SNPs in the SPTLC3 gene region in EPIC-Potsdam were largely synonymous (r2 = 0.96-1, D' = 1.0). The association of variation in rs680379 with Cer22:0 plasma concentrations had the lowest p-value among SNPs that were available for external replication in EUROSPAN37,38 and the Framingham Heart Study Offspring Cohort39, and the SNP was also available in a large GWAS on T2D (DIAGRAM)41. Therefore, we used rs680379 as genetic instrument for a univariable, two-sample Mendelian randomization study (MR). The results suggested higher T2D risk in participants with higher genetically predicted Cer22:0 plasma concentrations. Using the same genetic instrument, we replicated the MR with the SNP-phenotype association from the two published GWAS on plasma ceramides that we used for lookup37,39 and found that the MR estimates were also significant (Table 3). We did not conduct MRs with other (dh)ceramide-endpoint associations because data for external replication was lacking.

Table 3 Univariable, two-sample Mendelian randomization studies using genetic proxies to estimate effects of Cer22:0 on the risk of T2D.

Ceramides as mediators of putative diet-effects on type 2 diabetes

Habitual intakes of red meat and coffee consumption were consistently reported as risk factors of T2D15, but the potential underlying molecular mechanisms are unclear. A possible explanation for the relationship with T2D risk is an effect of these foods on lipid metabolism, possibly involving ceramides. To test whether association in EPIC-Potsdam were consistent with this hypothesis, we first assessed if red meat and coffee consumption were associated with T2D-related (dh)ceramides in a directionally consistent and statistically significant manner. In mutually adjusted models and accounting for an extensive set of potential lifestyle confounders, red meat intake was associated with a higher concentration of dhCer20:0 and Cer18:0 and lower levels of Cer20:0 and dhCer26:1 (Fig. 3A). The red meat-related T2D risk in EPIC-Potsdam (HR per 2 SD higher intake 1.31, 95%CI 1.01–1.71) was largely attenuated by adjustment for the red meat-associated ceramides (proportion explainable 62%, 95%CI 9% to 100%) (Fig. 3B). Coffee consumption was associated with lower concentrations of the high-risk dihydroceramide C22:2 (Fig. 3C). Adjusting the inverse coffee-T2D association (HR per 2 cups 0.87, 95%CI 0.78–0.98) for dhCer22:2 attenuated the inverse association of coffee with T2D risk by 43% (95%CI 10% to 99%) (Fig. 3D). Thus, our mediation analyses results are consistent with the hypothesis that divergent effects on ceramide metabolism partly mediate the opposite putative effects of red meat and coffee consumption on T2D risk.

Fig. 3: Mediation analysis.
figure 3

A Adjusted effect estimates (beta coefficients) of red meat on T2D-related (dh)ceramides (direction of associations consistent with mediation hypothesis; p-values < 0.05, one-sided t-test). B Attenuation of the putative effect of red meat on T2D risk after adjustment for red meat- and T2D-related (dh)ceramides. C Adjusted effect estimate (beta coefficient) of coffee on T2D-related dhCer22:2 (direction of the association consistent with the mediation hypothesis; p-value < 0.05, one-sided t-test). D Attenuation of the putative effect of coffee on T2D risk after adjustment for coffee- and T2D-related dhCer22:2. All models were extensively adjusted for potential confounders (age, sex, fasting status, total energy intake, leisure-time physical activity, medication, smoking, alcohol consumption, and education). Blue indicates inverse association (i.e., lower ceramide concentration or T2D risk), orange: positive association (i.e., higher ceramide concentration or T2D risk). Total effect is the confounder-adjusted hazard ratio (95% CI) per exposure unit: red meat, 2 SD (~1 portion per day); coffee, two cups (300 mL) per day. PE Proportion explainable, i.e., relative attenuation of the total effect through mediator-adjustment. Cer ceramide, dhCer dihydroceramide.

Discussion

In this prospective study in a baseline-healthy, free-living population, a metabolic network based on deep ceramide and dihydroceramide-profiling data revealed several associations of specific (dh)ceramides with cardiometabolic disease risk robust against adjustment for other (dh)ceramides. When simultaneously included in a confounder- and total ceramide and dihydroceramide-adjusted Cox model, high plasma concentrations of Cer18:0, Cer22:0, dhCer20:0, and dhCer22:2 were associated with a higher T2D risk, while Cer20:0 and dhCer26:1 were associated with lower T2D risk. The high T2D risk associated with Cer18:0 and Cer22:0 suggests that these compounds may be directly involved in molecular mechanisms that implicate ceramide metabolism in T2D etiology. Mendelian randomization estimates were consistent with an effect of Cer22:0 on T2D risk, and gene set enrichment analyses suggestively linked Cer18:0 to insulin signaling and both ceramides to cytokine-induced inflammation. Mediation analyses suggested differential influences of high red meat and high coffee consumption on ceramide metabolism, potentially explaining the putative opposite effects of the two foods on T2D risk. Moreover, when simultaneously included into the same confounder- and total ceramide and dihydroceramide-adjusted model, Cer16:0 and dhCer22:2 were both associated with higher CVD risk. Enrichment analyses suggested enrichment of dhCer22:2-associated SNPs in gene sets related to the regulation of hemostasis and platelet aggregation.

Prospective human studies showed an association of ceramides with T2D risk and diabetes-related traits. In the Strong Heart Study, Cer16:0, Cer18:0, Cer20:0, and Cer22:0 were associated with insulin resistance8. The FINRISK-cohort reported that the Cer18:0-to-Cer16:0-ratio was associated with higher T2D risk42, suggesting the relation of Cer18:0 to shorter chain precursors as a predictor for T2D incidence. Another study associated (dh)ceramides with T2D incidence in mice and humans, particularly those with 18 and 22 carbon atoms in the acyl chain43. Despite heterogeneity due to different included (dh)ceramides and diverse modeling approaches, these observations are generally consistent with our results. Based on our network adjustments, we linked saturated LCFA-containing (dh)ceramides to T2D risk in a chain length-dependent manner and additionally detected risk markers among VLCFA-containing (dh)ceramides. In a mutually adjusted model, high levels of Cer18:0, Cer22:0, dhCer20:0, and dhCer22:2 were associated with higher T2D risk, while Cer20:0 and dhCer26:1 were associated with lower risk.

In cells, ceramide signaling orchestrates the metabolic response to elevated levels of non-esterified FAs4. To this end, ceramides induce triglyceride synthesis (for example, by translocation of CD36 to the plasma membrane and by induction of SREBP genes44,45,46), downregulate nutrient supply (among others by insulin desensitization and downregulation of lipolysis47,48,49,50,51,52), and stimulate FA-utilization (e.g., by decreasing mitochondrial efficiency, which diminishes feedback inhibition of beta-oxidation53,54). Under a prolonged metabolic challenge, ceramides also link cellular stress to immune responses, apoptosis55,56, and fibrosis57,58. Thereby, intracellular concentrations of LCFA-containing ceramides serve as nutrient sensors. Accordingly, genetic knockout of Cer18:0-producing ceramide-synthase (CerS)-1 protected mice from the detrimental effects of a high-fat diet on systemic glucose homeostasis6. Our study consistently linked Cer18:0 to strongly elevated T2D risk, while its direct network-neighbor Cer20:0 was moderately inversely related to T2D risk when simultaneously included in the same Cox model. Our genetic analyses consistently suggested functions of LCFA-containing ceramides in metabolic regulation, particularly linking them to insulin-signaling pathways. Our results specifically suggest that the interference of Cer18:0 with insulin sensitivity, which was demonstrated in animal models, is linked to T2D development in a free-living human population.

Studies in rodents and humans demonstrated that ceramide signaling partly mediates the adverse effects of an unfavorable dietary FA composition on metabolic health12,13,45. We related high habitual red meat consumption to an adverse saturated LCFA-signature in ceramides, specifically higher levels of Cer18:0. Moreover, we observed a marked effect attenuation by controlling the red meat-related T2D risk for LCFA-containing ceramides, suggesting that the higher T2D incidence among people with high red meat consumption is partly explainable by red meat-induced alteration of the saturated LCFA-composition of ceramides.

The quantitatively most abundant ceramide synthase in the human liver is CerS2, which synthesizes Cer22:0. Genetic ablation of CerS2 in mice suppresses the hepatic adaptation to nutritional challenges. CerS2-knockout mice were protected against liver fat accumulation and elevated blood sugar in overfeeding regimens but developed severe hepatic pathologies59,60. However, hepatocytes of CerS2-knockout mice were protected against lipid-induced TNF-α/NF-κB-dependent inflammation and apoptosis61. In our study, Cer22:0 was among the quantitatively most abundant ceramides in plasma, and it was the strongest T2D risk marker. Our GWAS results further linked Cer22:0 to NF-κB activation, and Mendelian randomization suggested that it might play a biological role in T2D development. Our results suggest that the human plasma concentration of Cer22:0 may serve as a biomarker for metabolically induced cellular stress and inflammatory signaling that predispose to T2D.

We also observed associations of dhCer22:2 with higher and dhCer26:1 with lower T2D risk. Gene set enrichment analysis suggested enrichment of dhCer22:2-associated SNPs in mitochondrial function-related pathways and dhCer26:1-associated SNPs in insulin signaling- and inflammation-related pathways. Other studies also linked dihydroceramides with 22 carbon atoms acyl chains to insulin sensitivity and hepatic inflammation but did not assess dhCer22:2 concentrations43,62. Our network-adjusted analyses suggested dhCer22:2 and dhCer26:1 as new independent T2D risk markers and warrant external validation.

Among dietary factors, coffee was associated with lower cardiometabolic risk19,20,21, and the effect of coffee on hepatic lipid metabolism is a potential explanation. Animal studies demonstrated that coffee and its components affect critical regulators of lipid metabolism, including SREBP1, CD36, and PPARα and PPARγ25,26,27,28, affecting lipid uptake, excretion, and FA-metabolism in the liver. As discussed above, ceramides connect nutrient sensing to the regulation of cellular stress responses; and our gene set enrichment analysis suggested that dhCer22:2 may reflect metabolic stress signals and mitochondrial dysfunction. We observed lower concentrations of dhCer22:2 associated with coffee consumption and adjusting for this biomarker substantially attenuated the inverse association of coffee consumption and T2D risk. These observations are consistent with the hypothesis that modification of ceramide metabolism could partially explain the beneficial effects of coffee on cardiometabolic health.

Several studies showed that plasma ceramide concentrations predict CVD risk7,9,10,14,63,64. Besides distinct source populations, the different coverage of (dh)ceramides and different modeling strategies complicates the comparison of these studies. We detected Cer16:0 and dhCer22:2 as independent CVD risk markers, using comprehensive lipidomics profiles and a modeling strategy targeting risk association robust against adjustment for the total ceramide and dihydroceramide concentrations and other (dh)ceramides.

Previous reports of Cer16:0 and Cer18:0-associations with higher CVD risk14,63,64 are consistent with our confounder-adjusted single (dh) ceramide models that did not adjust for network neighbors and total dihydroceramide and ceramide concentrations. However, in our study, only the association of Cer16:0 with CVD risk was robust against adjustment for total ceramide and dihydroceramide concentrations and network neighbors.

We found a robust association of dhCer22:2 with CVD risk in EPIC-Potsdam and did not identify previous reports of the CVD risk association. The gene set enrichment suggested possible involvement in immune response, platelet aggregation, and cell–cell interaction involved in hemostasis. Experimental studies demonstrated that VLCFA-containing ceramides link inflammatory signals to vascular pathologies65,66. Genetic and pharmacological inhibition of type 2-neutral sphingomyelinase in mice reduced the circulating VLCFA-containing ceramide concentrations and prevented lipid-induced atherosclerosis67. Consistently, the platelet-activating factor activates ceramide production in erythrocytes, leading to their adhesion68. Moreover, VLCFA-containing ceramides are functionally involved in necroptosis69, providing a biological link to vascular health and cardiac cell death. Our genetics and observational findings suggest dhCer22:2 may be a biomarker at the interface of lipid metabolism, inflammatory signaling, and cardiovascular health.

Substantial evidence from animal models supports a causal role of specific (dh)ceramides in cardiometabolic disease development5,6,44,70,71. Human intervention studies demonstrated an impact of diet composition on ceramide metabolism12,13. Against this background, our results suggest that (dh)ceramide profiling in intervention studies may help to understand the molecular underpinnings of the effect of dietary composition on cardiometabolic health. Plasma ceramide profiling may provide pathway-specific cardiometabolic risk markers, with LCFA-ceramides potentially reflecting metabolic impairment5,6 and VLCFA-containing ceramides potentially reflecting immune responses and cell–cell interactions65,66,67,68,69. However, our study also suggests that the specificity of (dh)ceramides as molecular pathway markers depends on simultaneous assessment and modeling of a comprehensive (dh)ceramide profile.

Our study had limitations. Although delivering a very comprehensive lipidomics screen, the manufacturer (Metabolon®) did not disclose indicators of the technical variance for single lipids. We partly compensated for this lack of transparency by assessing the intra-individual variance of the single lipid measurements over several weeks in a pilot study, assessing the temporal stability of the (dh)ceramide measurements. Some (dh)ceramides showed substantial within-person variance over several weeks. Under the assumption that the introduced variance is unrelated to the disease risk, poor reliability is expected to bias single measurement-based risk estimates towards the null. Accordingly, most disease-associated (dh)ceramides had fair to excellent ICCs in our reliability study.

In addition, observed associations might be attributable to unmeasured confounding. In combination with experimental data, our selection of specific chain-length (dh)ceramides with direct effects on disease risk can be useful to elucidate molecular mechanisms. However, for other applications, including risk prediction, the total effect of a biomarker is more critical, and it might not be advantageous to adjust for other correlated (dh)ceramides or total levels of ceramides and dihydroceramides.

The p-value-based variable selection and inferences in our observational and genetic analyses depended on the sample size, complicating comparison to studies with different statistical power. Particularly, the GWAS on ceramide risk markers had limited statistical power. External GWAS data to validate SNP-lipid associations in independent cohorts was not available for most ceramides and all dihydroceramides. The pathway enrichment analysis generated plausible biological insights from genetic associations with less stringent significance cutoffs, but datasets for replicating our findings were not available.

The limited statistical power of the GWAS may also partly account for only detecting one reliable instrument for the MR study of Cer22:0 on T2D risk, using a single SNP from the SPTLC3 gene region, which impeded checks for horizontal pleiotropy. Genetic variants in this gene region were also implicated in other lipid and metabolic traits. However, the SNPs were linked to a gene that encodes a subunit of a key enzyme in sphingolipid biosynthesis, and the MR results were replicated with SNP-phenotype associations from independent cohorts. Still, the IVs’ pleiotropic effects on other ceramides must be assumed, and the attribution of the effect to Cer22:0 depends on the validity of our network-adjusted observational analysis. Therefore, the MR estimates alone do not provide conclusive evidence on causality but complement the observational estimates due to distinct sources of bias. Our findings encourage larger GWAS on ceramides that may also generate more genetic instruments for MR studies. Mediation analyses in observational data do not prove causality but generate testable hypotheses, which warrant validation in controlled trials.

To conclude, our study indicates that the cardiometabolic risk associated with (dh)ceramide plasma concentrations depends on the contained acyl chain, especially if models are conditioned on other disease-related (dh)ceramides and total ceramide and dihydroceramide concentrations. These observations are consistent with the hypothesis that specific (dh)ceramides are involved in distinct molecular mechanisms of cardiometabolic disease etiology, which coincides with evidence from animal models. Our genetic analyses also suggested the implication of the disease-related (dh)ceramides in cardiometabolic disease-related molecular pathways. Furthermore, we showed that adjustment for a few T2D-related (dh)ceramides markedly attenuated the adverse effect of red meat and the protective effect of coffee consumption on T2D risk, consistent with the hypothesis that their effect on ceramide metabolism partially mediates the effect of these foods on T2D risk. Altogether, these results indicate that circulating (dh)ceramide profiles integrate information on the exposure to genetic and environmental cardiometabolic risk factors and may be applied as pathway-specific biomarkers for cardiometabolic health.

Methods

All EPIC-Potsdam participants gave informed consent for biomedical research use of their data, and the study was approved by the Ethics Committee of the State of Brandenburg, Germany72. The study participants did not receive monetary compensation. All work was performed in accordance with the Declaration of Helsinki.

Study population

EPIC-Potsdam

The prospective EPIC-Potsdam cohort study includes 27,548 participants (16,644 women and 10,904 men) recruited within an age range of 35–65 years from the general population between 1994 and 199872. Participants were then actively contacted by sending out questionnaires and, if necessary, by telephone every 2–3 years, with response rates between 90% and 96% per follow-up round73.

Nested case-cohorts were constructed for efficient studies into molecular phenotypes and disease risk. The case-cohort design relies on a randomly drawn subsample (the subcohort) and oversampling of all incident disease cases in the full cohort during the study period to boost the statistical power. Statistically accounting for the oversampling of cases, this design provides unbiased risk estimates for the full cohort74. The subcohort (n = 1137; baseline-prevalent T2D cases excluded) was drawn from all participants who provided blood at baseline (n = 26,437). Additionally, for each endpoint, all incident cases in the full cohort until a specified censoring date were included (CVD: 551 incident cases, 28 in the subcohort; T2D: 775 cases, 26 in the random subcohort).

For T2D, the censoring date was the 31st of August 2005 (820 incident cases). After excluding participants with missing follow-up information, prevalent diabetes at recruitment, insufficient blood specimens, or non-verifiable information on diabetes incidence, the analytical sample comprised 1886 participants (1000 women and 886 men), including 775 participants with incident T2D from whom 26 were part of the subcohort. The median follow-up time for T2D was 6.5 years (interquartile range 6.0–8.7 years).

For CVD, the censoring date was the 30th of November 2006, with 583 incident primary cardiovascular events occurring during the study. After equivalent exclusions (using prevalent and non-verifiable CVD instead of diabetes as exclusion criterion), the CVD sample comprised 1671 participants (892 women and 779 men), including 551 participants with incident CVD (283 only myocardial infarction, 257 only strokes, 11 both) from whom 28 were part of the subcohort. The median follow-up time for CVD was 8.4 years (interquartile range 7.6–9.2 years).

Baseline assessment

The baseline examination included anthropometric and blood pressure measurements, a personal interview and a questionnaire on prevalent diseases and sociodemographic and lifestyle characteristics (including physical activity, education, and medication), and a validated semi-quantitative food frequency questionnaire (FFQ). Among other foods, the habitual intake of unprocessed and processed red meats and coffee was assessed75,76. We defined total red meat as the sum of unprocessed red meat and processed meat. The correlations between quantitative repeated assessment of red meat, processed meat, and coffee consumption were 0.73, 0.77, and 0.70 from FFQs 6 months apart, indicating good to excellent reproducibility77. Anthropometric measurements and physical examinations were conducted by trained medical personnel. BMI was calculated as body weight in kilograms divided by squared height in meters. Waist circumference was measured midway between the lower rib margin and the superior anterior iliac spine to the nearest 0.5 cm78,79. Blood pressure was measured in a standardized procedure with oscillometric devices (BOSO-Oscillomat, Bosch & Sohn, Jungingen, Germany), and the mean of second and third reading was used80.

At baseline, blood samples were drawn under standardized conditions regarding room temperature according to the study protocol and stored in liquid nitrogen (−196 °C) or deep freezers (−80 °C). Per participant, 30 ml of blood were collected, of which 20 ml were filled in Monovettes containing citrate. Samples were separated in serum, plasma, buffy coat, and erythrocytes and aliquoted into 0.5 ml straws as previously described in detail81.

Laboratory measurements

For all laboratory measurements, samples were randomly distributed across batches independent of case status, and all laboratory and data-processing steps were performed blind to the case status.

Lipid profiling

The (dh)ceramide-profiling data was generated with Metabolon (Morrisville, US) using the Metabolon® Complex Lipid Panel. The platform generates the molecular species concentration and complete fatty acid composition of each covered lipid class, including 13 dihydroceramides and 12 ceramides. From plasma samples, lipids were extracted in methanol:dichloromethane, concentrated under nitrogen, and reconstituted in ammonium acetate dichloromethane:methanol (50:50). The extracts were directly infused into the ionization source of a Sciex SelexION® −5500 QTRAP mass spectrometer. After ionization, the lipids passed through SelexIon differential mobility spectrometry (DMS), in which voltages are applied that selectively allow the passage of only a specific lipid class at any given time. After the DMS filtering, lipids entered the Multiple Reaction Monitoring (MRM), where the lipid mass and its characteristic fragment were measured. The Metabolon® Complex Lipid Panel included >50 isotopically labeled internal standards introduced in the biological sample early in the process and permitted accurate quantitation of lipids across and within classes. According to Metabolon®, the coefficients of variation (CVs) of lipid class concentrations are all below 10% and the median CV of species at a 1uM concentration in serum or plasma is ~5%. In a preceding analysis, we estimated intraclass correlation coefficients (ICCs) of repeated blood samples taken several weeks apart. The ICC relates intraindividual to between-person variation, indicating biological stability of the measurements, and we used Rosner’s classification of ICCs (ICC < 0.40 poor reproducibility; ICC from 0.40–0.75 fair to good reproducibility; ICC > 0.75 excellent reproducibility)82.

Genetics

We only considered the random subcohort participants for the genetic analyses, excluding prevalent T2D and CVD cases (n = 1094). The DNA was extracted from buffy coats using the chemagic DNA Buffy Coat Kit special on a Chemagic Magnetic Separation Module I (PerkinElmer Chemagen technologies, Baesweiler, Germany) according to the manufacturer’s instructions. Eligible samples were genotyped with three different genotyping arrays as part of different larger genotyping projects: Human660W-Quad_v1_A (n = 328), HumanCoreExome-12v1-0_B (n = 587) and Illumina InfiniumOmniExpressExome-8v1-3_A DNA Analysis BeadChip (n = 179). Genotyping and quality control of the Human660W-Quad_v1_A and HumanCoreExome-12v1-0_B chips were described elsewhere83. Genotyping using the Illumina InfiniumOmniExpressExome-8v1-3_A DNA Analysis BeadChip was performed in the Life and Brain Center in Bonn, Germany. This array contains about 960 000 genetic variants, allowing to genotype 77% of all common genetic variants within the human genome. Additionally, a 250 K high-value exome content, discovered through exome sequencing studies, is covered by the chip. The DNA was processed according to the manufacturer’s instruction using an automatized, LIMS controlled workflow, and the arrays were finally scanned using an Illumina iScan bead arrays reader. Genotype calling and quality control of the samples were carried out jointly in all 1094 samples using Illumina’s GenomeStudio v2011.1 software suite. Protocols suggested by the CHARGE consortium84, Anderson et al.85., and Guo et al.86. were used to derive the final dataset. zCall with a threshold of seven was applied87 to improve the genotype calling for rare variants. Samples with low call rate, discordant sex information (F-value between 0.2 and 0.8), related or duplicated individuals (IBD > 0.185), individuals with divergent ancestry, or unclear sample allocation were excluded from further analysis (n after exclusions = 1094). Phasing and imputation were conducted using the Michigan Imputation Service88. The Haplotype Reference Consortium (release 1.1) was used as a reference panel89. Before imputation, pre-phasing was applied using Eagle290,91. Imputation was carried out in four separated datasets (one for each genotyping chip or two for the HumanCoreExome-12v1-0_B chip) using minimac388. Pre- and post-imputation tools (HRC-1000G-check-bim.v4.2.9, icv.1.0.5) for checking data quality were applied92. The four imputed files were merged using bcftools93, keeping the four merged files’ minimal R2 score. After, the SNPs were filtered by R2, keeping those with values >0.6. Data were available for the 22 autosomes but not for the sex chromosomes.

Targeted biomarkers

The automatic ADVIA 1650 analyzer (Siemens Healthcare, Erlangen, Germany) was used to assess plasma levels of total cholesterol and triglycerides, and we applied a sex-specific correction for dilution with citrate (correction factor 1.16 for women and 1.17 for men)94.

Case ascertainment

T2D

Systematic information sources for the incidence of T2D were self-report of diagnosis, T2D-relevant medication, or dietary treatment due to T2D diagnosis during follow-up. Additionally, death certificates and information from tumor centers, physicians, or clinics that provided assessments for other diagnoses were screened for an indication of incident T2D. For participants classified as potential cases based on that information, a standard inquiry form was sent to the treating physician. Only physician-verified cases diagnosed with T2D [International Statistical Classification of Diseases and Related Health Problems (ICD)-10 code: E11] and a diagnosis date after the baseline examination were considered confirmed incident cases of T2D.

CVD

Incident CVD was defined as the incidence of non-fatal and fatal myocardial infarction (MI) and stroke (ICD-10 codes: I21 for acute MI, I63.0 to I63.9 for ischemic stroke, I61.0 to I61.9 for intracerebral and I60.0 to I60.9 for subarachnoid hemorrhage, and I64.0 to I64.9 for unspecified stroke). The incidence of CVD was assessed by participant self-report or based on information from death certificates. Self-reported incidence was then validated by contacting the treating physicians, including assessing the ICD-10 code, date of occurrence, and further information on symptoms and diagnostic criteria used in the WHO MONICA study. For myocardial infarction, diagnostic criteria included clinical symptoms, electrocardiograms, heart enzymes, and known coronary heart disease. The stroke diagnosis was based on anamnesis, clinical symptoms, CT/MRT, angiogram, lumbar puncture, echocardiogram, Doppler, and ECG, plus imaging techniques if available. Participants with silent cardiovascular events that have not been documented within 28 days after occurrence were excluded as non-verifiable cases from all analyses.

Statistics

Data preparation

A moderate fraction of covariable information was missing (waist circumference, nmissing = 2; BMI, nmissing = 12; blood lipids, nmissing = 82; blood pressure, nmissing = 148). Single imputation was used to impute these missing values, applying the “predictive mean matching method” from the SAS procedure PROC MI. The “predictive mean matching method” draws information from other covariables to predict missing values and, compared with linear regression, generally generates more plausible imputed variable distributions95. The following variables contributed to the prediction of missing values: incident case (T2D/CVD) during follow-up (yes, no), sex, age, height, smoking, leisure-time physical activity (sports, biking, gardening), drug treatment (antihypertensive, lipid-lowering, aspirin), prevalent disease status (T2D, CVD), total energy intake, intakes of whole-grain bread, grain flakes, grains, and muesli, fresh fruit, raw vegetables, cooked vegetables, nuts, coffee, high-energy soft drinks, fish, red meat and processed meat, total alcohol consumption, and educational attainment.

Smoking was modeled in four categories (never smoker, ex-smoker, current smoker < 20 units/day, current smoker ≥ 20 units/day). Alcohol intake was modeled in six sex-specific intake level categories. The alcohol consumption-categories in men were: abstainers, 0–6 g/d, >6–12 g/d, >12–24 g/d, >24–60 g/d, >60–96 g/d, >96 g/d. The alcohol consumption-categories in women were: abstainers, 0–6 g/d, >6–12 g/d, >12–24 g/d, >24–60 g/d, >60 g/d. Coffee intake was modeled as cups (150 mL) per day and meat intake as grams per day. Educational attainment was modeled in three categories (in or no vocational training/vocational training, technical college degree, university degree). Leisure-time physical activity was modeled as average weekly hours. Fasting status was modeled as a binary variable (≥8 h, yes/no).

The few participants with missing (dh)ceramide values (three for dhCer14:0, 5 for dhCer18:1 and 20:1 each, 13 participants in total) were excluded from all analyses that included these variables. The (dh)ceramide concentrations tended to be right-tailed. Therefore, we log-transformed (dh)ceramide concentrations, which resulted in approximately normal distributions, and z-scaled the log-transformed values. Accordingly, all regression estimates were reported per 1 SD.

Prentice-weighted Cox models for (dh)ceramide-cardiometabolic risk analyses in the case-cohort

Associations between (dh)ceramides and disease risk were evaluated in Cox proportional hazards regression models with age as the underlying time scale. Study exit was determined by a diagnosis of diabetes or CVD, dropout, or censoring time, whichever came first. The case-cohort design was accounted for by Prentice weighting96.

The NetCoupler-algorithm

We aimed to estimate the direct effects of (dh)ceramides on cardiometabolic disease risk that could not be attributed to the influence of related ceramide metabolites. Therefore, we developed a graphical model-based method, the NetCoupler-algorithm32. In a first step, we estimated a network model of conditional dependencies, where edges represent covariance between two (dh)ceramides that could not be explained by adjustment for any subset of other (dh)ceramides. To this end, we applied an order-independent implementation of the causal structure learning PC-algorithm36,97. The resulting network graphically encoded the family of causal models that could have generated the observed conditional independence structure, i.e., the skeleton of the data-generating DAG. This conditional independence network was then used to detect links between individual metabolites and disease incidence that could not be explained by confounding influences through other (dh)ceramides. By definition, at least one subset of direct neighbors is sufficient to block confounding from the whole network35. However, sufficient adjustment sets could not be unambiguously read from the graph because the edges were not directed. Therefore, the NetCoupler-algorithm iterates for each metabolite through adjustment for all possible combinations of direct network neighbors. A metabolite is then only classified as a direct effector if the association with disease incidence is robust across all these sub-models (Supplementary Fig. 4). The analyses were conducted with a developmental implementation (available upon request). We provide detailed documentation and ready-to-use software implementation of the NetCoupler-algorithm in the R statistical programing language on GitHub (https://github.com/NetCoupler).

An edge between any possible pair of ceramides was detected based on dependency at an alpha level of 0.05, conditioning on any subset of other ceramides to learn the network. To evaluate the direct link of each ceramide with disease incidence, iterative Cox models were used. Thereby, each ceramide was associated with time-to-disease-incidence, adjusting for all possible combinations of direct neighbors in the ceramide network. All models were additionally adjusted for total ceramide and total dihydroceramide concentrations, age in years (strata variable), sex, height, waist circumference, leisure-time physical activity, fasting status, antihypertensive medication, lipid-lowering medication, aspirin, total energy intake, smoking, alcohol consumption, education, plasma concentrations of triglycerides, total cholesterol, and systolic and diastolic blood pressure; baseline-prevalent T2D cases were excluded from the diabetes risk model, and adjusted for in the CVD risk model. Ceramides that were directionally consistent and statistically significantly (alpha < 0.05) associated with the disease endpoint across all neighbor-adjusted models were classified as direct effects. Because they can only be confounders but not mediators, each newly identified direct effector ceramide was included in the fixed adjustment set, and the procedure was repeated until no further direct effects were detected. Finally, for each endpoint, all the selected directly disease-associated (dh)ceramides were simultaneously included into the same Cox model, adjusted for the full set of above-defined covariables, rendering the mutually adjusted disease hazard ratios.

Genome-wide association study

The software QCtool v1.4 was used to filter the SNPs by SNP missing rate (removed ≥ 0.05), minimum allele frequency (MAF) (removed out of interval [0.05–0.5]), and Hardy–Weinberg equilibrium (removed −log10(p-value) ≥ 3). Then, we used SNPtest v2.5.2 for exploratory single variant association analysis (n ~ 5,339,213 markers) as exposures and the log-transformed and z-standardized (dh)ceramides as an outcome. We considered p-values below 10−5 suggestively significant. We assumed a frequentist additive genetic model (method expected: genotype dosage), adjusted for age at recruitment and sex. Variants were mapped to Ensembl annotation version 84 (GRCh37)98, and we used the Ensembl Variant Effect Predictor for annotation99.

For the GWAS on Cer18:0, Cer20:0, and Cer22:0, we performed lookup studies with partly unpublished results from EUROSPAN (European special populations research network: quantifying and harnessing genetic variation for gene discovery, n = 4034), a consortium involving five European populations focusing on the genomics of >300 phenotypes including lipidomics, that were measured at the Institute for Clinical Chemistry and Laboratory Medicine, Regensburg University Medical Center (Germany), using electrospray ionization tandem mass spectrometry (ESI-MS/MS) in positive ion mode. Genetic association tests between lipid and allele dosage were performed using a mixed model approach implemented with the ‘mmscore’ option in the GenABEL software. Results from the five populations were combined using inverse variance weighted fixed-effects model meta-analyses using the METAL software. The other (dh)ceramides associated with cardiometabolic risk in EPIC-Potsdam were not available in EUROSPAN. We also compared our suggestively significant GWAS results on Cer22:0 in EPIC-Potsdam with published SNP-Cer22:0 associations from the Framingham Heart Study Offspring Cohort (n = 2217). To this end, we extracted beta estimates and p-values from Table 2 in the publication by Cresci et al.39. The other (dh)ceramides associated with cardiometabolic risk in EPIC-Potsdam were not available in the Framingham Heart Study Offspring Cohort.

We used summary-level data for the association of ceramide-associated SNPs with T2D obtained from the DIAbetes Genetics Replication And Meta-analysis (DIAGRAM) consortium, including 32 studies with a total of 898,130 individuals (74,124 with T2D and 824,006 without) of European ancestry41. In that resource, the Haplotype Reference Consortium reference panel was used for all component studies except deCODE GWAS, which was imputed using a population-specific reference panel (30,440 Icelandic haplotypes)41. We used the T2D data without BMI adjustment. The EPIC-Potsdam GWAS data were included in the EPIC-Interact Consortium, which contributed GWAS data to the used DIAGRAM meta-analysis. The EUROSPAN and FHOCS cohorts did not contribute to the DIAGRAM data of the utilized publication41.

Pathway enrichment analysis

We used GSA-SNP2 software for gene set enrichment analysis based on GWAS p-values40. This tool employs the Z-statistic of the random set model. We used a 20 kilobase window upstream and downstream of the gene for the SNP to gene annotation and removed adjacent genes highly correlated in the European population. We used pathway annotation from the MSigDB C2.CP (curated canonical pathways) version 5.2 database100, therein the C2 canonical pathway database, which consists of 1329 curated gene sets that represent a biological process compiled by domain experts101,102. From this knowledge source, we selected pathways that were linked to T2D (previously published set of gold standard pathways for T2D40) and CVD (defined by us as pathways that were statistically significantly enriched in the CARDIoGRAM GWAS data (42,335 CVD cases and 78,240 controls))103. Pathways with a q-value < 0.25 were considered significantly enriched.

Mendelian randomization

We conducted a univariable two-sample MR study with Cer22:0 as phenotype and T2D as the outcome41. We only conducted an MR on the putative Cer22:0 effect on T2D risk because it was the only ceramide for which genome-wide suggestively significant SNPs were detected in EPIC-Potsdam and replicated in an independent study. We selected the SNP with the strongest Cer22:0 association in EPIC-Potsdam that was available in the replication datasets as instrumental variable for a univariable MR and harmonized the data for the direction of the effects between phenotype and endpoint associations. We used the R-packages ‘TwoSampleMR’ (v0.5.5) from the MR-Base platform104 and “MendelianRandomization” (v0.5.0) to generate SNP specific Wald ratios (SNP-endpoint estimate divided by SNP-phenotype estimate) for the phenotype-endpoint associations.

Mediation analyses

We used the potential influence of red meat consumption and coffee consumption on T2D risk to explore the role of (dh)ceramides as potential mediators of lifestyle effects on cardiometabolic risk. These exposures were chosen because they contributed to T2D-prediction beyond other established risk factors in the EPIC-Potsdam study105,106,107, and the hypothesis that these exposures act through modification of lipid metabolism is biologically plausible. In a first step, we selected potential ceramide-mediators by regressing the food of interest on all T2D-related ceramides, adjusting for potential confounders [age, sex, T2D-related dietary exposure other than the exposure (from the set of red and processed meat, coffee, and whole grain), fasting status, total energy intake, leisure-time physical activity, medication (antihypertensive and lipid-lowering drugs), smoking (four categories, never, former, current < 20 Units per day, current > 20 Units per day), alcohol consumption, and education]. T2D-related ceramides were selected as potential mediators if they were statistically significantly and directionally consistently (one-sided p-value < 0.05) associated with the exposure.

Then, we estimated the proportion explainable (PE) as percentage attenuation of the association between exposure (food group or anthropometric trait) and outcome (T2D risk) in Cox models with adjustment for the selected ceramides compared to the same model without ceramide adjustment, using the delta method24,108,109. Bias corrected 95% bootstrap confidence intervals for the PE were constructed with the bcajack-function from the bcaboot package (CRAN.R-project.org/package=bcaboot) with 1000 replications and a two-thirds sampling fraction.

Software

Statistical Analysis System (SAS) Enterprise Guide 7.1 with SAS version 9.4 (SAS Institute Inc., Cary, NC, USA) was used to manage and prepare datasets and transform the lipid values. The pcalg-package in R (version 3.5.2 (20 December 2018)) and a developmental version of the NetCoupler-package (available from M.B.S. and C.W. upon request) were used to generate the metabolite networks and link them to disease incidences. QCtool v1.4 and SNPtest v2.5.2110 were used for the GWAS on lipids. MR studies were conducted using the ‘TwoSampleMR’ (v0.5.5) from the MR-Base platform104 and the “MendelianRandomization” (v0.5.0)111 R packages in R (version 3.6.3 (29 February 2020)). Mediation analyses were conducted in R (version 3.5.2 (December 2018)).

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.