Type 1 diabetes affects over 500,000 children globally, making it one of the most common metabolic illnesses in children1. Autoimmune destruction of the insulin-producing beta cells in the pancreas results in hyperglycemia and lifelong insulin dependency. Genetic risk factors are well described and likely interact with non-genetic risk factors to influence disease progression, though exact pathogenesis remains unclear2. The appearance of autoantibodies can be detected as early as 3 months of age and defines the beginning of islet autoimmunity (IA), the preclinical stage of the disease3. Efforts to better characterize metabolic dysregulation around the time of seroconversion and prior to the detection of autoantibodies may allow earlier identification of at-risk children and better understanding of the processes involved.

Metabolites reflect the interaction of numerous biological factors, including many that may influence the development of autoimmune diabetes, such as genetics, microbiome, and dietary intake. Metabolomics differences between IA cases and controls mostly have been found at the time of seroconversion, but are inconsistent across studies conducted in country-specific populations4,5,6,7. Previous country-specific studies used different laboratories to measure varying types (primary/polar5, lipids4,7,8,9, both6,10) and amounts of metabolomics features (from 1067 to 5406), and conducted studies at varying ages and stages of the disease course (cord blood4,7,8, at seroconversion to IA6, longitudinally5,9,10). Given these methodological differences, it is unclear whether differences in study findings are due to technical artefacts, or whether they represent truly different associations by geography or other meaningful characteristic. The identification of generalizable metabolic profiles related to the development of early stages of the disease across several populations is important and may inform dietary interventions to prevent type 1 diabetes, which have so far proven unsuccessful11.

Traditional investigation of diet in the development of type 1 diabetes has examined effects of individual foods or food groups and nutrients such as cow’s milk11,12,13, fatty acids14,15,16,17,18, or vitamin D19,20,21,22. However, these approaches do not account for the complexity of the diet—the effects of single nutrients and foods are often too small to identify, or too highly correlated to be separated from each other23. Examining combinations of foods and metabolites may better elucidate the role of diet in IA, as it can account for synergistic or antagonistic effects of foods or nutrients contained in the diet, and differences in how they are processed in the body.

We aimed to identify metabolite-related dietary patterns associated with IA in the multinational The Environmental Determinants of Diabetes in the Young (TEDDY) study. We conducted a metabolome- and lipidome-wide association study to better characterize plasma metabolites and lipids distinguishing cases and controls both at the time of the first autoantibody detection, and prior to its development. We created dietary patterns summarizing candidate metabolites identified pre-IA, and tested the longitudinal association of those metabolite-related dietary patterns with the development of IA.


TEDDY study design

TEDDY is an international consortium that enrolled 8,676 newborn infants with a high- or moderate-risk class II HLA genotype between 2004 and 201024. Participants are closely followed for the development of IA or type 1 diabetes, with study visits every three months from birth to age 48 months, and every three or six months thereafter depending on autoantibody status until the age of 15 years. Participating study centers include: Georgia/Florida, Colorado, and Washington in the U.S., and Finland, Sweden, and Germany in Europe. IA cases are defined by confirmed autoantibody positivity to either insulin (IAA), GAD (GADA), or IA-2 (IA-2A) on two consecutive study samples, the first of which defines the case’s event age.

A nested case-control biomarker study was designed using risk set sampling to select three controls per IA case (n = 418) that had developed in TEDDY as of May 2012. Eligible controls were autoantibody-negative at the case’s event age, and further matched on clinical center, sex, and family history of type 1 diabetes as previously described25. Secondary outcomes included cases positive for IAA only or GADA only at IA event time. IA-2A was excluded as an outcome since very few cases developed IA2 as their first and only persistent confirmed autoantibody at IA case-time. Multiple autoantibody positivity (mAb+) was defined as any subject positive for more than one autoantibody at IA event time, or who developed more than one autoantibody during follow-up.

The study methods have been carried out in accordance with the approved guidelines by local Institutional Review or Ethics Boards, including: Colorado Multiple Institutional Review Board (#04-0361); Medical College of Georgia Human Assurance Committee (2004–2010)/Georgia Health Sciences University Human Assurance Committee (2011–2012)/Georgia Regents University Institutional Review Board (2013–2017)/Augusta University Institutional Review Board (2017-present) (#HAC 0405380); University of Florida Health Center Institutional Review Board (#IRB201600277); Washington State Institutional Review Board (2004–2012)/Western Institutional Review Board (2013–present) (#20130211); Ethics Committee of the Hospital District of Southwest Finland (#Dnro168/2004); Bayerischen Landesärztekammer (Bavarian Medical Association) Ethics Committee (#04089); and Regional Ethics Board in Lund, Section 2 (2004–2012)/Lund University Committee for Continuing Ethical Review (2013-present) (#217/2004). The study is monitored by an External Evaluation Committee formed by the National Institutes of Health. Written informed consents were obtained from a parent and/or legal guardian for all participating children. Data described in the manuscript and code book will be made available upon request from the NIDDK Central Repository at

Metabolomics data pre-processing

Metabolomics abundance measures (metabolites and lipids) were obtained for all cases and controls for each available study visit from birth until the case event time. Primary metabolites and complex lipids were quantified from citrate plasma using GC-TOF MS and CSH-QTOF MS data acquisition, respectively, at the NIH West Coast Metabolomics Center at the University of California, Davis. GC-TOF MS data were acquired as previously described26, with data processing and compound identification using the BinBase algorithm27, GC-TOF data were sum normalized followed by LOESS (locally weighted scatterplot smoothing) normalization. For complex lipids, samples were extracted by methyl-tert-butyl ether/methanol/water28, followed by chromatogram peak detection and alignments using Mass Profiler Professional (Agilent, Santa Clara, CA). Peaks detected in a minimum of 30% of samples were identified and quantification back-filled using the Fiehn laboratory’s LipidBlast spectral library, as previously described29. LOESS followed by batch ratio (QC samples were used to adjust sample batch median to global study median) normalization was performed across all the samples to estimate and remove analytical variance.

Prior to transformation, data quality checks included evaluation at the metabolite- and sample-level (Supplemental Fig. 1). Metabolites that were not detected in more than 10% of samples (6 metabolites), or with a coefficient of variation greater than or equal to 100% (286 metabolites) were excluded from further analyses. Samples with missing or zero values in greater than 10% of metabolomics features (n = 5) or with values more extreme than 4 standard deviations above or below the mean in greater than 30% of metabolomics features (n = 6) were removed from analyses. A total of 853 metabolites and lipids and 11,556 samples passed the quality checks. All metabolites were transformed using Box-Cox transformation analysis, and scaled30.

Dietary intake and food groupings

Dietary assessment was carried out by 24-hour recall at the first clinic visit at 3–4.5 months of age, then by 3-day food record every 3 months until 12 months of age, and then every 6 months thereafter. TEDDY research staff provided detailed instruction and examples to families regarding completion of food records, as previously described31. From quantities of foods and dishes consumed, the amounts of energy and single foods contained therein were computed using in-house food record processing programs and food composition databases unique to each country32. The foods and dishes (e.g. wheat bread, apple-oat meal) consumed were quantified into main food groups (ie: cereals, fruits and berries, etc.) and subgroups (ie: wheat, rice, oats, citrus fruits, apple, berries, etc.) in grams per day (g/day) of intake. After quantification, the three food records were averaged to calculate the mean energy and food intake for each study subject on each study visit. Results of detailed harmonization studies of these country-specific food composition databases documented that the energy values and food subgroups used in this study were comparable across the TEDDY countries32.

For any food record where a subject was indicated as breast fed, we estimated the amount of breastmilk consumption using an algorithm developed by the Institute of Medicine33. First, we calculated the estimated energy requirement based on age and weight. The difference in the estimated energy requirement and the mean energy reported on the food record from food and formula was attributed to breastmilk. We calculated the amount (grams) of breastmilk consumed to achieve that energy intake using a conversion factor of energy density per 100 g, as follows: 65.3 kcal/100 g in Finland, 69 kcal/100 g in Germany, 68 kcal/100 g in Sweden, and 70 kcal/100 g in the U.S.

Statistical analyses

Statistical analyses are described by aim below. First, we identified plasma metabolites and lipids associated with IA at three different time points, using a metabolome-wide association approach in TEDDY’s nested case-control study. Second, we created dietary patterns summarizing candidate metabolites identified in infancy. Finally, we tested the longitudinal association of infancy metabolite-related dietary patterns with development of mAb+ in the full TEDDY cohort. Figure 1 summarizes the population and data flow for all aims and analyses.

Figure 1
figure 1

Data flow diagram summarizing selection and size of analysis population for all aims (numbered). From a nested case-control study in TEDDY, we tested the association of 853 metabolites with outcomes at the time of seroconversion to IA (sets = 352), the last sample prior to IA (sets = 366), and at infancy (sets = 253). We created dietary patterns explaining candidate mAb+ metabolites identified in infancy, when children (n = 529) were 9-months of age and autoantibody negative. All subjects with food records at 9-months in the full TEDDY cohort (n = 6,537) were scored on the dietary pattern, and the association with development of mAb+ tested. IA = islet autoimmunity, mAb+ = multiple autoantibody positive, sets = number of risk sets or matched strata, n = number of subjects.

Metabolites associated with IA

Conditional logistic regression was used to calculate odds ratios (ORs) for the association of each transformed metabolite with the development of IA, adjusting for high-risk HLA genotype (DR3-DQA1*05:01-DQB1*02:01/DR4-DQA1*03:01-DQB1*03:02 versus all other), and age at blood draw. Some TEDDY subjects follow a long-distance protocol, in which blood is drawn and shipped to clinical centers before being processed for biomarker identification. Since plasma primary GC-TOF MS metabolic profiles are less stable with centrifugation delay34, we required an additional match for the long distance protocol between case and control samples.

Metabolomics analyses were run in three cross-sections. First, we examined metabolite differences between IA cases and controls at the first detection of autoantibody positivity (seroconversion), defined as the first of the two consecutive autoantibody positive visits for cases. Then, to identify metabolites and lipids that may differentiate IA cases and controls prior to the detection of autoantibodies (pre-seroconversion) we selected the most recent IA-free visit for cases. Finally, since autoimmunity can begin very early in life and both metabolomics and dietary factors are strongly related to age, we identified metabolites distinguishing IA cases and controls prior to the appearance of autoantibodies in infancy. The “infancy” cross-section was defined as an IA-free visit at 9 months of age for cases. For controls, the visit corresponding to the case visit was selected for all cross-sections. Children positive for autoantibodies at 9-months (n = 48 IA cases) and their matched controls were excluded from the infancy cross-section (Fig. 1).

We tested the association of each metabolite with the secondary outcomes described above: IAA, GADA, and mAb+. We considered p-value < 0.05 significant since traditional approaches for multiple comparison correction may be too strict for the unusually highly correlated metabolomics data or inappropriate given the exploratory nature of our study aims35. SAS version 9.4 was used for these analyses.

We focused on pathway enrichment, given that metabolites may capture perturbations in many upstream biological systems thereby complicating interpretation of individual associations. ChemRICH forms non-overlapping groups of metabolites based on chemical similarity and ontology mapping36. It calculated a single p-value for each group, and identified the most significant metabolite in each group as the “key compound” ( Inputs for the ChemRICH analyses included the nominal p-value and odds ratio from the individual conditional logistic regression models, and chemical structure information from well-characterized known metabolites and lipids (p = 315, see Supplemental Table 1).

Metabolite-related dietary patterns preceding the appearance of autoantibodies

Reduced rank regression (RRR) was used to identify dietary patterns reflecting metabolites associated with IA. RRR creates linear combinations of foods (dietary patterns) that explain the maximum covariation in a second set of intermediate response variables (metabolites)37, thereby capturing disease-related variation in the diet rather than general eating behaviors identified from other dietary pattern methods38. We focused dietary pattern analysis on the infancy cross-section, since it is prior to the beginning of the autoimmune process and all children were the same age. Conducting dietary pattern analyses with foods in young children of different ages could be problematic, since not all foods are able to be eaten at all ages.

Foods from the food record were combined into 43 subgroups based on nutrient content and culinary usage (Supplemental Table 2)32. Intake of various foods varies greatly at 9-months of age, leading to some foods having a large proportion of subjects with no reported intake. Therefore, we filtered out food subgroups with a high proportion of non-eaters, as a way to deal with inflated zero distributions, as is common in studies using RRR39. Food subgroups that were shown to be comparable across TEDDY countries (ie harmonized), and were eaten by at least 40% of subjects in infancy were included in the creation of dietary patterns. The resulting dietary patterns therefore reflect foods that are most commonly eaten at 9-months among TEDDY study participants. Food subgroups were standardized to the age-specific mean and standard deviation of all TEDDY food records for dietary pattern analyses. The key compound in each significantly enriched metabolite group (identified by ChemRICH) was used as RRR response variables.

The number of dietary patterns needed to best explain the variation in metabolites was selected using the van der Voet T2 statistic40. The loadings (or relative weights) of food groups on each dietary pattern and the partial correlation with metabolite response variables was used to interpret each dietary pattern.

Metabolite-related dietary patterns and risk of IA

Infancy metabolite-related dietary patterns were first tested in the nested case-control study using conditional logistic regression as described above. Then we applied them to the full TEDDY cohort at 9 months of age, using the food group loadings to generate one score per pattern for each subject with complete food records (n = 6,537). The dietary pattern score is a linear combination of food intakes using the reported intake of each food weighted by the factor loadings37. The score indicates how similar the reported dietary intake of a subject is to the dietary pattern—with higher scores indicating a diet similar to the pattern, and lower scores indicating a diet dissimilar to the pattern.

Cox proportional-hazards models were used to test the association of metabolite-related dietary pattern scores at 9-months on risk of mAb+, adjusting for clinical center, high-risk HLA genotype, family history of type 1 diabetes, total energy intake, and sex. Adjustment factors used in multivariable models were selected based on standard adjustments used in the TEDDY study31,41. Time-to-event analyses were performed to evaluate whether metabolite-related dietary pattern scores were associated with mAb+ by the age of 6 years. Cases included those selected for the nested case-control study plus any additional cases that developed by January 2018, when the prospective data collection was cut for analyses. Given that risk factors for IA may differ by age, we restricted follow up to 6 years in the TEDDY cohort to ensure the cohort analysis represented a similarly-aged case-population as the nested case-control study. For consistency with the nested case-control study, the time-to-event was defined as the time from birth to the appearance of the first persistent confirmed autoantibody among IA cases who developed a second persistent confirmed autoantibody at any point. Subjects without complete covariate information, those developing mAb+ at 9-months (n = 53), or who had no follow-up after 9-months (n = 243) were excluded from survival analysis (Fig. 1).


Metabolic dysregulation apparent at seroconversion and infancy

From the nested case-control study, there were 352 matched sets with metabolomics measures for 1 case and at least 1 control at seroconversion (mean (SD) case-age = 722 (446) days), 366 sets at pre-seroconversion (mean (SD) = 625 (412) days), and 253 sets at the 9-month infancy visit (mean (SD) = 283 (14) days) (Table 1). For secondary outcomes, 49% of IA cases were positive for IAA, 32% for GADA, and 60% for mAb+. The distribution of secondary outcomes was consistent across the seroconversion, pre-seroconversion, and 9-month infancy cross-sections. The majority of IA cases were from Sweden (32%) and Finland (28%). IA cases were 55% male, 22% had a first degree relative, and 12% had their seroconversion blood-draw following TEDDY’s long distance protocol (Supplemental Table 3). The distribution of matched sets by matching factors (clinical center, sex, first-degree relative status, and long distance protocol) was similar in the secondary outcomes compared to primary IA. Approximately 9% of the pre-seroconversion case samples (n = 34) occurred during the 9-month infancy visit.

Table 1 Description of matched sets (1 case and 1, 2 or 3 controls) for metabolomics analyses by outcome and cross-section.

Conditional logistic regression results from the metabolome-wide association study indicated metabolic dysregulation in cases compared to matched controls at seroconversion and during infancy (Supplemental Fig. 2). More metabolites and lipids were different by case status when restricting to the mAb+ outcome (p = 130, 15%) compared to the IA outcome (p = 64, 7.5%). There were few metabolites associated with the secondary outcomes IAA first and GADA first in any cross-section. Therefore, we focused the metabolomics set enrichment analyses on the mAb+ outcome, which represented 60% of IA cases and had the largest signal of metabolomics differences between cases and controls. ChemRICH identified seven groups of chemically similar metabolites that were significantly different among mAb+ cases and controls at seroconversion, one group at pre-seroconversion, and six groups in infancy (Fig. 2, Supplemental Table 4).

Figure 2
figure 2

Chemically similar metabolite sets identified as significantly associated with mAb+ by ChemRICH. Each row is an individual metabolite, grouped by ChemRICH set and sorted by log(OR) within each set. Log(OR) > 0 (red) indicates a positive association between metabolite and mAb+, whereas log(OR) < 0 (blue) indicates an inverse association between metabolite and mAb+. Phosphatidylcholines were significantly lower in cases compared to controls in infancy (9-month), just prior to seroconversion (PSV), and at seroconversion to primary IA (SV). Other phospholipids were significantly lower in cases only in infancy, while other metabolite groups, such as unsaturated triglycerides and amino acids, distinguished cases and controls at seroconversion. Metabolite groups identified as significant (group p-value < 0.05) in any cross-section are shown, along with the corresponding adjusted p-value for the group (group false discovery rate).

Of the metabolite groups identified as different between mAb+ cases and controls, only unsaturated phosphatidylcholines (PC) were consistently dysregulated in all three analyses (p-value for group in seroconversion = 2.1 × 10−5, pre-seroconversion = 0.013, infancy = 2.2 × 10−20), with the majority of the individual metabolites being lower in mAb+ cases compared with controls (OR < 1) (Fig. 2). Similarly, phosphatidylethanolamines (PE) were lower in mAb+ cases (OR < 1) at both seroconversion (p-value = 0.0047) and in infancy (p-value = 2.9 × 10−6).

Other than PCs and PEs, distinct metabolite groups distinguished mAb+ cases from their controls at the time of seroconversion to primary IA compared to during infancy prior to the appearance of any autoantibodies. At seroconversion, mAb+ cases had lower levels of unsaturated triglycerides (p-value = 3.1 × 10−15), amino acids (p-value = 0.0074), diglycerides (p-value = 0.014), and aromatic amino acids (p-value = 0.03), and higher levels of saturated fatty acids (p-value = 0.0074). In infancy, other phospholipids were significantly protective for mAb+ (majority of OR < 1), including sphingomyelins (SM, p-value = 1.1 × 10−8) and phospholipid ethers (EtherPL, p-value = 0.0032), along with the glucosylceramides (GlcCer, p-value = 4.4 × 10−5). Three dicarboxylic acids were significantly higher in mAb+ cases compared to controls in infancy (OR > 1, p-group = 0.0019).

Infant metabolite-related dietary patterns and risk of mAb+

From each of the six groups identified by ChemRICH in infancy, we used the key metabolite (most significant one) as a response variable in dietary pattern analyses, including: PC (34:3), SM (d41:2) A, PE (34:2), GlcCer (d41:1), adipic acid, and PC (p-32:0) or PC (o-32:1) (EtherPL).

Reduced rank regression identified three dietary patterns that explained 8% of the variation in metabolites and 29.3% of the food variation. Food groups factor loadings and metabolite variable weights used to interpret each dietary pattern are shown in Fig. 3. More extreme factor loadings or variable weights indicate the food or metabolite was influential in the dietary pattern. Infants scoring high on Dietary Pattern 1 ate more non-gluten containing cereals, onions, vegetable oils, and fat-free milk (positive factor loadings), and less breast milk (negative factor loadings). This diet corresponded to higher levels of PE (34:2), as indicated by the higher variable weight. Infants scoring high on Dietary Pattern 2 ate diets with higher saturated fats, fat-free milk, poultry, and infant formula, and lower in potatoes and vegetable oils. This diet corresponded to higher levels of SM (d41:2) A, GlcCer (d41:1), and PC (p-32:0) or PC (o-32:1). Finally, 9-month infants scoring high on Dietary Pattern 3 ate diets higher in breast milk, red meat, potatoes, and cereals, and lower in processed fruits, legumes, and infant formula. High scores on Dietary Pattern 3 corresponded to higher levels of PC (34:3) and PC (p-32:0) or PC (o-32:1), and lower levels of SM (d41:2) A. The correlation between metabolites and dietary patterns followed similar patterns (Supplemental Table 5).

Figure 3
figure 3

Food group loadings and metabolite weights for metabolite-related dietary patterns. In total, the three dietary patterns explained 8% of metabolite variation and 29.3% of food variation. For food, the radial axis indicates the loading on each dietary pattern (Range: −0.6 to 0.4), and is used to interpret which combinations of foods are influential in the dietary pattern. Similarly, the metabolite radial axis indicates the weight of each metabolite on each dietary pattern (Range: −0.6 to 0.9), indicating which combination of metabolites are explained by each dietary pattern. For example, subjects scoring high on dietary pattern 1 had diets higher in non-gluten containing cereals, onions, vegetable oils, and fat-free milk, and lower in breast milk. This diet corresponded to higher levels of PE (34:2).

Dietary patterns generated from metabolites and food intake in the nested case-control study were applied to the full cohort to generate one metabolite-related dietary pattern score for each dietary pattern on all 9-month diet records. Subjects developing mAb+ by age 6 years were more likely to have a first-degree relative (FDR) with type 1 diabetes and to have high-risk HLA-DR3/4 genotypes (Supplemental Table 6). Dietary Pattern 3 was significantly protectively associated with mAb+ in the nested case-control study (OR = 0.67, 95%CI = 0.48–0.94, Table 2). However, there was no association seen in time-to-event analyses applied to the whole cohort and adjusted for clinical center, sex, HLA-DR3/4, and FDR (HR = 0.98, 95%CI = 0.83–1.16, Table 2). No other 9-month metabolite-related dietary patterns were associated with development of mAb+ in the TEDDY cohort. Results did not change in a sensitivity analysis in which we adjusted for an additional 17 covariates that TEDDY has identified as associated with development of IA (race-ethnicity, maternal education, maternal age, introduction of probiotics before 28 days, introduction of probiotics at or after 28 days, weight for age z-score at 12 months, and number of minor alleles for rs2476601, rs2816316, rs11711054, rs10517086, rs4948088, rs1004446, rs7111341, rs2292239, rs3184504, rs3825932, rs12708716) (data not shown).

Table 2 Dietary patterns at 9-months of age associated with risk of mAb+ in TEDDY.


We identified dysregulated metabolism at the onset of and preceding stage 1 diabetes (mAb+) in a multi-national, prospective type 1 diabetes study. PC and PE metabolite groups were consistently decreased in mAb+ cases compared to controls both prior to and at the time of seroconversion. Unsaturated triglycerides and amino acid groups were lower among mAb+ cases at seroconversion only, whereas SM, GlcCer, and EtherPL lipids were lower among mAb+ cases in infancy only. While an infancy dietary pattern explaining choline- and sphingosine- containing lipids was associated with mAb+ in the nested case-control study, this association was not observed in the full TEDDY cohort.

Dicarboxylic acids were the only metabolite group we found associated with increased risk of mAb+ in infancy. Adipic acid was the key compound of the dicarboxylic acids group. Other dicarboxylic acids associated with increased mAb+ risk included the tricarboxylic acid (TCA) cycle intermediaries succinic acid and malic acid. While metabolomics studies in Norway and Germany did not identify TCA cycle metabolites5,6, a previous study in Finnish children found both succinic acid and glutamic acid were increased in type 1 diabetes cases 0–9 months prior to autoantibody appearance10. We were not able to examine glutamic acid as its measurement was inhibited by the use of citrate tubes for plasma collection and storage in TEDDY. Through their regulation of demethylase activity, succinic acid and other TCA cycle intermediaries may be important regulators of DNA and histone methylation42, which may have some links to type 1 diabetes pathogenesis43.

The remainder of metabolite groups distinguishing mAb+ cases from controls in infancy belonged to lipid classes, some of which have been inconsistently associated with type 1 diabetes endpoints. We identified phospholipid dysregulation of PC and PE metabolites prior to mAb+ in infancy and at the time of seroconversion. While Oresic et al.10 similarly found lower phosphatidylcholines in children who later developed type 1 diabetes, Pflueger et al. found higher levels of triglycerides and PUFA-containing phosphatidylcholines in autoantibody-positive children6. Unsurprisingly, the direction of association is different before and after the appearance of autoantibodies, as has been reported for other risk factors for type 1 diabetes, such as erythrocyte membrane fatty acid levels15,17 or diabetes susceptibility genes44.

Sphingolipid metabolism plays a role in diabetic pathologies, including regulating beta-cell apoptosis, proinsulin and insulin folding in the endoplasmic reticulum, and cytokine secretion45. The evidence supporting this connection has been recently extended from animal models into human islet cells46. We identified two sphingolipid groups as significantly lower in infancy for mAb+ cases versus controls, including the SM group, which were previously identified in type 1 diabetes metabolomics studies9,10, and the GlcCer group. As a whole, sphingolipids have been characterized as both pro- and anti-inflammatory. Endogenous sphingolipids are metabolically involved in T-cell regulation, autoimmunity, and inflammation47, yet consumption of dietary sphingolipids have been linked to anti-inflammatory responses48. The protective effects of dietary sphingolipids may operate via changes in gut microbiota or by activating other cofactors such as peroxisome proliferator-activated receptor γ expression49, both of which have been implicated in type 1 diabetes50. While outside the scope of this study, targeted characterization of the relationship between sphingolipid dietary intake and metabolite levels in future research might help to disentangle reported contrasting effects, which likely depend on other factors such as specific sphingolipid structure and existing metabolic state.

We identified choline-containing lipid groups (PC, SM, EtherPL) as protective for development of mAb+ at 9 months of age, consistent with previous studies conducted at birth and 3 months of age7,9,10. Choline is important for rapid growth and development in infancy, as a constituent of phospholipid cellular membranes. Additionally, it may play a role in insulin resistance or energy metabolism, perhaps through its role as a methyl donor for epigenetic changes51.

A metabolite-related dietary pattern reflecting choline-containing foods and metabolites was protective for mAb+ in the nested case-control study; however, no dietary pattern at 9-months was associated with development of mAb+ by age six years in the full TEDDY cohort. There are several factors that could contribute to the lack of dietary pattern association found. First, untargeted metabolomics and 3-day food records may not be measured precisely enough to successfully identify disease-related dietary patterns at such a young age where variability in both is large. Second, using the most significant metabolite in each group may not be the best choice of response variable, which is the variable that determines the ability of reduced rank regression to capture disease-related variation in the diet52. Metabolites may reflect other environmental factors, such as medication or microbiome, and therefore be poorly correlated with dietary intake. Metabolomics measures were further limited because they were quantified from non-fasting samples, which has been shown to differentially impact serum metabolic profiles related to dietary factors53.

We identified metabolic dysregulation prior to the detection of autoantibodies that distinguished children whose lifetime risk for symptomatic (Stage 3) type 1 diabetes approaches 100%54. Metabolomics differences were more apparent when comparing the high-risk mAb+ group to controls than comparing all cases of IA to controls. Few differences were identified by the type of first-appearing autoantibody. This metabolomics discovery was more comprehensive and generalizes to a broader population than previous studies. However, our exploratory approach using nominal p-value cutoffs may necessitate replication of these findings in another study. While novel application of dietary patterns summarizing candidate metabolites did not successfully extend outside of the nested case-control study, the approach may show promise for future work with targeted measurement of disease-related metabolites. Application of these methods that account for complex dietary intake are particularly important in type 1 diabetes research, since components of dietary intake are among the leading hypothesized environmental factors that act on high-risk genetic background to cause type 1 diabetes.

In the TEDDY study, higher levels of dicarboxylic acids and lower level PCs, SMs, PEs, EtherPLs, and GlcCers at 9-months of age was associated with increased risk of mAb+. Characterization of this profile may help shape the future of early diagnosis or prevention efforts.