Introduction

Gestational diabetes mellitus (GDM) is a hyperglycemic condition first recognized during pregnancy1. GDM affects 4.5% to 20.3% of pregnancies among Asian women depending on different countries and GDM diagnostic criteria2. GDM increases the woman’s lifetime risk of developing abnormal glucose metabolism (AGM), including prediabetes and type 2 diabetes (T2D)3,4. Additionally, these women are also at higher risk of developing cardiovascular disease and renal disease5,6. Therefore, identifying more convenient approaches such as novel biomarkers to quantify the risk of AGM could be beneficial for risk stratification and early postpartum intervention among women with GDM.

It is well known that metabolites are the end products of specific cellular regulatory processes. The levels of metabolites could reflect the ultimate response of biological systems to genetic or environmental changes7. In terms of the transition from GDM to postpartum AGM, the capture of metabolic signals underlying postpartum AGM might be indicative and even predict this process. Emerging studies demonstrated that specific metabolomic biomarkers improved the prediction of the transition risk from GDM to T2D, including lipids [i.e., Cholesteryl ester (20:4), Lphosphatidylethanolamine (36:2), Phosphatidylserine (38:4), Lphosphatidylserine (C40:5)] and amino acids (i.e., branched-chain amino acids, Hexose)8,9,10,11. However, there is still a lack of adequate understanding in this field of research. Firstly, previous studies have only focused on metabolic biomarkers in the transition from GDM to T2D, mainly ignoring the transition from GDM to prediabetes. Secondly, these prior studies have primarily used targeted approaches to assess metabolic profiles and tested specific categories of metabolites (e.g., lipids) based on potential mechanisms and interests8,9,10. Such methods limit the consideration of the full spectrum of human metabolic profiles and misinterpret the underlying mechanisms between GDM and the development of AGM with selection bias in specific categories of metabolites. Thirdly, these existing studies mainly focused on non-Asian populations in the US and Australia while lacking data on the Asian population, which are at a high risk of both GDM and T2D12. Lastly, all the present studies reported metabolic biomarkers identified only in GDM women. However, there is also a trend in developing postpartum AGM among women without a history of GDM13.

In this pilot with 100 women nested in a Singapore birth cohort, we identified metabolic signatures associated with postpartum AGM and GDM via an untargeted and discovery-based metabolomic approach. Subsequently, we investigated the postpartum AGM indicative value given by such metabolic signatures, in addition to traditional risks including a family history of T2D and body mass index (BMI).

Results

Of the 100 participants, 24 out of 50 women with GDM (46%) and 17 out of 50 women without GDM (34%) developed AGM after 5 years’ follow-up (p = 0.22). Women with GDM at baseline had similar pre-pregnancy BMI (22.8 vs. 21.6, p = 0.54), lower gestational weight gain at 26–28 weeks of gestation (8.2 vs. 8.8, p = 0.07), and lower higher BMI at follow-up (24.1 vs. 25.9, p = 0.54) compared with women without GDM at baseline (Table 1).

Table 1 Sociodemographic and clinical characteristics at baseline and 5-year postpartum follow-up of the study population.

Figure 1 illustrate women's metabolic profiles with and without AGM at the 5-year follow-up in OPLS-DA score plots. A total of 31 serum metabolites were found at different levels between women with and without 5-year postpartum AGM in a crude model (OPLS-DA VIP score > 1, Mann–Whitney U test or t-test p-values 0.0006 to 0.049). After adjusting for age at follow-up, race/ethnicity, BMI at follow-up, college education, family history of T2D and nulliparity, 23 metabolites remained statistically significant in association with 5-year postpartum AGM (FDR < 0.1), of which 5 metabolites were able to further differentiate women with a history of GDM (Table 2). Regression coefficients of all 23 metabolites are presented in Supplementary Table 1.

Figure 1
figure 1

OPLS-DA score plots of metabolic profiles of gestational diabetes mellitus/normal glucose metabolism women at baseline (24–28 weeks of gestation), and abnormal glucose metabolism (AGM)/normal glucose metabolism women at 5-year’s follow-up. (A) The purple dots and the yellow dots represent women with and without GDM at baseline, respectively; (B) The pink dots and the green dots represent women with and without AGM at 5-year’s follow-up, respectively.

Table 2 Metabolites that were associated with 5-year postpartum abnormal glucose metabolism after multiple adjustment.

After further controlling for collinearity with ridge regression, all 5 metabolites remained significant in relation to 5-year postpartum AGM (p: 0.001 to 0.018) (Supplementary Table 2). These five metabolites, namely p-cresol sulfate, linoleic acid, glycocholic acid, lysophosphatidylcholines [LysoPC(16:1) and LysoPC(20:3)], were included in the AGM indication models. Higher serum levels of LysoPC(16:1) and LysoPC(20:3) were associated with increased risk of 5-year postpartum AGM, while higher levels of p-cresol sulfate, linoleic acid, and glycocholic acid were associated with reduced risk of 5-year postpartum AGM. LysoPC(16:1) and LysoPC(20:3) were analyzed separately to avoid biological collinearity.

Figure 2 presents the results of the comparison across four models, namely Model 1 (traditional risks), Model 2 (traditional risks and diagnostic biomarkers including fasting and 2-h glycemic levels at study entry), Model 3 [Model 2 and metabolites p-cresol sulfate, linoleic acid, glycocholic acid and lysoPC(16:1)], and Model 4 [Model 2 and metabolites p-cresol sulfate, linoleic acid, glycocholic acid and lysoPC(20:3)]. Variables included in each model and AUC and R square for each model are presented in Table 3. The AUC (R2) for all models were listed accordingly: 0.74 (0.25) in Model 1, 0.77 (0.32) in Model 2, 0.94 (0.70) in Model 3 and 0.92 (0.67) in Model 4. The AUC of Models 3 and 4 were both significantly higher than Model 1 and Model 2 individually and Model 3 yielded the highest indicative value among all stepwise models (Supplementary Table 3).

Figure 2
figure 2

Receiver operating characteristic (ROC) curve admissions of the indicative models on AGM at 5-year’s follow-up. The gray line represents the ROC curve of Model 1: AGM at year 5 ~ Age at year 5 + Ethnicity + BMI at year 5 + Family History of T2D + Number of GDM Episodes, R2 = 0.25, AUC = 0.74; The green line represents the ROC curve of Model 2: AGM at year 5 ~ Age at year 5 + Ethnicity + BMI at year 5 + Family History of T2D + Number of GDM Episodes + Fasting and 2-h glycemic levels at study entry, R2 = 0.32, AUC = 0.77; The orange line represents the ROC curve of Model 3: AGM at year 5 ~ p-cresol sulfate + linoleic acid + Glycocholic acid + LysoPC(16:1) + Age at year 5 + Ethnicity + BMI at year 5 + Family History of T2D + Number of GDM Episodes + Fasting and 2-h glycemic levels at study entry, R2 = 0.70, AUC = 0.94; The blue line represents the ROC curve of Model 4: AGM at year 5 ~ p-cresol sulfate + linoleic acid + Glycocholic acid + LysoPC(20:3) + Age at year 5 + Ethnicity + BMI at year 5 + Family History of T2D + Number of GDM Episodes + Fasting and 2-h glycemic levels at study entry, R2 = 0.67, AUC = 0.92.

Table 3 Contribution of variables in each regression model for abnormal glucose metabolism.

Additional KEGG pathway analyses with the 31 metabolites that passed OPLS-DA and univariate analysis (Table 4) showed that among all women, AGM-related metabolites were associated with 3 biological pathways with p-value < 0.1. They included alpha-linolenic acid metabolism (alpha-linolenic acid) (p = 0.03), glycerophospholipid metabolism [LysoPC(16:1) and LysoPC(20:3)] (p = 0.09) and biosynthesis of unsaturated fatty acids (alpha-linolenic acid) (p = 0.09). The metabolic map that shows the location of alpha-linolenic acid metabolism is presented in Fig. 3.

Table 4 Kyoto encyclopedia of genes and genomes (KEGG) pathways of AGM-associated metabolites among all subjects.
Figure 3
figure 3

The metabolic network of identified alpha-linolenic acid metabolism.

Discussion

Our study identified 5 metabolic signatures [p-cresol sulfate, linoleic acid, glycocholic acid, LysoPC(16:1), and LysoPC(20:3)] that were associated with postpartum AGM specifically among women with a history of GDM. In addition to the indicative value of postpartum AGM using traditional risk factors including BMI and family history of T2D as well as glucose level at index pregnancy, these metabolites significantly increased the AUC of the regression model by ~ 20%. Furthermore, our pathway analysis showed that such identified metabolites were involved in either lipid or insulin metabolism.

Emerging evidence has suggested a plausible role of metabolites underlying the transition from GDM to manifest T2D. Several metabolites have been suggested to be predictive of T2D development among women with a history of GDM. These metabolites include branched-chain amino acids, acylcarnitines, fatty acids (i.e., linoleic acid, phospholipids including lysoPCs), and sphingomyelins (i.e., SM (OH) C14:1)8,9,10,11. However, most existing studies identified these metabolites using targeted approaches focusing on lipids and amino acids8,9. Therefore, they may neglect metabolites in other pathways that could significantly contribute to the transition from GDM to T2D.

In our study, serum metabolites were examined using an untargeted, discovery-based approach (LC–MS) that includes more classes of metabolites other than lipids and amino acids for analyses, thus providing a more comprehensive metabolic profiling. Models with identified metabolites yielded higher indicative values on postpartum AGM than model using traditional risk factors and/or glycemic levels collected at index pregnancy. Such findings might suggest a great potential of utilizing these identified metabolites to underlie the transition between GDM and AGM. Two out of the five identified AGM-associated metabolites among women with a history of GDM were lysoPCs [lysoPC(16:1) and lysoPC(20:3)], both increased in women with AGM. LysoPCs are essential elements of glycerophospholipid metabolism and also reservoirs and transporters for fatty acids and choline14. Insulin resistance, prediabetes, and T2D are accompanied by hypertriglyceridemia15 and abnormal glycerophospholipid metabolism16. Previous studies have reported an upregulation of lysoPCs including lysoPC (16: 1) in women with GDM16,17. Therefore, the increase of lysoPCs found in our study might lead to a surplus in fatty acids and choline, which ultimately results in a higher risk of impairment in glucose metabolism18.

We also observed linoleic acid (LA) signatures, alpha-linoleic acid (ALA), and glycocholic acid in women with postpartum AGM. LA is a polyunsaturated fatty acid (PUFA) and a type of omega-6 fatty acid associated with reduced risk of T2D and improved glucose tolerance in women after GDM19,20. Incorporating linoleic acid into phospholipids could alter membrane fluidity and further enhance insulin receptor activity21. ALA is an essential omega-3 fatty acid and a precursor of eicosapentaenoic acid (EPA) and docosahexaenoic acid (DHA). It has been reported to be an active agent reducing circulating free fatty acids (FFA) and increasing insulin sensitivity and may reduce the risk of T2D22. Moreover, ALA has also shown the potential of lowering the levels of HbA1c and fasting blood glucose concentrations in diabetic patients23. In addition, glycocholic acid is an acyl glycine and a bile acid-glycine conjugate involved in fats' emulsification and part of the primary bile acid biosynthesis pathway (KEGG ID: hsa00120). As we know, bile acids are physiological detergents that facilitate excretion, absorption, and transport of fats and sterols in the intestine and liver. The inter-organ signaling and interplay between bile acids receptors and the gut microbiota have been suggested to underlie the pathophysiology of T2D24,25.

Interestingly, we also observed decreased level of a metabolite (p-cresol sulfate) underlying AGM that was not reported to be associated with GDM or AGM. The level of circulating 4-cresol in the host organism may reflect architectural alterations of the gut microbiota in obese and diabetic patients26,27. However, the pathophysiological mechanisms of p-cresol sulfate's role in AGM development remain unclear and require further investigation.

This study's strengths lie in the application of AGM as an early stage outcome, measurements evaluated via standardized protocols, and reliable quality control on metabolites examination. However, our study is not without limitations. First, the relatively small sample size might have restricted the study power to detect more potential metabolites signatures underlying AGM. In addition to the small sample size, we were unable to match ethnicity completely. However, considering that all mothers were of Southeast Asian origin and no significant difference was found across ethnicities between GDM and non-GDM controls, the genetic heterogeneity in our findings might not be substantive. Second, residual bias might exist such as glycated hemoglobin (HbA1C) at index pregnancy. Third, we did not collect any dietary intake data after delivery. Even though some evidence showed no difference between the GDM and non-GDM group after delivery in terms of diet or energy intake9,11, further studies are warranted to include such variable. Fourth, since levels of the metabolites were examined at follow-up rather than at baseline, reverse causality cannot be ruled out in our preliminary findings. Future studies with larger sample sizes in a multiracial and prospective study setting with external validation, as well as multiple time points of metabolites testing, are warranted to verify these preliminary findings.

Conclusion

Our study identified five metabolites including p-cresol sulfate, linoleic acid, glycocholic acid, lysoPC(16:1) and lysoPC(20:3) that were associated with postpartum AGM, specifically among women with prior GDM, beyond traditional risk factors. These metabolic signatures might shed light on the pathophysiology underlying the transition from GDM to AGM, and even provide insights into potential screening approaches using metabolites in clinical practice.

Methods

Study participants and design

This is a cross-sectional and observational pilot study nested in a longitudinal birth cohort study in Singapore (Growing Up in Singapore Towards Healthy Outcomes, GUSTO). This cohort recruited 1136 mothers with singleton pregnancies during their first trimester from June 2009 to September 2010. We have reported the study design and recruitment criteria in previous publications28. We performed an oral glucose tolerance test (OGTT) at 26–28 weeks of gestation for all recruited mothers. As a pilot in the metabolomics study, we enrolled a total of 100 participants in the current study, including 50 GDM and 50 non-GDM women matched for age (± 2 years), ethnicity and pre-pregnancy BMI (± 2 kg/m2 and within the same WHO category). All participants attended both baseline (26–28 weeks’ gestation) and follow-up (5-year postpartum) visits. Supplementary Fig. 1 presents the flowchart of the current study design.

GDM diagnosis at the index pregnancy

At baseline, we diagnosed 50 women with GDM using a 2-h 75 g oral glucose tolerance test (OGTT) during 24–28 weeks gestation according to World Health Organization 1999 criteria29: fasting glucose ≥ 7.0 mmol/l and/or 2-h plasma glucose ≥ 7.8 mmol/l. None of these 50 women with GDM required drug treatment.

Diagnosis of abnormal glucose metabolism (AGM) at 5-year postpartum

At the 5-year postpartum visit, we assessed glucose tolerance of all 100 participants using HbA1c and a 2-h 75 g OGTT. We defined prediabetes as follows: (a) fasting plasma glucose 6.1‒6.9 mmol/l and 2-h plasma glucose < 11.0 mmol/lL, or (b) fasting plasma glucose < 7.0 mmol/l and 2-h plasma glucose 7.9‒11.0 mmol/l. We defined T2D as: (a) fasting plasma glucose ≥ 7.0 mmol/l, or (b) 2-h plasma glucose ≥ 11.0 mmol/l, or (c) HbA1c ≥ 6.5%, or (d) self-reported physician-diagnosed T2D during the 5 years follow-up. We subsequently categorized participants as having AGM if they had either prediabetes or T2D at the 5-year postpartum visit.

Liquid chromatograph–mass spectrometer (LC–MS) based metabolic profiling at 5-year postpartum follow-up

We extracted the metabolites using 200 μl serum samples collected at the 5-year postpartum visit. Briefly, we added 800 μl ice-cold mixture of methanol/acetone/acetonitrile (1:1:1, v/v/v) to each serum sample and incubated the mixture at − 20 °C for 30 min to precipitate. We then centrifuged the mixture at 16,000×g for 15 min (4 °C) to collect supernatant containing metabolites and dried the supernatant in a vacuum concentrator (miVac, GeneVac, Warminster, UK) before LC–MS analysis. We described the detailed laboratory procedures in Appendix, Supplementary Table 4 and Supplementary Figs. 2 and 3. We tested all samples in one batch.

We obtained and imported raw data from LC–MS analysis to MarkerView (SCIEX, Foster, California, US) for peak extraction, the lists of which contained m/z values, retention time and integrated ion intensity for each m/z feature. We employed a modified 80% rule for missing value handling, i.e., a metabolite feature is kept if the metabolite feature has a non-zero value for at least 80% in any group samples30. We applied interquartile range (IQR) to the peak lists and performed data filtering using MetaboAnalyst (Version 4.0)31. We filtered the data further if the relative standard deviation (RSD) were more than 20% in QC samples.

We detected a total of 21,226 metabolite features using LC–MS (1734 features from HILIC negative mode, 2636 features from HILIC positive mode, 6181 features from RP negative mode and 10,655 featured from RP positive) after applying the modified 80% rule. We finally included a total of 3067 metabolite features for statistical analysis after MetaboAnalyst processing.

Covariates

We measured standing height using the SECA model 213 (Seca, Hamburg, Germany) and standing weight using SECA model 803 scale (Seca, Hamburg, Germany), according to standardized protocols32 at baseline and the 5-year follow-up visit. We calculated BMI as weight in kilogram over the square of height in meter. We calculated 26–28 weeks’ gestational weight gain (GWG) as the difference in weight between 26 and 28 weeks’ gestation and pre-pregnancy. Trained staff administered questionnaires in either English, Chinese, Malay, or Tamil at baseline index pregnancy. We collected information on the highest education level (college vs. below college), family history of diabetes (yes vs. no), past pregnancy history (parity, past GDM), and pre-pregnancy weight.

Statistical analyses

Identifying candidate metabolites associated with 5-year postpartum AGM

First, we compared GDM and non-GDM women characteristics with generalized linear mixed models or generalized estimating equations when appropriate to account for matching factors. We constructed orthogonal projections to latent structures discriminant analysis (OPLS-DA) to separate and discriminate women with AGM and normal glucose metabolism. Generally, OPLS-DA aimed to differentiate between groups in highly complex datasets (e.g. LC–MS based metabolic data), despite within-group variability33. Second, we used variable importance for the projection (VIP) plot to summarize the importance of the metabolite features to the OPLS-DA model (VIP score > 1). Then we used univariate analysis (Mann–Whitney U test and t-test when appropriate, p < 0.05) to determine whether a metabolite showed different level between women with AGM and normal glucose metabolism. Third, we included these candidate metabolites for multivariable analysis.

Differentiating candidate metabolites associated with 5-year postpartum AGM between women with or without GDM at the index pregnancy

We stratified the participants based on their GDM status diagnosed at index pregnancy, and then used multivariable logistic regression models to identify metabolites specifically associated with 5-year postpartum AGM among women with a history of GDM. We employed false discovery rate (FDR) with the Benjamini–Hochberg procedure to correct for multiple testing and deemed significance at FDR less than 0.1. Next, we applied the ridge regression model to account for collinearity. We only included the metabolites with p-value < 0.05 in the ridge regression to further analyses.

Exploring the AUC in the regression model for 5-year postpartum AGM with serum metabolites among women with a history of GDM

We narrowed down the significant metabolites and used multivariable logistic regression models to assess their individual and combined indicative values on 5-year postpartum AGM. Using receiver operating characteristic (ROC) curves with tenfold cross-validation, we tested the following models: Model 1—known risk factors of postpartum AGM including age at follow-up, ethnicity, BMI at follow-up, family history of T2D, and the cumulative number of GDM episodes among all live pregnancies at follow-up; Model 2—Model 1, and additional baseline glucose parameters (fasting and 2-h glycemic levels at study entry); Models N—Model 2 and additional significant metabolites identified in our study, using a stepwise approach [e.g., each metabolite, each pair (if applicable), each triplet (if applicable), each quartet (if applicable), all metabolites (if applicable)]. We ranked the best fitting model with the highest R2 and area under curve (AUC) values. We further verified metabolites in the MS/MS spectrum's final model using pure chemical standards if commercial standards were available. The complete procedures of data processing and statistical analysis to discover metabolite features and identify candidate metabolites are illustrated in Supplementary Fig. 4.

Pathway analysis

We performed pathway analysis based on the KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway database using MetaboAnalyst (Version 4.0). Such step aimed to investigate the published biological function in our significant metabolites identified in the OPLS-DA and univariate analysis. We also plotted a metabolic map (Fig. 3) to show the identified metabolic network.

We expressed data as median (interquartile range, IQR) or mean (standard deviation, SD) when appropriate. We conducted all statistical analyses using SIMCA 13.2 (Umetrics, Umea, Sweden), MetaboAnalyst (Version 4.0), and R Software (Version 3.5.0), and deemed significance at p-value (2-sided) less than 0.05.

Ethics approval

We conducted the study according to the tenets of the Declaration of Helsinki and obtained approval by the SingHealth Centralized Institutional Review Board and the National Health Group’s Domain Specific Review Board of Singapore. We obtained written informed consent from all participants for our study before any testing.