Introduction

Hypertensive disorders of pregnancy (HDP) and preterm birth (PTB) are significant causes of obstetric morbidity1. It remains unclear whether the subtypes of HDP, pre-eclampsia (PE) and gestational hypertension (GH), are distinct conditions or on a spectrum, though they share common risk factors2,3. The subtypes of PTB, spontaneous or medically indicated4, also share risk factors5. Knowledge about the causes of both HDP and PTB is limited6; HDP has been related to placentation, inflammation and progressive endothelial damage1, and PTB to activation of the hypothalamic–pituitary–adrenal axis and exaggerated inflammatory response7,8. Although several biomarkers have been linked to HDP and PTB, none are clinically useful in early pregnancy9,10.

Untargeted metabolomics captures signals for low molecular weight compounds from exogenous exposures (e.g., environmental chemicals, medication, and food), and endogenous metabolites produced by the host system that map to biochemical pathways. Untargeted metabolomic approaches have been used to investigate PE11,12,13,14,15,16,17,18,19 (few studies have examined GH specifically14); however, the key metabolites identified across studies have not been consistent3,6 and the pooled sensitivity of all single biomarkers is low10. Untargeted metabolomics has also been used to investigate PTB, and, again, the key metabolites have not been consistent across studies20. Differences in study design, population, biospecimens, and included/overlooked confounders likely contribute to this variability20.

This study therefore performed an untargeted metabolomic analysis of first-trimester serum to further reveal biomarkers for both HDP and PTB. We extend previous analyses by incorporating two types of spectroscopic analysis, providing the evidence for metabolite identification and annotation, and using modelling approaches to determine the predictive value that metabolites add over known risk factors.

Methods

Study population

Serum specimens were obtained from the Global Alliance to Prevent Prematurity and Stillbirth (GAPPS) repository, taken between 2011 and 2016 (demographic and lifestyle information on the registry, Table S1). Two overall case groups were identified: HDP and PTB. 51 cases of HDP (including 18 cases of PE and 33 cases of GH) and 53 cases of PTB (42 spontaneous) were frequency-matched for gravidity to 109 controls. Pregnant women ≥ 14 years of age can be enrolled in GAPPS Repository; exclusion criteria include: received narcotics in the previous 12 h, in active labor, or multiple gestations. Participants were enrolled during pregnancy, usually at a prenatal care appointment, from the University of Washington Medical Center, Seattle; Swedish Medical Center, Seattle; and Yakima Valley Memorial Hospital, Yakima, WA. All participants were followed throughout pregnancy, delivery and up to 10 weeks postpartum.

Clinical definitions

Participant medical records were abstracted by GAPPS study personnel, and recorded HDP (PE and GH) and PTB (births < 37 weeks gestation) were used to define case groups. Two overall case groups were examined, then their subgroups: all HDP, then PE and GH separately; all PTB, then limited to spontaneous PTB (sPTB; dataset did not allow for distinguishing other subtypes of PTB). HDP and PTB cases were selected separately and did not overlap, but according to the medical records abstraction, 7 PE cases gave birth preterm and 1 case of PTB had GH. A sensitivity analysis including these in the other respective case groups was conducted and did not change the results, so the original case groups are used in this report.

Covariate collection

Questionnaires were used to collect information on demographics, health history, diet, and home and work environment. Three questionnaires were collected on participants who enrolled before May 20, 2014; afterwards, 5 questionnaires were collected. When necessary, covariate definitions were harmonized across questionnaires.

Metabolomics analysis

Details of the sample preparation, data acquisition, data preprocessing, metabolite identification and annotation, and statistical analysis are provided in the Supplementary material.

Serum samples were collected from participants during the first trimester of pregnancy (gestational age range: 6+1–13+6 weeks), and prepared according to published methods21,22. Untargeted metabolomics data was acquired using a Vanquish UHPLC system coupled with a Q Exactive HF-X Hybrid Quadrupole-Orbitrap Mass Spectrometer (UPLC-HR-MS; Thermo Fisher Scientific) and using a Bruker Avance III 700 MHz NMR spectrometer The UPLC-HR-MS data was processed using Progenesis QI (Waters Corporation), and the NMR data was processed using Chenomx NMR Suite 8.4. UPLC-HR-MS signals were identified or annotated through matching with an in-house physical standards library or public databases.

Statistical analysis

The Caret R package (version 6.0-84) and RStudio 3.6.1 were used for the UPLC-HR-MS model selection procedure based on cross validation with SAS software 9.4 for the remaining data analysis. All demographic, behavioral, medical, and lifestyle factors available or harmonizable across questionnaires and potentially associated with exposure and the outcomes were examined as possible confounders. Covariates that were distributed differently in cases and controls with p value < 0.2, with the exception of previous history of complications (because causes of previous events might also cause events in the current pregnancy23), were included in the initial stepwise models. Due to the sample selection criteria, “gravidity” was included in each of the stepwise models regardless of significance level.

Each case group was modeled separately using univariate and multivariable logistic regression models. All signals meeting the selection criterion, regardless of being identified/annotated, were considered in the analysis. The first set of multivariable regression models utilized all 3,122 signals, and due to the high dimensionality of the data we utilized a multi-step approach based on a fivefold cross validation (supplementary materials)24. In addition, the 12 exogenous metabolites that were identified or annotated based on the untargeted LC–MS in-house physical standards library and differed by group were modeled using stepwise multivariable regression with p < 0.05 for retention; the covariates identified above were included regardless of the p value. The broad spectrum NMR data (195 bins) was modeled using stepwise multivariable regression. Stepwise regression was used for the exogenous signals and the NMR data due to the lower dimensionality of the data.

The area under the receiver operating characteristic curve (AUC) was used to evaluate the performance of the prediction models. The strength and precision of the associations with individual metabolites were compared based on the odds ratios and widths of the confidence interval respectively.

Pathway enrichment analysis

GeneGo MetaCore (Clarivate Analytics, PA) was used to assess the enrichment of perturbed metabolic pathways. For this analysis, metabolites were included that had an ontology level (OL) of OL1 (RT, Mass, and MS/MS), OL2a (RT and Mass), or were determined by NMR (Tables S2S11). Metacore uses the hypergeometric test, which represents the enrichment of certain metabolites in a pathway, together with the false discovery rate (FDR). A p value < 0.01 is considered indicative of significant enrichment in pathways.

This secondary analysis of de-identified data and samples was ruled not human subjects research by the Tulane Institutional Review Board. All participants provided informed consent to recruitment into the GAPPS repository25.

Results

The largest proportion of included samples were from Yakima Valley Memorial Hospital, with 14.1% of participants from the University of Washington Medical Center and 14.1% from Swedish Medical Center; there was no statistically significant association between pregnancy complications and center where participants were enrolled. The mean participant age was between 29 and 31 for all case groups and for the controls (Table 1). A large majority had been pregnant before (78% of controls, 72–79% depending on case group). The majority of the participants were white (72% of controls, 69% of HDP cases, 53% of PTB cases). Early-pregnancy BMI of those with hypertensive disorders (mean for HDP cases, 34.7, SD 9.6) was higher than controls (mean 29.3, SD 7.8). Cases of overall PTB (29% ever smokers) and GH (39% ever smokers) were more likely to have smoked than controls (17%). Cases were more likely to have used street drugs prior to pregnancy (overall HDP 17%, overall PTB 15%) than controls (6%). Other variables that differed from controls for at least one case group are listed in Table 1. Besides gravidity, for the stepwise modeling, BMI and illegal drug use were included in models of HDP, GH, and PE; obesity and illegal drug use were selected for the PTB model; and no covariates remained in the sPTB model.

Table 1 Demographic, medical, and lifestyle characteristics of cases and controls.

When examined one at a time, 337 signals were associated with HDP (p < 0.1) with 173 metabolites being identified or annotated (Table S2). When GH and PE were examined individually, 344 signals (with 173 being identified or annotated) were associated with GH (p < 0.1, Table S3), while 446 (with 189 being identified or annotated) were associated with PE (p < 0.1, Table S4).

Models including signals/metabolites determined by UPLC-HRMS (Table 2) showed significant improvements in the AUC over models constructed using only covariates. Among the signals/metabolites retained in the HDP models, the most precise associations were with an unknown signal with an neutral mass of 746.6045 Da and retention time at 0.59 min (0.59_746.6045n, reduced odds), and a signal annotated as pilocarpine (PDc, increased odds). The strongest effect sizes were for an unidentified signal at 8.66_762.1452 m/z and 12.74_412.2842 m/z, both of which were associated with reduced odds. A signal that annotated as 2,6-Di-tert-butyl-4-hydroxymethylphenol (BHT-OH) through matching with public database by exact mass and MS/MS spectra (PDa) was strongly associated with GH. For the PE model, 4 signals were included; among them, an unidentified signal at 6.30_477.7721 m/z was most precise, while cerasinone (PDb) had the strongest effect size and bolasterone the most definite annotation (PDa) (Table 2). The signals/metabolites included in the overall HDP model were not the same as those included for models of each type of HDP, but signals/metabolites included in the final HDP model were associated with either GH or PE, and usually both, when examined individually (Tables S2S4).

Table 2 Metabolites/signals that predicted HDP, GH, PE, PTB, and sPTB in multiple logistic regression models (LC–MS).

Over 246 signals were individually associated with PTB (p < 0.1), with 189 metabolites identified or annotated (Table S5); 298 signals were individually associated with sPTB (p < 0.1) and 135 metabolites identified or annotated (Table S6). In multiple logistic regression analysis, 5 signals were included in the PTB model, while 6 signals were included in the sPTB model (Table 2), all with similar precision (variance) and effect size (odds ratio). A common signal was retained in both models with a RT at 15.66 min and an exact neutral mass at 770.4609 Da. All of these metabolites were annotated with an evidence bases of PDc or below.

In the NMR analysis (Table 3; unadjusted results in tables S6S10), bins containing signals that could be derived from asparagine/albumin was associated with HDP (OR 0.17, 95% CI 0.04–0.74), and from asparagine/N,N-dimethylglycine/trimethylamine were associated with PE (OR 0.16, 95% CI 0.05–0.52). Threonine and urea were associated with reduced risk of PTB and SPTB, respectively, but did not add significantly to the predictive value of the model.

Table 3 NMR metabolites associated with HDP and PTB in cross-validated multiple logistic regression model (NMR).

An additional aim of our study was to evaluate the correlation between environmental exposures and pregnancy complications. Over 20 metabolites derived from exogenous compounds were identified or annotated (OL1, OL2a, and OL2b), and over a dozen metabolites that are derived from exogenous exposures differentiated the case–control status (univariable logistic regression analysis, p < 0.1). This included metabolites of bisphenols, parabens, phthalates, polyphenol metabolites, and medications (Table S2S6). Monohexyl phthalate was associated with HDP and GH (Table 4), while salicylamide was associated with PE. (R,S)-N-Acetyl-S-(2-hydroxy-3-buten-1-yl)-l-cysteine was associated with reduced odds of sPTB.

Table 4 Association between exposure of exogenous chemicals and pregnancy complications by stepwise modeling.

For pathway analysis, metabolites that were perturbed between cases and controls with the evidence bases of OL1 (RT, MS, MS/MS) or OL2a (RT, MS) included individual steroid hormones, acetylcarnitines, nucleosides, hydroxyl short-chain fatty acids, and exogenous metabolites. Thirty pathways were found to be associated with HDP, with 24 associated with GH and 37 associated with PE; while 15 pathways were associated with PTB and 9 with sPTB (Fig. 1 and Table 5). Five perturbed pathways were associated with all the investigated complications: aminoacyl-tRNA biosynthesis, l-threonine, renal secretion of organic electrolytes, and urea cycle. HDP, GH and PE were also highly overlapping in pathways related to cortisol biosynthesis, cholesterol and sphingolipid transport, lipoprotein metabolism, and metabolic syndrome/type 2 diabetes. Pathways associated with PTB and/or sPTB related to cortisol production activation in depression, renal secretion of drugs, transcription role of Vitamin D receptor in regulation of genes involved in osteoporosis, immune responses, and tyrosine metabolism.

Figure 1
figure 1

Overlap of pathways by complications under study. Venn diagram of metabolic pathways perturbed between cases (e.g., GH, HPD, PE, PTB, and sPTB) and controls. Pathway enrichment was conducted by Genego Metacore using Enrichment by Pathway Map, and the cut-off for pathway enrichment is p < 0.01. Each section of the diagram is labeled by capital letters (A, B, C, D, E), and the numbers of pathways that were specific to a certain phenotype (in the region with single capital letter) or overlapping between different phenotypes (in the region with combination of letters). The list of pathways corresponding to each section are shown in Table 5.

Table 5 Enriched metabolic pathways perturbed between different case groups and control group (corresponding to Fig. 1).

Discussion

In this untargeted metabolomic analysis of first trimester serum samples, we identified and annotated several endogenous and exogenous metabolites associated with complications of pregnancy, and showed that metabolites significantly improved the predictive value of models over known risk factors. The number of features differentiating cases and controls and the identified/annotated features found for PTB were less than that of HDP; this may indicate that PTB is a more heterogeneous condition. The investigation was is a discovery-based (i.e., untargeted) approach which could lead to biomarker(s) useful in clinical practice. Unlike analyses that focused mainly on a few signals with identification/annotation13,16,26, we created models using all signals for a more comprehensive analysis. Some signals used in the modelling approach could be identified through retention time, mass, and fragmentation, while others were annotated through public databases or remained unknown. The identifications and annotations in our study provide evidence-based ontology levels, which is important for data comparison and harmonization in future collaborations.

Exogenous metabolites Monohexyl phthalate was correlated with HDP and GH, and phthalate metabolites were weakly associated with decreased blood pressure in the second trimester in one previous study27. The correlation between salicylamide and PE may be due to the usage of aspirin-like medication (such as Labetalol, 2-hydroxy-5-[1-hydroxy-2-[(1-methyl-3-phenylpropyl)amino]ethyl]benzamide monohydrochloride), in hypertensive women28. (In our study, salicylamide levels were higher for the 5 women in the study, 4 cases and 1 control, who had chronic hypertension.) (R,S)-N-Acetyl-S-(2-hydroxy-3-buten-1-yl)-l-cysteine (MHB2) is a metabolite generated in vivo after exposure to 1,3-butadiene via smoking or air pollution29; the link we found between MHB2 and sPTB is consistent with previous studies finding associations with these toxicants30,31.

Individual metabolites, HDP: Our study identified multiple signals with strong predictive value for HDP. We attempted to match signals to our in-house library of standards run under identical conditions to the study samples, as well as with public database. These signals could not be identified using evidence of retention time and/or MS/MS spectra pattern. Therefore, we provided the tentative annotation and chromatographic/spectra information for those important signals, which might be helpful for identification/annotation using other data mining technologies in the future22,32. We found a large number of metabolic profiles that were significantly perturbed (p < 0.1) between cases and controls (Table S2S11 in supplementary materials). Although none of these identified/annotated metabolites was predictive enough to be used as a clinical biomarker, most of our findings in metabolic profiles (Table S2S11) are highly consistent with the New Zealand SCOPE cohort33, as well as other discovery-phase studies34,35. One of the signals with predictive value for PE matched to an androgen steroid hormone, and the PE-associated perturbation of steroid hormones was also reported in the SCOPE study33. Increased androgens are correlated with vascular dysfunction in HDP, interrupting oxygen and nutrient transport from the maternal blood supply36. In the GH model, 2,6-Di-tert-butyl-4-hydroxymethylphenol (BHT-OH, PDa) was predictive. This compound is a metabolite of 2,6-Di-tert-butyl-4-methylphenol (BHA), a synthetic phenolic antioxidant used widely in foods, polymers, and cosmetics to slow oxidation. Some BHA metabolites have been found to induce cellular DNA damage and the chemical was placed on the European Union watch list in 201537. Only elevated acylcarnitine and decreased taurine levels have repeatedly been found to relate to PE in previous metabolomic studies6. Neither was included in our final model, but butenylcarnitine and 3-hydroxyhexanoyl carnitine were associated with higher odds of HDP in univariate models (Table S2); no association was found with taurine.

Individual metabolites, PTB: Most of the signals retained in the final models for PTB and sPTB were identified with public database matching. Of the metabolites we found that were associated with PTB in this analysis, only threonine had been previously associated with PTB, with a negative association20. Our previous review of metabolomics and PTB found little consistency across studies, with only myoinositol, creatinine, histidine, and 5-oxoproline associated across multiple studies20. Among these, in our analysis, only histidine was weakly associated with PTB, and it was not retained in final models.

Common pathways: Pathways involved in protein synthesis (aminoacyl-tRNA biosynthesis), threonine metabolism, urea cycle, and renal secretion of organic electrolytes were perturbed in both HDP and PTB. Protein synthesis and amino acid metabolism play important roles in maternal and fetal health. Pregnant women who have inherited metabolic disorders in protein and amino acid metabolism are more likely to develop pregnancy complications, indicating burdens in urea nitrogen clearance38. A previous study of late-onset pre-eclampsia also found associations with aminoacyl-tRNA synthesis (though they were not statistically robust)39. Perturbation of the renal secretion of organic electrolytes pathway may indicate changes in the kidney proximal tubule related to xenobiotic metabolism40.

Pathways and individual complications: Multiple pathways were perturbed in the early part of pregnancies that later developed HDP. Several lipid-related pathways were associated with HDP, consistent with the disruptions of lipid metabolism that have been demonstrated in HDP41,42. The leucine, valine, and isoleucine metabolism, related to both HDP and PTB in these data, was previously associated with late-onset preeclampsia39. 4-hydroxyglutamate, identified as a strong predictor of PE in a previous study16, was not associated in our analysis. However, it is involved in the arginine-proline metabolism pathway, one of the pathways identified for HDP, and is a substrate that produces 4-hydroxy-2-oxoglutarate, an intermediate on several pathways identified in this analysis. Pathways related to oxidative stress, nitrous oxide signaling, and inflammatory signaling were associated only with PE, suggesting that the oxidative stress and inflammation leading to severe damage in endothelial function might contribute to the more severe pathology of PE. Fewer pathways were associated with PTB and the associations were less strong, but some were intriguing. For instance, the pathways related to activation of cortisol pathways in major depressive disorder were perturbed, and cortisol and depression have both been previously related to PTB43.

Strengths of the study include the first-trimester sampling and strong QC for both the sample collection and the spectroscopic analyses. Limitations include the small sample size, lack of detailed information on subtypes of PE and PTB, lack of a replication sample, the single-timepoint sample, and the limited number of African-American participants.

This study contributes to the growing literature on metabolites associated with pregnancy complications and suggests that perturbations of several common pathways are associated with both HDP and PTB. The metabolomic field needs to report the evidence basis for identifications and annotations in order to increase the usability of reported findings.