Delirium is a condition characterized by an acute change and fluctuation in attention, thinking, and consciousness. Postoperative delirium complicates the courses of 15–53% of older adults undergoing major surgical procedures1,2, and is associated with higher rates of postoperative complications3, longer hospital stays4, higher rates of discharge to extended care facilities5, and increased mortality4,5,6. The economic impact of delirium associated health care expenditures is over $164 billion per year in the U.S. alone7.

Currently, delirium is solely a clinical diagnosis, and there is no laboratory test or biomarker to facilitate its diagnosis. Additionally, delirium etiology and pathogenesis are poorly understood. Previous research has suggested several mechanisms for delirium pathogenesis, which include: (1) increased expression of inflammatory markers found in the blood and central nervous system8,9,10,11,12; (2) alterations in plasma levels of precursor amino acids leading to dysfunction in neurotransmitter systems13; (3) oxidative stress leading to cerebral dysfunction14,15,16; (4) acute or chronic stress response resulting in aberrantly high levels of glucocorticoids17,18; and (5) dysregulation of the circadian rhythm15,19,20,21.

Metabolomics is a high-throughput quantitative approach to study the metabolome—the collection of small metabolites found in a system. These smaller metabolic molecules are often the end products of the biochemical processes in a cell and are particularly sensitive to endogenous and exogenous stimuli22. Differences in their concentration levels provide an efficient way to monitor and detect alterations in specific cellular pathways. Metabolomics can reveal transient biochemical changes that are closely aligned with the disease state of a system22.

Metabolomics has been extensively used to characterize cellular changes in disease phenotypes and facilitate identification of disease-specific markers23,24,25. Targeted metabolomics approaches have been employed previously in identifying novel biomarkers and distinguishing between diseased and healthy cohorts26,27. Only a few prior studies have explored metabolic changes associated with delirium and illuminating potential underlying biological mechanisms28,29,30,31.

To obtain more insights into the pathophysiological mechanisms of delirium as well as to identify potential risk and disease markers, we performed targeted metabolomic profiling both at preoperative and postoperative time points. We identified multiple metabolites linked to delirium before and after surgery and demonstrated enrichment of several metabolic pathways. Our associative, predictive, and systems analysis may help to identify pathophysiologic pathways to prioritize for development of diagnostic and therapeutic regimens for postoperative delirium.


In Fig. 1, we show the overall workflow of the experimental and computational steps applied in this paper.

Figure 1
figure 1

Experimental design workflow. Key steps in our experimental design included: sample preparation, mass spectrometric qualitative and quantitative analysis, data preprocessing, univariate and multivariate statistical analysis, systems analysis applied to the univariate findings, and predictive modeling using multivariate results.

Sample characteristics

Table 1 shows the sample demographics and clinical variables of the metabolomics patient samples. On average, patients in the matched sample (n = 104; 52 matched pairs) were 77 years old, 53% female, and 49% had a vascular comorbidity. The characteristics of the delirium cases (DEL) and no-delirium controls (CNT) were similar, indicating a successful match procedure. In terms of non-match variables, on average both cases and control were overweight and showed little evidence of malnutrition. Both cases and controls reported poor sleep in the hospital, which is typical32, with cases experiencing slightly more sleep disruption. The prevalence of diabetes medication use was low and similar in cases and controls. We did not control for the variables that were not used in the matching process because of the risk of over-controlling for variables that are potentially along the causal pathway to delirium33,34.

Table 1 Sample characteristics in the matched cohorts.

Data pre-processing

We identified 250 metabolites as “present,” i.e., detected at a level suitable for downstream analysis (out of 315 metabolites measured). Missing data accounted for 3.15% of total data acquired from present metabolites (Supplementary Table S1), which was much smaller than the typical amount of missing data percentage seen in metabolomics studies (15–26%)35,36,37,38,39. After data imputation, signal drift correction, and normalization, the percent relative standard deviation (RSD) was less than 5 for 87% and less than 10 for 98% of all metabolites in the pooled QC samples (Supplementary Tables S2, S3, S4).

Metabolites altered in delirium group at PREOP

Thirty-seven metabolites were found to be altered in DEL at PREOP (nominal p value < 0.05 in at least two tests) when compared with CNT (Fig. 2a; Table 2a; Supplementary Table S5). Four of these metabolites had a binomial test BH-corrected p value < 0.05: trehalose-6-phosphate, phenyllactic acid, creatine, and N-acetyl-l-alanine. We performed 10,000 tenfold splits where at each iteration 1/10th of the samples were left out in each group, and the three statistical tests were run on the remaining samples. Twenty-eight of the thirty-seven metabolites had a nominal p value < 0.05 in at least two tests more than 50% of the time, thus showing high robustness (Supplementary Table S6).

Figure 2
figure 2

Statistical approaches to discovering significant metabolites at PREOP and POD2. For (a, b), we applied both parametric [t-test (T)] and nonparametric [Wilcoxon Rank (W) and binomial (B)] statistical tests to account for the degree, direction, and rank of difference between delirium (DEL) and control (CNT) groups at both preoperatively (PREOP) and post-operative day 2 (POD2) time points. To correct for multiple hypothesis testing, we used the Benjamini–Hochberg (BH) procedure. A metabolite was considered to have differentially quantified concentrations if it had a BH-corrected p value < 0.05 in at least two statistical tests. (a) At PREOP, none of the metabolites met our strict criteria for differential concentration. Four metabolites had a BH-corrected p value < 0.05 only in the binomial test (see text). Systems biology was performed using the 37 metabolites that passed two or more tests with a nominal p value < 0.05. (b) At POD2, there were 53 metabolites that met our criteria for differential concentration. These metabolites were used as input for systems analysis. Score plots for the OPLS-DA analysis using the (c) PREOP and (d) POD2 data. Ellipses represent clustering based on the Mahalanobis distance for outlier detection (orange: delirium, blue: control, and black: all samples). Metabolites with the most extreme loadings (positive and negative) for (e) PREOP and (f) POD2 are noted. These metabolites had the greatest impact on the model.

Table 2 Metabolites with significant differential concentrations between delirium and control groups.

Applying one predictive and one orthogonal component, the OPLS-DA model produced moderate separation between PREOP DEL and CNT, and minimal within group variation (Fig. 2c). Despite the good separation achieved in Fig. 2c, and high total sum of variation explained by the model (R2Y(cum) = 0.719), the estimation of model predictive performance was modest (Q2 = 0.203). Variable importance in projection (VIP) scores, which reflect both the loading weights and variability of response, bolstered previous univariate findings (Table 2a; Fig. 2e). The OPLS-DA model was initially validated using permutation testing and further tested for robustness by CV-ANOVA. The p values for permutation testing were pR2Y < 0.159 and pQ2 < 0.001, and for CV-ANOVA was < 2.4E−4, thus showing a statistically significant validation of the separation achieved between PREOP DEL and CNT when using this model. There were 11 metabolites with a VIP score > 2.0 (VIP > 1.0 is considered significant in distinguishing between classes), which included three of the four metabolites that showed a binomial test BH-corrected p value < 0.05. A support vector machine (SVM) prediction model40 using these 11 metabolites as predictors yielded an area under the curve (AUC) of 83.80% for the associated receiver operating characteristic (ROC) curve (Supplementary Fig. S2).

Metabolites altered in delirium group at POD2

Fifty-three metabolites were found to have had significantly different concentrations between DEL and CNT at POD2 with a BH p value < 0.05 in at least two tests (Fig. 2b; Table 2b; Supplementary Table S7). Six of these metabolites were also altered at PREOP with three of them reversing the FC directionality. PREOP and POD2 associated metabolites represent candidate risk and disease markers, respectively. Therefore, the metabolites and/or their FC sign may be different because they represent different underlying biological mechanisms. Based on the 10,000 tenfold cross validation split datasets, all of the metabolites had p < 0.05 in at least two tests in all of the cases, and 28 of the metabolites had BH-corrected p < 0.05 in at least two tests in more than 50% of the cases (Supplementary Table S6).

Using one predictive and two orthogonal components, the OPLS-DA model produced clear separation between POD2 DEL and CNT with R2Y(cum) = 0.848 and predictive performance with R2Q = 0.344 (Fig. 2d,f). All VIP scores were at or above 1.0, supporting univariate findings (Table 2b; Fig. 2). The model was validated using permutation testing and evaluated for robustness with CV-ANOVA. The p values for permutation testing were pR2Y < 0.004 and pQ2 < 0.001, and for CV-ANOVA it was < 9.0E−7. These results support a statistically significant separation achieved between POD2 DEL and CNT when using this model. Based on both the statistical analysis and predictive modeling, metabolites at POD2 show significantly better discriminative power than at PREOP.

Pathway enrichment analysis

At PREOP, the valine, leucine, and isoleucine biosynthesis pathway was the most significantly enriched (FDR < 0.04) (Table 3a; Supplementary Table S8; Supplementary Fig. S3). This canonical pathway is found in many species but the metabolic reactions specific to humans are noted by dashed boxes in Fig. 3a. These four reactions require the enzyme serine dehydratase or branched chain amino acid aminotransferase41,42,43. These chemical processes include eight metabolites, and of these, three of them were found to have significant differential concentrations (nominal p value < 0.05) at PREOP: 2-oxobutanoate, isoleucine, and valine. All three metabolites were downregulated in DEL (Fig. 3a).

Table 3 Biochemical pathways enriched with significant metabolites at preoperative (PREOP) and postoperative day 2 (POD2).
Figure 3
figure 3

Pathway analysis of metabolites with significant differentially quantified concentrations using MetaboAnalyst81,82,83. Red indicates upregulation in DEL and green denotes downregulation. Grey represents metabolites not included in our targeted metabolomics protocol. Superscripts signify direction of change for the metabolites that did not exhibit significant differential concentrations. The +, −, and × superscripts indicate upregulated (in DEL), downregulated (in DEL), and signal too low or not present, respectively. (a) At PREOP, the valine, leucine, and isoleucine biosynthesis pathway was the only one with metabolites with significantly differentially quantified concentrations (FDR < 0.04). Dashed boxes represent the reactions that are specific to human. (b) At POD2, the citrate cycle pathway was the most significantly enriched.

At POD2, three pathways were found to be significantly enriched with our significant metabolites (N = 53) (Table 3b, Supplementary Table S8, Supplementary Fig. S3). The citrate cycle was the most significant pathway (FDR < 0.0019). This pathway included six significant metabolites which were all upregulated in DEL at POD2 (Fig. 3b). Furthermore, in Fig. 3b we show one additional metabolite, itaconate/itaconic acid, which is increased in delirium cases, but is not part of the classical citrate cycle. Itaconic acid is directly linked to the metabolite aconitate in this pathway.


In this paper, we applied targeted metabolomics to study delirium both pre- and postoperatively using a nested matched case-control design that maximizes efficiency and control of confounding factors. We identified metabolites associated with delirium both at PREOP and POD2 suggesting potential risk and disease markers. Our 11-metabolite predictive signature at PREOP, which achieved an AUC close to 84%, can be combined with other biomarker panels in a multi-omics approach. Systems biology analysis suggests that the ‘valine, leucine, and isoleucine biosynthesis’ and ‘citrate cycle’ pathways may be dysregulated in delirium. These altered pathways coincide with alternative energy pathways previously implicated in delirium pathogenesis in several studies15,31,44. Our findings suggest that alterations in primary and secondary energy production, amino acid synthesis metabolic pathways, and increased oxidative stress may be involved in the etiology and pathophysiology of delirium.

Metabolomics provides new insights into metabolic phenotypes and dysregulations of metabolic pathways underlying human disease and adds an important avenue for biomarker and therapeutic target discovery. However, experimental and computational challenges such as sample size, batch effect, peak drift, and data normalization hamper wide-spread use45. We established a robust workflow by using internal standards and pooled QC samples that provided reliable data for downstream analysis (RSD < 10% for > 98% of the metabolites in pooled QC samples).

Our study is one of only a few metabolomics studies of delirium. Pan et al. characterized the preoperative cerebrospinal fluid (CSF) metabolome of older patients (≥ 65 years) who developed delirium following elective hip or knee arthroplasty. They found that spermidine, putrescine, and glutamine were significantly upregulated in delirium patients compared to controls30. In a blood-based metabolomics study of delirium, Guo et al. took an untargeted metabolomics approach in characterizing preoperative serum metabolites of older surgical patients (between 65 and 80 years) undergoing hemiarthroplasty for hip fractures. They identified four metabolites associated with a higher risk of postoperative delirium (S-methylcysteine, linolenic acid, eicosapentaenoic acid, and linoleic acid) with alterations in energy metabolism and oxidative stress pathways29. In a follow-up study, Guo et al. profiled both PREOP and POD1 samples and reported imbalances between branched-chain and aromatic amino acids with perturbations in tricarboxylic cycle and oxidative stress pathways31. Although the specific metabolites and direction of dysregulation were not always consistent within the three studies (Supplementary Table S9), there were agreements in perturbed pathways.

There were six metabolites found to be significant at both time points; three displayed opposite directions in fold change noted in Table 2. This observation could be because the significant metabolites in the two timepoints imply different clinical/biological mechanisms (risk vs. disease factor). One of these, 2-oxobutanoate, is part of the 11-metabolite predictor used for the SVM prediction model at PREOP. Enrichment analysis using the 11-metabolite predictor revealed that 5 metabolites (N-acetyl-l-alanine, nicotinamide, Creatine, 2-oxobutanoate, Creatinine) are involved in metabolism of amino acids as well as 2 metabolites (Uracil, Uridine) are linked to pyrimidine ribonucleosides degradation. Uracil was the only metabolite linked to delirium by Guo et al.31. Disruption in amino acid metabolism has been linked to Alzheimer’s disease and dementia46,47. Another one of these metabolites, N-acetyl-glutamate (NAG), is a crucial enzymatic cofactor for the first step in the urea cycle, essential for liver ureagenesis and reduced levels are associated with Alzheimer’s disease48. (Supplementary Table S10).

Our systems biology approach indicates possible downregulation of the valine, leucine, isoleucine biosynthesis pathway at PREOP, for those patients who would go on to develop delirium following surgery. Valine, leucine, and isoleucine are essential branched-chain amino acids (BCAA). Essential amino acids are not naturally synthesized in the body and are obtained through diet. BCAAs contribute to energy production, and are key precursors in glutamate synthesis49. They play critical roles in the biochemistry of the brain and are important building blocks in proteinogenesis. The availability of these free form amino acids is a rate-limiting step in proteinogenesis. BCAAs are a subset of the large neutral amino acid (LNAA) class, which also includes the aromatic amino acids (AAA), tyrosine, tryptophan, and phenylalanine. LNAAs are key precursors for neurotransmitters that have been implicated in delirium, including glutamate, serotonin, dopamine, and norepinephrine13,14. BCAAs and AAAs compete for access to the central nervous system (CNS) through the large neutral amino acid transporter 1 (LAT1)50,51,52. With the absence of BCAAs, there is greater opportunity for the AAAs to reach CNS and generate their associated neurotransmitters: serotonin, dopamine, and norepinephrine.

We identified distinct metabolites and metabolic pathophysiological pathways linked to delirium development at POD2. Most significantly, potential upregulation in the citrate cycle pathway in delirium cases was predicted. This is a key pathway for energy metabolism. Six metabolites within the citrate cycle (oxaloacetate, fumarate, citrate, aconitate, isocitrate, α-ketoglutarate) were all increased in delirium cases, as was itaconic acid, which is generated by decarboxylation of aconitate. Interestingly, both plasma and CSF metabolomics comparison of patients with Alzheimer’s disease (AD) to matched cognitive healthy controls identified a similar increase of several citrate cycle intermediates such as citrate, aconitate, and α-ketoglutarate in AD53,54. Importantly, there is a well characterized reduction in activity of the rate limiting citrate cycle enzyme α-ketoglutarate dehydrogenase which converts α-ketoglutarate to succinyl-CoA55,56 that may be a key event in AD pathogenesis as well as various other neurodegenerative diseases due to oxidative stress54. Reduced  α-ketoglutarate dehydrogenase may result in accumulation of α-ketoglutarate. A disruption of energy homeostasis and increased oxidative stress because of increased citrate cycle activity may be a crucial event during delirium manifestation.

Glucose is the main fuel for the brain, but under glucose-depleted conditions like strenuous exercise, fasting, or starvation; the ketone bodies, β-hydroxybutyrate (βOHB) and acetoacetate (AcAc), are the main alternative fuel sources. These small lipid-derived molecules are converted to acetyl-CoA and then oxidized in the citrate cycle to produce adenosine triphosphate and carbon dioxide57. Kealy and colleagues showed that energy metabolism dysregulation was sufficiently capable of triggering an acute cognitive dysfunction in a delirium mouse model44. Under normal conditions in humans, these ketones are present in plasma in an approximate 2:1 βOHB:AcAc ratio58. At PREOP, the delirium cohort displayed a decrease in βOHB (FC = − 1.69, nominal p value = 0.01), and then an upregulation of AcAc (FC = 1.54, BH p value = 0.00075) at POD2, elucidating an alteration in energy metabolism. At POD2 our participants were in a fasting state, which supports an increase in acetoacetate, but this upregulation should be uniform across both phenotypes. This disparity points to potential energy pathway alteration in the delirium group that is not observed in the control group. These findings support previous hypotheses that energy metabolism dysregulation is a possible driver in delirium pathophysiology29,31,44.

While most of the metabolites identified in this study have not been previously linked to delirium risk or development, a few of these metabolites have been previously linked to inflammation, cognitive function, or neurological diseases. Most prominent is kynurenic acid, which our data show to be significantly increased in delirium cases at POD2. Kynurenic acid, a degradation product of l-tryptophan, is a neuromodulator interacting with NMDA, nicotinic, and GPR35 receptors. It plays a role in various neurophysiological functions as well as neurological diseases and inflammatory processes. Overexpression of kynurenic acid in several neurological diseases is associated with confusion58, while reduction of kynurenic acid in mice improves cognitive function59,60. Both kynurenine and tryptophan are differentially quantified concentrations at POD2 in delirium samples at nominal, not BH-corrected, p value (< 0.05 in all three statistical tests) with kynurenine, like kynurenic acid, being increased and tryptophan being decreased in delirium samples. Thus, kynurenic acid may be one key metabolite associated with delirium development.

Despite our strengths in experimental design, established cohort, developed workflow, and data analysis methods, several limitations of this study are of note. The statistical power at PREOP was still insufficient to yield metabolites that passed two or more BH-corrected statistical tests implying the need for more samples. Second, the SAGES study was conducted within a single geographic region, limiting the generalizability of our results across other populations. Third, since only plasma samples were analyzed, it is difficult to assess whether similar metabolic changes occur in the brain and have functional and causal consequences with regard to delirium risk or development. Therefore, the identification of disruptions in several metabolic pathways and the link to delirium pathogenesis and pathophysiology are putative and need to be further validated. Fourth, since delirium had already occurred in most of our participants at the time of the POD2 blood draw, we are unable to completely disentangle metabolic changes that are on the causal pathway to delirium from those that result from delirium. Finally, targeted metabolomics is restricted by detection of the predefined metabolites established by the specific protocol that may result in the omission of metabolites involved in delirium.

In conclusion, we have identified metabolites associated with delirium at both pre- and post-operative time points that may enhance our understanding of the etiology and pathophysiology of delirium. Our findings highlight potential dysregulation in energy production pathways at both PREOP and POD2 paving the way for future studies, which should expand upon these findings through exploration of key enzymes involved in these pathways. The metabolomic findings at PREOP have the potential to be combined with other biomarkers to develop a predictive signature, which can be used to target personalized, preventative interventions to reduce delirium incidence. Additionally, our analysis of the metabolome at POD2 could improve our understanding of the disease mechanism, which may lead to better strategies to ameliorate the delirium process. Overall, the established workflow and data analysis provide a two-faceted approach to delirium metabolomics providing a comprehensive assessment of the biochemical activity associated with this syndrome.


Study participants

The Successful Aging after Elective Surgery (SAGES) study is a prospective observational study of older adults undergoing major elective non-cardiac surgery and was designed to understand novel risk factors and long-term outcomes of delirium59,60. SAGES enrolled patients ≥ 70 years old scheduled for major noncardiac surgery, including total knee or hip replacement, cervical or lumbar laminectomy, abdominal aortic aneurysm repair, lower extremity vascular bypass, or colectomy. Patients received either general or spinal anesthesia. Major inclusion and exclusion criteria have been published59,60. Patients underwent a detailed screening process to exclude dementia based on medical record review, capacity assessment, patient or family report of dementia diagnosis, and cognitive testing using the Modified Mini-Mental State Examination61. In addition, patients underwent a neurocognitive battery at baseline, which was used to compute the general cognitive performance (GCP) summary measure62. Comorbidities were identified from medical record review by trained physician abstractors and scored based on the Charlson Comorbidity Index.

Informed consent for study participation was obtained from all subjects according to procedures approved by the institutional review boards of Beth Israel Deaconess Medical Center (BIDMC) and Brigham and Women’s Hospital, the two surgical sites, and Hebrew SeniorLife, the study coordinating center, all located in Boston, Massachusetts. All experiments, methods, and analyses conducted in the current manuscript were performed in accordance with relevant guidelines and regulations of the institutional review board (IRB) of all participating institutions.

Data collection

Postoperative delirium was determined daily throughout hospitalization, supplemented with a validated chart review to identify cases that may have been missed during daily interviews. Delirium was assessed using the Confusion Assessment Method (CAM) diagnostic algorithm63,64. The presence of delirium by chart review was adjudicated by at least two delirium experts, and discordance was resolved through consensus65. Patients were considered delirious if delirium was present on either the CAM or the chart review method on any postoperative day; otherwise, patients were considered non-delirious. If delirium was absent during the entire hospitalization, subsyndromal delirium was defined as (i) an acute change in mental status or fluctuation, (ii) at least one CAM core feature (inattention, disorganized thinking, altered level of consciousness), and (iii) at least one other CAM supporting feature (disorientation, perceptual disturbance, delusion, psychomotor agitation, psychomotor retardation, or inappropriate behavior)66.

Creation of nested matched case-control sample

From the entire SAGES cohort, a matched delirium/no-delirium sample was previously identified to examine inflammatory markers and plasma proteomics67,68. Delirium cases (DEL) were defined as patients with delirium on POD2. No delirium controls (CNT) were defined as patients with no delirium or no subsyndromal delirium during their hospital stay. DEL/CNT pairs were matched on six variables (age and baseline GCP within five years, and an exact match for sex, presence of vascular comorbidity, surgery type, and Apolipoprotein E [APOE] ε4 status) (Table 1). Vascular comorbidity was present if the participant had at least one Charlson diagnosis related to vascular disease: myocardial infarction, congestive heart failure, peripheral vascular disease, cerebrovascular disease, hemiplegia, and diabetes or diabetes with end organ damage. For APOE genotype, DNA was extracted from cellular material in the blood and genotyped at the Partners Center for Personalized Medicine. Patients with at least one ɛ4 allele were defined as APOE ɛ4 carriers. To limit heterogeneity of surgical procedure only patients who underwent an orthopedic procedure were included in this manuscript 66 matched pairs. Fourteen pairs with low plasma volume in the biorepository were excluded, yielding n = 52 matched pairs (104 participants) for inclusion in this study.

Blood collection and processing for plasma

Heparinized plasma samples were collected and stored as part of the SAGES study59 as previously described68. PREOP and POD2 timepoint plasma was utilized in the present study. For all of the samples, blood was collected in a non-fasting state at PREOP and in a fasting state at POD2. A pooled control sample was created from twenty PREOP samples (ten males and ten females) from patients de-enrolled from the SAGES study. Metabolites were extracted from equal volumes of heparinized plasma (100 µl) by ice-cold methanol precipitation added to a final concentration of 80% (vol/vol).

Targeted metabolomics

Equal volumes per sample of methanol extracted metabolites were run on a 5500 QTRAP (SCIEX, Framingham, MA) mass spectrometer using a previously published targeted methodology developed by the BIDMC Mass Spectrometry Core69. This protocol utilizes a targeted selected reaction monitoring (SRM), positive/negative ion-switching, mass spectrometry-based metabolomics platform suitable for bodily fluids, cells, and fresh and fixed tissue69.

Description of data

To accommodate for the maximum sample loading capacity of the 5500 QTRAP 60 samples, six unique sample preps and mass spectrometry runs were performed for a total of 52 matched DEL/CNT pairs at PREOP and POD2 that resulted in analysis of a total of 208 samples. Each experimental run—except for the final one, which was abridged due to sample limitations—included ten matched pairs, three internal spike-in standards, thirteen pooled quality control plasma samples (QC), three conditioning samples and four blanks (Fig. 1, Supplementary Fig. S1). In the supplementary material we explain the sample order for each run, in detail.

Preprocessing of metabolomics data

Overall, 315 metabolites were measured using previously published targeted methodology (Supplementary Table S1)69. A metabolite was considered “present” if it was measured in at least 50% of the samples within a phenotypic group (PREOP CNT, PREOP DEL, POD2 CNT, POD2 DEL).

Signal drift was corrected with statTarget using a quality control-based machine learning algorithm: random forest signal correction (QC-RFSC)70,71. This allowed for the integration of the six different experimental runs using the pooled QC samples. Metabolites deemed absent were omitted while signal imputation was performed on the remaining using the k-nearest neighbor (knn) method72.

NormalizeMets R package was used to normalize the metabolomics data matrix to previously selected internal standards using the NOMIS: normalization using optimal selection of multiple internal standards technique71,73,74.

Statistical analysis

Differential metabolite concentration was assessed using both parametric and nonparametric statistical tests to account for the degree, direction, and rank of difference between DEL and CNT groups at both PREOP and POD2 time points. For all statistical tests (paired t-test, binomial test, and Wilcoxon signed-rank test), the Benjamini–Hochberg (BH) procedure was applied to correct for multiple hypotheses testing75. These results were further tested for robustness using tenfold split analysis at a 5% significance level, with 80% power, and run 10,000 times (Supplementary Table S6). If a metabolite came up as significant in the majority of the splits (≥ 50%), this conferred a level of confidence in the observed differential concentration. Fold-change (FC) of metabolite concentration was calculated by applying the one-step Tukey’s biweight algorithm on FC values (DEL/CNT) for each paired sample76. This provides a robust estimation of the FC for each individual metabolite that is unaffected by outliers. The FCs of the metabolites that are downregulated in the delirium group are indicated using the negative sign, e.g., a FC of − 2 implies two-fold downregulation in the delirium group.

Multivariate statistical analysis was performed using the ropls package in the statistical program R71,77. Potential outliers were identified using principal component analysis (PCA) and Mahalanobis distance78 and were omitted prior to the application of orthogonal partial least squares-discriminant analysis (OPLS-DA), a supervised multivariate separation method77,79. The model was validated through sevenfold cross-validation, cross validation-analysis of variance (CV-ANOVA)80, and response permutation testing (n = 1000) using the fit metrics R2X, R2Y, and Q2.

Systems biology analysis

Systems biology analyses of delirium-related metabolites were performed using MetaboAnalyst 4.0 (Montreal, Quebec, Canada)81,82,83, which takes a bifurcated analytic approach that includes metabolite over-representation and pathway topology41,84. A metabolite was included in biological systems analysis if it showed significant differential concentration (nominal p value < 0.05 (PREOP), BH p value < 0.05 (POD2)) in two or more statistical tests. At PREOP, nominal p value was used in lieu of BH p value because systems analysis requires more metabolites than those that met the BH significance cut-off < 0.05.

Hypergeometric testing was used to assess if delirium-related metabolites were overrepresented in a pathway more than expected by chance81,82. A pathway was considered enriched if the false discovery rate (FDR) adjusted p value was < 0.05. In topology analysis, perturbed metabolites located in critical or central positions are deemed to have a greater impact on a pathway.