Sepsis is a life-threatening organ dysfunction caused by a dysregulated host response to infection1. Septic shock is a subset of sepsis with underlying circulatory and cellular/metabolic abnormalities associated with higher mortality rates1, 2.

Early supportive therapy with fluid resuscitation and vasopressors to restore hemodynamics and reduce tissue hypoperfusion is decisive for the patient’s outcome and has figured in treatment guidelines for decades3. However, mortality rates for septic shock may reach 60% even in the era of early recognition and treatment4, with today’s poor prognosis mainly related to multi-organ dysfunction (MOF).

The limited improvement in septic shock survival can be explained by the inability to prospectively identify the patients who are most likely to benefit from a specific therapy, and the lack of predictive monitoring markers of drug delivery and response. Researchers are becoming increasingly aware that the response to therapy is crucial and precision medicine is already an important research topic for acute illnesses and septic shock5. Precision medicine extends personalized medicine beyond the genome to include broader systems, with a multilevel approach to tailoring therapeutics to individual patients.

Recently interest in the metabolomics approach has grown as the metabolome is the result of gene and protein function and activity and may therefore provide a more sensitive readout of drug response phenotypes because most drugs affect components of metabolism6. For instance, metabolomics analysis of various classes of blood metabolites has been used to search for predictive signatures of intensive care unit (ICU) mortality in adults7,8,9,10,11. Less attention has been paid to the investigation of putative metabolic determinants that would able to classify patient’s responsiveness to initial therapy during the first 48 h in ICU.

We investigated the plasma metabolomics profile of septic shock patients during the acute phase of resuscitation. Blood samples were collected at study enrolment (time T1) and after about 48 hours (time T2). The patients received initial therapy according to the standards3 immediately after shock diagnosis (time T0). The time interval between T0 and T1 was on average 10 hours.

We merged untargeted and targeted mass spectrometry-based metabolomics strategies to cover the plasma metabolites repertoire as far as possible. We first adopted an unbiased strategy (untargeted metabolomics), profiling as many plasma metabolites as possible without any a-priori hypothesis. We applied rapid but yet accurate mass metabolic profiling by direct flow injection-time-of-flight mass spectrometry (TOF-MS)12 as untargeted screening to explore the main perturbed metabolic features.

Targeted metabolomics measures and quantifies a specific list of metabolites according to a standard in order to achieve absolute quantification of defined metabolite classes13. Since metabolic signatures showing changes in circulating kynurenine, fatty acids, lysophosphatidylcholines species and/or carnitine esters have already been reported in different septic shock14,15,16,17,18, they might conceivably be involved in the first phase of shock as well, and could help in understanding the different trajectories in septic shock patients. We therefore also applied a targeted approach focused on measuring these specific metabolic classes to provide the magnitude of their changes in our clinical setting and possibly validate the information from our untargeted analysis.

The primary objective was to verify whether different responses to therapy, measured as changes in organ dysfunction using the Sequential Organ Failure Assessment (SOFA) score, are associated with different metabolite patterns.


Clinical characteristics of the study population

The current work is an ancillary study from the multicenter prospective observational trial, Shockomics (see Identifier NCT02141607). We analyzed the septic shock patients enrolled at the Université de Gèneve (Geneva, Switzerland).

All patients had a SOFA score higher or equal than 9 at T1 (Table 1). Patients who still had a score higher than 8 at T2 and had no decrease of at least 4 points were classified as not responsive to therapy (NR); the other cases were classified as responsive to therapy (R). The NR group consisted of seven patients who had a SOFA score >8 at T2 and Δ SOFA <5 (Δ = T1 − T2 values of SOFA).

Table 1 Characteristics of the two groups of patients at study enrollment (R: responsive; NR: not responsive to therapy).

The characteristics of the 21 patients at enrolment (T1) are reported in Table 1. Mortality and length of stay, comorbidities, sources of infection, administered therapies and partial SOFA scores are illustrated in Table 2. No significant differences were found between the two groups at admission. However, though not significantly different, in the NR group one patient died within one week; there was also a higher percentage of deaths within 28 days (43% in NR vs. 14% in R) and a longer hospital stay (33 days in NR vs. 15 days in R).

Table 2 Mortality and length of stay, comorbities, sources of infection and partial SOFA scores in the two groups (R: responsive; NR: not responsive to therapy).

Metabolic fingerprinting by untargeted metabolomics

A rapid untargeted analysis by flow injection-TOF-MS was done to screen for metabolic features significantly characterizing the responsiveness (R) group and non-responsive (NR) group to therapy in septic shock patients. Statistical analyses on the species identified from the untargeted approach showed that at T1 the two groups were similar and most of the differences were seen at T2. None of the species identified had a significant difference between R and NR at T1, except 4 species: stearic acid was lower in NR, while pyruvic acid, lactic acid and histidine were higher (Fig. 1). The changes in peak intensities from T1 to T2 were verified in the two groups separately and then compared (Table 3). There was a general increase in circulating essential aminoacids such as arginine, tyrosine, threonine and lysine at T2 in R and NR, but only lysine and threonine rose significantly in both groups. Similarly, the abundance of acetylcarnitine was significantly lower in R and NR (Table 3). Only NR showed a significant reduction in circulating fatty acids, mainly saturated and monosaturated (myristic acid, palmitoleic acid, palmitic acid, oleic acid and stearic acid). The endogenous kynuramine, derived from tryptophan, increased with time in both groups, although significantly only in R. The trend (Δ = T2 − T1) showed three species significantly differed between the two groups: creatinine decreased in R and increased in NR, while myristic acid and oleic acid significantly decreased in NR only (Fig. 2).

Figure 1
figure 1

Untargeted metabolomics. Metabolites whose peak intensity is significantly different between responsive (R) and non-responsive (NR) patients at T2 (Wilcoxon rank-sum test p < 0.05, FDR < 0.15). Distributions are shown as box-plots where the central line is the median concentration, the edges of the box are the 25th and 75th percentiles and outliers are defined as 1.5 times the interquartile range and highlighted by +.

Table 3 Metabolite peak intensities from T1 to T2 in the two groups (untargeted approach).
Figure 2
figure 2

Untargeted metabolomics. Metabolites whose change in peak intensity from T1 to T2 in the two groups is statistically significant. Box-plots in the top right corner show differences in metabolite peak intensity between T1 and T2 expressed as delta (Δ = T2 − T1). We did the Wilcoxon rank-sum test for the delta of the two groups and Wilcoxon signed rank between T1 and T2 in each group separately. Significant differences are marked with *(p-value < 0.05).

Metabolic profiling by targeted metabolomics

We applied a mass spectrometry-based quantitative metabolomics profiling to unambiguously identify and quantify glycerophospholipids, aminoacids, acylcarnitines and biogenic amines in the plasma of the study subjects.

Similarly to the untargeted metabolomics approach, univariate statistical analyses showed that most of the differences in metabolite levels arose at T2.

Only four metabolites had significant differences in concentrations in the two groups at T1 (Supplemental Table S1 and Figure S1), and 23 metabolites at T2 (Table 4). At T2 the plasma levels of six species of lysophosphatidylcholines (lysoPCs), seven diacyl-phosphatidylcholines (PCaa), two acyl-alkyl phosphatidylcholines (PCae), two long-chain sphingomyelins (SM) and glutamic acid were lower in NR than in R, whereas there was greater abundance in NR for aminoacids such as alanine, methionine, phenylalanine and histidine (Table 4 and Fig. 3).

Table 4 Species identified by the targeted approach which are significantly different in the R and NR groups at T2 (Wilcoxon rank-sum test p < 0.05 and FDR < 0.15).
Figure 3
figure 3

Targeted metabolomics. Metabolites whose concentration (μM) is significantly different between responsive (R) and non-responsive (NR) patients at T2 (Wilcoxon rank-sum test p-value < 0.05, FDR < 0.15). The figure shows only 6 out of 23 metabolites (see Table 4) significantly differing in the two groups at T2 as example. Distributions are shown as box-plots where the central line is the median concentration, the edges of the box are the 25th and 75th percentiles and the outliers are defined as 1.5 times the interquartile range and highlighted by +.

No metabolites changed significantly in the NR group from T1 to T2, whereas 54 significantly changed in R. In the R group eight species of lysoPCs, seventeen PCaa, twenty-two PCae, eleven SM, seven amino acids (AAs) and two biogenic amines significantly increased from T1 to T2, while histidine, creatinine and taurine decreased significantly. For 38 metabolites the trends (Δ = T2 − T1) were different between R and NR (Table 5).

Table 5 Metabolite concentrations (µM) from T1 to T2 in the two groups.

Moreover, kynurenine, a product of the trypthophan catabolism increased in the NR group significantly more than R group as shown in Fig. 4.

Figure 4
figure 4

Targeted metabolomics. Metabolites whose concentration (μM) from T1 to T2 in the two groups is statistically different. The figure shows only 4 metabolites as an example of those differing overtime (see Table 5). Box-plots in the top right corner show the differences in metabolite concentrations between T1 and T2, expressed as delta (Δ = T2 − T1). We did the Wilcoxon rank-sum test for the delta of the two groups and Wilcoxon signed rank between T1 and T2 in each group separately. Significant differences are marked with *(p-value < 0.05).

Regression analysis for targeted metabolomics data

As combinations of features can give more information than features considered singly, we used classification models with the aim of identifying a set of metabolites that are mostly associated with the target class, i.e. patients not responsive to therapy (NR). The coefficients of the models obtained are reported in Supplemental Table S2. The interpretation of the coefficients in the logistic regression it is not trivial. If we express the odd-ratio as exponential of linear combination of the independent variables, we can say that if the coefficient β i is positive then the increase of the feature x i will be associated with the increase of the odd ratio, i.e. the probability to belong to class 1 (NR) is higher than to class 0 (R), given all other x j variables being equal. On the contrary, if the coefficient β i is negative then the increase of the feature x i will be associated with the decrease of the odd ratio, i.e. the probability to belong to class 1 is lower than to class 0.

Three metabolites were selected in all models: PC ae C40:2, PC ae C38:0 and alanine. Figure 5A shows the coefficient values of the model built according to the criterion of minimal deviance on the first 20 ranked features. All the obtained models correctly classify the observations in the testing set so the performances are better than the ones from the metabolites considered individually for patient classification (the average AUC of the metabolites is below 0.8, as shown in Supplemental Figure S2).

Figure 5
figure 5

Coefficients values of the logistic regression models for targeted metabolomics (panel A) and for integration of targeted and untargeted metabolomics (panel B).

Regression models for targeted and untargeted metabolomics data

We built the integrated models by using 10 features from untargeted metabolomics data and the first 20 ranked metabolites from targeted analysis, as explained in the Methods section.

We can notice that the set of features selected includes again lysoPCs, PCs and alanine. Moreover, six further species not measured by targeted analysis entered in the models: stearic acid, palmitoleic acid, palmitic acid, oleic acid, myristic acid and citric acid. The coefficients of the models are reported in Supplemental Table S3. Figure 5B shows the coefficients of the model built according to the criterion of minimal deviance on the first 20 ranked features. We can notice that alanine, PC ae C38:0, PC aa C38:1, myristic acid and palmitoleic acid are selected by all models. It is worth to underline that PC ae C38:0 and alanine have coefficients with the same sign as the coefficients of the models built on targeted metabolomics data only.

Discriminant Analysis

The coefficient values of the LDA models and the VIP scores of the PLS-DA models built on the first 10 and 20 ranked features after mRMR are reported in Tables 6 and 7 for targeted metabolomics and for targeted plus untargeted metabolomics data respectively. We cannot use the entire subset of 30 features due to the lower number of observations (i.e. 21 patients only). In fact, the computation of the boundary region requires the covariance matrix would be invertible and this is not the case.

Table 6 VIP scores of PLS-DA and coefficients of LDA for the targeted metabolomics models.
Table 7 VIP scores of PLS-DA and coefficients of LDA for the integrated targeted and untargeted metabolomics models.

In the targeted metabolomics model, it is worth to underline that PC ae C38:0, which already played an important role in the regression models, occupies the first position in the VIP ranking when considering 20 features. Similarly, when considering PC ae C40:2, we can notice that it is in the first position when considering 10 features. This latter metabolite also has the highest score in the integrated model. Three-dimensional PLS-DA score plots on 20 features for the two models are shown in Fig. 6 (targeted metabolomics only in panel A; integrated model of targeted and untargeted metabolomics in panel B). In both cases, only one subject (i.e. NR in the targeted model and R in the integrated one) was misclassified by the models.

Figure 6
figure 6

Three-dimensional PLS-DA score plots on 20 features for targeted metabolomics model (panel A) and for targeted and untargeted metabolomics model (panel B).

Plasma level of sPLA2-IIA

As shown in Fig. 7A, sPLA2-IIA plasma levels were significantly higher in NR than in R patients at T2. To note that responders showed a markedly significant reduction of sPLA2-IIA from T1 to T2. We also compared the sPLA2-IIA variation overtime, expressed as Δ = T2 − T1 (panel B) and sPLA2 –IIA decreased significantly in R only. Some NR patients reduced the level of sPLA2-IIA, however, the population trend was not significant and clear as in R.

Figure 7
figure 7

Plasma sPLA2-IIA levels (μg/L) in responsive (R) and non-responsive (NR) patients at T1 and T2 (panel A) and comparison of time trend variation, expressed as delta (Δ = T2 − T1), between the two groups (panel B). Distributions are shown as box-plots where the central line is the median concentration, the edges of the box are the 25th and 75th percentiles and the outliers are defined as 1.5 times the interquartile range and highlighted by +. Significant differences between groups are marked with *(Wilcoxon rank-sum test p-value < 0.05), whereas significant differences from T1 to T2 are marked with **(Wilcoxon sign-rank test p-value < 0.05).


Improvement of organ function as assessed by a drop in SOFA score in the first days of sepsis and septic shock is associated with better outcomes19, 20, but the mechanisms behind organ improvement remain to be fully elucidated. We made a comprehensive metabolomics study of septic shock patients stratified as responders and not responders to standard therapy on the basis of the changes in SOFA score within the first 48 hours after the enrollment.

To the best of our knowledge, this is the first study to show that plasma metabolome changes are associated with initial responsiveness to therapy in septic shock patients.

Combining untargeted and targeted metabolomics methods by collecting data for fast untargeted MS data acquisition and high-resolution MRM transitions for targeting multiple metabolites, we obtained a wider picture of patients’ metabolic states and their metabolic trajectory during the first 48 hours after ICU admission. Univariate analysis and the classification models confirmed that NR group presented an overall lipidome alteration, as previously reported14, 15, 18, 21,22,23.

Here we showed that in NR patients specific lysophosphatidylcholines species (lysoPC C16:0, C16:1, C18:0, C18:1, C18:2, C20:3) did not significantly change from T1 and T2, whereas in R patients they significantly increased and were markedly higher than in NR at T2 (see Fig. 4 and Table 5). In addition, their respective free fatty acids such as palmitic (C16:0), palmitoleic (C16:1), stearic (C18:0) and oleic (C18:1) acids were significantly lower at T2 in NR only (Table 3).

LysoPCs have a very complex role in metabolism. They are primarily generated by phospholipase A2 (PLA2) enzyme activity, and like this enzyme, they have a direct role in toxic inflammatory responses. Low plasma lysoPC levels have been noted in sepsis patients and systemic treatment with lysoPCs has proved to be therapeutic in rodent models of sepsis and ischemia24.

These observations suggest that elevation of plasma levels of these lipids can actually help relieve serious inflammatory conditions. Cunningham et al.24 reported that specific lysoPCs act as uncompetitive product inhibitors of plasma secreted PLA2 enzymes (sPLA2s), especially under conditions of elevated enzyme activity, thus providing a feedback mechanism for the anti-inflammatory effects of these compounds25. Indeed, in the present work we showed that the level of sPLA2-IIA significantly decreased together with an increase of lysoPC species (i.e. lysoPC C16:00 and C18:0) in R patients only. This result is in line with other experimental evidence that reduction of sPLA2-IIA may slow down the inflammatory cascade and increase the probability of responsiveness to therapy26, 27 (see also Figure S5).

The reduction in circulating lysoPC in NR patients may simply reflect their enhanced conversion to lysophosphatidic acid (LPA), which induces a multitude of cellular responses through its action on immunologically relevant cells28. Conceivably the low lysoPC may also promote an excessive immune response, with detrimental effect in NR patients21, 29. Low circulating levels of lysoPC 16 and 18 species have also been reported in inflammatory liver disease30,31,32. NR patients have also showed a marked decrease in PC species, which originate in the liver. The imbalance of lysoPC/PCs cycling suggests that hepatic homeostasis and function is compromised even before any clinical manifestation, and bilirubin alone cannot give a clear picture of the liver’s condition.

In addition, NR patients had lower levels of PC species containing long-chain polyunsaturated fatty acids (LCPUFAs), such as PC aa C38:6, PC aa C36:6, PC aa C40:5, with further elongation/desaturation products. This profile agrees with our previous finding of the different composition of PC species as potential metabolic determinants of mortality in septic shock14. Here again we can speculate that, since LCPUFAs reduce T-cell activation and dampen inflammation33, a decrease in PC-containing LCPUFAs might hamper their protective effects, including a concerted action of either withdrawing pro-inflammatory eicosanoids or incrementing anti-inflammatory eicosanoids. Eicosanoids and pro-resolving lipid profiles have been recently correlated with survival and clinical outcome in sepsis34.

The multivariate models showed that less change in plasmalogen concentrations (plasmenylcholines PC ae C44:6, PC ae 40:2, PC ae 40:5, plasmanylcholine PC ae 38:0), in lysoPC species (e.g. saturated lysoPC C16:0, C18:0), and in fatty acids in combination with larger increment of alanine were associated with non-responsiveness (Fig. 5 and Table 7). Plasmalogens serve as endogenous antioxidants, mediators of membrane structure and dynamics, storage for polyunsaturated fatty acids and lipid mediators35. Raising plasmalogen levels protects human endothelial cells during hypoxia36. Reduced plasmalogens abundance in NR might reflect an increased oxidative imbalance probably due to an excessive systemic inflammatory response with a resulting high level of oxidative stress. A low plasmalogen level has been reported as a surrogate marker of oxidative stress in elderly septic patients37. Furthermore, an exaggerated systemic inflammatory response in NR would be in accordance with the high levels of kynurenine, supporting the role of an accelerated tryptophan catabolism along the kynurenine pathway in sepsis outcome14, 38.

A novelty of this study is the emerging role of alanine. Alanine is a gluconeogenic amino acid and plays a key role in the glucose-alanine cycle, a series of reactions in which amino groups and carbons from muscle are transported to the liver. When muscles degrade amino acids for energy needs, the resulting nitrogen is transaminated to pyruvate to form alanine. This is done by the enzyme alanine transaminase, which converts L-glutamate and pyruvate into α-ketoglutarate and L-alanine. The resulting L-alanine is shuttled to the liver where the nitrogen enters the urea cycle and the pyruvate is used to make glucose. Enhanced elaboration of glucose by the liver (hepatic gluconeogenesis) is a prominent feature of the solid organ response to injury and provides fuel to the cellular elements of the inflammatory response. Indeed, the high respiratory exchange ratio for carbohydrate oxidation is the proof that fat oxidation requires more oxygen than carbohydrate oxidation to produce ATP, making glucose more efficient as a fuel source. In this regard, it is well known in sepsis that circulating cytokines and catecholamines cause a shift in metabolism towards the stress state: activation of glycogenolysis and hepatic gluconeogenesis, activation of hepatic lipolysis, increase in muscle protein catabolism and high production of lactate level39.

The higher plasma alanine in NR may be a sign of lower hepatic capacity for conversion of alanine to glucose. The higher levels of pyruvic acid and lactic acid in NR patients at T2 (Fig. 1) seem to further support this interpretation.

We recovered also the data of liver functionality markers from the medical records of our patients (Supplemental Table S4 and Supplemental Figures S6 and S7). The values at the two time points of aspartate transaminase (ASAT or AspAT or AAT), alanine transaminase (ALAT or ALT), alkaline phosphatase (AP or ALP), gamma-gamma transaminases (gGT), albumin and total bilirubin were not significantly different between the two groups. Only one patient in R group and one patient in NR group were diagnosed as moderate/severe liver dysfunction at the enrollment. The inflammatory response affects organs functionality, liver included, and the high variability in the values distribution can be easily explained by the shock condition and by antibiotics therapy.

Our hypothesis is that alanine plays a role in the energy metabolism and this pathway could be affected by liver dysfunctionality, but this is a further factor to be investigated in ad-hoc clinical trial.

The results presented here highlight biological pathways that could have a clinical impact on septic shock progression and management. Animal experiments are now in progress to understand the time course of metabolome change in this condition better.

We are aware of the risk of overfitting the classifier model to our limited set of subjects in this investigation, despite attempts to minimize these effects with the statistical methods used. Furthermore, these analyses could not take into account all the possible confounding factors such as different renal and hepatic functions, type of nutrition (parental or enteral), and latent insulin resistance.


In conclusion, the data presented here reinforce the emerging evidence that lipidome alterations are important in the individual patient’s response to infection. Changes in the levels of metabolites over time can distinguish a positive response to therapy. Under conditions of severe inflammatory stress and subsequent elevation of PLA2 enzymes activity, elevation of circulating levels of lysoPCs may promote the consequent inhibition of PLA2 enzymes, thus favoring cytoprotection. The emerging role of alanine suggests a different approach for monitoring hepatic function, which will be more specific than bilirubin. Further studies should investigate whether such metabolic dysregulation could be exploited for a more effective targeted therapy.

Material and Methods

Study design, patients and clinical data

This is an ancillary study from the multicenter prospective observational trial named Shockomics (see Identifier NCT02141607). Details of the protocol are described in the work of Aletti et al.40. The study was approved by the Geneva regional research ethics committee (Commission cantonale d'éthique de la recherché, President Prof. Bernhard Hirschel, study number 14–041).

Between October 2014 and December 2015, patients admitted with septic shock to the 38-bed mixed ICU of Geneva University Hospital were screened for inclusion criteria according to researchers’ availability. We included adults (>18 years old) with an admission SOFA score ≥6 and arterial lactate ≥2 mmol/l. Patients with a high risk of death within the first 24 hours after admission, systemic immunosuppression, hematological diseases, metastatic cancer, pre-existing dialysis, decompensated cirrhosis or who had received more than 4 units of red blood cells or any fresh frozen plasma before ICU admission were excluded40. Informed consent was obtained from patients or proxies.

Patients were managed by the clinical care team according to international guidelines41.

Blood samples collected within 16 hours of ICU admission (T1, acute-phase) and 48 hours after enrollment (T2, post-resuscitation phase) were processed for metabolomics analysis40.

We used the SOFA score to classify patients in two groups according to their responsiveness to early resuscitation as responders (R) or non-responders (NR). Patients with a SOFA score at T2 higher than 8 and no drop of at least 4 points in their SOFA scores from T1 to T2 (ΔSOFAT2−T1) were classified NR.

Survival at 28 days after ICU admission for patients surviving hospital discharge was assessed by consultation of the Geneva Canton death registry and by telephone call to the patient/proxies.

Untargeted metabolomics by Flow Injection Analysis-Time-of-Flight Mass Spectrometry (FIA-TOF-MS)

Plasma samples

Metabolites were extracted by adding four volumes of cold methanol to the plasma sample (10 μL); samples were vortexed and incubated at −20 °C for 1 h. They were then centrifuged 10 min at 14,000 × g, and the supernatant was collected, dried in a SpeedVac and resuspended in 50 μL of 0.1% formic acid42. A portion (15 μL) of the extract was analyzed by mass spectrometry.

Flow Injection-TOF MS/MS

Analysis was done on an Agilent 1290 infinity Series coupled to an Agilent 6550 iFunnel Q-TOF mass spectrometer (Agilent, Santa Clara, CA) equipped with an electrospray source operated in negative and positive mode. The flow rate was 150 μL/min of mobile phase consisting of isopropanol/water (60:40, v/v) buffered with 5 mM ammonium at pH 9 for negative mode and methanol/water (60:40, v/v) with 0.1% formic acid at pH 3 for positive mode. Reference masses for internal calibration were used in continuous infusion during the analysis (m/z 121.050873, 922.009798 for positive and m/z 11.9856, 1033.9881 for negative ionization). Mass spectra were recorded from m/z 50 to 1100. Source temperature was set at 320 °C with 15 L/min drying gas and nebulizer pressure of 35 psig. Fragmentor, skimmer, and octopole voltages were set to 175, 65, and 750 V, respectively. MS/MS fragmentation patterns of statistically significant features were collected and used to confirmed metabolite identity.

MS Data Processing

All steps of data processing and analysis were done with Matlab R2016a (The Mathworks, Natick) using in-house developed script following the workflow proposed by Fuhrer12. Centroid m/z lists were exported to csv format. Briefly, in this procedure, we applied a cut-off to filter peaks of less than 500 ion counts for negative and 1000 ion counts for positive ionization to avoid detection of background noise. Centroid m/z lists from different samples were merged to a single matrix by binning the accurate centroid masses within the tolerance given by the instrument resolution (about 10 ppm). The output m x n matrix contains the m peak intensities of each mass for the n analyzed samples. Because mass axis calibration is applied online during acquisition, no m/z correction was applied during processing to correct for potential drifts.

Metabolite identification

A total of 14001 and 2190 metabolite masses were measured as peak intensities in positive and negative ion mode respectively. Given the large number of masses measured, we ran preliminary statistical analyses to select only the most significant ones for subsequent metabolite identification. Details of the statistical workflow are reported in Supplemental methods. Significant metabolic species were then identified by database searches (METLIN,; HMBD, in positive and negative ionization (Table S4).

Targeted plasma metabolomics analysis

A targeted quantitative approach using a combined direct flow injection and liquid chromatography (LC) tandem mass spectrometry (MS/MS) assay (AbsoluteIDQ 180 kit, Biocrates, Innsbruck, Austria) was applied for the metabolomics analysis to EDTA-plasma samples stored at −80 °C, as previously reported by our group7. The method of AbsoluteIDQ p180 kit has been proved in conformance with the FDA Guideline ‘Guidance for Industry—Bioanalytical Method Validation’, which implies proof of reproducibility within a given error range. The method combines derivatization and extraction of analytes with the selective mass-spectrometric detection using multiple reaction monitoring (MRM) pairs. Isotope-labeled internal standards are integrated into the platform for metabolite absolute quantification. This strategy allows simultaneous quantification of 186 metabolites (40 amino acids and biogenic amines, 40 acylcarnitines, 90 glycerophospholipids, 15 sphingomyelins, 1 monosaccharide). For data preprocessing, analytical specification and statistical analyses refer to Supplemental Methods.

A metabolite was excluded from further analysis if its concentration did not meet all the following criteria: (1) fewer than 20% of missing values (non-detectable peak) for each quantified metabolite in each experimental group (2) 50% of all sample concentrations for the metabolite had to be above the limit of detection (LOD)7. In total, 130 of the 186 metabolites were used for statistical analysis. All the measurable metabolites are listed in Supplemental Table S5.

Determination of sPLA2-IIA levels

sPLA2-IIA levels in plasma were detected by using the sPLA2-IIA (human type IIA) ELISA kit (Cayman Chemical, USA) according to manufacturer’s instructions. Concentrations in plasma were tested in duplicate and determined against standard curves at wavelength of 450 nm (Tecan Infinite M200 Plate Reader, Tecan Trading AG, Switzerland).

Statistical analyses

For those species identified by untargeted metabolomics and also quantified by targeted approach, we verified that their peak intensities and concentrations were correlated using the Spearman correlation analysis (see Supplemental Figure S3). For the identified metabolites, we examined the ability to separate the two groups of each metabolite individually by computing the area under the ROC curve, applying leave-one-out cross-validation (CV) (Supplemental Figure S4). Details of the statistical analysis for the comparisons of peak intensities are given in the Supplemental Methods.

We compared the metabolite concentrations measured by the targeted approach at T1 and T2 of the R and NR groups by the Wilcoxon rank-sum test. Changes in metabolite concentrations from T1 to T2 were compared separately for the R and NR groups by the Wilcoxon signed-rank test. Finally, for each metabolite, we compared the time-trend changes in metabolite concentrations (i.e. ∆ = T2 − T1) between the two groups by the Wilcoxon rank sum test. To deal with the large number of statistical comparisons, we also computed the false discovery rate (FDR), assessed after the bootstrapping procedure. The sample size was increased from 14 to 20 subjects for the R group and from 7 to 10 for NR by bootstrapping with replacement, for a total of 30 observations. The bootstrapping procedure was only used for the FDR assessment. Results were considered significant when p < 0.05 and FDR < 0.15.

For the peak intensities, we also examined the ability to separate the two groups of each metabolite individually by computing the area under the ROC curve, applying leave-one-out cross-validation (CV) (Supplemental Figure S4).

Finally, we compared sPLA2-IIA plasma levels between R and NR by means of Wilcoxon rank-sum test both at T1 and T2 and for the ∆ = T2 − T1. We also evaluated changes in sPLA2-IIA concentrations from T1 to T2 within the same group applying the Wilcoxon signed-rank test.

Multivariate analyses

Targeted metabolomics analysis

The aim of our model was to classify NR patients. We built the model on metabolite concentration changes from T1 to T2 (Δ = T2 − T1). The concentrations are highly correlated and, when the number of observations (21 patients) is much lower than the number of features (130 metabolites) as in our case, one must filter and reduce the number of features entering the model.

We adopted the method proposed by Peng et al.43, which is the minimal-redundancy-maximal-relevance (mRMR). This algorithm sorts the features according to their relevance to the outcome (maximum relevance criterion) and their redundancy (minimum redundancy criterion) in relation to the other variables. The ranking is based on the mutual information between the outcome and each feature and on the mutual information between each couple of features. We discretized the features distribution according to the interquartile range as suggested by Peng et al.43 in order to apply the mRMR algorithm.

Multivariate analysis was done using the Elastic Net technique44, a shrinkage regression method effective in case of several highly correlated variables. It performs continuous variable selection, causing some of the regression coefficients to be exactly zero, thus reducing the variance of the regression estimates by eliminating redundant features. The subset of variables corresponding to non-zero coefficients can be considered as the variables mainly associated with the outcome. The elastic net technique can be used with any linear regression model. Since we have a binary outcome (R/NR), we applied the logistic regression.

We considered the first 10, 20 and 30 metabolites ranked by mRMR to build three different classification models. The dataset was divided into a training and test set as two third and one third of the observations, respectively. Data were normalized (Z score normalization) before performing the elastic net. We adopted two strategies to further select a smaller subset of features. We performed 50 times an elastic net logistic model using a logit function to fit the training set data. We considered a binary classification (R = 0, NR = 1) and the output of the model is a value between 0 and 1, which represents a sort of probability. We then selected the coefficients of the model with the minimal deviance. We also applied another strategy, we used the shrinkage parameter λ, corresponding to the model with the minimal deviance, to fit another elastic net model and to obtain the coefficients of the logistic regression. In both cases, the models were then evaluated on the testing set and the performance was assessed by the number of correct imputations.

Linear Discriminant Analysis (LDA) and Partial Least Squares Discriminant Analysis (PLS-DA) were also implemented. More precisely, LDA was performed on the first 10 ranked metabolites and the coefficients for the linear boundary between the first and second classes were retrieved. PLS-DA was performed both on the first 10 and 20 ranked metabolites, considering 3 PLS components. Since the groups are unbalanced, the data matrix was weighted centered in order to avoid having a decision boundary shifted towards the most numerous group. The variable importance in projection (VIP) scores, which represent the weights of each feature in PLS-DA model, and the coefficients of LDA were compared to those of logistic regression.

The performance of the classification models was evaluated by the number of correct imputations.

Integration of data from targeted and untargeted analysis

We built an integrated model by using the concentration of the metabolites and the peak intensities of the species identified by untargeted analysis. We must precise that for those metabolites quantified also in the targeted approach (i.e. acetylcarnitine, tyrosine and histidine), we used the concentration values instead of peak intensities as they are more reliable. Therefore, the metabolites identified from untargeted approach and used for these analyses were: acetylcarnitine, pyruvic acid, lactic acid, stearic acid, kynuramine, citric acid, myristic acid, palmitoleic acid, palmitic acid, oleic acid, tyrosine, histidine. We built the model on the changes from T1 to T2 (Δ = T2 − T1). Untargeted metabolomics data (10 features in total) were then combined with the first 20 ranked metabolites from targeted analysis. We considered all the 30 features and we performed again the mRMR algorithm to find the first 10, 20 and 30 ranked features. The classification models were built using the regularized logistic regression on normalized data as described in the previous paragraph. LDA and PLS-DA were also performed as stated above.