Childhood obesity incidence has globally risen at an alarming rate during last decades. Although the increasing rates are apparently plateauing in some Western countries, it remains a serious challenge and a public health priority, given that obesity during childhood is associated with an increased risk of morbidity and mortality due to non-communicable diseases in adulthood1.

The long-term metabolic consequences of childhood obesity are mainly associated with an excessive accumulation of body fat, which in turn leads to an increased risk to develop non-communicable diseases, like type 2 diabetes mellitus and cardiovascular diseases2. However, the mechanistic pathways by which adiposity may induce metabolic perturbations are not fully understood3. The identification in the early stages of childhood obesity of metabolic profiles potentially predicting obesity-related co-morbidities later in life should be considered not only a research topic, but also a clinical priority4. Besides the well-known biochemical and anthropometric risk factors present in overweight/obese children and associated to metabolic/cardiovascular disturbances in adulthood, new players are emerging as potential markers of these conditions.

A wide array of volatile organic compounds (VOCs) is emanated from the human body via breath, saliva, blood, milk, skin secretions, urine, and faeces, as products of metabolic processes5. Several studies have revealed that the metabolomics analysis of VOCs from biological fluids can give useful information for the clinical diagnosis and the therapeutic monitoring of a variety of pathologies, including gastrointestinal disorders and cancer6,7. In particular, Alkhouri et al. have provided evidence on significant differences of the pattern of exhaled VOCs in obese children compared with lean controls, demonstrating that various breath VOCs could potentially be useful to gain insight into pathophysiological processes and pathways leading to the development of childhood obesity and its related complications8.

Among the various biological fluids, urine shows specific features that make it an option of choice for volatile metabolomic profiling. Urine samples can be easily and non-invasively collected in large quantities and stored for long periods. They also offer higher concentrations of VOCs compared to other body fluids. A large body of evidence has revealed that urinary VOCs profiles contain rich information about individual physiological conditions, so that some urinary VOCs can be considered potential biomarkers in diagnosing or monitoring several pathological conditions, including diabetes, autism syndrome and different types of cancer6,9. Very recently, Elliot et al. have examined, by proton (1 H) nuclear magnetic resonance (NMR) spectroscopy and ion exchange chromatography, urinary metabolites from urine samples collected over two 24-hour time periods, to characterize the metabolic patterns of adiposity in a large epidemiological study in the United States and UK. This study showed unforeseen dependencies and interconnectivities between specific urinary metabolites and biochemical pathways that are possibly involved in the pathogenesis of obesity10.

The potential of urinary VOCs profiling as early diagnostic method has not been fully explored, both because of its complexity (containing numerous volatile compounds with different structure and a range of polarity, concentration and volatility) and of the analytical difficulties in identifying and quantifying volatile metabolites. Consequently, several analytical techniques have been developed for separation and concentration of VOCs from this biological fluid. Among them, solid-phase microextraction (SPME), a pre-concentration technology, which integrates sampling, extraction, concentration, and sample introduction into a single solvent-free step11, can be successfully used to simplify this complexity, specifically when coupled to capillary gas chromatography–mass spectrometry (GC-MS)12. Nowadays, urine analysis by SPME GC-MS has been well established as an easy, fast and reliable diagnostic tool allowing the identification of possible urinary disease-associated VOCs.

Aim of the present study was to evaluate, using SPME GC-MS, urinary metabolic signatures in a sample of normal-weight (NW) and overweight/obese (OW/Ob) children belonging to the Italian cohort of the I. Family study.


Twenty-one OW/Ob (ten females and eleven males, age 12.4 ± 1.2 years, BMI 26.7 ± 4.2 kg/m2) and twenty-eight NW (sixteen females and twelve males, age 12.9 ± 1.5 years, BMI 19.5 ± 1.8 kg/m2) children were included in the study. The total energy intake, and the energy intake (% kcal) from fat, carbohydrates and proteins were comparable in the two groups (Table 1). Similarly, no difference were observed between the two groups with regard to blood glucose, insulin, HbA1C, and HOMA index.

Table 1 Characteristic of the study population.

Characterization of the volatile urinary metabolome under acidic and alkaline conditions

Determination of the urinary volatile profiles of overweight/obese and normal-weight children by SPME GC-MS

Typical SPME GC-MS TIC chromatograms of urine samples from a NW and an OW/Ob child, reported, respectively, in Fig. 1a and b, show that very similar VOCs profiles were obtained from the urine of the two groups of subjects, when analysed under acid conditions.

Figure 1
figure 1

Representative SPME GC-MS chromatograms of urine VOCs from a NW (a) and OW/Ob (b) child obtained under acidic pH.

On the other hand, different volatile profiles were clearly distinguished in the urine obtained from the two groups of subjects analysed under basic pH, as displayed in Fig. 2a and b, which show, respectively, representative SPME GC-MS TIC chromatograms of urine samples from a NW and an OW/Ob child analysed under alkaline conditions.

Figure 2
figure 2

Representative SPME GC-MS TIC chromatograms of urine VOCs from a NW (a) and OW/Ob (b) child obtained under alkaline pH.

Identification of each metabolite was achieved by comparing the fragmentation patterns (in terms of presence and intensity of the signals) with those in the NIST 2005 and Wiley 2007 libraries and by evaluating their retention times, using an in-house retention-time library based on reference standard samples. Additionally, identification of volatile compounds was also accomplished by matching their retention indices (RI) (as Kovats indices)13 with literature data, calculated in relation to the retention time of a C8-C20 n-alkanes series, with those of authentic compounds or literature data for similar chromatographic columns.

The identified metabolites included a variety of chemical structures: aldehydes, ketones, nitrogen compounds, terpenes, acids, alcohols, benzene derivatives, furan and sulphur-containing compounds and esters.

One hundred and ten and eighty-three metabolites were detected in samples from both NW and OW/Ob children under acid and alkaline conditions, respectively.

The fragment ion m/z values of the all identified urinary VOCs with the highest abundance within each fragmentation pattern, the matching percentage of the NIST and/or Wiley library, the experimental and literature reported Kovats index, the identification methods and their frequency of occurrence in OW/Ob and NW children are listed in Tables 2 and 3 for acidic and alkaline conditions, respectively.

Table 2 VOCs identified in the urine of OW/Ob and NW children under acid pH. Main fragment ion m/z, match percentage to the NIST 05 and/or Wiley 07 libraries, experimental (RIcal) and literature reported (RI) Kovats index, identification methods (ID) and percentage of occurrence are reported.
Table 3 VOCs identified in the urine of OW/Ob and NW children under alkaline pH. Main fragment ion m/z, match percentage to the NIST 05 and/or Wiley 07 libraries, experimental (RIcal) and literature reported (RI) Kovats index, identification methods (ID) and percentage of occurrence are reported.

Data analysis of the SPME GC-MS data sets

A preliminary exploratory data analysis was performed using PCA, excluding the presence of outliers in both the data sets on the basis of the DModX test and the Hotelling’s T2 test (at level of 95%).

Statistical data analysis based on multivariate and univariate approaches performed on VOCs profiles obtained under acid pH did not show differences between the two analysed groups. The PLS-DA model built, considering the whole data set, did not pass the permutation test on the class response and the distribution of the AUC ROC in prediction, obtained during stability selection, showed median equal to 0.63 and 5th percentile equal to 0.47. In addition, the minimum p-value of the t-test for the measured variables resulted to be 0.12 and the behaviour of the related ROC curves unsatisfactory.

On the other hand, a robust PLS-DA model was obtained considering the SPME GC-MS data acquired in alkaline conditions. In Fig. 3 we report the score scatter plot of the discriminant model. Under stability selection, the distribution of the AUC ROC in prediction showed median equal to 0.91 and 5th percentile equal to 0.64. The metabolites selected by stability selection were joined to those selected by t-test with False Discovery Rate and ROC obtaining a set of 14 putative markers, which seem to be crucial in the distinction of OW/Ob children and NW (Table 4). Among these, the levels of 2-pentanone, 3-hexanone, 5-methyl-3-hexanone, 4-methyl-2-heptanone, 3-octanone, 2,4,4-trimethyl-1-pentanol, 1-hexanol, 2-hexanol, 1-heptanol, dimethyl sulfone, 2,4,6-trimethyl-pyridine and formamide N,N-dibutyl are higher in the urine of OW/Ob children than in NW. In contrast, 1 H pyrrole-2-methyl and 1-methyl-2-piperidone have a lower concentration in OW/Ob children compared to NW.

Figure 3
figure 3

Score scatter plot of the PLS-DA model built considering the data set obtained under alkaline conditions. The model showed 2 components, R2 = 0.74 (p-value < 0.001) and AUC ROC, calculated by 7-fold cross-validation, equal to 0.96 (p-value < 0.001). NW children are indicated with white circles whereas OW/Ob subjects with dark grey circles. The PLS-DA model was post-transformed according to Stocchero & Paris (2016)34.

Table 4 Selected VOCs identified in SPME GC-MS analysis under alkaline conditions.


In the present paper, SPME GC-MS was used to evaluate for the first time the volatile urinary metabolic signatures associated with early obesity on a sample of children belonging to the Italian cohort of the I. Family study.

Urine sampling is a simple and safe alternative to more invasive investigations in children, and, as far as we are aware, this is the first study in which, in order to profile a wider range of urinary volatiles with different physicochemical properties, VOCs from urine samples of OW/Ob and NW children have been analysed under both acidic and alkaline conditions.

One hundred and ten and eighty-three VOCs were detected in samples from both NW and OW/Ob children under acid and alkaline conditions, respectively. Statistical data analysis based on multivariate and univariate approaches performed on volatiles profiles obtained under acid pH did not allow distinguishing the two analysed groups. On the other hand, a robust PLS-DA model was gained considering the large and heterogeneous SPME GC-MS set of data acquired under alkaline conditions, which also allowed the identification of fourteen VOCs putative biomarkers that seem to be crucial in differentiating OW/Ob children from NW.

A large number of VOCs in urine seems to arise from the bacterial action in the gut5, while the presence of volatile metabolites in the gastrointestinal tract is believed to result from the complex interaction of colonocytes, human gut microflora and invading pathogens14. Alkhouri et al. have reported that obese children have a unique pattern of VOCs compared with lean children showing that obesity, like other pathological disturbances, can induce the synthesis of new VOCs or a modification in the concentration of VOCs that are normally produced into the metabolic condition of an individual8.

Specifically, alterations found in the pattern of VOCs can be reflective of changes and variations within the gastrointestinal environment, as demonstrated by a large body of evidence for the role of gut microbial dysbiosis in the pathophysiology of obesity and other gastrointestinal disorders15.

The relationship between the intestinal microbiota and the immune system of the host could be a mediating factor in the development of obesity16. Indeed, there is evidence that the gut microbiota can directly influence body weight in several ways17. The relative abundance of bacterial species and the microbial diversity vary with the physiological state of the host. In particular, obesity is associated with both a reduced-diversity microbial community and an altered representation of bacterial genes15.

Microbes produce about 300 volatile organic compounds in the human gut, whose systemic effects are unknown18. Del Chierico et al. (2017) have demonstrated that the relative abundance of some species of bacteria (Firmicutes and Bacteroidetes) in obese children was similar to that of nonalcoholic fatty liver (NAFL) and nonalcoholic steatohepatitis (NASH) children. These microorganisms always present higher levels in patients compared to controls, while the level of other type of bacteria (Oscillospira) is decreased. Consequently, VOCs have huge potential as biomarkers specific of gastrointestinal and even metabolic diseases.

Data from the literature suggest a possible biological role for some, but not all, of the fourteen VOCs whose levels significantly differed in the two groups under study.

The up-regulation of some of the ketonic and alcoholic compounds, reaching statistical significance in the OW/Ob group vs NW, has been already observed in obese compared to normal-weight children and explained with a gut microbial dysbiosis in the obese subjects17. In particular, 5-methyl-3-hexanone and 4-methyl-2-heptanone, belonging to the methyl-ketone group, were found statistically higher in OW/Ob than in NW children. These compounds can be produced by many species of bacteria and fungi from the respective alkanoic acid19. Indeed, the other ketonic compounds, such as 2-pentanone, 3-hexanone and 3-octanone, can also be synthesized by bacteria5.

With regard to alcohols, the abundance of 1-hexanol, 2-hexanol, 1-heptanol, 2,4,4-trimethyl-1-pentanol is increased in all OW/Ob children compared to NW. These findings can be explained with same results reported in Zhu et al. (2013), who have shown that there is a significant increase of Bacteroides in the obese and NASH groups, compared to the healthy group. The increase of these gut microbial bacteria, capable of producing alcohol, may explain why some alcoholic VOCs are more abundant in OW/Ob children compared to NW20.

A more in depth assessment of the literature has allowed to retrieve different papers indicating that some bacterial species, present also in the human gut microbiome, may produce some of the compounds we found. Specifically, 5-methyl-3-hexanone, 2-pentanone, 3-octanone and 1-hexanol can be released by various actinomycetes in gut flora21,22,23.

The VOCs profile of OW/Ob compared to NW children also results in a higher urinary level of dimethyl sulfone in OW/Ob children. Sulphur containing compounds are formed by incomplete metabolism of sulphur containing amino acids in the transamination pathway. The levels of these compounds are known to be elevated in patients with altered liver function24. Fatty liver disorders are very common in obese children and adolescents, reaching a prevalence of 40–50%25. Interestingly, previous findings have demonstrated that sulphur-containing compounds are also associated to childhood obesity8.

Our study confirms the results of recent papers indicating that certain urinary volatile compounds appeared to contribute to the metabolic signature of adiposity8,10,17.

The VOCs associated with obesity in our study are indeed not consistent with those identified in other studies8,10. Several factors may affect the different VOCs profiles observed in different settings, including environmental and dietary factors, methodological differences in sampling (urine/breath) and analytical detection techniques, and finally, characteristics of the populations under study. Of note, Elliot et al. (2015) reported about adult subjects, while in the Alkhouri et al. study (2015) a large proportion of affected children had severe obesity.

Moreover, although we did not report differences in caloric as well as macronutrient intake between OW/Ob children and NW, we cannot exclude those differences in specific dietary components may affect the VOCs profile. It has been suggested that both intra- and inter-variability in VOCs profile can be related to dietary habits26. Consequently, the adoption of a standardized diet prior to the test can help to reduce variability in VOCs in future experimentations. Finally, the present cross-sectional analysis, with urinary VOCs determined at a single time point, by its nature excluded the identification of causality.

The absence of a blind validation set to test our findings is a limitation of the study. In spite of the robust and conservative procedure used for internal validation, a new set of subjects would be required to replicate the results of this pilot study and to confirm the selected putative markers.

Post-hoc power analysis suggested that the number of recruited subjects for each group was insufficient for some of the selected VOCs. We will take into account our results for designing new experiments with a sufficient statistical power able to improve the knowledge about the role of the VOCs in explaining the development and the progression of the obesity in children.

Finally, environmental chemical exposures possibly interfering with the urinary VOCs profile were not assessed in the present study.

In conclusion, our results suggest that there is potential for urinary VOCs, detected by SPME GC-MS, as metabolic biomarkers of childhood obesity. In particular, the hypothesis that altered urinary VOCs profiles may reflect gut dysbiosis or early impairment of the liver function deserves further investigation, particularly considering that urine sampling represents simple and safe alternative to more invasive procedures in children. While we recognize the limitations and the relative reliability of our analyses, these novel findings may be considered as hypothesis-generating, to be obviously confirmed by larger prospective investigations.


Experimental design and cohort

The I.Family project ( aimed to assess the determinants of eating behaviour in children and adolescents of eight European countries and related health outcomes was built on the IDEFICS cohort (, established in 2006 and followed-up in 2012–2013. A full description of the project has been recently published27.

Briefly, the Italian cohort of the I.Family project was composed by 1521 children and teens (773 NW, 748 OW/Ob) who underwent a general examination module27. Among them, 249 participants (121 NW, 128 OW/Ob), identified on the basis of their body weight trajectories over the 6 year follow-up, underwent additional examinations, including the collection of a fasting urine sample. Among the 249 participants asked to provide an additional 50 ml fasting urine sample, a subsample of 28 NW and 21 OW/Ob participants accepted, and was included in the present analysis.

In particular, weight, to the nearest 0.1 kg with children wearing light clothes and without shoes was measured using an electronic scale (TanitaBC420SMA,Tanita Europe GmbH, Sindelfingen, Germany). Height was measured using a telescopic height-measuring instrument (Seca 225 stadiometer, Birmingham, UK) to the nearest 0.1 cm. BMI was calculated as weight (in kg) divided by height squared (in m2). A detailed description of the anthropometric measurements, including intra- and inter-observer reliability, has been previously published28. Weight categories were defined according to age- and sex-specific BMI categories29.

Each individual on the day of the physical examination provided a sample of morning urine (after overnight fasting) in a 50 mL sterile PVC container. Samples were immediately frozen and stored at −80 °C until analysis. The complete defrosting of the samples was performed at room temperature shortly before analysis. Dietary intake of the previous 24 h was assessed using an online 24-h dietary recall assessment program based on the validated offline version30.

Children were asked to participate, on voluntary basis, in fasting blood withdrawal. A detailed description of sample collection and analytical procedures has been published by Peplies et al. (2010)31.

Specifically, serum insulin was measured through enzyme-linked immunosorbent assay kit (MODULAR E170, Roche Diagnostics). Insulin resistance was estimated by the Homeostatic Model Assessment (HOMA-IR), using the following formula: HOMA-IR = [serum insulin (mU/L) × blood glucose (mmol/L)]/22.532.

The study protocol was approved by the local Ethics Committee of the local Health Authority (ASL Avellino) and informed written parental consent was obtained for each participant. All experiments were performed in accordance with relevant guidelines and regulations.

Chemicals and reagents

2-β-pinene (97% purity), 2-octanone (98% purity), 4-hexen-ol (96% purity), ethyl-nonanoate (98% purity), trans-2-decenal (92% purity), and aniline (98% purity) were used as internal standards, and were all produced by Sigma-Aldrich. Stock solution of these six standards, at a concentration of 1000 ppm, were prepared by dissolving the standards in a mixture of Mill-Q water and ethanol (95/5 (v/v)), and were stored in a refrigerator at 4 °C.

Ethanol was purchased from Romil. Ultra-pure water from a Milli-Q system (Millipore, Bedford, MA, USA) with a conductivity of 18 MΩ was used throughout.

Sodium chloride (NaCl), potassium carbonate (K2CO3) and potassium hydroxide (KOH) were from Sigma-Aldrich, and hydrogen chloride (HCl) was from Carlo Erba. Helium at a purity of 99.999% (Rivoira, Milan) was used as the GC carrier gas. The SPME fibers and the glass vials were purchased from Supelco (Bellofonte, PA, USA). The capillary GC-MS column HP-Innowax (30 m × 0.25 mm × 0.5μm) was obtained from Agilent J&W (Agilent Technologies Inc. Santa Clara, CA).

The SPME fibers were conditioned as suggested by the manufacturer, prior to their first use. Before the initially daily analysis, the fibers were conditioned for 5 min at the operating temperature of the GC injector port and the blank level was checked. Triplicate analyses were performed.

Sample preparation and SPME procedure

Volatiles profiling was performed using the headspace SPME GC-MS method described by Cozzolino et al. (2014), with a DVB/CAR/PDMS (50/30 μm) fibre, an extraction temperature of 40 °C and an extraction time of 30 min.

The pH of urine samples can be an important aspect in affecting the extraction of VOCs. Although both ionized and un-ionized forms of acidic and basic VOCs exist in urine, only the un-ionized forms are volatile and can be found in the headspace. Consequently, in order to provide a profile that represents the true concentrations of VOC components in urine, here urine samples were analysed both under acid and alkaline pH, following two different sample preparation procedures, as shown below9.

  1. 1)

    Acid conditions (pH 1–2): in a 20 mL screw-on cap HS vial (Supelco, Bellefonte, PA, USA), 4 mL urine were added to 1 mL water, approximately 3 g NaCl and 100 μL 6 mol L−1 HCl;

  2. 2)

    Alkaline conditions (pH 12–14): 4 mL urine, 1 mL water, approximately 3 g K2CO3 and one pellet KOH were mixed in the HS vial.

In each sample12.5 μL from a stock solution of the six internal standards (2-β-pinene, 2-octanone, 4-hexen-ol, ethyl-nonanoate, trans-2-decenal, and aniline) at a concentration of 25 ppb were added.

After stirring, vials were sealed with a Teflon (PTFE) septum and an aluminium cap (Chromacol; Fisher, Loughborough, UK) for the release of volatile compounds in the vial and enable analysis.

The sample vial was placed in the instrument dry block-heater and held at 40 °C for 30 min to equilibrate the system. The extraction and injection processes were automatically performed using an autosampler MPS 2 (Gerstel, Mülheim, Germany). Finally, the fibre was automatically inserted through the vial’s septum for 10 min, to allow the volatiles adsorption onto the SPME fibre surface.

Gas chromatography–mass spectrometry analysis

The SPME fibre was introduced into the injector port of the gas chromatograph (model 7890 A; Agilent Technologies, Santa Clara, CA) coupled with a mass spectrometer 5975 C (Agilent), wherein the metabolites were thermally desorbed and directly transferred to a capillary column HP-Innowax (30 m × 0.25 mm × 0.5 μm; Agilent) for analysis.

The oven temperature program was initially set at 35 °C for 5 min, ramped to 120 °C at 5 °C min−1, increased to 250 °C at 10 °C min−1, and held for 10 min. The temperature of the ion source and the quadrupole were held at 230 °C and 150 °C, respectively; helium was used as carrier gas with a flow of 1.5 mL min−1; injector temperature was kept at 240 °C and the pulsed splitless mode was used for the analysis.

The fibre was maintained in the injector for 25 min. Mass spectra were acquired at an ionization energy of 70 eV and volatile components were detected by mass selective detector. The detector operated in a mass range between m/z 30 and 300 with a scan rate of 2.7 scans/s. Each sample was analysed in triplicate in a randomized sequence where blanks, related to analyses of coating fibre not submitted to any extraction procedure, were run.

Metabolites identification was accomplished by searching mass spectra in the available database libraries (NIST, version 2005; Wiley, version 2007) and by the comparison of their retention times with an in-house developed retention time library based on commercial standards. Furthermore, identification of volatile compounds was also achieved by matching their retention indices (RI) (as Kovats indices; Kovats, 1958) with literature data, determined relative to the retention time of a C8-C20 n-alkanes series, with those of authentic compounds or literature data.

Metabolite concentration was determined by calculating the ratio of the peak area of the metabolite and the peak area of the related internal standard. After the calculation of the median of the triplicates, the obtained data sets were log-transformed and autoscaled.

Statistical analysis

The collected data were investigated by multivariate and univariate statistical data analysis.

Specifically, exploratory data analysis was performed by Principal Component Analysis (PCA), whereas Projection to Latent Structures Discriminant Analysis based on Variable Influence on Projection selection (PLS-DA VIP-based) was applied to identify the differences between OW/Ob and Nw children. Stability selection based on Monte-Carlo sampling was used to highlight the subset of relevant variables characterizing the two groups and to estimate the predictive power of the models33. During stability selection, three hundred random subsamples of the collected samples were extracted by Monte-Carlo sampling (with a prior probability of 0.70), and then PLS-DA VIP-based was applied to each subsample, obtaining a set of 300 discriminant models. The predictive performance of each model was estimated by means of Receiver Operating Characteristic (ROC) curve analysis of the outcomes of the predictions of which samples would be excluded during sub-sampling. Within this set of PLS-DA VIP-based models, the most frequently selected variables were identified as relevant variables. The threshold of VIP to use for variable selection was determined maximizing the Q2 parameter (i.e. R2 calculated by cross-validation) during 7-fold cross-validation. Models were submitted to permutation test on the class response to avoid over-fitting according to good practice for model building.

ROC analysis and t-test with False Discovery Rate were applied to investigate the properties of single variable. We considered VOCs with t-test p-value less than 0.05, q-value less than 0.1 and AUC ROC greater than 0.50 (α = 0.05) as significant variables. The results of the multivariate data analysis were merged to those obtained by univariate data analysis to have a comprehensive data analysis, where both the correlation structures and the individual properties of the measured variables were taken into account.

PCA, PLS-DA VIP-based with stability selection, ROC analysis and t-test with False Discovery Rate were implemented with the R 3.1.2 platform (R Foundation for Statistical Computing).

Data availability

The datasets generated during and/or analysed during the current study are not publicly available according to the conditions laid down in the Consortium Agreement of the I.Family project (EC FP7 Grant Agreement No. 266044) but are available from the corresponding author on reasonable request.