Introduction

Analysis of volatile organic compounds (VOCs), which are organic compounds that have high vapor pressure, is a new frontier in non-invasive, fast and potentially inexpensive monitoring of human body chemistry as well as monitoring related health conditions, such as cancer, kidney diseases and neurodegenerative diseases1,2. These VOCs are produced due to the alteration of specific biochemical pathways in the body, most of which are linked with oxidative stress, cytochrome p450, liver enzymes, carbohydrate metabolism and lipid metabolism2. VOCs, after they are produced, are emitted to various body fluids, including, but not confined to: (i) headspace of the disease-related cells; (ii) urine headspace; (iii) blood headspace; (iv) skin headspace; or (v) exhaled breath3,4,5,6. Of these body fluids, exhaled breath and skin headspace are the most useful VOC sources for monitoring body chemistry or state of health for enhanced personalized treatment. This is because exhaled breath and skin headspace can be obtained non-invasively, they are suitable for high compliance and they provide a matrix of relative low complexity2,7,8. Nevertheless, use of samples from a single body fluid9,10 has been limited by moderate accuracies in real-world conditions and/or confounding factors1,11.

In the current study, we have developed a methodology that allows simultaneous collection of VOCs from both breath and skin samples – seeFigure 1a for illustration and methods section for more details. We show that the developed approach allows exploration of a wider VOCs spectrum than that obtained from a single body fluid sample, leading to higher detection capabilities in real-world conditions. We reach these conclusions using both a laboratory-based analytical tool (gas chromatography/mass spectrometry, GC/MS) and a point-of-care tool (nanomaterial-based chemical sensor array).

Figure 1
figure 1

Sampling and GC/MS-based VOC Detection.

(a) Schematic illustration of the sampling process from both breath and skin using sorbent tubes and analyzed by sensor array and GC/MS (illustration done by author, except head figurine (68898031) is reprinted with license from shutterstock.com). (b) Correlation matrix with color map that the various VOC abundances in breath and skin. Analysis of the shared GC/MS compound (19 VOCs) using: (c) correlation matrix with color map that the correlation within a matrix of various VOCs. Rows were reordered based on correlation values so that high or low correlation will be adjacent. B) breath; S) skin; 1) Acetone; 2) Heptane; 3) Dodecane; 4) Octanal; 5) Nonanal; 6) Decanal; 7) 1,3-Butadiene, 2-methyl; 8) Butane, 2,2-dimethyl; 9) Benzene; 10) Benzene, 1,3-dimethyl; 11) Pentane, 3-methyl; 12) Hexane; 13) Cyclopentane, methyl; 14) Acetic acid; 15) Carbon disulfide; 16) Acetophenone; 17) alpha-Pinene; 18) Benzaldehyde; 19) Toluene. (d) PCA analysis according to the three sample groups: breath, skin and room.

Results

The developed approach was tested on breath and skin samples that were simultaneously collected from 30 volunteers, whose details are listed in Supplementary Table S1. The GC/MS analysis of these samples showed pronounced differences between the VOC profiles in the exhaled breath and those in the skin sample. Thirty-seven VOCs, common to all participants, were identified in the individual breath samples, regardless of their differences in age, gender, ethnicity, smoking history or state of health. The common breath VOCs were tentatively identified by spectral library match (seeSupplementary Table S2). The individual skin samples showed 217 VOCs, common to all participants, (seeSupplementary Table S3).

The general relationship between these VOCs was studied by means of a correlation matrix. Figure 1b presents a heat map of the VOC abundance within the various samples (1/−1: maximum positive/negative correlation; zero: no correlation). This analysis showed that the two VOC sources (skin and breath) are only slightly correlated with each other, while the response within each source is highly correlated. Following this analysis, the similarity and correlation between the VOCs from both sources (i.e. the number of VOCs found in both sources) was examined. Remarkably, the overlap between skin and breath VOCs was rather small–only 19 VOCs were found both in breath and skin samples. As a matter of fact, less than 10% of the skin VOCs was found in exhaled breath, viz. over 90% were unique to the skin sample. Excluding isoprene, the concentration of the common VOCs was considerably larger in the skin samples compared to the breath samples. Figure 1c displays a correlation matrix between the abundance of the 19 common VOCs found in the breath and skin samples. For most of the common VOCs, a rather general correlation was found within breath samples and within skin samples, but not between breath and skin samples. Three VOCs (decanal, nonanal and octanol) presented rather high correlations, both within breath (nonanal:octanol → 0.71; decanal:octanol → 0.66; nonanal:decanal → 0.86) and within skin (nonanal:octanal → 0.94; decanal:octanal → 0.67; nonanal:decanal → 0.70) samples. Interestingly, nonanal and decanal were identified previously as two of the most dominant skin odorants in humans12. Principal component analysis (PCA) of skin and breath yielded well-separated clusters, irrespective of the demographic differences (seeFigure 1d). In addition, these signatures did not overlap with the PCA cluster of the room air samples, indicating that neither skin VOCs nor breath VOCs were visibly affected by the ambient atmosphere.

Breath and skin samples from the same 30 volunteers were also studied using a nanomaterial-based sensor array that could provide a future point-of-care diagnostic tool (see Methods). The electrical resistance of the constituent sensors in the array changed rapidly and reversibly on exposure to the samples. Typical responses for two representative sensors are shown in Figures 2a and 2b. Different levels of response variability were found among volunteers in the different sensors. Figures 2c and 2d show a representative variability response between the sensing measurements of the breath and skin samples for each volunteer. In most cases, the sensing signal of the skin sample was higher than the signal of the constituent breath sample, with some exceptions. For each sensor, the variation in the breath and skin samples was tested by matched pair (t-test) analysis (by subject) between all breath samples and all skin samples (see Supplementary Table S4). In the GC/MS analysis, we analyzed the VOCs separately. In the case of the (cross-reactive) sensors array, the response is obtained from all VOCs in the mixture. In other words, each sensor has one response to all VOCs of each single sample. The sensors showing high variability between the skin and breath samples typically display higher person-to-person variation than the sensors that had low skin-to-breath variability.

Figure 2
figure 2

Representative sesning responses of (a) GNP sensor (no. 31) and (b) RN-CNT sensor (no. 34) to breath and skin VOC samples; the first cycle represents the response upon exposure to a clean nitrogen and the second cycle represents the response to the VOC sample. (c) Variance within each pair of samples (breath versus skin) of a specific individual tested (x-axis) based on the area under the signal in sensor 31 and (d) sensor 34.(e) Correlation matrix: the color map represents the sensor response correlations with a scale from 1 to −1, which are maximum positive correlation (yellow) or negative (blue), respectively. Zero values (light green) represent low correlation. (f) Hierarchical clustering with a color map based on the combination of breath and skin data. Rows represent the various samples which were clustered into nine main clusters. The columns represent the different sensor/feature data in breath and skin. (g) A volatolomic barcode based on the normalized response. To simplify presentation, we present the breath and skin data in tandem. Each line represents a specific sample while the different colored rectangular bars represent different sensor responses; the size of the bar represents the relative extent of the sensor response for the specific sample from all responses. (MAZ02 and MAZ08 were missing skin or breath sample and so not presented in the figure) (h) A machine-readable barcode of the information obtained from multiple body samples. The barcode was calculated using free online QR code generator.

The relation between the sensors' response was further studied by means of a correlation matrix. Figure 2e shows a heat map representation of the correlations (1/−1: maximum positive/negative correlation; zero: no correlation). The data was clustered (rows were reordered based on correlation values so that high or low correlation would be adjacent) for an informative presentation. Strong intra-group correlation between the sensing responses to the same type of body fluid was observed (among the breath samples and among the skin samples), but very few inter-group correlations were observed. The intra-group correlations of sensor response within the various samples were mostly positive (yellow), with a few cases showing a negative correlation (blue). The lack of correlation between breath and skin samples is indicated by shades of green in Figure 2e. These results are consistent with the GC/MS analysis. Hence, the two body fluids studied yield complementary VOC profiles with little overlap.

Considering these results, it is reasonable to claim that combining the complementary information from the two different samples yields a wider window into the underlying disease pathology, compared to those obtained from a single clinical sample. To check this claim, a hierarchical clustering that is combined with a heat map was produced and presented (Figure 2f). This analysis groups the rows of a data matrix into clusters whose values are close to each other. The data from all sensors for both breath samples and skin samples were clustered into nine sub-clusters, representing the highest similarity among the tested samples. In other words, each cluster represents a similar VOC profile, suggesting higher resemblance among samples (subjects) of a specific cluster. This methodology could promote the option for personalized medicine. While we could not pinpoint specific common reasons for the cluster formations, probably because of the small population examined in this study, we did observe one exception. The age parameter in the top cluster (red) showed an average of 34.2 years and median of 35 years while the other clusters showed an average of 27.4 years and median of 27 years. This observation suggests that the VOC profile is indicative of age-related metabolic processes, even though further validation with larger populations is still needed. The matched pairs analysis on the breath vs. skin response showed significant differences (in most cases, P value < 0.0001), except for four subjects that showed major difference from the rest (004; 010; 025; 027) these were also clustered separately on the hierarchical cluster. Excluding six sensors, the two-way clustering showed a clear difference between the skin and breath (x-axis). A similar clustering approach with the GC/MS results has not revealed similar results. This can be explained as follows: each sensor in the nanoarray measures all (or part of the) VOCs in the examined mixture. In contrast, the GC/MS analyzes each VOC separately. Finally, by normalizing the data, we created a barcode representation to serve as a fast, high-throughput profiling method. Figure 2g summarizes the breath and skin data considered in the barcode representation. Thus, a machine-readable odor profile of an individual can be built on the basis of the total responses from two (or more) different samples (seeFigure 2h).

Discussion

To eliminate potential experimental artifacts and/or lack of standardizations, we have used the same ingredients for the sample collection (sorption material) and the samples analysis, for both skin and breath body fluids (see Methods)1,2,13. Therefore, the differences obtained within or between the various VOC profiles cannot be attributed to a (lack of) standardization aspects. Two methodologies have been used to investigate the combined volatolomic approach. The first methodology relies on GC/MS analysis to identify the chemical nature and the composition of the volatolomic profile of the various body fluids examined in this research. However, GC/MS is complex and may not lend itself to point-of-care technology. In contrast, the current study includes a nanomaterial-based sensors array that relies on collective VOC patterns, rather than identification of specific VOCs (as is the case in GC/MS)14. In these sensors, the measured signals of the VOCs mixture from the skin and/or breath samples are additive, so that the overall signal of one sensor stems from a few ~ppm concentration of VOCs. Hence, the combined sensors' response from breath and skin together are negligibly affected (or not affected) by small VOC and/or concentration variations that are linked with specific confounding factors, such as smoking15. Still, a larger population would be needed to better validate constituent confounding effect with this combined methodology.

The small number of VOCs (in breath) obtained in our study, compared to other studies in the literature16,17,18,19, could be attributed to the sample collection method as well as to the sorption material used. The fact that only 19 common VOCs were found (out of more than 250 VOCs) indicates that a major portion of the VOCs in a specific body fluid might stems from different metabolic pathways or confounding conditions/VOCs. Despite the small number of the statistically validated VOCs, the obtained results suggest that combined volatolomic data gives a wider VOC profile, compared to each of the body fluids alone.

There are several limitations of our study. We adopted a case-control design, which may over estimate sensitivity and specificity, so our results must be interpreted cautiously. However, the design is appropriate at this preliminary stage of investigation and our results now pave the way for a study using patients or people with diagnosed disease. While we used different statistical tools (e.g., All Pairs, Tukey HSD test; Matched pair (t-test) analysis) to cope with the question in hand and address possible limitations such as the familywise error rate, we are aware of concerns that arise because of the limited sample number and multiple testing. A final statistical validation remains to be carried out in future studies including a large study population. Furthermore, concrete human/clinical studies using multiple samples are necessary to confirm this assumption. With additional R&D, this volatile dataset methodology could enhance personalized well-being on a community level, as well as public health and management approaches.

In summary, our experiments demonstrate that a novel volatolomic approach could lead to the development of a powerful diagnostic tool in the near future. Spectrometry techniques are powerful tools for detecting quantitative VOC concentration profiles. Nevertheless, for volatolomic diagnostics to enter clinical practice, approaches must be adopted that would allow implementing point-of-care devices, such as chemosensory-based VOC detection. This approach is not only sensitive and specific, as outlined in our preliminary data, but also lends itself very easily to a point-of-care format and affordability.

Methods

Volunteer enrollment

A heterogeneous group of 30 volunteers (aged 29–44 years; both genders, with varied ethnicity, smoking habits and state of health) were enrolled for the study at the Technion – Israel Institute of Technology (Haifa, Israel), after giving their written informed consent and filling out a short questionnaire regarding general background information (e.g., age, gender, smoking; seeSupplementary Table S1) in accordance with approved guidelines. Exclusion criteria for this study were the consumption of food, beverages or smoking less than one hour prior to the test. Volunteers' enrollment was approved by the ethics committee (approval request number, 02/13) of the Technion – Israel Institute of Technology (Haifa, Israel).

Collection of breath samples

Exhaled alveolar breath was collected in a controlled, reproducible manner in the same room from all volunteers. Inhaled air was cleaned of ambient contaminants by repeatedly inhaling to total lung capacity for 3 min through a mouthpiece connected to a scrubber cartridge set on the inlet port (purchased from Eco Medics, Duerten, Switzerland), removing 99.99% of the exogenous compounds from the room air during inspiration and reducing the concentration of potential exogenous VOCs. The unfiltered ambient air in the collection room was sampled and the identified contaminants were disregarded in our subsequent analysis. Immediately after the lung washout, subjects exhaled through the exhalation port of the mouthpiece. Exhaled breath is composed of alveolar air and respiratory dead space air. The dead space was automatically filled into a separate bag and the alveolar breath from the end of the exhalation was filled into a 750 ml Mylar bag (Quintron Instrument Co., Inc., Milwaukee, WI, USA). The breath collection described is a single-step process that does not require the volunteer to perform any action except inhaling and exhaling through the mouthpiece. Two bags were collected from each volunteer for the analysis by gas chromatography/mass spectrometry (GC/MS) and by the nanomaterial-based sensors. Immediately after the breath collection, the VOCs in the breath samples were trapped and pre-concentrated in a two-bed Tenax® TA/Carboxen® 1018 sorption tubes (glass TD tube, preconditioned, O.D. × L 1/4 in. × 3 1/2 in., sealed with Swagelok® End-Fittings purchased from Sigma-Aldrich, Israel) by pumping the content of each collection bag through a sorbent tube (flow rate: 100 ml/min). The room air samples were collected by pumping ambient air in the collection room through a sorbent tube for 7 min. at a rate of 100 ml/min. The sorbent tubes were stored under refrigeration at 4°C until they were analyzed; maximal storage time was one week. One sorbent tube with VOCs from breath sample, was destroyed due to breakage during storage.

Collection of skin headspace samples

In parallel to the described breath sample collection, a skin headspace sample, from the palm of the hand, was collected from each volunteer in a controlled way and in the same room as the breath sample collection. Each volunteer was asked immediately before the test to wash the hands thoroughly under a stream of fresh water for at least one minute. Then the volunteer was asked to allow the hands to dry without contact with paper or towel. The VOCs in the skin samples were trapped and pre-concentrated in a two-bed Tenax® TA/Carboxen® 1018 sorption tubes (glass TD tube, preconditioned, O.D. × L 1/4 in. × 3 1/2 in.; purchased from Sigma-Aldrich, Israel) using a TDS3 ™ thermal desorption tube sampling system (purchased from Sigma-Aldrich, Israel) by direct pumping from the hand of each volunteer through a sorbent tube (flow rate: 100 ml/min.). The TDS3 was held in the closed hand of each volunteer with the opening held right above the skin. Tubes were sealed with Swagelok® End-Fittings and stored under refrigeration at 4°C, up to one week, until they were analyzed. One sorbent tube with VOCs from skin sample, was destroyed due to breakage during storage.

Sample characterization

The breath and skin samples were characterized using two totally independent, complementary characterization methods. The first method is VOC analysis with gas chromatography/mass spectrometry (GC/MS). Here, chemical analysis of the breath and skin samples was carried out to identify the VOCs that show statistically different concentrations within the compared populations. A GC/MS-QP2010 instrument (Shimadzu Corporation) was used that was equipped with a SLB-5ms capillary column (with 5% phenyl methyl siloxane; 30 m length; 0.25 mm internal diameter; 0.5 μm thicknesses; from Sigma-Aldrich) and was combined with a thermal desorption (TD) system (TD20; Shimadzu Corporation). Desorption was done by heating the TD tubes for 5 min. at 270°C and trapping them on a Tenax cold trap at −10°C. The samples were injected into the GC-system in splitless mode at 30 cm/sec constant linear speed and under 0.70 ml/min column flow. The following oven temperature profile was set: (a) 10 min at 35°C; (b) 4°C/min ramp until 150°C; (c) 10°C/min ramp until 300°C; and (d) 5 min at 300°C. The GC/MS chromatograms were analyzed using the GCMS post-run analysis program (GC/MS solutions version 2.53SU1, Shimadzu Corporation) and the compounds were tentatively identified through spectral library match (Compounds Library of the National Institute of Standards and Technology, Gaithersburg, MD 20899-1070 USA, 2008). The second method is VOC analysis with 25 nanomaterial-based sensors, combined with a statistical multivariate data analysis algorithm (see below). The sensors were based on layers of gold-nanoparticles (GNPs) with eleven different organic ligands and quasi 2D random networks of single walled carbon nanotubes (RN-SWCNTs) capped with four different organic overlayers20,21.

The second Tenax® TA/Carboxen™ 1018 glass TD tube per volunteer was introduced to the sensor array chamber that was specially assembled to a TD system, enabling direct exposure of the TD tube to the sensor array. Desorption was done by heating the TD tubes for 10 min. at 250°C and trapping them on a Tenax cold trap at 0°C. The samples were injected into the GC system in direct (splitless) mode at constant 3 ml/min total flow. The breath sample from the TD was then delivered by a six-way Valco™ valve, equipped with 10 ml stainless steel loop (VICI®, Valco Instruments Company Inc., USA) into a stainless steel test chamber containing the cross reactive nanomaterial-based sensors array. The test chamber was evacuated between exposures to discharge any VOCs adsorbed on the sensors organic layer. A Keithley 2701 DMM data acquisition/data-logging system was used to measure the resistance of all the sensors simultaneously as a function of time. The sensors' baseline responses were recorded for 5 min in vacuum, 5 min under clean nitrogen (99.999%), 5 min in vacuum, 5 min under sample exposure, followed by another 5 min in vacuum. To detect possible malfunctions of the sensors and to counteract slight drifts of their baseline conditions due to ageing and/or poisoning effects, the sensors were calibrated each day the experiment was run, by exposing the sensors to known concentrations, 10 ppm isobutylene calibration gas (Calgaz, Cambridge, Maryland, USA) and recording their resistance changes similarly to the sample procedure.

Each sensor responded to all (or to a certain subset) of the VOCs found in the exhaled breath samples or skin samples. The exposure of the various sensors to the VOC samples resulted in rapid and fully reversible changes of the electrical resistance. Two sensing features were read out from the time-dependent resistance response of each sesnor. One sesning feature is related to the normalized resistance change at the beginning of the exposure (with respect to the value of sensors resistance in vacuum prior to the exposure), while the other sensing feature is related to the area beneath the time-dependent resistance response during the exposure period. Each sensor responded to all (or to a certain subset) of the VOCs found in the exhaled breath samples or skin samples.

Description of the nanomaterial-based sensor array

The nanomaterial-based sensors array that was used to analyze the VOCs samples contained cross-reactive, chemically diverse chemiresistors that were based on two types of nanomaterials: (i) organically stabilized spherical gold nanoparticles (GNPs, core diameter: 3–4 nm) and (ii) single walled carbon nanotubes (RN-SWCNTs). The chemical diversity of the sensors was achieved through 25 different sensors with either similar or different basic organic functionalities (19 for the GNP sensors and 6 for the SWCNT sensors). The organic ligands of the GNPs provided broadly cross-selective absorption sites for the breath VOCs15,20. The GNPs were synthesized as described in Refs. 20, 22 and generally dispersed either in chloroform or toluene. Chemiresistive layers were formed by drop-casting the solution onto semi-circular microelectronic transducers, until a resistance of several MΩ was reached. The device was dried for 2 h at ambient temperature and then baked overnight at 50°C in a vacuum oven. The microelectronic transducers consisted of ten pairs of circular interdigitated (ID) gold electrodes on silicon with 300 nm thermal oxide (Silicon Quest International, Nevada, US). The outer diameter of the circular electrode area was 3 mm and the gap between two adjacent electrodes and the width of each electrode both 20 nm.

The RN-SWCNT sensors were based on electrically continuous random networks of SWCNTs that were formed by drop-casting a solution of SWCNTs (from ARRY International LTD, Germany; ~30% metallic, ~70% semiconducting, average diameter = 1.5 nm, length = 7 mm) in dimethylformamide (DMF, from Sigma-Aldrich Ltd., >98% purity) onto the pre-prepared electrical transducers. After the deposition, the devices were slowly dried overnight under ambient conditions to enhance the self-assembly of the SWCNTs and to evaporate the solvent. The procedure was repeated until a resistance of 100 KΩ to 10 MΩ was obtained. The microelectronic transducers for the SWCNT sensors consisted of ten pairs of 4.5 mm wide, interdigitated Ti/Pd electrodes on silicon with two microns of thermal oxide (Silicon Quest International, Nevada, US). The gap between two adjacent electrodes was 100 nm. The SWCNT sensors were organically functionalized with cap-layers that were composed of four Polycyclic Aromatic Hydrocarbon (PAH, aka HBC) derivatives. PAH-1, PAH-2, PAH-3 and PHA-5 contain hydrophobic mesogens that are terminated with alkyl chains and functionalized with ether, alcohol and carboxylic acid21. The GNP and SWCNT/PAH or SWCNT/HBC sensors used in this study responded rapidly and reversibly when exposed to typical VOCs in the sample21,23. Additionally, we have confirmed that they have a very low response to water20,21,24,25.

Statistical analysis (GC/MS)

VOCs variability of the breath and skin sample was checked within subjects and among subjects using matched pair (t-test) analysis (by subject) was checked between all breath samples and all skin samples. Where applicable, the All Pairs, Tukey HSD test was performed for multiple comparisons. In addition, correlation matrix using Pearson product-moment correlations was done. Significant differences for the one-way and/or matched pair test were considered as a cut-off p-value: 0.05 between the sub-groups checked, as determined from the results using JMP, version 9.0.0 (SAS Institute Inc., Cary, NC, USA, 1989–2005).

Statistical analysis (sensors array)

The sensor array originally consisted of 40 sensors. After running the samples and testing electrical signals received, i.e., resistance of the various sensors, it was decided to remove 15 sensors that showed no response, showed irregular signal responses or noise readings. All remaining 25 sensor in the array responded to all (or to a certain subset) of the VOCs found in the exhaled breath samples and skin samples. Specific patterns and predictive models for the volatile-print of each volunteer were studied and derived from the sensor array output, using correlation matrix based on Pearson product-moment correlations, color maps, principal component analysis (PCA) and matched pair (t-test) analysis (by subject) was checked for each sensor separately between all breath samples and all skin samples. Where applicable, the All Pairs, Tukey HSD test was performed for multiple comparisons as a statistical algorithm. The variability checked within subjects and among subjects using correlation matrix and hierarchical clustering using JMP, version 9.0.0 (SAS Institute Inc., Cary, NC, USA, 1989–2005). Hierarchical clustering is a method that begins with each point in its own cluster. At each step, the two clusters that are closest together are combined into a single cluster. This method continues until there is only one cluster containing all the points. Moreover, the two sensing features were related to the normalized resistance change at the beginning of the exposure (with respect to the value of sensors resistance in vacuum prior to the exposure) and to the area under curve (AUC) of the time-dependent resistance response during the exposure period, as described in the Nanomaterial-Based Sensor Array section.