Volatile metabolomic signature of human breast cancer cell lines

Breast cancer (BC) remains the most prevalent oncologic pathology in women, causing huge psychological, economic and social impacts on our society. Currently, the available diagnostic tools have limited sensitivity and specificity. Metabolome analysis has emerged as a powerful tool for obtaining information about the biological processes that occur in organisms, and is a useful platform for discovering new biomarkers or make disease diagnosis using different biofluids. Volatile organic compounds (VOCs) from the headspace of cultured BC cells and normal human mammary epithelial cells, were collected by headspace solid-phase microextraction (HS-SPME) and analyzed by gas chromatography combined with mass spectrometry (GC–MS), thus defining a volatile metabolomic signature. 2-Pentanone, 2-heptanone, 3-methyl-3-buten-1-ol, ethyl acetate, ethyl propanoate and 2-methyl butanoate were detected only in cultured BC cell lines. Multivariate statistical methods were used to verify the volatomic differences between BC cell lines and normal cells in order to find a set of specific VOCs that could be associated with BC, providing comprehensive insight into VOCs as potential cancer biomarkers. The establishment of the volatile fingerprint of BC cell lines presents a powerful approach to find endogenous VOCs that could be used to improve the BC diagnostic tools and explore the associated metabolomic pathways.

Scientific RepoRts | 7:43969 | DOI: 10.1038/srep43969 body, such as lipid peroxidation, energy metabolism through glycolysis and amino acid catabolism are common to all living cells 9 . It is believed that some metabolic pathways might be up-or down-regulated in cancer cells and, therefore, metabolome analysis may reveal differences between biological samples based on metabolic profiles or fingerprints. Indeed, cancer cells have an altered metabolism compared with normal cells that may lead to the production of specific compounds 10 . In recent years, several studies have reported the analysis of cancerous cell lines to find potential cancer biomarkers [11][12][13][14] . The most recent techniques include the use of nanomaterial-based sensors 11 , electrochemical sensors 15 , or thermal desorption coupled with gas chromatography mass spectrometry 16 . However, most of these techniques are expensive and time-consuming. In this work, solid-phase microextraction in headspace mode (HS-SPME), which was developed by the Pawliszyn group 17 in the early 90's and consists of a fiber coated with different polymers extracting a wide range of chemical compounds, was selected as an extraction technique. This technique is superior to other extraction techniques, in that it is rapid, easy to use, sensitive and does not require a concentration step before analysis.
In this study, a comparative analysis of the volatile metabolomic signature of BC cell lines (T-47D, MDA-MB-231, MCF-7) and normal human mammary epithelial cells (HMEC), was carried out, in order to identify BC-specific VOCs and to identify a set of biomarkers that could hopefully be correlated with VOCs released in vivo by BC cells. This finding will improve the knowledge about the origin of VOCs and providing comprehensive information as potential BC biomarkers. This strategy can help reveal novel BC biomarkers that might expand the current understanding of this multi-factorial disease. The GC-qMS analyses allow specific identification of VOCs, while multivariate statistical analysis is able to differentiate and discriminate oncologic from normal cells providing proof-of-principle for the detection of different volatile metabolomic patterns in target cells.

Results and Discussion
VOCs associated with normal breast cells (HMEC) and BC cell lines (T-47D, MDA-MB-231 and MCF-7 cells) were investigated. The cell lines for the present study were chosen based on their different molecular characteristics (Supplementary Table S1): namely, the expression of the estrogen receptor (ER), the progesterone receptor (PR), and the human epidermal growth factor receptor 2 (HER2). It is well known that BC is heterogeneous and that its prognosis and treatment depends on the molecular subtype of the cancer cells. The VOCs arising from the cellular metabolism were studied using HS-SPME/GC-MS: (a) by direct analysis of the headspace of the culture flasks after cell growth (these results are hereinafter designated as "Cells"); and (b) by analysis of the volatile metabolites from the culture media at different pH values. From the analysis of chromatograms, it was possible to identify 60 VOCs belonging to distinct chemical groups, namely, alkanes, aldehydes, ketones, acids, alcohols and benzene derivatives.
VOC signature of BC cell lines and breast normal cells. We identified twenty-six VOCs belonging to several chemical groups (Fig. 2): namely, alkanes, aldehydes, ketones, acids, alcohols and benzene derivatives (Table 1).
From these VOCs, 13 were found to be common in all studied breast cells (both normal and cancerous), 5 were present only in normal breast cells (HMEC), and 2 compounds were identified only in the MCF-7 breast cell line.
As can be observed in Fig. 2, the MCF-7 cell line demonstrated the most complex volatile metabolomic signature in terms of number of the identified VOCs and total GC peak areas compared with the other cell lines. Moreover, for all BC cell lines, the major chemical group identified was the higher alcohols, represented mainly by 2-ethyl-1-hexanol and cyclohexanol. These VOCs have already been reported in previous studies using BC cell lines 18,19 and in urine 20,21 from cancer patients. It is believed that their endogenous origin is as hydrocarbon Scientific RepoRts | 7:43969 | DOI: 10.1038/srep43969 metabolism byproducts 22,23 . The obtained data indicated that the levels of both VOCs (2-ethyl-1-hexanol and cyclohexanol) were higher in all investigated BC cells than in normal cells (HMEC). This might be due to the production of lipid peroxidation biomarkers with hydroxylase that are mediated by cytochrome P450 12,18 .
Similar results were reported by Peled and collaborators 24 when studying genetic mutations in lung cancer cells, and by Davies and collaborators 25 , who compared the volatile profile from the headspace of lung cancer cells with genetic mutations in TP53 and KRAS. Most of the identified VOCs were common to all BC cell lines and normal human mammary epithelial cells, but six of them, 2-pentanone, 2-heptanone, 3-methyl-3-buten-1-ol, ethyl acetate, ethyl propanoate, and 2-methyl butanoate, were detected only in BC cell lines. This finding justifies a more detailed investigation to evaluate of these six VOCs as BC biomarkers.
The influence of pH on the VOCs identified from culture media. The pH is one of the parameters that influences the extraction efficiency of VOCs and therefore it is required an optimization step. This was accomplished by the assessment of volatiles from culture media at different pH. We evaluated the effect of pH on the volatile signature obtained from culture media. At pH 2, the MCF-7 cells had the highest total GC peak area with acids (hexanoic acid, octanoic acid and 2-ethyl-hexanoic acid) being the most dominant chemical group.   Aldehydes (benzaldehyde and 3,4-dimethyl-benzaldehyde) were the most predominant chemical group in the T-47D and MDA-MB-231 cells ( Fig. 3; Supplementary Table S2). At pH 7, for MCF-7 cells, alkanes, ketones and alcohols were found the dominant chemical groups, which were represented by dodecane, 2-heptanone, and 2-ethyl-1-hexanol. For the other breast target cells alcohols (cyclohexanol) were the most representative chemical group. Finally, at pH 10 the main chemical group identified for MCF-7 cells were alkanes, ketones and alcohols (2-ethoxy-2-methyl-propane, acetone and 2-ethyl-1-hexanol). For T-47D and MDA-MB-231 cells, alcohols represented by cyclohexanol, presented the major contribution. As previously mentioned, it is believed that cancer cells have altered metabolisms leading to different volatile metabolomic patterns. This was observed in our study, where we identified some differences between BC cell lines and normal cells (Supplementary Table S2). Several VOCs were found to be common in all breast cell lines for all conditions, including, 2-ethoxy-2-methyl-propane, acetone, 2-methyl-2-propanol, cyclohexanol, 1,3-bis(1,1-dimethylethyl)-benzene and 2-ethyl-1-hexanol which had higher levels in BC cells. Ethyl acetate was only present in the T-47D cell line (Supplementary Table S2). Kwak and collaborators 26 described a similar study using melanoma cells and identified higher concentrations of acetone in cancer cells. The metabolomic origin of most VOCs is still unknown, as they rely on a variety of endogenous pathways and exogenous sources. Huang and collaborators 18 reported that cyclohexanol and 2-ethyl-1-hexanol had lower concentrations in BC cells and suggested that they were generated by endogenous hydrocarbon metabolism. Hydrocarbons can be metabolized to aldehydes or ketones in the body via alcohol dehydrogenase (ADH) and cytochrome P450 activities 12 . The higher activity of cytochrome P450 may explain why BC cell lines have less cyclohexanol than normal breast cells 18 . It can also induce a variety of biological responses, including the biotransformation of alkanes, alkenes and aromatic compounds 27 . Furthermore, Philips et al.
suggested that breast diseases are associated with increases in oxidative stress and a higher activity of P450 28 . Nevertheless, 2-ethyl-1-hexanol was found at higher levels in BC cells than in to normal cells. According to the human metabolome database, 2-ethyl-1-hexanol is involved in cell signaling, membrane integrity/stability and energy storage and it was also detected in lung cancer cell lines 16 at increased levels when compared with the medium. At pH 10, the levels for most of the VOCs were higher in BC cells than in normal breast cells, including those of acetone, 2-pentanone, cyclohexanol, 2-ethyl-1-hexanol and acetophenone.

PCA and PLS-DA analyses of VOCs.
To verify the significance of the identified VOCs from the headspaces of the culture flasks and the cell culture media at different pH conditions, a one-way ANOVA test was applied to analyze the data matrix. From the identified VOCs, a total of 23 (from cultured flask headspace), 52 (from culture media at pH 2), 34 (from the culture media at pH 7) and 43 (from the culture media at pH 10) showed significant differences (p < 0.05) ( Supplementary Fig. S1). Principal component analysis (PCA) was performed for each condition to identify variables to differentiate the VOCs pattern of the HMEC cells from those of the BC cell lines (MCF-7, T-47D and MDA-MB-231 cells), and from the VOCs patterns obtained from the culture media at different pH values (Fig. 4). The statistical data summary of PCAs are described in Supplementary Table S3. The differentiation between the above conditions was shown as the loading scores plot of the two principal components of the PCA. The PCA analysis is an unsupervised projection method used to visualize the dataset that displays the similarities and differences between groups and, in this case, demonstrated that the variables (scaled by standard deviation) used were sufficient to describe subsets with similar characteristics.
These results demonstrated that the scores from the cancer cell lines and those from the normal breast cells exhibited separate trends in the plots. Figure 4A shows the loading scatterplots of the PCA obtained from the analysis of the VOCs in the headspace of cultured flasks. It can be observed that 3 groups were formed, where HMEC cells was clearly differentiated from BC cell lines, which showed greater differentiation from MCF-7 cell lines across the PC1 and from T-47D and MDA-MB-231 across the PC2. Interestingly, BC cell lines formed two separated groups according molecular subtype (luminal A versus triple negative). However, no differentiation was achieved between T-47D and MDA-MB-231 cell lines, which formed a single group, perhaps this grouping of the two cell lines might be due to the fact that they have similar molecular characteristics. The variables that explain the differentiation between cell lines are represented in  Concerning the other tested conditions, the loading scatterplots of the PCA obtained from the analysis of VOCs from cell culture media at pH 2, pH 7 and pH 10, and respective influence of variables, are showed in Supplementary Fig. S2. Surprisingly, four groups were formed encompassing all breast cell lines in study under pH 2 and pH 7, where MCF-7 was differentiated from other cell lines mainly across PC1, while HMEC, T-47D and MDA-MB-231 were differentiated from each other through PC2.
Under pH 10, the pattern of differentiation between cell lines is similar to obtained for headspace, in which 3 groups (HMEC, T-47D/MDA-MB-231 and MCF-7) were formed. The differentiation between cell lines obtained under pH 2 and pH 7 may be due to the alterations of molecular components released under more acidic conditions than those normally present in the culture medium (pH 7.3). However, for differentiation and discrimination between normal breast cell lines and oncological breast cell lines based on the VOCs emitted as close to reality as possible in human cell tissues, partial least squares analysis (PLS) and linear discriminant analysis (LDA) were performed only with data from headspace of cell cultures. The statistical data summary of PLS and LDA are described in Supplementary Tables S5 and S6, respectively. Sample classification by PLS showed that the differentiation between cell lines was explained through one single component. PLS loading lineplot are presented in Fig. 5, which can be observed four centroids corresponding to each cell lines.
Similar to obtained in the PCA, HMEC centroid was clearly differentiated from oncologic breast cell lines, and MCF-7 (triple negative type) was distinguished from two luminal A type cell lines. On the other hand, T-47D and MDA-MB-231 remain very close to each other, which PLS values was 0.0588 and 0.0037, respectively. Regarding the influence of variables on PLS values of cell lines, HMEC was highly influenced by 4-methyl-heptane, tetradecane, benzaldehyde and acetophenone, T-47D and MDA-MB-231 were influenced by cyclohexanone, 1,2,4-trimethylbenzene, ethylbenzene and 1,3-dimethylbenzene, and MCF-7 was influenced by the remaining VOCs.
The linear discriminant analysis (LDA) was applied as a supervised pattern recognition method in order to discriminate statistically the cell lines under study, where samples were grouped according to molecular type as follows: N (HMEC), BL (T-47D and MDA-MB-231) and BTN (MCF-7). The LDA scatterplot of cell lines classification according to canonical functions were showed in Fig. 6.
The cell lines samples formed three clearly defined groups with a classification rate of 100%. Recognition ability, calculated as the percentage of members of the data set that were correctly classified, and prediction ability,  calculated as the percentage of members that were correctly classified, were 100% in all cases. After applying LDA with backward removal (p < 0.05) of variables, only two VOCs proved to be significant for discrimination between three defined previously, namely 1,2,4-trimethylbenzene and benzaldehyde. These compounds have been already identified in cancer cell studies by Brunner et al. 29 using PTR-MS, by Filipiak et al. 16 in lung cancer cells and Mochalski et al. 30 with human hepatocellular carcinoma cells where it was observed an increase in the release of this compound. Moreover, these two VOCs appear to be promise biomarkers due to fact that achieve a successful discriminant classification of samples according to molecular type of breast cell lines, demonstrating that volatile metabolomic signature of breast cells can be a useful approach to identify potential BC biomarkers for early diagnosis of BC.

Conclusions
This study demonstrated that HS-SPME/GC-MS is a simple, rapid, sensitive and solvent-free method that can be used to establish the volatile metabolomic patterns of normal and cancer breast cells. In addition, this study showed the potential of screening the in vitro VOCs associated with BC to identify potential volatile biomarkers to be used in early diagnosis. The headspace of culture media of normal and cancer cell lines was analyzed at different pH conditions. Sixty VOCs were identified as belonging to several chemical groups: namely, alkanes, aldehydes, ketones, acids, alcohols and benzene derivatives. Most of the identified VOCs are common to all BC cell lines and normal human mammary epithelial cells, but six of them, 2-pentanone, 2-heptanone, 3-methyl-3-buten-1-ol, ethyl acetate, ethyl propanoate, and 2-methyl butanoate, were detected only in the headspace of cancer cell lines. Multivariate statistical data obtained in this study revealed that combining in vitro assays with HS-SPME/GC-MS is a useful strategy to differentiate and discriminate the volatile metabolomic signature of normal cells and BC cell lines according to molecular type, thus contributing to the discovery of novel biomarkers of BC and investigations of the related metabolomic pathways, and thereby improving the diagnostic tools for BC.

Methods
Materials and reagents. Phosphate buffer saline (PBS) was purchased from Sigma-Aldrich (St. Louis, MO, USA), sodium chloride was obtained from Panreac (Barcelona, Spain), the SPME holder for manual sampling of SPME fiber [50/30 μ m divinylbenzene/carboxen/polydimethylsiloxane (DVB/CAR/PDMS)] and the glass vials were purchased from Supelco (Bellefonte, PA, USA). The SPME fiber was conditioned according to manufacturer's instructions. Before each daily analysis, the fiber was conditioned for 10 min in the injector port to prevent carryover. T75 glass flasks were purchased from Ningbo (Ja-Hely Technology, China). ; T-47D cell line was grown in 85% RPMI 1640 supplemented with 15% fetal bovine serum (FBS), 1% Antibiotic-Antimycotic solution and 10 μ g/mL human insulin, while the MDA-MB-231 cell line was grown in 85% RPMI 1640 supplemented with 15% fetal bovine serum (FBS) and 1% Antibiotic-Antimycotic solution. Human mammary epithelial cells (HMEC) were purchased from Life technologies (Gibco ® ) and grown in HUMEC serum-free medium supplemented with 20 μ g/mL of Antibiotic-Antimycotic solution (Life technologies, Gibco ® ). All cells were incubated in a humidified atmosphere containing 5% CO 2 and 95% air at 37 °C. Culture media was changed every 2 days and the cultures were passaged with 0.25% trypsin-EDTA (Life technologies, Gibco ® ) when 80% of confluence was achieved. VOC extraction from cell cultures. To extract VOCs from cell cultures, glass flasks were treated with collagen to promote cell adherence. Briefly, the glass flasks were covered with a collagen solution (0.2 mg/mL) for 30 min and then washed with PBS (3 times). The cells were then cultured in the T75 flasks for 48 h. After this period, volatile metabolites were extracted using a 50/30 μ m divinylbenzene/carboxen/polydimethylsiloxane (DVB/CAR/PDMS) SPME fiber exposed in the headspace of the flasks for 45 min at 37 °C, followed by injection into the GC injection port for 10 min to allow the desorption of VOCs from the fiber. After these extractions, cell-free aliquots were collected from the flasks holding 10 mL of the culture medium with growing cells. They were centrifuged to remove any suspended cells, and then 1 mL aliquots were adjusted to pH 2, 7 or 10 with 1 M HCl or 1 M NaOH 26 . After the addition of 200 mg NaCl and subsequent stirring (0.5 mm × 0.1 mm bar) at 800 rpm, the vials were capped with PTFE septa through which the SPME fiber was inserted in the headspace of the vial and placed in a thermostatic bath at 37 °C for 45 min. After this, the fiber was withdrawn into the needle and injected in the GC port (250 °C) over 10 min, when the analytes were thermally desorbed and transferred to the analytical column. Control headspace samples were also collected from flasks containing only empty media treated with the same incubation conditions to determine the contribution to the background. The analyses were performed in triplicate.

Cell lines and cultivation conditions.
GC-MS analysis. VOCs in the headspace were analyzed using an Agilent Technologies 6890 N Network gas chromatograph system (Palo Alto, CA, USA) equipped with a BP-20 fused silica column (60 m × 0.25 mm I.D. × 0.25 μ m film thickness, SGE, Dortmund, Germany) interfaced with an Agilent 5975 quadrupole inert mass selective detector. The following oven temperature profile was set: (a) 5 min at 45 °C; (b) increase temperature until 150 °C, at a rate of 2 °C min −1 (hold for 10 min); (c) 150 °C for 10 min; (d) increase temperature until 220 °C, at a rate of 7 °C min −1 ; and (e) 220 °C for 10 min. Column flow was constant at 1.0 mL/min using helium (He, N60, Air Liquide, Portugal) as the carrier gas. The injection port was maintained at 250 °C and operated in the splitless mode. Regarding MS analyses, the operating temperatures of the transfer line, quadrupole and ionization source were 270, 150 and 230 °C, respectively. The electron impact mass spectra were recorded at 70 eV and the ionization current was 10 μ A, and data acquisition was performed in scan mode (30-200 m/z). The identification of metabolites was performed comparing mass spectra with the Agilent MS ChemStation Software (Palo Alto, CA, USA) equipped with the NIST05 mass spectral library with a similarity threshold higher than 80%, or with commercially standards when available. All experiments were performed in triplicate and the results were expressed as the mean ± standard deviation.
Statistical analysis. Statistical tests were performed using the StatSoft STATISTICA 12.0 (2013) software (Tulsa, OK, USA). Differences in VOCs between groups were tested with one-way ANOVA, and p < 0.05 was considered as statistically significant. PCA, PLS and LDA were carried out on VOCs selected by ANOVA to evaluate differences among the studied groups. PCA was performed in order to obtain differentiation between samples under study without classification and PLS was used as a supervised linear pattern recognition algorithm for data classification of samples. PCA and PLS were performed through variables values scale by unit standard deviation with convergence criterion (0.0001) and leave-one-out cross validation for accuracy confirmation. For LDA analysis a backward selection method was used with a p < 0.05 through Wilks test. For cross validation a leave-one-out strategy was used.