Metabolic Profiling of Volatile Organic Compounds (VOCs) Emitted by the Pathogens Francisella tularensis and Bacillus anthracis in Liquid Culture

We conducted comprehensive (untargeted) metabolic profiling of volatile organic compounds (VOCs) emitted in culture by bacterial taxa Francisella tularensis (F. tularensis) subspecies novicida and Bacillus anthracis (B. anthracis) Sterne, surrogates for potential bacterial bioterrorism agents, as well as selective measurements of VOCs from their fully virulent counterparts, F. tularensis subspecies tularensis strain SCHU S4 and B. anthracis Ames. F. tularensis and B. anthracis were grown in liquid broth for time periods that covered logarithmic growth, stationary, and decline phases. VOCs emitted over the course of the growth phases were collected from the headspace above the cultures using solid phase microextraction (SPME) and were analyzed using gas chromatography-mass spectrometry (GC-MS). We developed criteria for distinguishing VOCs originating from bacteria versus background VOCs (originating from growth media only controls or sampling devices). Analyses of collected VOCs revealed methyl ketones, alcohols, esters, carboxylic acids, and nitrogen- and sulfur-containing compounds that were present in the bacterial cultures and absent (or present at only low abundance) in control samples indicating that these compounds originated from the bacteria. Distinct VOC profiles where observed for F. tularensis when compared with B. anthracis while the observed profiles of each of the two F. tularensis and B. anthracis strains exhibited some similarities. Furthermore, the relative abundance of VOCs was influenced by bacterial growth phase. These data illustrate the potential for VOC profiles to distinguish pathogens at the genus and species-level and to discriminate bacterial growth phases. The determination of VOC profiles lays the groundwork for non-invasive probes of bacterial metabolism and offers prospects for detection of microbe-specific VOC biomarkers from two potential biowarfare agents.


Results
The complexity of chromatographic peaks detected through GC-MS analysis of each sample is illustrated by an observed profile of Ft novicida sampled 24 hours post-inoculation, representing the early stationary phase (Fig. 1a). However, many peaks originated from the Mueller-Hinton media and the SPME sampling device (Fig. 1b) and were considered background. Peaks representative of the bacterial signature were of lower relative abundance, highlighted on a smaller chromatographic scale in Fig. 1c. This complexity was similar for Ft SCHUS4 and both B. anthracis taxa (not pictured).
Detection of thousands of volatile compounds in the various cultures and timepoints for each of the taxa (Table 1) necessitated specified data-filtering criteria (see Methods) for quality control purposes. For example, more than 2000 VOCs were detected across all Ft novicida samples. Eliminating VOCs that did not appear in at least two of the triplicate measurements (Criterion 1) narrowed the dataset to 121 VOCs, a reduction of approximately 95% (Fig. 2). Further elimination of VOCs with relative abundances less than 10x the average relative abundance in the negative controls (Criterion 2) narrowed the dataset to 18 putative volatile biomarkers that were confidently attributed to Ft novicida. The same criteria were applied to the data from other bacterial species studied here, resulting in 38 putative VOC biomarkers for Ft SCHUS4, 30 biomarkers in Ba Sterne, and 56 biomarkers in Ba Ames (Table 1).
Results from RG2 species. Candidate bacterial VOC biomarkers from all timepoints were annotated through examination of both mass spectral library matching scores using the NIST14 database and experimental retention indices. Since all metabolite annotations in this report are based on comparisons to literature spectra and retention index values, they should be considered as satisfying confidence level 2 of the Metabolomics Standards Initiative recommendations for identification of compounds 25 . For Ft novicida ( Table 2), 15 of the 18 biomarkers passed the set threshold of 70% match, while three were labeled as "unknowns" owing to poorer matches below that threshold. For Ba Sterne (Table 3), 18 of the 30 biomarkers passed the set threshold of 70% match, while the remainder were labeled "unknowns". Inter-species diversity in emitted VOC biomarkers was observed. The Ft novicida profile contains odd-chain, aliphatic methyl ketones, alcohols, nitrogen-containing,  and sulfur-containing volatiles. The Ba Sterne volatile profile is comprised of branched methyl ketones, followed by esters, carboxylic acids, alcohols, and sulfur-containing volatiles. Evaluation of potential markers requires assessment of the growth phase at each sampled timepoint post-inoculation of the culture flask. The logarithmic, stationary, and decline phases were identified based upon CFU measurements taken alongside SPME-VOC sampling. The data for both RG2 species, Ft novicida and Ba Sterne, are presented in Fig. 3. Logarithmic or "Log" phase, characterized by exponential bacterial growth, was observed to last for 20 hours and 8 hours, respectively. The bacterial counts rose approximately 3 orders of magnitude for both species, peaking at 1-2*10 9 CFU/mL for Ft novicida and 5*10 8 CFU/mL for Ba Sterne. For Ft novicida (Fig. 3a) the observed growth during that phase appeared rather variable. Ft cultures are known to be difficult to grow. Sampling more replicates may improve statistical confidence in future experiments. Stationary phase, occurring when the bacteria exhibit no additional growth due to a depleted nutrient source, was observed in both species. Ba Sterne measurements were completed at 24 hours post-inoculation while still in stationary phase. For Ft novicida, further growth phase changes were observed through a decline in viable bacterial growth to ~1*10 6 CFU/mL at 32 h and no observable growth at the 48 h and 52 h post inoculation. The limit of detection for concentrations of viable bacteria was less than 1000 CFU/mL. Regardless, the CFU/mL counts were fairly consistent across triplicate measurements in both taxa, allowing assessments of growth phase.
The observed profiles of the VOC biomarkers varied with growth phase. The averaged relative abundances of biomarkers from Ft novicida and Ba Sterne are listed across all measured timepoints and grouped by compound class (Tables 4 and 5; Supplemental Tables 1 and 2). The mean combined chromatographic peak areas of each   marker compound class measured at each time point for the two RG2 species are shown in Fig. 4. As can be seen in Fig. 4, the relative contributions for each of the marker compound classes to the total VOC biomarker signal evolve over time. Of the tentatively identified markers, chemical diversity was observed in the presence of ketones, aldehydes, alcohols, esters, carboxylic acids, nitrogen-or sulfur-containing markers, and alkanes. Although more biomarkers were detected for Ba Sterne (30) versus Ft novicida (18), the combined peak areas (total signal) of markers from Ft novicida at its peak growth (32 hours post-inoculation, stationary phase) were approximately 5x the total combined peak area of Ba Sterne VOCs at its peak growth (8 hours, logarithmic phase), attributed to the ~10x higher concentration of bacteria (compare Figs. 3 and 4). There was only a moderate correlation between combined marker peak areas and the bacteria concentration at any single timepoint. Peak areas and bacterial counts rose during the logarithmic phase for both species, but cumulative peak areas were stagnant or dropped during stationary phase despite bacterial concentration remaining steady. The biomarker peak areas for Ft novicida steadily increased throughout the logarithmic and stationary phases before decreasing during the decline phase. Alcohols steadily rose in relative abundance throughout the log phase and were dominant in the early log and log phases. While some alcohols persisted throughout the entire study, several were fully depleted at the longest timepoints measured (see 2-nonanol and 2-undecanol). Linear, odd-chain methyl ketones (or 2-ketones) were present throughout all growth phases, with ketones consisting of more than 13 carbons (longer than 2-tridecanone) being present only in the stationary phase and beyond. The contribution of methyl ketones peaked in stationary phase growth, and their decrease in the decline phase lowered total VOC relative abundances. Nitrogen-containing markers were present throughout the analysis of F. tularensis species due to the presence of 2,5-dimethylpyrazine, a marker that was a component of the growth media. However, the signal emitted from the bacterial cultures first exceeded 10x the signal in the media control  at the 20-hour timepoint, prompting its inclusion as a potential F. tularensis marker. Combined with the signal from 2-methyl-3-isopropylpyrazine, nitrogen-containing markers comprised almost 70% of the chemical profile for the decline phase. Finally, Ft novicida noticeably displayed a large signal of dimethyltrisulfide as an abundant marker in the decline phase, comprising 7-8.5% of the total VOC signal. Biomarker areas for Ba Sterne also changed dependent on growth phase, though fewer timepoints were measured compared to Ft novicida. Esters, carboxylic acids, and alcohols comprised a significant portion of the logarithmic phase VOC marker profiles. Esters were based on butanoic or propanoic acids, with methyl groups at the 2 or 3-carbon positions. Two carboxylic acids were also based on butanoic and propanoic acids, both methylated at the 2-carbon position. Alcohols were only present in the early log phase. The non-detection of these markers during the stationary phases (with the exception of 2-methyl-propanoic acid in early stationary phase) suggests use as precursors for further synthesis. Relative abundances of methyl ketones significantly increased during stationary phase. Moreover, while Ft novicida was dominated by straight-chain aliphatics, the methyl ketones in Ba Sterne contained methyl and aromatic substituents.
Levels of VOC biomarkers for Ft novicida and Ba Sterne were subjected to principal component analysis (PCA) to visualize VOC profiles observed at different growth phases. The scores plots in Fig. 5a,b show clustering of all three culture replicates of the respective strains. The loadings plots, depicting the relative importance of individual markers towards sample positioning on PCs 1 and 2, are described in greater detail in Supplemental Fig.  1a,b. PCA groupings similarly exhibited distinct groupings of timepoints into clusters as determined in Fig. 4a,b for each species. The PCA scores plot provides additional verification of similarity of VOCs from culture replicates -profiles from each timepoint (same color) were positioned more closely to each other than to replicates of an adjacent timepoint, demonstrating fairly reproducible VOC marker profiles.
Results from RG3 species. Determination of bacterial concentrations and VOC sampling of the RG3 pathogens grown in our BSL-3 laboratory were performed at select time points, as shown in Supplemental Table 3 for both Ft SCHUS4 and Ba Ames. Ba Ames exhibited growth throughout 24 hours, with bacterial concentrations rising to 2.5*10 7 CFU/mL at the 24 h time point. Meanwhile Ft SCHUS4 concentration remained stagnant around 6.7*10 5 CFU/mL throughout 24 hours of culture, hypothesized to remain in a lag phase after inoculation.
The decontamination protocols developed for our work in the BSL-3 laboratory on both Ft SCHUS4 and Ba Ames included wiping the SPME fiber exterior casing using bleach (see Supplemental Information). There was a potential for VOCs adsorbed on the internal fibers to be inadvertently oxidized. However, comparison of the VOC profiles from B. anthracis taxa obtained using the BSL-2 protocol without bleach wiping and the BSL-3 protocol that included the bleach wiping revealed a range of similar markers and/or compound classes, with no evidence of oxidized by-products for the profiles obtained using the BSL-3 protocol.
The VOC marker profile of Ba Ames displayed similarities to its RG2 counterpart Ba Sterne and is detailed in Table 7, where 18 of the 56 putative markers were identified. The VOC marker profile at the 6-hour timepoint of Ba Ames resembled the logarithmic VOC marker profile of Ba Sterne. Esters were the most abundant identified markers, consisting of propanoic and butanoic acid esters. Five esters were shared between both B. anthracis taxa. Subsequent compound classes included methyl ketones and carboxylic acids. Conversely, the VOC marker profile at the 24-hour timepoint of Ba Ames more closely resembled the stationary VOC marker profile of Ba Sterne. Methyl ketones were the dominant markers, while all esters have been depleted. Four methyl ketones were shared between B. anthracis taxa. Principal component analysis (PCA) of the level of VOC biomarkers was also applied to Ba Ames to visualize VOC profiles at different growth phases. Similar to Ba Sterne, it appears that different chemical profiles can be associated with different growth phases of Ba Ames (not shown here), however, more data points would be needed to draw stronger conclusions.
Conversely, the profile of VOCs from Ft SCHUS4 (Table 6) had fewer similarities with Ft novicida. The majority of putative markers for Ft SCHUS4 were classified as unknowns, with only 5 markers passing the conservative identification criteria. While none of the observed compounds passing the filtering criteria were shared between either species, the 6 and 24-hour timepoints both contained alcohols such as 4-methyl-3-heptanol and 1-dodecanol. Alcohols were also the dominant class of the logarithmic phase for Ft novicida. The lack of compound class similarities for determined markers could result from genetic differences between Ft SCHUS4 and Ft novicida. However, in agreement with the bacterial concentration data shown in Supplemental Table 3, the CFU counts suggests that Ft SCHUS4 remained in a lag phase or a very early logarithmic phase throughout the first 24 h after inoculation. Additional measurements of the growth phases of Ft SCHUS4 over longer time periods are needed for a more comprehensive comparison of Ft SCHUS4 VOC markers against those of Ft novicida.

Discussion
The methodology and results described here provide initial groundwork for detection and identification of volatile biomarkers from bacterial pathogens including fully virulent RG3 strains. The application of this non-invasive methodology for VOC profiling applied to actively growing F. tularensis and B. anthracis bacterial cultures revealed dynamic profiles, influenced by both the bacterial growth phase and bacterial concentration. At any given timepoint, isolation of the bacterial biomarkers was complicated by background volatiles, and data processing was applied uniformly across all sample types to identify bacterial biomarkers.
As we discuss the VOC profiles observed here, one should keep in mind that measured VOC profiles are influenced by the sampling and detection methods used. For example, the type of sorbent material used can introduce a sampling bias (sampling efficiency is dependent on partition behavior of each compound) and the sampling time and mass spectral analysis method influences the sensitivity with which compounds can be detected. Generally, detection limits for the basic type of SPME-GC-quadrupole MS used here for untargeted analysis (scan, not select ion mode) are on the order 1 ng of a compound injected into a column. It is conceivable that we have detected only the most prevalent VOCs and a larger number of relevant VOC biomarkers may be found if more efficient sampling techniques (e.g. thermal desorption tubes) and more sensitive mass spectrometry protocols, such as selected ion monitoring, are used. We explicitly note that absolute VOC quantification was not attempted here. For various practical reasons, absolute quantification of VOCs in entire cultures is challenging owing the desired use of stable isotope-labeled internal standards that are susceptible to metabolic degradation. Also, our cultures were not fully enclosed (we used flasks with vented caps) because they required gas exchange  (oxygen) to sustain growth and needed to be vented to avoid buildup of pressure. However, relative abundances of compounds were compared among cultures by integrating chromatographic peaks across species and timepoints. The cumulative VOC profiles of Ft novicida, Ft SCHUS4, Ba Sterne, and Ba Ames determined here included representatives of different compound classes such as methyl ketones, alcohols, nitrogen-containing compounds, sulfur-containing compounds, carboxylic acids, esters, and various unidentified biomarkers. Exhaustive identification of every pathway that produces these volatiles is beyond the scope of this discussion, but several likely routes of biosynthesis are enumerated below.
Ketones. Ketones were abundant markers, present in all pathogens except Ft SCHUS4, and largely as methyl ketones. The methyl ketones are likely formed by modifying products of the fatty acid biosynthesis pathway, specifically the β-oxidation of fatty acids 26 . Odd-chain methyl ketones can be formed through the decarboxylation of even-carbon β-keto acids. Conversely, even-carbon methyl ketones arise from odd-carbon fatty acids and occur with lower frequency 27 .
Interestingly, methyl ketones with straight-chain alkane branches were abundant in Ft novicida, while primarily branched and aromatic methyl ketones were prevalent in Ba Sterne and Ames. This difference may stem from B. anthracis being Gram-positive, whereas F. tularensis is Gram-negative. Synthesis of fatty acids in Gram-positive and Gram-negative bacteria is controlled by enzymes with different preferred substrates. For example, comparison of the enzyme β-ketoacyl-acyl carrier protein synthase III from Gram-negative E. coli and Gram-positive   Alcohols. While alcohols were present for both F. tularensis and B. anthracis, their number and relative abundances were greater in F. tularensis. Alcohols may be synthesized from the breakdown products of β-oxidation of fatty acids, for example after enzymatic reduction of carboxylic acids 27,29 . The fatty acid chains observed for the alcohols class exhibited diversity, including straight-chain, branched chain, and aromatic substituents. 1-nonanol was likely formed by reduction of the fatty acid. The 2-alkanols (2-nonanol and 2-undecanol) are postulated to be derived from corresponding methyl ketones as reduced intermediates, since the corresponding methyl ketones were also detected at all timepoints where the 2-alkanols were detected, usually at a higher relative abundance. The aromatic alcohol phenylethyl alcohol is a widely occurring VOC produced by several bacterial species. Volatile alcohols have been shown to play a role in growth inhibition of several bacteria and fungi 30,31 . Sulfur-containing compounds. Dimethyltrisulfide was an abundant VOC uniquely present in Ft novicida during the decline phase, when no viable bacteria were detected. Sulfur-containing VOCs are attributed to  www.nature.com/scientificreports www.nature.com/scientificreports/ breakdown of the amino acids cysteine and methionine 27 . Dimethyltrisulfide has previously been observed as product of human decomposition caused by bacteria.
Nitrogen-containing compounds. The detection of nitrogen-containing pyrazine markers produced by bacteria is complicated by endogenous pyrazine VOCs present in the growth media 27 . The sterilization of growth media through autoclaving heats amino acids and reducing sugars, producing pyrazines via the Maillard Reaction 32 . The Mueller-Hinton growth media controls consistently produced 2,5-dimethylpyrazine, and a similar relative abundance was observed in the Ft novicida cultures through the first 16 h of growth. At 20 h of growth and beyond, the relative abundance of 2,5-dimethylpyrazine rose more than 10x the relative abundance of the controls, suggesting the growing bacteria have active involvement in biosynthesis of pyrazines. An additional pyrazine, 2-methyl-3-isopropylpyrazine, was also observed. Isopropyl substituents to pyrazine compounds are not common constituents in bacterial volatiles 27 . Therefore, we hypothesize both pyrazines originate from Ft novicida under the chosen growth conditions. Pyrazines have also been observed as volatile byproducts of www.nature.com/scientificreports www.nature.com/scientificreports/ bacterial metabolism, for example, in the genera Streptomyces 33 and Bacillus 34 . Only one nitrogen-containing compound, tetramethylpyrazine, was observed in Ba Ames during the last observed timepoint, estimated to be in the logarithmic phase, but was not as abundant in as in Ft novicida. This is compounded by the different growth media utilized, which emphasizes the need for careful evaluation when comparing biomarkers across different growth conditions and species.

Esters and carboxylic acid compounds. Esters and carboxylic acids were detected exclusively in both
Ba Sterne and Ba Ames, but not in the F. tularensis strains. The identified B. anthracis markers contained either propanoic or butanoic acid as the backbone for methylated esters or the side-chain for carboxylic acids. The formation of esters and carboxylic acids can be derived from shared metabolic pathways occurring during normal bacterial growth, such as oxidation of fatty acids or amino acid metabolism. As the ester and carboxylic acid compounds were only observed during the logarithmic growth stage, this demonstrates a shift in B. anthracis metabolism once bacteria reach the stationary phase.

Evidence of dynamic metabolic processes.
For all four bacterial taxa studied here, their VOC marker profiles varied as function of time after inoculation/culture start and, as observed for Ft novicida, Ba Sterne and Ba Ames, varied distinctly across their growth phases. For Ft novicida and Ba Sterne, this was also shown through application of PCA, which produced distinct groupings for the VOC markers of different growth phases. For example, in the Ft novicida cultures, the methyl ketones, once produced, were present throughout the remainder of the experiment. However, select alcohols (2-nonanol and 2-undecanol) were not detected after a mid-stationary (28-hour) timepoint. This suggests alcohols were depleted in the liquid culture, potentially as precursors in ongoing bacterial metabolism. Once the basic metabolism of isolated pathogens is determined, changes in the marker profiles when additional variables are added (e.g. different substrates) can help drive inferences on metabolic activity of complex systems.

Comparison of VOC markers for RG3 vs. RG2 strains.
In comparing the putative VOC biomarkers identified for Ba Ames (RG3) to those for Ba Sterne (RG2), we found some similarities, but also distinct differences. In contrast, the profile of VOCs from Ft SCHU S4 (RG3) had fewer similarities with Ft novicida (RG2). The relative similarities between Ba Ames and Ba Sterne may stem from the close genetic relationship of these two strains (Ba Sterne is missing one of the two plasmids that Ba Ames has but is otherwise genetically very similar to Ba Ames 35 ). In contrast, Ft SCHU S4 and Ft novicida are genetically more distinct 36 . If further confirmed in future studies, this may have implications for the use of RG2 "simulants" to develop sensors and algorithms for detecting exposures to the related RG3 pathogens. In the biodefense community, Ba Sterne is generally considered a good simulant for Ba Ames. For Ft, RG2 simulants other than Ft novicida may be considered.
Future VOC sampling should be performed on additional subspecies of F. tularensis (e.g., spp. holarctica) and B. anthracis (e.g., spp. Vollum or H9401) to investigate whether these profiles are unique to a subspecies, species, or bacterial pathogens in general. Several markers identified in this study have been previously reported as emissions of other bacterial types. For example, Chen et al. 14 reported 2-heptanone, 2-nonanone, and 2-undecanone in E. coli but did not detect higher carbon methyl ketones. Rees et al. 12,13 reported both even and odd-chain methyl ketones, including 2-hexanone, 2-heptanone, 2-nonanone, and 2-decanone products from Klebsiella pneumoniae, where the presence of even-chain methyl ketones suggests a different or complementary synthesis pathway for volatile production.
One of the long-term goals of our project seeks to use VOCs as breath-based diagnostic markers towards the detection of biowarfare agents in patients after a suspected biological attack. During a hypothetical pathogen infection in humans, the VOCs in breath may be derived from 1) the invading pathogen, 2) the human breath volatilome, or 3) interactions between the human host and pathogen. This study represents our first step in non-invasive methodology and data analysis optimization, extensively profiling two attenuated pathogen or RG2 species and screening their RG3 virulent counterparts in optimized growth media. The number of compounds identified in the human breath volatilome continues to grow through targeted and untargeted studies. A searchable database of breath-specific compounds in the "human volatilome" has been curated by the U.S. Environmental Protection Agency (EPA) and is continuously updated 37,38 . A survey of this list against the F. tularensis RG2 and RG3 profiles found here revealed 2-heptanone and 2-nonanone have been detected in human breath, while a comparison against the B. anthracis RG2 and RG3 profiles revealed 2-methylpropanoic acid, 2-heptanone, 6-methyl-2-heptanone, and 5-methyl-5-hepten-2-one that have been reported in human breath. Also, it is important to  Table 7. Annotations of B. anthracis Ames-specific VOC markers and average relative abundances (n = 3) at 6 and 24 hours post inoculation. Notes: a VOC detected in 2/3 of triplicate measurements b VOC detected in 1/3 of triplicate measurements RI (Lit): Retention Index reported from NIST14 RI (Exp): Retention Index calculated from experiment.
note that the volatiles in this database may not be commonly shared among all people, as human breath has been shown to be influenced by one's unique personal microbiome, external exposures, and immunological responses. The effects of shared volatiles between pathogens and a human host must be evaluated in further studies that better simulate an in vivo infection, as well as identifying markers unique to that interaction. Future work into "baseline" human breath signatures, a pathogen-specific volatilome, and host-microbe interactions are required for evaluation of VOCs as diagnostic tools for human health. Towards a pathogen-specific volatilome, further efforts include expanding both the number of bacterial species and evaluating the effects of chosen growth media on VOCs produced. Additionally, as animal model studies have established a low bacterial count can establish infections (e.g. 10 bacterial counts for F. tularensis in primate models), optimization of signal detection will also be investigated, as the conditions employed here used relatively high bacterial counts. Finally, future work should make the transition from in vitro studies into experiments more closely aligned with in vivo studies, such as initiating bacterial infection of human lung cell cultures and analyzing the resultant profiles for discovery of overlapping volatile compounds that may serve as diagnostic markers of human exposures to biosecurity-relevant pathogens.

Conclusions
This study adapted a SPME-GC-MS methodology for noninvasive profiling of VOCs emitted from actively growing pathogens, specifically potential biowarfare bacterial agents and their surrogates, in both BSL-2 and BSL-3 settings. The devised methodology detected volatile biomarkers that were reflective of both the presence and physiological growth phase of pathogens. The data processing employed distinguished signals from the pathogens against a complex chemical background, in this case aided by the use of powerful software (MassHunter, MPP) for compound annotation and visualization of GC-MS data. Although the devised methodology based on SPME-GC-quadrupole MS does not represent the pinnacle of sensitivity, a number of relatively robust and reproducible putative volatile biomarkers could be detected. Further confirmation of these markers should be pursued in more repeat experiments across a wider range of growth conditions. More efficient VOC collection methods and more sensitive mass spectral analysis techniques may also uncover additional markers in the future.
Detection and identification of metabolites specific to taxa or species provides the first steps to understanding their formation via various metabolic pathways and the genetic basis for these pathways. We acknowledge that the work presented here constitutes only initial scoping experiments. While this work demonstrates the applicability of this method and found a number of interesting volatile biomarkers, this work needs to be expanded to determine the influence of various experimental factors on markers. We recommend that future research include determining the dependence of pathogen-produced volatiles on environmental conditions (e.g. chosen growth media) and use of different VOC collection methods (e.g. thermal desorption tubes) to achieve lower detection limits. Elucidation of comprehensive bacterial profiles is expected to provide clues about bacterial metabolism in controlled environments, which can further inform research into metabolic processes when pathogens are in other settings (i.e. a host). Ultimately, such biomarkers may yield useful information about metabolism in bacterial taxa and may facilitate new applications in biodetection. Distinct volatile profiles have potential to be used for the detection of pathogens in the context of biosecurity-relevant exposures of humans during a biological attack.
Results from this work have implications in the larger volatilomics community, both within the field of pathogens study and beyond. While volatile compounds from B. anthracis have been previously studied, this is the first study, to our knowledge, to profile volatile emissions of F. tularensis. Future databases can incorporate biomarker signatures from various pathogen species for means of relevant comparisons.
Preparation of bacterial headspace. Bacterial colonies were selected after overnight incubation on agar plates and transferred to 10 mL of liquid modified MH media or BHI media, respectively. Bacteria were cultured in media under aerobic conditions with overnight incubation at 37 °C and 170 rpm shaking. For each species and experiment, three 100-µL aliquots were inoculated into three separate 20 mL portions of fresh liquid media (1:200 dilutions) and incubated in three 250-mL disposable polycarbonate Erlenmeyer flasks with vented caps at 37 °C and 170 rpm shaking. The VOC profiles from the headspaces of each of the triplicate bacterial cultures (replicates) and the number of viable bacteria were sampled and assessed at multiple timepoints. In addition to the three replicates of each pathogen species, an uninoculated liquid media flask was simultaneously prepared and VOCs sampled from it as a negative (media-only) control.
Sampling VOCs from bacterial headspace (RG2 strains). The VOC profiles of bacterial headspaces and media controls were sampled at different time intervals depending on bacterial growth rates and experimental setups using a protocol developed here that some of the authors also applied for headspace analysis of algal cultures in other work 40 . Ft novicida cultures were sampled at the following timepoints: 0, 2, 4, 8,12,16,20,24,28,32,48, and 52 hours post-inoculation. Ba Sterne cultures were sampled at the following timepoints: 0, 4, 8, 12, 20, and 24 hours post-inoculation. At the time of sample collection, Erlenmeyer flasks were removed from the incubator-shaker and transferred to a biosafety cabinet. Headspace VOCs were immediately collected for 30 minutes on a field-portable 2 cm solid-phase microextraction (SPME) fiber with a 65 µm polydimethylsiloxane/divinylbenzene (PDMS/DVB) coating (Supelco, Bellefonte, PA) with no agitation of the flask. At each timepoint, one unexposed SPME fiber (fiber remaining retracted behind the septum in the SPME housing) was placed within the biosafety cabinet where the SPME sampling of cultures was taking place. These fibers served as "travel blanks" to account for potential background volatiles leaking onto retracted fibers over time during storage or transportation to the GC-MS analysis laboratory. These "travel blanks" were analyzed concurrently with fibers exposed to cultures. After collection, all SPME fibers were stored in refrigerators at 2-4 °C until analysis. Data acquisition on the GC-MS occurred within 3 weeks of collection.
Sampling VOCs from Bacterial Headspace (RG3 strains) and Transfer of SPME Samples to BSL-2 Facility. The VOC profiles of RG3 Ft SCHUS4 and Ba Ames as well as corresponding media controls were sampled at the following timepoints: 0, 6, and 24 hours post-inoculation. Timepoints were chosen to capture the exponential and stationary growth phase in each species. At the time of sample collection, Erlenmeyer flasks were removed from the incubator-shaker and transferred to a biosafety cabinet within the BSL-3 facility. Flasks were allowed to sit in the BSC for 30 minutes prior to sampling in order to allow any aerosols to settle. Headspace VOCs were collected for 30 minutes on SPME fibers with no agitation of the flask. After collection, SPME fiber devices were decontaminated by bleach wiping the entirety of their external housing for 1 min apiece, and residual bleach was removed via wiping. The process of bleach wiping to prevent accidental transfer of pathogens out of the BSL-3 facilities was tested and validated. The overall protocol was approved by the Institutional Biosafety Committee (IBC) at LLNL (see Supplemental Protocol in Supplemental Information). SPME fibers were transferred from the BSL-3 to BSL-2 facilities and stored in refrigerators at 2-4 °C until analysis, as previously described. In analyzing the samples collected in the BSL-3, we did not find any indication that bleach wiping may have altered the compounds detected e.g. by introducing chlorinated compounds.
Determination of bacterial concentrations. The growth phase (logarithmic, stationary, decline) of each organism was estimated by monitoring the concentration of viable bacteria over the course of the experiment for all biological replicates. Aliquots (1 mL) of all bacterial cultures were collected immediately following VOC sampling at each of the timepoints, and the Erlenmeyer flasks were subsequently placed back into the incubator-shaker. The aliquots were serially diluted between 10 −2 to 10 −7 depending on expected growth phase. A preliminary experiment was performed by plating in duplicate 10-fold dilutions to determine the appropriate serial dilution for each growth phase. The dilution factor was selected to achieve a target concentration of 30-300 cells per plate for counting. Dilutions were plated in duplicate (100-μL aliquots) on agar plates to determine the number of colony-forming units (CFU). Bacterial concentrations are reported as CFU counts per mL of liquid culture.
Data acquisition parameters. The data acquisition followed a procedure similar to the one previously described for algal VOCs and is briefly summarized here 40 . VOC analyses were performed on an Agilent 5975 T GC-MSD (Agilent Technologies, Santa Clara, CA) using an Agilent HP-5ms column (30 m x 250 µm x 0.25 µm) coupled to a single quadrupole mass analyzer with helium carrier gas at a constant flow rate of 1.2 mL/min. Volatiles absorbed by the SPME fiber were desorbed in the heated (250 °C) GC inlet for 60-seconds using splitless injection. The column temperature was programmed to start at 40 °C for 6 min, then heated at 8 °C/min from 40 to 280 °C and held for 4 min (total run time = 40 min). Ions were generated using electron ionization (EI) (70 eV) and acquired at 4 scans/s over m/z 35-450. Data acquisition was performed using ChemStation (version E.02.02). A commercial GC-MS reference standard (S-22329; AccuStandard, New Haven, CT) was used to evaluate day-to-day performance of the GC-MS system and to calculate retention indices. Data processing. After data acquisition, data processing procedures and criteria were applied to detect and identify taxa-specific biomarkers similar to the work previously described for algal VOCs 40 . All ChemStation data files (consisting of data from biological replicates, media controls, and travel fibers) were translated using MassHunter GC/MS Translator B.07.05 for compatibility with Agilent's Mass Hunter Qualitative software (version B.07.00 SP2) and Mass Profiler Professional (MPP) 12.6.1 software. These programs enabled sophisticated organization of individual MS files into complex datasets for chemometric analyses.
Chromatographic deconvolution and visualization were performed using MassHunter Qualitative using a Retention Time window size factor of 90.0, signal-to-noise ratio threshold of 2.00, and absolute ion height filter of 1000 counts 40 . An arbitrary small value of 1 was assigned across all samples to the signal value for compounds that were not detected. Detected peaks were transferred into MPP and inter-aligned using a retention time tolerance of 0.15 minutes, mass spectral match factor of 0.6 (of maximum 1.0), and a delta m/z tolerance of 0.2 Da. Annotation of the aligned compounds was performed by searching spectra against the NIST14 mass spectral database. Compounds with mass spectral matches ≥70% were subsequently identified by the name of the match with the highest score. Identifications with literature retention indices deviating more than 5% from the experimental retention indices were rejected. Compounds that did not exceed the mass spectral match or retention index threshold were annotated using the base peak m/z and retention index (e.g. "Unknown m/z 121_RI 1002").
The reported abundance values in this work are relative abundances of compounds, obtained by integrating the signal in their chromatographic peaks. Relative abundances are compared between different measurements (timepoints, species). Absolute quantification of VOCs in the headspace above bacterial cultures is challenging with our method. For example, our culture vessels were not fully enclosed due to the use of vented caps designed (2020) 10:9333 | https://doi.org/10.1038/s41598-020-66136-0 www.nature.com/scientificreports www.nature.com/scientificreports/ to facilitate gas exchange and avoid pressure buildups, and some loss of VOCs may have occurred. The retention of analytes is also affected by sorbent material, sampling time, and potential saturation, whereas the desorption is affected by extraction time and temperature. Some relative quantitation could be achieved using internal standards, whether pre-loaded or spiked into cultures, but also has a number of practical issues. Therefore, for the purposes of our work, absolute quantification was not attempted.
Two filtering criteria were used to identify relatively robust and reproducible VOCs as the most likely candidate compounds for potential taxa-specific biomarkers. The first criterion required detection of a potential biomarker in at least two of three culture replicates at a given sampling timepoint. This "2 out of 3 replicates" filter criterion was chosen as a compromise to require some level of reproducibility while also allowing for some biological variability that is often encountered in experiments involving live biological systems. Some of the detected compounds were present at fairly low concentrations and some biological variability could have easily pushed a compound below the detection threshold in one of the replicates. We chose this "2 out of 3" criterion in order to avoid missing some potentially interesting markers by applying too stringent a criterion. The second criterion concerned the presence or absence of a marker in a biological culture relative to the media and travel blank controls appropriate for each organism. A compound was removed from consideration as a potential candidate if its relative abundance in the biological replicates was less than ten times the relative abundance in the control.
The VOCs identified as putative taxa-specific biomarkers were compared with regard to both individual markers and groups of markers encompassing a compound class. First, the presence or absence of these markers in each growth phase (logarithmic, stationary, and decline) was determined. Second, the calculated peak areas of markers, also referred to as relative abundances here, were compared amongst biological replicates to assess consistency of detection. Finally, principal component analysis (PCA) was used as a dimension-reduction strategy to visualize covariance in the dataset. Only markers remaining after the filtering criteria were applied were utilized. Using the MPP software, prior to PCA analysis, markers were individually mean-centered and variance-scaled. PCA was performed on the transformed dataset, and the results are presented as a scores plot of the first two principal components (PCs) and a loadings plot to elucidate the contribution of each marker to PC positioning. PCA was not performed on the RG3 taxa due to the limited number of acquired samples.

Data availability
The datasets generated and analyzed during the current study can be reproduced from the raw metabolomic data files that have been deposited to the EMBL-EBI MetaboLights database under the identifier MTBLS1737. The complete dataset can be accessed at https://www.ebi.ac.uk/metabolights/MTBLS1737.