Carbon monoxide (CO) is a chemically reactive trace gas that is produced through natural processes and anthropogenic pollution. The average global mixing ratio of this gas is ~90 ppbv in the troposphere (lower atmosphere), though this concentration greatly varies across time and space, with levels particularly high in urban areas [1,2,3,4]. Currently, human activity is responsible for ~60% of emissions, with the remainder attributable to natural processes [1]. Counteracting these emissions, CO is rapidly removed from the atmosphere (lifetime of 2 months) by two major processes: geochemical oxidation by atmospheric hydroxyl radicals (85%) and biological oxidation by soil microorganisms (10%) [1, 5]. Soil microorganisms account for the net consumption of ~250 teragrams of atmospheric CO [1, 5, 6]; on a molar basis, this amount is seven times higher than the amount of methane consumed by soil bacteria [7]. Aerobic CO-oxidizing microorganisms are also abundant in the oceans; while oceans are a minor source of atmospheric CO overall [8, 9], this reflects that substantial amounts of the gas are produced photochemically within the water column and the majority is oxidized by marine bacteria before it is emitted to the atmosphere [10].

Aerobic CO-oxidizing microorganisms are traditionally categorized into two major groups, the carboxydotrophs and carboxydovores [11]. The better studied of the two groups, carboxydotrophs grow chemolithoautotrophically with CO as the sole energy and carbon source when present at elevated concentrations. To date, this process has been reported in 11 bacterial genera from four classes (Table S1): Alphaproteobacteria [12,13,14,15], Gammaproteobacteria [12, 15,16,17,18], Actinobacteria [19,20,21], and Bacilli [22]. Genetic and biochemical studies on the model alphaproteobacterial carboxydotroph Oligotropha carboxidovorans have demonstrated that form I carbon monoxide dehydrogenases mediate aerobic CO oxidation [23,24,25]. The catalytic subunit of this heterotrimeric enzyme (CoxL) contains a molybdenum–copper center that specifically binds and hydroxylates CO [24, 25]. In such organisms, electrons derived from CO oxidation are relayed through both the aerobic respiratory chain to support ATP generation and the Calvin–Benson cycle to support CO2 fixation [11, 26]. With some exceptions [19], these CO dehydrogenases have a high catalytic rate but exhibit low-affinity for their substrate (Km > 400 nM) [27]. Thus, carboxydotrophs can grow in specific environments with elevated CO concentrations, but often cannot oxidize atmospheric CO [11, 28].

Carboxydovores are a broader group of bacteria and archaea adapted to oxidize CO at lower concentrations, including atmospheric levels, in a broad range of environments. These bacteria can oxidize CO but, in contrast to carboxydotrophs, require organic carbon for growth [11, 29]. Carboxydovores have now been cultured from some 31 bacterial and archaeal genera to date (Table S1), spanning classes Alphaproteobacteria [29,30,31,32], Gammaproteobacteria [29, 33,34,35,36], Actinobacteria [18, 37,38,39,40], Bacilli [41], Thermomicrobia [41,42,43,44], Ktedonobacteria [44, 45], Deinococcota [41], Thermoprotei [46, 47], and Halobacteria [33, 48]. Carboxydovores are also thought to use form I CO dehydrogenases, but usually encode slower-acting, higher-affinity enzymes. In contrast to carboxydotrophs, carboxydovores usually lack a complete Calvin–Benson cycle, suggesting they can support aerobic respiration, but not carbon fixation, using CO [11]. A related enzyme family (tentatively annotated as form II CO dehydrogenases) was also proposed to mediate CO oxidation in carboxydovores [11, 29, 49], but recent studies suggest CO is not their physiological substrate [32].

The physiological role of CO oxidation in carboxydovores has remained unclear. It was originally thought that such microorganisms oxidize CO primarily to support mixotrophic growth [29, 30], but a recent study focused on the alphaproteobacterial carboxydovore Ruegeria pomeroyi showed that CO neither stimulated growth nor influenced metabolite profiles [31]. We recently developed an alternative explanation: consumption of atmospheric CO enables carboxydovores to survive carbon limitation [44, 50, 51]. This hypothesis is inspired by studies showing atmospheric H2 oxidation enhances survival [44, 52,53,54,55,56,57]. In support of this, CO dehydrogenases have been shown to be upregulated by five different bacteria during carbon limitation [38, 44, 53, 58, 59] and atmospheric CO is consumed by stationary-phase cells [44, 60]. Moreover, ecological studies have shown that CO is rapidly oxidized in ecosystems containing low organic carbon [51, 61, 62]. However, in contrast to atmospheric H2 [53,54,55, 57, 63], it has not yet been genetically or biochemically proven that atmospheric CO supports survival. To address this, we studied CO oxidation in Mycobacterium smegmatis, a genetically tractable representative of a globally abundant soil actinobacterial genus [64, 65]. This organism encodes a form I CO dehydrogenase and six other putative enzymes from the wider molybdenum-containing hydroxylase superfamily, but lacks a form II CO dehydrogenase [18, 29]. We show, through proteomic, genetic, and biochemical analyses, that its CO dehydrogenase is (i) strongly induced by organic carbon starvation, (ii) mediates aerobic respiration of atmospheric CO, and (iii) enhances survival of carbon-starved cells. On this basis, we confirm that atmospheric CO supports microbial survival and, with support from genomic, metagenomic, and metatranscriptomic analyses, propose a survival-centric model for the evolution and ecology of carboxydovores.

Materials and methods

Bacterial strains and growth conditions

Table S7 lists the bacterial strains and plasmids used in this study. Mycobacterium smegmatis mc2155 [66] and the derived strain ΔcoxL were maintained on lysogeny broth (LB) agar plates supplemented with 0.05% (w/v) Tween80. For broth culture, M. smegmatis was grown on Hartmans de Bont minimal medium [67] supplemented with 0.05% (w/v) tyloxapol and 5.8 mM glycerol. Escherichia coli TOP10 cells were maintained on LB agar plates and grown in LB broth. Liquid cultures of both M. smegmatis and E. coli were incubated on a rotary shaker at 200 rpm, 37 °C unless otherwise specified. Selective LB or LBT media used for cloning experiments contained gentamycin at 5 µg mL−1 for M. smegmatis and 20 µg mL−1 for E. coli.

Mutant construction

A markerless deletion of the coxL gene (MSMEG_0746) was constructed by allelic exchange mutagenesis. Briefly, a 2245 bp fragment containing the fused left and right flanks of the MSMEG_0746 gene was synthesized by GenScript. This fragment was cloned into the SpeI site of the mycobacterial shuttle plasmid pX33 [68] with E. coli TOP10 and transformed into M. smegmatis mc2155 electrocompetent cells. To allow for temperature-sensitive vector replication, the transformants were incubated on LBT-gentamycin agar at 28 °C for 5 days until colonies were visible. Catechol-reactive colonies were sub-cultured on to LBT-gentamycin agar plates incubated at 40 °C for 3 days to facilitate the first recombination of the coxL flanks into the chromosome. To allow the second recombination and removal of the backbone vector to occur, colonies that were gentamycin-resistant and catechol-reactive were sub-cultured in LBT-sucrose agar and incubated at 40 °C for 3 days. The resultant colonies were screened by PCR to discriminate ΔcoxL mutants from wild-type revertants (Fig. S1). Whole-genome sequencing (Peter Doherty Institute, University of Melbourne) confirmed coxL was deleted and no other SNPs were present in the ΔcoxL strain. Table S8 lists the cloning and screening primers used in this study.

Shotgun proteome analysis

For shotgun proteome analysis, 500 mL cultures of M. smegmatis were grown in triplicate in 2.5 L aerated conical flasks. Cells were harvested at mid-exponential phase (OD600 ~ 0.25) and mid-stationary phase (72 h post ODmax ~ 0.9) by centrifugation (10,000 × g, 10 min, 4 °C). They were subsequently washed in phosphate-buffered saline (PBS; 137 mM NaCl, 2.7 mM KCl, 10 mM Na2HPO4 and 2 mM KH2PO4, pH 7.4), recentrifuged, and resuspended in 8 mL lysis buffer (50 mM Tris-HCl, pH 8.0, 1 mM PMSF, 2 mM MgCl2, 5 mg mL−1 lysozyme, 1 mg DNase). The resultant suspension was then lysed by passage through a Constant Systems cell disruptor (40,000 psi, four times), with unbroken cells removed by centrifugation (10,000 × g, 20 min, 4 °C). To denature proteins, lysates were supplemented with 20% SDS to a final concentration of 4%, boiled at 95 °C for 10 min, and sonicated in a Bioruptor (Diagenode) using 20 cycles of ‘30 s on’ followed by ‘30 s off’. The lysates were clarified by centrifugation (14,000 × g, 10 mins, room temperature). Protein concentration was confirmed using the bicinchoninic acid assay kit (Thermo Fisher Scientific) and equal amounts of protein were processed from both exponential and stationary phase samples for downstream analyses. After removal of SDS by chloroform/methanol precipitation, the proteins were proteolytically digested with trypsin (Promega) and purified using OMIX C18 Mini-Bed tips (Agilent Technologies) prior to LC-MS/MS analysis. Using a Dionex UltiMate 3000 RSL Cnano system equipped with a Dionex UltiMate 3000 RS autosampler, the samples were loaded via an Acclaim PepMap 100 trap column (100 µm × 2 cm, nanoViper, C18, 5 µm, 100 Å; Thermo Scientific) onto an Acclaim PepMap RSLC analytical column (75 µm × 50 cm, nanoViper, C18, 2 µm, 100 Å; Thermo Scientific). The peptides were separated by increasing concentrations of buffer B (80% acetonitrile/0.1% formic acid) for 158 min and analyzed with an Orbitrap Fusion Tribrid mass spectrometer (Thermo Scientific) operated in data-dependent acquisition mode using in-house, LFQ-optimized parameters. Acquired.raw files were analyzed with MaxQuant [69] to globally identify and quantify proteins across the two conditions. Data visualization and statistical analyses were performed in Perseus [70].

Activity staining

For CO dehydrogenase activity staining, 500 mL cultures of wild-type and ΔcoxL M. smegmatis were grown to mid-stationary phase (72 h post ODmax ~ 0.9) in 2.5 L aerated conical flasks. Cells were harvested by centrifugation, resuspended in lysis buffer, and lysed with a cell disruptor as described above. Following removal of unlysed cells by centrifugation (10,000 × g, 20 min, 4 °C), the whole-cell lysates were fractionated into cytosols and membranes by ultracentrifugation (150,000 × g, 60 min, 4 °C). The protein concentration of the lysates, cytosols, and membranes was determined using the bicinchoninic acid assay [71] against bovine serum albumin standards. Next, 20 µg protein from each fraction was loaded onto native Bis-Tris polyacrylamide gels (7.5% w/v running gel, 3.75% w/v stacking gel) prepared as described elsewhere [72] and run alongside a protein standard (NativeMark Unstained Protein Standard, Thermo Fisher Scientific) at 25 mA for 3 h. For total protein staining, gels were incubated in AcquaStain Protein Gel Stain (Bulldog Bio) at 4 °C for 3 h. For CO dehydrogenase staining [14], gels were incubated in 50 mM Tris-HCl buffer containing 50 µM nitroblue tetrazolium chloride (NBT) and 100 µM phenazine methosulfate in an anaerobic jar (100% CO v/v atmosphere) at room temperature for 24 h. Weak bands corresponding to CO dehydrogenase activity were also observed for wild-type fractions after 4 h.

Gas chromatography

Gas chromatography was used to determine the kinetics and threshold of CO dehydrogenase activity of M. smegmatis. Briefly, 30 mL stationary-phase cultures of wild-type and ΔcoxL M. smegmatis strains were grown in 120 mL serum vials sealed with butyl rubber stoppers. At 72 h post-ODmax, cultures were reaerated (1 h), resealed, and amended with CO (via 1% v/v CO in N2 gas cylinder, 99.999% pure) to achieve headspace concentrations of ~200 ppmv. Cultures were agitated (150 rpm) for the duration of the incubation period to enhance CO transfer to the cultures and maintain an aerobic environment. Headspace samples of 1 mL were periodically collected using a gas-tight syringe to measure CO. Gas concentrations in samples were measured by gas chromatography using a pulsed discharge helium ionization detector (model TGA-6791-W-4U-2, Valco Instruments Company Inc.) as previously described [44]. Concentrations of CO in each sample were regularly calibrated against ultra-pure CO gas standards of known concentrations to the limit of detection of 9 ppbv CO. Kinetic analysis was performed as described, except cultures were amended with six different starting concentrations of CO (4000, 2000, 1000, 500, 200, 50 ppmv) and oxidation was measured at up to five timepoints (0, 2, 4, 6, 8 h). Reaction velocity relative to the gas concentration was calculated at each timepoint and plotted on a Michaelis–Menten curve. Vmax app and Km app values were derived through a non-linear regression model (GraphPad Prism, Michaelis–Menten, least squares fit) and linear regressions based on Lineweaver-Burk, Eadie-Hofstee, and Hanes-Woolf plots.

Respirometry measurements

For respirometry measurements, 30 mL cultures of wild-type and ΔcoxL M. smegmatis were grown to mid-stationary phase (72 h post ODmax ~ 0.9) in 125 mL aerated conical flasks. Rates of O2 consumption were measured before and after CO addition using a Unisense O2 microsensor. Prior to measurement, the electrode was polarized at −800 mV for 1 h with a Unisense multimeter and calibrated with O2 standards of known concentration. Gas-saturated PBS was prepared by bubbling PBS with 100% (v/v) of either O2 or CO for 5 min. Initially, O2 consumption was measured in 1.1 mL microrespiration assay chambers sequentially amended with M. smegmatis cell suspensions (0.9 mL) and O2-saturated PBS (0.1 mL) that were stirred at 250 rpm at room temperature. After initial measurements, 0.1 mL of CO-saturated PBS was added into the assay mixture. Changes in O2 concentrations were recorded using Unisense Logger Software (Unisense, Denmark). Upon observing a linear change in O2 concentration, rates of consumption were calculated over a period of 20 s and normalized against total protein concentration.

Gene expression analysis

To assess CO dehydrogenase gene expression by qRT-PCR, synchronized 30 mL cultures of M. smegmatis were grown in triplicate in either 125 mL aerated conical flasks or 120 mL sealed serum vials supplemented with 1% (w/v) CO. Cultures were quenched at mid-exponential phase (OD600 ~ 0.25) or mid-stationary phase (3 days post-ODmax ~ 0.9) with 60 mL cold 3:2 glycerol:saline solution (−20 °C). They were subsequently harvested by centrifugation (20,000 × g, 30 min, −9 °C), resuspended in 1 mL cold 1:1 glycerol:saline solution (−20 °C), and further centrifuged (20,000 × g, 30 min, −9 °C). For cell lysis, pellets were resuspended in 1 mL TRIzol Reagent, mixed with 0.1 mm zircon beads, and subjected to five cycles of bead-beating (4000 rpm, 30 s) in a Biospec Mini-Beadbeater. Total RNA was subsequently extracted using the phenol-chloroform method as per manufacturer’s instructions (TRIzol Reagent User Guide, Thermo Fisher Scientific) and resuspended in diethylpyrocarbonate (DEPC)-treated water. RNA was treated with DNase using the TURBO DNA-free kit (Thermo Fisher Scientific) as per the manufacturer’s instructions. RNA concentration, purity, and integrity were confirmed by using a NanoDrop ND-1000 spectrophotometer and running extracts on a 1.2% agarose gel. cDNA was then synthesized using SuperScript III First-Strand Synthesis System for qRT-PCR (Thermo Fisher Scientific) with random hexamer primers as per the manufacturer’s instructions. qPCR was used to quantify the copy numbers of the target gene coxL and housekeeping gene sigA against amplicon standards of known concentration. A standard curve was created based on the cycle threshold (Ct) values of coxL and sigA amplicons that were serially diluted from 108 to 10 copies (R2 > 0.99). The copy number of the genes in each sample was interpolated based on each standard curve and values were normalized to sigA expression in exponential phase in ambient air. For each biological replicate, all samples, standards, and negative controls were run in technical duplicate. All reactions were run in a single 96-well plate using the PowerUp SYBR Green Master Mix (Thermo Fisher Scientific) and LightCycler 480 Instrument (Roche) according to each manufacturers’ instructions.

Growth and survival assays

For growth and survival assays, cultures were grown in 30 mL media in either 125 mL aerated conical flasks or 120 mL sealed serum vials containing an ambient air headspace amended with 20% (v/v) CO. Growth was monitored by measuring optical density at 600 nm (1 cm cuvettes; Eppendorf BioSpectrometer Basic); when OD600 was above 0.5, cultures were diluted ten-fold before measurement. All growth experiments were performed using three biological replicates. To count colony-forming units (CFU mL−1), each culture was serially diluted in HdB (no carbon source) and spotted on to agar plates in technical quadruplicates. Survival experiments were performed on two separate occasions using three biological replicates in the first experiment and six biological replicates in the second experiment. Percentage survival was calculated for each replicate by dividing the CFU mL−1 at each timepoint with the CFU mL−1 count at ODmax.

Glycerol quantification

Glycerol concentration in media was measured colorimetrically. Samples of 900 µL were taken periodically from triplicate cultures during growth, cells were pelleted (9500 × g, 2 min, room temperature) and the supernatant was collected and stored at −20 °C. Glycerol content for all supernatant samples was measured simultaneously in a single 96-well plate using a Glycerol Assay Kit (Sigma-Aldrich) as per manufacturer’s instructions. Absorbance was measured at 570 nm using an Epoch 2 microplate reader (BioTek). A standard curve was constructed using four standards of glycerol (0 mM, 0.3 mM, 0.6 mM, and 1 mM; R2 > 0.99). Glycerol concentration was interpolated from this curve. Samples were diluted either five-fold or two-fold in UltraPure water such that they fell within the curve. All samples, standards and blanks were run in technical duplicate.

Genome survey

We compiled the amino acid sequences of the catalytic subunits of all putative form I CO dehydrogenases (CoxL) represented in the National Center for Biotechnology Information (NCBI) Reference Sequence (RefSeq) [73]. All sequences with greater than 55% sequence identity and 90% query coverage to CoxL sequences of Oligotropha carboxidovorans (WP_013913730.1), Mycobacterum smegmatis (WP_003892166.1), and Natronorubrum bangense (WP_006067999.1) were retrieved by protein BLAST [74]. Homologous sequences with less than 55% sequence encoded form II CO dehydrogenases and hence were not retrieved. The dataset was manually curated to dereplicate sequences within species and remove incomplete sequences. The final dataset contained a total of 709 CoxL sequences across 685 different bacterial and archaeal species (Table S3).

Phylogenetic analysis

To construct phylogenetic trees, the retrieved sequences were aligned using ClustalW in MEGA7 [75]. Initially, the phylogenetic relationships of 709 sequences were visualized on a neighbor-joining tree based on the Poisson correction method and bootstrapped with 500 replicates. Subsequently, the phylogenetic relationships of a representative subset of 94 sequences were visualized on a maximum-likelihood tree based on the Poisson correction method and bootstrapped with 200 replicates. Both trees were rooted with the protein sequences of five form II CO dehydrogenase catalytic subunit sequences (WP_012893108.1, WP_012950878.1, WP_013076571.1, WP_01359081.1, WP_013388721.1). We confirmed that trees of similar topology were produced upon using a range of phylogenetic methods, namely neighbor-joining, maximum-parsimony and maximum-likelihood in MEGA, Mr Bayes, phyml, and iqtree. In addition, equivalent trees were created by using the protein sequences of the CO dehydrogenase medium subunit (CoxM), small subunits (CoxS), or concatenations of all three subunits (CoxLMS). Varying the form II CO dehydrogenase sequence used also had no effect on the overall topology.

Metagenome and metatranscriptome analysis

Forty pairs of metagenomes and metatranscriptomes that encompassed a range of soil and marine sample types were selected and downloaded from the Joint Genome Institute (JGI) Integrated Microbial Genomes System [76] and the NCBI Sequence Read Archive (SRA) [77]. Table S5 provides details of the datasets used. Raw metagenomes and metatranscriptomes were subjected to quality filtering using NGS QC Toolkit [78] (version 2.3.3, default settings, i.e., base quality score and read length threshold are 20 and 70%, respectively). SortMeRNA [79] (version 2.1, default settings and default rRNA databases) was used to removed ribosomal RNA (rRNA) reads from metatranscriptomes. Each metagenome and metatranscriptome was subsampled to an equal depth of 5 million reads and 2 million reads, respectively, using seqtk ( seeded with parameter -s100. Subsampled datasets were then screened in DIAMOND (version, default settings, one maximum target sequence per query) [80] using the 709 CoxL protein sequences (Table S3) and the 3261 hydrogenase catalytic subunit gene sequences from HydDB [81]. Hits to CoxL were filtered with an amino acid alignment length over 40 residues and a sequence identity over 60%. Clade classification of the reads was based on their closest match to the CoxL sequence dataset. Hydrogenase hits were filtered with the same amino acid alignment length cutoff and a sequence identity over 50%. Group 4 [NiFe]-hydrogenase hits with a sequence identity below 60% were discarded.


Mycobacterium smegmatis synthesizes carbon monoxide dehydrogenase in response to organic carbon starvation

We first performed a proteome analysis to gain a system-wide context of the levels of CO dehydrogenase during growth and survival of M. smegmatis. Shotgun proteomes were compared for triplicate cultures grown in glycerol-supplemented minimal media under two conditions: mid-exponential growth (OD600 ~ 0.25; 5.1 mM glycerol left in medium) and mid-stationary phase following carbon limitation (72 h post ODmax ~ 0.9; no glycerol detectable in medium) (Fig. 1a). There was a major change in the proteome profile, with 270 proteins more abundant and 357 proteins less abundant by at least four-fold (p < 0.05) in the carbon-limited condition (Fig. 1b; Table S2).

Fig. 1
figure 1

Comparison of proteome composition of carbon-replete and carbon-limited cultures of Mycobacterium smegmatis. a Growth of M. smegmatis in Hartmans de Bont minimal medium supplemented with 5.8 mM glycerol. The glycerol concentration of the external medium is shown. Error bars show standard deviations of three biological replicates. Cells were harvested for proteomic analysis at OD600 = 0.25 (mid-exponential phase, glycerol-rich) and 3 days post ODmax (mid-stationary phase, glycerol-limited). b Volcano plot showing relative expression change of genes following carbon limitation. Fold change was determined by dividing the relative abundance of each protein in three stationary phase proteomes with that in the three exponential phase proteomes (biological replicates). Each protein is represented by a gray dot. Structural subunits of selected metabolic enzymes, including the form I CO dehydrogenase, are highlighted and their locus numbers are shown in subscript in the legend

The top 50 proteins with increased abundance included those involved in trace gas metabolism and amino acid catabolism. In line with our hypotheses, there was an increase in the structural subunits encoding a putative form I CO dehydrogenase, including a 54-fold increase in the catalytic subunit CoxL. Levels of the two uptake hydrogenases also increased, particularly the catalytic subunit of hydrogenase-2 (HhyL, 148-fold), in line with previous observations that mycobacteria persist on atmospheric H2 [54, 63]. There was also evidence that M. smegmatis generates additional reductant in this condition by catabolizing amino acid reserves: the three subunits of a branched-chain keto-acid dehydrogenase complex were the most differentially abundant proteins overall and there was also a strong induction of the proline degradation pathway, including the respiratory proline dehydrogenase (Fig. 1b).

The abundance of various enzymes mediating organic carbon catabolism decreased, including the respiratory glycerol 3-phosphate dehydrogenase (10-fold) and glycerol kinase (8-fold), in line with cultures having exhausted glycerol supplies (Fig. 1b). The proteome also suggests that various energetically-expensive processes, such as cell wall, ribosome, and DNA synthesis, were downregulated (Table S2). Overall, these results suggest that M. smegmatis reduces its energy expenditure and expands its metabolic repertoire, including by oxidizing CO, to stay energized during starvation.

Carbon monoxide dehydrogenase mediates atmospheric CO oxidation and supports aerobic respiration

Having confirmed that a putative CO dehydrogenase is present in stationary-phase M. smegmatis cells, we subsequently confirmed its activity through whole-cell biochemical assays. To do so, we constructed a markerless deletion of the coxL gene (MSMEG_0746) (Fig. S1). Native polyacrylamide gels containing fractions of wild-type M. smegmatis harvested in carbon-limited stationary-phase cells strongly stained for CO dehydrogenase activity in a 100% CO atmosphere; the molecular mass of the band corresponds to the theoretical molecular mass of a dimer of CoxLMS subunits (~269 kDa). However, no activity was observed in the ΔcoxL background (Fig. 2a).

Fig. 2
figure 2

Comparison of carbon monoxide dehydrogenase activity of Mycobacterium smegmatis wild-type and ΔcoxL cultures. a Zymographic observation of CO dehydrogenase activity and localization. The upper gel shows enzyme activity stained with the artificial electron acceptor nitroblue tetrazolium chloride in a CO-rich atmosphere. The lower gel shows protein ladder and whole protein stained with Coomassie Blue. Results are shown for whole-cell lysates (L), cytosolic fractions (C), and membrane fractions (M) of wild-type (WT) and ΔcoxL cultures. b Gas chromatography measurement of CO oxidation to sub-atmospheric levels. Mixing ratios are displayed on a logarithmic scale, the dotted line shows the average atmospheric mixing ratios of CO (90 ppbv), and error bars show standard deviations of three biological replicates. c Apparent kinetic parameters of CO oxidation by wild-type cultures. Curves of best fit and kinetic parameters were calculated based on a Michaelis–Menten non-linear regression model. Vmax app and Km app values derived from other models are shown in Table S4. d Examples of traces from oxygen electrode measurements. O2 levels were measured before and after CO addition in both a wild-type and ΔcoxL background. e Summary of rates of O2 consumption measured using an oxygen electrode. Center values show means and error bars show standard deviations from three biological replicates. For all values with different letters, the difference between means is statistically significant (p< 0.001) based on Student’s t-tests

Gas chromatography measurements confirmed that M. smegmatis oxidized carbon monoxide at atmospheric concentrations. Stationary-phase cultures consumed the CO added to the headspace (~200 ppmv) to sub-atmospheric concentrations (46 ± 5 ppbv) within 100 h (Fig. 2b). The apparent kinetic parameters of this activity (Vmax app = 3.13 nmol gdw−1 min−1; Km app = 350 nM; threshold app = 43 pM) are consistent with a moderate-affinity, slow-acting enzyme (Fig. 2c; Table S4). Such rates are similar to those previously measured for hydrogenase-2 [63]. It is important to note, however, that measurements are based on whole-cell activities and may not reflect the kinetics of the purified enzyme. No change in CO mixing ratios was observed for the ΔcoxL strain (Fig. 2b), confirming that the form I CO dehydrogenase is the sole CO-oxidizing enzyme in M. smegmatis. In turn, these results provide the first genetic proof that form I CO dehydrogenases mediate atmospheric CO oxidation.

We performed oxygen electrode experiments to confirm whether CO addition stimulated aerobic respiration. In stationary-phase cultures, addition of CO caused a 15-fold stimulation of respiratory O2 consumption relative to background rates (p< 0.0001). This stimulation was observed in the wild-type strain, but not the ΔcoxL mutant, demonstrating it is dependent on CO oxidation activity of the CO dehydrogenase (Fig. 2d, e). Thus, while this enzyme is predominantly localized in the cytosol (Fig. 2a), it serves as a bona fide respiratory dehydrogenase that supports aerobic respiration in M. smegmatis.

Carbon monoxide is dispensable for growth and detoxification, but enhances survival during carbon starvation

We then performed a series of experiments to resolve the expression and importance of the CO dehydrogenase during growth and survival. Consistent with the proteomic analyses, expression levels of coxL were low in carbon-replete cultures (mid-exponential phase; 1.35 × 107 transcripts gdw−1) and increased 56-fold in carbon-limited cultures (mid-stationary phase; 7.48 × 108 transcripts gdw−1; p < 0.01). Addition of 1% CO did not significantly change coxL expression in either growing or stationary cultures (Fig. 3a). These profiles suggest that M. smegmatis expresses CO dehydrogenase primarily to enhance survival by scavenging atmospheric CO, rather than to support growth on elevated levels of CO.

Fig. 3
figure 3

Expression and importance of carbon monoxide dehydrogenase during growth and survival of Mycobacterium smegmatis. a Normalized number of transcripts of the CO dehydrogenase large subunit gene (coxL; MSMEG_0746) in wild-type cultures harvested during exponential phase (carbon-replete) and stationary phase (carbon-limited) in the presence of either ambient CO or 1% CO. Error bars show standard deviations of four biological replicates. For all values with different letters, the difference between means is statistically significant (p< 0.01) based on Student’s t-tests. b Final growth yields (ODmax) and specific growth rates wild-type and ΔcoxL strains. Strains were grown on Hartman de Bont minimal medium supplemented with either 5.5 mM glycerol, 20% CO, or both 5.5 mM glycerol and 20% CO. Values labeled with different letters are significantly different (p < 0.05) based on Student’s t-tests. Error bars show standard deviations of three biological replicates. c Long-term survival of wild-type and ΔcoxL strains in Hartman de Bont minimal medium supplemented with either 5.5 mM glycerol. Percentage survival was calculated by dividing the colony-forming units (CFU mL−1) at each timepoint with those counted at ODmax (day 0). Error bars show standard deviations of nine biological replicates. For asterisked values, there was a significant difference in survival of ΔcoxL strains compared to the wild-type (p < 0.05) based on Student’s t-tests

These inferences were confirmed by monitoring the growth of the wild-type and ΔcoxL strains under different conditions. The strains grew identically on glycerol-supplemented minimal medium. Addition of 20% CO caused a slight increase in doubling time for both strains and did not affect growth yield (Fig. 3b). This suggests that M. smegmatis is highly tolerant of CO but does not require CO dehydrogenase to detoxify it. M. smegmatis did not grow chemolithoautotrophically on a minimal medium with 20% CO as the sole carbon and energy source (Fig. 3b). While carboxydotrophic growth was previously reported for this strain, the authors potentially observed CO-tolerant heterotrophic or mixotrophic growth, given the reported media contained metabolizable organic carbon sources [40]. Consistently, M. smegmatis lacks key enzymes of the Calvin–Benson cycle (e.g., RuBisCO, ribulose 1,5-bisphosphate carboxylase) typically required for carboxydotrophic growth.

Finally, we monitored the long-term survival of the two strains after they reached maximum cell counts upon exhausting glycerol supplies (Fig. 1a). The percentage survival of the ΔcoxL strain was lower than the wild-type at all timepoints, including by 45% after 4 weeks and 50% after 5 weeks of persistence. These findings were reproducible across two independent experiments and were significant at the 98% confidence level (Fig. 3c). Such reductions in relative percentage survival are similar to those previously observed for uptake hydrogenase mutants in M. smegmatis (47%) [53, 54] and Streptomyces avermilitis (74%) [57]. These experiments, therefore, provide genetic evidence that atmospheric CO oxidation mediated by form I CO dehydrogenases enhances bacterial persistence. It should be noted that we did not attempt to complement the observed phenotypes, though whole-genome sequencing confirmed that no other substitutions were present in ΔcoxL compared to the wild-type cells.

Atmospheric carbon monoxide oxidation is an ancient, taxonomically widespread and ecologically important process

We subsequently surveyed genomic, metagenomic, and metatranscriptomic datasets to gain insights the taxonomic and ecological distribution of atmospheric CO oxidation. This yielded 709 amino acid sequences encoding large subunits of the form I CO dehydrogenases (CoxL) across some 685 species, 196 genera, 49 orders, and 25 classes of bacteria and archaea (Table S3; Fig. 4a, b). The retrieved sequences encompassed all sequenced species, across seven phyla (Fig. 4b), that have previously been shown to mediate aerobic CO oxidation (Table S1). We also detected coxL genes in nine other phyla where aerobic CO oxidation has yet to be experimentally demonstrated (Fig. 4b). Hence, the capacity for aerobic CO respiration appears to be a much more widespread trait among aerobic bacteria and archaea than previously reported [49, 62]. It is particularly notable that coxL genes were detected in representatives of seven of the nine [64, 82] most dominant soil phyla, namely Proteobacteria, Actinobacteriota, Acidobacteriota, Chloroflexota, Firmicutes, Gemmatimonadota, and Bacteroidota (Fig. 4b). While most species surveyed encoded a single copy, 16 actinobacterial species encoded two isozymes of CO dehydrogenase (Table S3).

Fig. 4
figure 4

Distribution of carbon monoxide dehydrogenases in genomes, metagenomes, and metatranscriptomes. a Maximum-likelihood phylogenetic tree showing the evolutionary history of the catalytic subunit of the form I CO dehydrogenase (CoxL). Evolutionary distances were computed using the Poisson correction model, gaps were treated by partial deletion, and the tree was bootstrapped with 200 replicates. The tree was constructed using a representative subset of 94 CoxL amino acid sequences from Table S3 and a neighbor-joining tree containing all 709 CoxL sequences retrieved in this study is provided in Fig. S2. The major clades of the tree are labeled, and the colored bars represent the phylum that each sequence is affiliated with. The tree was rooted with five form II CO dehydrogenase sequences (not shown). b Phylum-level distribution of the CoxL-encoding species and orders identified in this work. c Abundance of coxL genes and transcripts in environmental samples. In total, 40 pairs of metagenomes and metatranscriptomes (20 aquatic, 20 terrestrial) were analyzed from a wide range of biomes (detailed in Table S5). The abundance of hhyL genes and transcripts, encoding the high-affinity group 1 h [NiFe]-hydrogenase, are shown for comparison. Box plots show the individual values and their mean, quartiles, and range for each dataset

We constructed phylogenetic trees to visualize the evolutionary relationships of CoxL protein sequences (Fig. 4a; Fig. S2). The trees contained five monophyletic clades that differed in phylum-level composition, namely actinobacterial, proteobacterial, and halobacterial clades, as well as mid-branching major (mixed 1) and minor (mixed 2) clades of mixed composition containing representatives from seven and three different phyla respectively. Clades were well-supported by bootstrap values, with exception of the mixed 2 clade (Fig. 4a; Fig. S2). Trees with equivalent clades were produced when using seven distinct phylogenetic methods, using other CO dehydrogenase subunits (CoxM, CoxS, and CoxLMS concatenations), or varying the outgroup sequences. In all cases, major clades included CoxL proteins of at least one previously characterized carboxydotroph or carboxydovore (Table S1). Surprisingly, all clades also contained species that have been previously shown to oxidize atmospheric CO (Table S1). This suggests that atmospheric CO oxidation is a widespread and ancestral capability among CO dehydrogenases. In contrast, CO dehydrogenases known to support aerobic carboxydotrophic growth were sparsely distributed across the tree (Fig. 4a; Table S1).

To better understand the ecological significance of aerobic CO oxidation, we surveyed the abundance of coxL sequences across 40 pairs of metagenomes and metatranscriptomes (Table S5). Genes and transcripts for coxL were detected across a wide range of biomes. They were particularly abundant in the oxic terrestrial and marine samples surveyed (1 in every 8000 reads), for example, grassland and rainforest soils, coastal and mesopelagic seawater, and salt marshes (Table S6). In contrast, they were expressed at very low levels in anaerobic samples (e.g., groundwater, deep subsurface, peatland) (Figure S3). Across all surveyed metatranscriptomes, the majority of the coxL hits were affiliated with the mixed 1 (40%), proteobacterial (25%), and actinobacterial (25%) clades, with minor representation of the mixed 2 (8%) and halobacterial (2%) clades (Table S6). The normalized transcript abundance of coxL was higher than the genetic determinants of atmospheric H2 oxidation (hhyL; high-affinity hydrogenase) in most samples (18-fold in aquatic samples, 1.2-fold in terrestrial samples) (Fig. 4c). Together, this suggests that CO oxidation is of major importance in aerated environments and is mediated by a wide range of bacteria and archaea.


In this work, we validated that atmospheric CO oxidation supports bacterial survival during nutrient limitation. M. smegmatis increases the transcription and synthesis of a form I CO dehydrogenase by 50-fold in response to organic carbon limitation. Biochemical studies confirmed that this enzyme is kinetically adapted to scavenge atmospheric concentrations of CO and uses the derived electrons to support aerobic respiration. In turn, deletion of the genes encoding the enzyme did not affect growth under a range of conditions, but resulted in severe survival defects in carbon-exhausted cultures. These observations are reminiscent of previous observations that M. smegmatis expresses two high-affinity hydrogenases to persist by scavenging atmospheric H2 [53,54,55, 63]. In common with atmospheric H2, atmospheric CO is a high-energy, diffusible, and ubiquitous trace gas [28], and is, therefore, a dependable source of energy to sustain the maintenance needs of bacteria during persistence. Overall, the proteome results suggest that M. smegmatis activates CO scavenging as a core part of a wider response to enhance its metabolic repertoire; the organism appears to switch from acquiring energy organotrophically during growth to mixotrophically during survival by scavenging a combination of inorganic and organic energy sources.

Despite this progress in resolving the physiological role of CO oxidation in this organism, detailed mechanistic studies are required to understand how M. smegmatis and other carboxydovores gain energy from atmospheric CO oxidation. Firstly, it is unclear what enables CO dehydrogenase to bind and oxidize atmospheric CO. It is important to compare whole-cell kinetic parameters with those of the purified enzyme, given enzyme activity is likely to be influenced by both structural features and cellular context. While it is probable that structural adaptations of the enzyme contribute to high-affinity binding, the only solved structures of molybdenum–copper CO dehydrogenases to date are from the apparent low-affinity carboxydotroph O. carboxidovorans [24, 25]. Secondly, it is unclear how CO dehydrogenase inputs electrons into the aerobic respiratory chain. Our study indicates that the CO dehydrogenase is primarily cytosolic, though it can’t be ruled out that it makes weak or transient associations with the cell membrane. The proteome data shows M. smegmatis expresses CoxG (MSMEG_0749), which is implicated as a membrane anchor for CO dehydrogenase in O. carboxidovorans [83,84,85]. Thirdly, further studies are required to resolve how M. smegmatis couples CO oxidation to O2 reduction, including with respect to reaction stoichiometry, electron flow, and terminal oxidase selectivity. In this regard, one discrepancy is that we observed a surprisingly high rate of O2 reduction (oxygen electrode measurements) compared to CO oxidation (gas chromatography); side-by-side analyses are required to determine whether these findings are physiologically relevant or instead reflect methodological differences.

Looking more broadly, it is probable that CO supports the persistence of many other bacterial and archaeal species. Atmospheric CO oxidation is a common trait among all carboxydovores tested to date and has been experimentally demonstrated in 18 diverse genera of bacteria and archaea [19, 29, 33, 36, 43, 44, 48]. In this regard, a recent study demonstrated that the hot spring bacterium Thermomicrobium roseum (phylum Chloroflexota) upregulates a form I CO dehydrogenase and oxidizes atmospheric CO as part of a similar response to carbon starvation [44]. It has also been demonstrated that the form I CO dehydrogenases of the known atmospheric CO scavenger Ruegeria pomolori [58] and a Phaeobacter isolate [59] from the marine Roseobacter clade (phylum Proteobacteria) are also highly upregulated under energy-limiting conditions. The capacity for atmospheric CO uptake has also been demonstrated in four halophilic archaeal genera (phylum Halobacterota) [33, 48] and may also extend to thermophilic archaea (phylum Crenarchaeota) [46, 47]. Moreover, two cultured aerobic methanotrophs harbor the capacity for aerobic CO respiration [86, 87]. Our study, by showing through a molecular genetic approach that CO oxidation enhances survival, provides a physiological rationale for these observations. Altogether, this suggests that most organisms encoding form I CO dehydrogenases use this enzyme to support survival rather than growth.

These results also have broader implications for understanding the biogeochemical cycling and microbial biodiversity at the ecosystem level. It is well-established that soil bacteria are major net sinks for atmospheric CO and marine bacteria mitigate geochemical oceanic emissions of this gas [10]. This study, by confirming the enzymes responsible and demonstrating that their activities support bacterial persistence, has ramifications for modeling these biogeochemical processes. In turn, we propose that CO is an important energy source supporting the biodiversity and stability of aerobic heterotrophic communities in terrestrial and aquatic environments. The genomic survey supports this by demonstrating that form I CO dehydrogenases, most of which are predicted to support atmospheric CO oxidation, are encoded by 685 species and 16 phyla of bacteria and archaea. In turn, the metagenomic and metatranscriptomic analyses confirmed that coxL genes and transcripts are highly abundant in most aerated soil and marine ecosystems. The notably high abundance of coxL transcripts in pelagic samples of various depths suggests CO may be a major energy source for maintenance of marine bacteria. In soils, the oxidation of atmospheric CO may be of similar importance to atmospheric H2; this is suggested by the strength of the soil sinks for these gases [1, 88], the abundance of coxL and hhyL genes in soil metagenomes, and the distribution of these genes in the genomes of soil bacteria [89]. Atmospheric CO may be especially important for sustaining communities in highly oligotrophic soils, as indicated by previous studies in polar deserts [51], volcanic deposits [60, 62, 90], and salt flats [33, 91, 92]. Further work is now needed to understand which microorganisms mediate consumption of atmospheric CO in situ and how their activity is controlled by physicochemical factors.

Integrating these findings with the wider literature, we propose a new survival-centric model for the evolution of CO dehydrogenases. It was traditionally thought that aerobic CO oxidation primarily supports autotrophic and mixotrophic growth of microorganisms [11, 26]. However, the majority of studied CO-oxidizing bacteria are, in fact, carboxydovores, of which those that have been kinetically characterized can oxidize CO at sub-atmospheric levels (Table S1). In turn, our phylogenomic analysis revealed that atmospheric CO-oxidizing bacteria are represented in all five clades of the phylogenetic tree, suggesting that the common ancestor of these enzymes also harbored sufficient substrate affinity to oxidize atmospheric CO. On this basis, we propose that microorganisms first evolved a sufficiently high-affinity form I CO dehydrogenase to subsist on low concentrations of CO. The genes encoding this enzyme were then horizontally and vertically disseminated to multiple bacterial and archaeal genera inhabiting different environments. On multiple occasions, certain bacterial lineages evolved to support growth on CO in microenvironments where present at elevated concentrations. This would have required relatively straightforward evolutionary innovations, namely acquisition of Calvin–Benson cycle enzymes (e.g., RuBisCO) and their integration with CO dehydrogenase. The modulation of CO dehydrogenase kinetics was likely not a prerequisite, given these enzymes efficiently oxidize CO at a wide range of substrate concentrations [19, 44], but may have subsequently enhanced carboxydotrophic growth. In this regard, it remains to be explored whether some cultivated carboxydotrophs can also support persistence using trace concentrations of CO. These evolutionary inferences differ from hydrogenases, where high-affinity, oxygen-tolerant enzymes appear to have evolved from low-affinity, oxygen-sensitive ones [89]. However, it is probable that the processes of atmospheric CO and H2 oxidation evolved due to similar physiological pressures and over similar evolutionary timescales.