Main

Aerobic microorganisms usually oxidize their carbon sources to carbon dioxide and water. During this process, energy and intermediary metabolites required for biosynthesis are generated. In contrast, many acetic acid bacteria oxidize their substrates incompletely even under normal growth conditions. Among them is the α-proteobacterium G. oxydans, a Gram-negative, obligately aerobic and rod-shaped acidophilic organism1 that belongs to the family Acetobacteriaceae. G. oxydans is known for its incomplete oxidation of a wide range of carbohydrates and alcohols in a process that is referred to as oxidative fermentation. The corresponding products (aldehydes, ketones and organic acids) are secreted almost completely into the medium. The organism is able to grow in highly concentrated sugar solutions and at low pH values. High oxidation rates correlate with low biomass production, making G. oxydans strains suitable objects for industrial application2.

Because these organisms oxidize many substrates regioselectively, G. oxydans strains are used for the production of L-sorbose from D-sorbitol (vitamin C synthesis)3 and 6-amino-L-sorbose from 1-amino-D-sorbitol for the synthesis of the antidiabetic drug Miglitol4. They also can be used industrially to produce D-gluconic acid, and ketogluconic acids from D-glucose and dihydroxyacetone from glycerol5. Strains of this genus also produce aliphatic, aromatic carboxylic and thiocarboxylic acids that find applications as flavoring ingredients6. Furthermore, Gluconobacter enzymes and whole cells are thought to be ideal for use as the biological element in sensor systems for the detection of alcohols, sugars and polyols7. Natural habitats of G. oxydans are flowers and fruits8. The organism is also found in alcoholic beverages (wines and beers) and soft drinks where it causes off-flavors and spoilage9.

Here we report the complete genome sequence of G. oxydans 621H (DSM 2343). The genome data provide a rich source for metabolic reconstruction of the pathways leading to industrially important products derived from sugars and alcohols.

Results

General features of the G. oxydans genome

The entire genome of G. oxydans 621H was sequenced to obtain detailed insights into the oxidative potential of the organism and to elucidate the mechanisms of incomplete oxidation. The circular chromosome consists of 2,702,173 base pairs (bp) with a G+C content of 60.8% (Table 1). In addition five plasmids were identified (163.1, 26.6, 14.6, 13.2 and 2.7 kb). Hence, the total size of the genome is 2,922,384 bp. There are a total of 2,664 predicted protein-encoding open reading frames (ORFs), four copies of rRNA operons and 55 genes encoding tRNAs. It is predicted that 89.9% of the DNA encodes proteins or stable RNAs. Biological roles were assigned to 1,877 ORFs (70.5%; see Supplementary Table 1 online). 446 ORFs showed similarities to hypothetical proteins from other organisms, and 341 ORFs were unique to G. oxydans (Table 1). A total of 1,650 proteins showed high homologies to proteins from other bacteria (threshold: blastp e-value <e−15). Of these proteins 76.4% (1,260) were most homologous to proteins from α-proteobacteria (see Supplementary Table 2 online). Most prominent were Rhodospirillum rubrum and Magnetospirillum magnetotacticum. The GC skew analysis10 of the chromosome from G. oxydans indicated a bidirectional replication that starts at the proposed oriC sequence near the dnaA gene and ends near the calculated replication terminus at around 1.3 Mb (Supplementary Fig. 1 online).

Table 1 General features of the G. oxydans genome

Also present in the G. oxydans genome is a large number of repeated DNA elements, which are known to be involved in genomic rearrangements. We identified 82 insertion sequences (IS) and 103 transposase genes (Supplementary Table 1 online). Some of these copies are partially deleted, and therefore, presumed defective. Most of the IS, however, appear to have functional copies and may be responsible for the genetic instability leading to deficiencies in various physiological properties as observed in a variety of acetic acid bacteria11. Some IS elements can be grouped into the families IS12528 (ten copies) and IS1032 (eight copies)12.

G. oxydans has many two-component regulatory systems (15 sensor kinases and 16 response regulators, including 6 sensor-regulator pairs), which mediate the response to various environmental stimuli. The genome also codes for 75 regulatory proteins, many of which belong to previously described regulator families such as LysR (13), AraC (4), MarR (3), TetR (6), AsnC (2) and MerR (2). G. oxydans appears to have a simple chemosensory transducer protein system. The genome contains one complete set of chemotaxis genes that are organized in one gene cluster, and three copies of genes encoding methyl-accepting chemotaxis proteins (MCPs). The chemotactic response is accomplished by signal transmission between the receptor complexes and the flagellar motor complexes. There are two copies of CheY (GOX1551, GOX1556), which act as messenger proteins, transducing the signal from the MCPs to the flagella13. The genome contains 27 predicted flagellar genes, most of them organized in six gene clusters (see Supplementary Table 1 online). The flagellum is essentially composed of three substructures, a long helical filament that is connected via the flexible hook to the complex basal body, which is located in the cell wall. All genes encoding the structural proteins of the flagellum were found in the G. oxydans genome. These results are in accordance with morphological features of this organism that indicate the presence of peritrichous flagella14.

With respect to cell wall assembly, all essential genes were identified that are involved in biosynthesis of lipopolysaccharides and peptidoglycan. The organism contains a gene cluster comprising 14 genes (GOX1477–1490) that is probably required for the production of a cell surface capsular polysaccharide. Furthermore, we identified all genes necessary for the biosynthesis of squalen and hopanoids (see Supplementary Table 1 online).

Plasmids and phages

None of the five plasmids show homology to known plasmids of other G. oxydans strains (e.g., pAG5 (ref. 15) and pGO128 (M. Sievers, direct submission to GenBank; Gen ID: NC_003374)). However, pGOX5 revealed substantial similarities over 1.5 kb with plasmid pJK21 from Acetobacter europaeus16. A BLAST search with the RepA proteins of plasmids pGOX1, pGOX2, pGOX3 and pGOX5 indicated high homologies to the corresponding proteins encoded by genes of plasmids from other α proteobacteria.

Around 70% of the ORFs of the G. oxydans 621H plasmids encode hypothetical proteins. Genes encoding proteins with putative functions include a DNA helicase II (umuD), a restriction/modification system, a C4-dicarboxylate transporter, a heavy metal resistance system, proteins involved in plasmid replication and two alcohol dehydrogenases with unknown substrate spectra (GOX2594 and GOX2684; for additional information see Supplementary Table 3 online). Additionally, the megaplasmid contains the genetic information for DNA transfer via conjugation (Supplementary Table 1 online). We detected two regions (GOX2318–2357 and GOX1211–1226) with ORFs that show similarity to known phage genes. G. oxydans strains have been reported to contain dsDNA phages with contractile tails (Myoviridae, e.g., phage GW6210) and noncontractile phages with short tails (Podoviridae, e.g., JW2040)17. They were identified by electron microscopy and DNA sequences are not available.

Respiratory chain

The knowledge of the DNA sequence forms the basis for the elucidation of the respiratory chain as a key element of the oxidative metabolism in G. oxydans. The core system is rather simple, consisting of a non-proton-translocating NADH:ubiquinone oxidoreductase and two quinol oxidases of the bo3 and bd type, respectively (Fig. 1). The former protein has been characterized biochemically18. The identification of a bd-type quinol oxidase explains previous findings that G. oxydans contains a cyanide-insensitive terminal oxidase that is produced at low pH values19. The organism lacks a proton-translocating NADH: ubiquinone oxidoreductase (complex I) and a cytochrome c oxidase (complex IV). Therefore, the ability to translocate protons in the course of redox reactions is rather limited (Fig. 1). Genes encoding a ubiquinol:cytochrome c oxidoreductase (bc1 complex) were identified. However, the function of the corresponding protein complex is unclear because reduced cytochrome c cannot be reoxidized by complex IV, which is missing in this organism. The electrochemical proton gradient is used to generate ATP via an F1Fo-type ATP synthase. The corresponding genes are organized in two clusters, which are located at different positions on the chromosome. Cluster I (GOX1110–1113) encodes the hydrophobic membrane-bound subunits and cluster II (GOX1310–1314) contains the genes for the hydrophilic part of the enzyme (see Supplementary Table 1 online).

Figure 1: Respiratory chain and sugar/alcohol metabolism in G. oxydans 621H.
figure 1

(1) Membrane-bound lactate dehydrogenase (GOX1253). (2) PQQ-dependent alcohol dehydrogenase (GOX1067–1068). (3) Membrane-bound acetaldehyde dehydrogenase (GOX0585–0587). (4) PQQ-dependent glucose dehydrogenase (GOX0265). (5) Membrane-bound gluconate-2-dehydrogenase (GOX1230–1232). (6) Glycerol/sorbitol dehydrogenase (SldAB; GOX854–855). (7) Sorbitol dehydrogenase (fructose forming; GOX2094–2097). (8) Uncharacterized membrane-bound oxidoreductases (see Supplementary Table 3 online). (9) Uncharacterized flavin-containing oxidoreductases (see Supplementary Table 3 online). (10) Uncharacterized PQQ-containing oxidoreductases (GOX0516, GOX1441, GOX1857, GOX1969). (11) Soluble glucose dehydrogenase (GOX2015). (12) Soluble gluconate dehydrogenase (GOX2187). (13) 2,5-diketogluconate reductase (GOX0644). (14) Uncharacterized cytoplasmic oxidoreductases (see Supplementary Table 3). (15) Soluble sorbitol dehydrogenase (fructose forming; GOX1432). (16) L-sorbose reductase (GOX0849). (17) Pyruvate decarboxylase (GOX1081). (18) Soluble acetaldehyde dehydrogenase (GOX2018). (19) Soluble alcohol dehydrogenase (GOX0313). OxPP indicates that the corresponding intermediates are further metabolized by the oxidative pentose-phosphate pathway. All soluble oxidoreductases are NAD(P) dependent and catalyze reversible reactions. The membrane-bound dehydrogenases transfer electrons to ubiquinone (UQH2). The left part of the figure shows the components of the respiratory chain in G. oxydans 621H: Membrane-bound transhydrogenase (GOX0310–0312); NADH-dehydrogenase (type II), nonproton translocating NADH:ubiquinone oxidoreductase (GOX1675); quinol oxidase (bo3), cytochrome bo3 ubiquinol oxidase (GOX1911–1914); quinol oxidase (bd), cytochrome bd ubiquinol oxidase (GOX0278–0279); bc1 complex, ubiquinol:cytochrome c oxidoreductase (GOX0565–0567), Cytc, cytochrome c (GOX0564). The question marks indicate that the function of the bc1 complex is not clear yet. The function of the enzymes was verified by BLAST and motif searches of the corresponding ORFs against public databases.

Oxidoreductases

In addition to the standard respiratory complexes mentioned above, G. oxydans possesses a large and diverse set of genes encoding membrane-bound dehydrogenases that channel electrons into the respiratory chain (Fig. 1) as well as many intracellular oxidoreductases.

One striking feature of the G. oxydans genome is that 75 ORFs were identified that encode putative dehydrogenases/oxidoreductases of unknown functions (Fig. 1; Supplementary Table 3 online). Among these ORFs, 23 were predicted to contain at least one transmembrane helix and four proteins that are probably located in the periplasm. The organism contains 15 putative oxidoreductases of the short chain dehydrogenase/reductase family that catalyze the reversible oxidation of alcohols to aldehydes with the concomitant reduction of NAD+. We identified four novel pyrroloquinoline quinone (PQQ)-dependent dehydrogenases of which three are membrane-bound with five transmembrane helices in the N-terminal domain. The C- terminal domain faces the outside and comprises the PQQ binding site and the catalytic center. The remaining protein represents a periplasmic PQQ enzyme. Five putative oxidoreductases belong to the zinc-binding dehydrogenase family. These proteins use NADP+ or NAD+ and oxidize alcohols or reduce aldehydes. Six putative soluble oxidoreductases belong to the aldo-keto reductase family that comprise a number of related monomeric NADPH-dependent oxidoreductases. Furthermore, we identified four FAD-linked oxidases, three of which are referred to as putative D-lactate dehydrogenases. Two of these enzymes (GOX1170 and GOX2071) are predicted to be located in the cytoplasm of G. oxydans and are homologs of the corresponding mitochondrial protein from Saccharomyces cerevisae20. The third protein (GOX1253) is related to the respiratory lactate dehydrogenase from Escherichia coli21 and contains one transmembrane helix, indicating that it is membrane bound and located at the inner face of the cytoplasmic membrane. In addition, several other proteins were found that belong to different families of flavin-containing oxidoreductases (for a detailed description of all putative oxidoreductases, see Supplementary Table 3 online). These findings may lead to the development of new strategies for the employment of G. oxydans in a much greater variety of incomplete oxidations. Future studies will indicate the substrate spectra and the catalytic activities of the uncharacterized dehydrogenases in G. oxydans.

To prove that the genes encoding uncharacterized oxidoreductases are transcribed, we analyzed the expression of a selected subset by real time RT-PCR. This approach included genes encoding enzymes with known function as well as proteins that have been identified and annotated because of bioinformatic evidence. The experiments showed that all genes tested were expressed during growth on glucose, some of them in relatively high amounts. The results indicated that the corresponding oxidoreductases were produced and had a metabolic function in G. oxydans (Supplementary Fig. 2 online).

A number of ORFs could be assigned to known oxidoreductases, for example, those containing the cofactor PQQ (Fig. 1). The alcohol dehydrogenase (involved in acetate formation), glucose dehydrogenase (involved in gluconate formation) and glycerol/sorbitol dehydrogenase (SldAB) belong to this group of quinoproteins22. The latter enzyme is the major polyol dehydrogenase in G. oxydans that exhibits a broad substrate specificity23. It catalyzes the oxidation of D-sorbitol, gluconate and glycerol, thereby producing L-sorbose, 5-ketogluconate and dihydroxyacetone, all of which are of great biotechnological interest24,25,26. Other proteins like gluconate-2-dehydrogenase and sorbitol dehydrogenase (SldSLC) contain flavins as prosthetic groups (Fig. 1). The latter protein catalyzes D-sorbitol oxidation to D-fructose and can be used for the biotechnological production of this sweetener5.

It has been shown that membrane-bound dehydrogenases transfer electrons to ubiquinone, which functions as electron donor for the quinol oxidases27. The active sites of the dehydrogenases are oriented towards the periplasm (Fig. 1). Thus, substances that are used as energy sources can be oxidized in the periplasmic space without the need to enter the cytoplasm2. The products of the oxidations are easily released into the medium via porins in the outer membrane of the Gram-negative bacteria. It has been shown that membrane-bound dehydrogenases are responsible for the rapid oxidation of biotechnologically important substrates27,28.

In addition to the membrane-bound dehydrogenases, there is an alternative route for the oxidation of sugars, alcohols and polyols in G. oxydans1,27. The second set of enzymes is located in the cytoplasm, indicating that the uptake of their substrates into the cell is required (Fig. 1). These proteins catalyze reversible reactions and are NAD(P)+ dependent. The substrates are oxidized and the resulting intermediates are phosphorylated and further metabolized via the pentose phosphate pathway. The soluble, NAD(P)+-dependent enzymes are believed to participate in the synthesis of biosynthetic precursors and are obviously involved in the maintenance of cells in the stationary growth phase27. A close inspection of the genome sequence indicated that G. oxydans possesses several genes encoding enzymes of this class for the oxidation of glucose, gluconate, ketogluconate, ethanol and acetaldehyde (Fig. 1). In addition, three proteins were identified that may play a critical role in biotechnological applications for the synthesis of L-sorbose (NADPH-dependent sorbose reductase and fructose-forming sorbitol dehydrogenase) and xylitol (xylitol dehydrogenase)29,30,31.

Intermediary metabolism

All genes encoding the enzymes of the oxidative pentose-phosphate and the Entner-Doudoroff pathways were identified. The oxidative pentose-phosphate pathway is thought to be the most important route for phosphorylative breakdown of sugars and polyols to CO2 (ref. 32). It is noteworthy that the aforementioned soluble dehydrogenases as well as the glucose-6-phosphate and the 6-phosphogluconate dehydrogenase involved in these pathways produce NADPH + H+ in the course of their reactions. It is still a matter of debate how this cofactor is oxidized. Interestingly, the genome data revealed the presence of a proton-translocating nicotinamide nucleotide transhydrogenase in G. oxydans 621H (Fig. 1). The amino acid sequences of the three subunits are highly homologous to the membrane-bound transhydrogenase from Rhodospirillum rubrum (pntAA, pntAB and pntB)33. Transhydrogenase, in animal mitochondria and bacteria, couples hydride transfer between NADH + H+ and NADP+ to proton translocation across a membrane34. Hence, the enzyme might have two functions in G. oxydans: first, the oxidation of NADPH derived from intermediary metabolism, and second, the translocation of protons across the membrane thereby contributing to maintenance of the electrochemical proton gradient (Fig. 1).

Within the genome no ORF could be assigned that potentially encodes a phosphoenolpyruvate synthase or other phosphoenolpyruvate-synthesizing enzymes indicating that G. oxydans cannot produce C6-sugars via gluconeogenesis. Hence, the formation of glucose, for example, from pentoses or glycerol must depend on the oxidative pentose phosphate pathway. In accordance with the literature35, the citrate cycle is incomplete because genes encoding the succinate dehydrogenase are missing.

We predict that G. oxydans has the capability to take up and to channel several polyols, sugars and sugar derivatives into the oxidative pentose phosphate pathway (Fig. 2). In most cases the substrates are first phosphorylated by specific kinases and are further degraded by dehydrogenases and isomerases. Further genome analysis showed that G. oxydans contains metabolic pathways for the de novo synthesis of all nucleotides, amino acids, phospholipids and most vitamins. Proteins for the synthesis of the important cofactor PQQ are encoded by the pqqA-E operon (GOX983–987), which is highly homologous to the one found in G. oxydans IFO 3293 (ref. 36). The organism contains several cytochrome c-type proteins. A close inspection of the genome revealed the presence of all genes for the synthesis and transport of heme c and the maturation of cytochrome c (GOX1648–1652) (Supplementary Table 1 online)37. Ammonia is taken up by a specific transporter (GOX743) and is introduced into biosynthetic pathways by the catalytic activity of glutamate synthase (GOX1851–1852) and glutamine synthetase (GOX1121). We identified a gene encoding a pyruvate decarboxylase (GOX1081) indicating that G. oxydans can produce acetaldehyde from pyruvate. The former intermediate is then oxidized to acetate by a soluble acetaldehyde dehydrogenase (GOX2018)38.

Figure 2: Reconstruction of pathways for the utilization of polyols, sugars and sugar derivatives in G. oxydans.
figure 2

The enzymes catalyzing the reactions and the corresponding ORFs are indicated as follows. (1) Sugar ABC transporter (GOX2219–2221). (2) Dihydroxyacetone kinase (GOX2222). (3) Triosephosphate isomerase (GOX2217). (4) Polyol ABC transporter (GOX2182–2185). (5) Polyol dehydrogenase (GOX2181). (6) Ribulokinase (GOX2186). (7) Sugar permease (GOX1474). (8) N-acetylgylucosamine kinase (GOX1475). (9) N-acetylglucosamine-6-phosphate deacylase (GOX1473). (10) Glucosamine-6-phosphate deaminase (GOX1470). (11) Glycerol uptake facilitator protein (GOX2089). (12) Glycerol kinase (GOX2090). (13) Glycerol-3-phosphate dehydrogenase (GOX2088). (14) Polyol/sugar symporter (GOX0649, GOX0925, GOX1047, GOX2231). (15) Ribokinase (GOX2084). (16) Ribose-5-phosphate isomerase (GOX1708). (17) Sugar ABC transporter (GOX1179–1181). (18) Sugar kinase (1182). (19) Aldose-phosphate isomerase (GOX1178). The final intermediates are further metabolized via the oxidative pentose phosphate pathway. DHAP, dihydroxyacetone phosphate; GAP, glyceraldehyde-3-phosphate.

Transporters

G. oxydans possesses various transporters for the uptake of substrates and ions. We identified 29 ABC transporters, 5 symporters and 37 other permeases. Among them were those that can transport metal ions, NH4+, phosphate, sulfate, amino acids, purines, pyrimidines, di- and oligopeptides, sugars (e.g., ribose, galactose), polyols (e.g., mannitol and sorbitol) and sugar acids (gluconate and glucarate). Glycerol is taken up by a facilitator protein (Fig. 2). In addition, some components of a phosphotransferase system were determined. However, the sugar-specific proteins (EIIB, EIIC) were not identified (see Supplementary Table 1 online).

Discussion

The bacteria belonging to the genus Gluconobacter not only exhibit an extraordinary uniqueness in their biochemistry with respect to the process of incomplete oxidation but also in their growth behavior and response to extreme culture conditions2. On the basis of the genomic sequence of G. oxydans we can now more extensively describe the process of incomplete oxidation and the physiology of organisms carrying out this process. The organisms contain many membrane-bound dehydrogenases that are part of their strategy to thrive and to survive in nutrient-rich environments. The enzymes rapidly form sugars or sugar acids that are difficult to assimilate for most microorganisms, whereas G. oxydans can easily take advantage of these substrates.

The oxidized compounds are taken up and reduced in the cytoplasm, the reactions being catalyzed by a soluble set of oxidoreductases. Therefore, this deposit and withdrawal system of sugars and sugar alcohols is a clever strategy adopted by G. oxydans to survive in mixed microbial populations. Furthermore, incomplete oxidation of glucose and other aldoses, which are abundant in the natural habitats of G. oxydans, lead to the formation of sugar acids and to a decrease in the pH value, thereby preventing propagation of many other microorganisms.

The respiratory chain is designed to accelerate this process because proton-translocating abilities are limited. This prevents an increase in the electrochemical membrane potential that would otherwise lead to the inhibition of membrane-bound redox reactions. However, the low energy-transducing efficiency results in very low growth yields39. The inability to degrade glucose and other sugars via the Embden-Meyerhof pathway and the incomplete citrate cycle contribute to the inadequate utilization of the substrates. In summary, G. oxydans reveals an extreme adaptation to its nutrient rich habitats by outcompeting other microorganisms. Furthermore, the unique metabolism makes it an ideal organism for microbial process development.

Methods

The G. oxydans 621H (DSMZ 2343) genome was determined by a whole-genome shotgun approach using plasmid and cosmid libraries as described previously40. The raw sequencing data were obtained from Integrated Genomics Inc. All generated sequences (53,000) were assembled into contigs, and misassembled regions caused by repetitive sequences were solved as described40. Closure of sequencing gaps was accomplished through primer-walking using plasmid, PCR products and cosmid templates. Sequences from cosmids served for verification of the orientation, order and linkage of the contigs. Initial ORF prediction was accomplished automatically using the program YACOP41. The ERGO software package42 (Integrated Genomics, Inc., http://www.integratedgenomics.com/) was used for automatic annotation. All annotations were inspected manually through searches against PFAM, PROSITE, PRODOM and COGS databases, in addition to the BLASTP versus GenBank/EMBL and SWISSPROT databases. Putative frameshifts were checked and corrected manually.

The complete G. oxydans genome sequence has been deposited in GenBank (http://www.ncbi.nlm.nih.gov, accession nos. CP000009 (chromosome), CP000004 (plasmid pGOX1), CP000005 (plasmid pGOX2, CP000006 (plasmid pGOX3), CP000007 (plasmid pGOX4), CP000008 (plasmid5) and the Göttingen Genomics Laboratory database (http://www.g2l.bio.uni-goettingen.de/).

Note: Supplementary information is available on the Nature Biotechnology website.