Hyperactive nanobacteria with host-dependent traits pervade Omnitrophota

Candidate bacterial phylum Omnitrophota has not been isolated and is poorly understood. We analysed 72 newly sequenced and 349 existing Omnitrophota genomes representing 6 classes and 276 species, along with Earth Microbiome Project data to evaluate habitat, metabolic traits and lifestyles. We applied fluorescence-activated cell sorting and differential size filtration, and showed that most Omnitrophota are ultra-small (~0.2 μm) cells that are found in water, sediments and soils. Omnitrophota genomes in 6 classes are reduced, but maintain major biosynthetic and energy conservation pathways, including acetogenesis (with or without the Wood-Ljungdahl pathway) and diverse respirations. At least 64% of Omnitrophota genomes encode gene clusters typical of bacterial symbionts, suggesting host-associated lifestyles. We repurposed quantitative stable-isotope probing data from soils dominated by andesite, basalt or granite weathering and identified 3 families with high isotope uptake consistent with obligate bacterial predators. We propose that most Omnitrophota inhabit various ecosystems as predators or parasites.

The conclusions of methanotrophy drawn from the SURF_12 6 genome apply to the OLB16 lineage-not Omnitrophota. Momper et al. (2017) 6 discovered a methane monoxogenase within the genome SURF_12. From this gene, the authors infer methane oxidation as a possible metabolism of SURF_12. Noted by the authors as well was a paraphyly of the OP3/Omnitrophica genomes that were included in their study. This paraphyly forms two separate clades: one clade includes SURF_12 and OLB16, while the other clade corresponds to the many of the same Omnitrophota genomes discussed here. Using the GTDB, a follow-up analysis of the SURF_12 genome 7 revised and affirmed its affiliation with the GTDB phylum OLB16. This resolves the paraphyly and maintains the conclusion of Momper et al. (2017) 6 -methanotrophy may indeed be a feature of the OLB16 lineage. No signal of methane oxidation was otherwise observed among the Omnitrophota as defined here. It is unlikely that this metabolism is present in the phylum.

Supplementary Note 3: Extended discussion of Omnitrophota physiology
Many catabolic and anabolic pathways were complete. Genes involved in the three-carbon module of glycolysis (M00002) are extremely common among members of Omnitrophota, with nearly every genome encoding a full pathway. Genes involved in the cleavage and interconversion of hexoses were less common, however, suggesting some conserved gaps in Embden-Meyerhoff glycolysis. Fructose-bisphosphate aldolase (K01623, K01624, K01625) ( Table S5) shows some degree of variability between classes: these genes are common in Gorgyraia and 4484-213, rare in the Omnitrophia and Velamenicoccia, and absent in 2-02-FULL-51-18 and Aquiviventia. Glucose-6-phosphate isomerase (K01810, K06859, K13810, or K15916), catalyzing the interconversion between glucose-6-phosphate and fructose-6-phosphate in glycolysis, is absent from the 2-02-FULL-51-18 and uncommon among the Aquiviventia. Despite this, genes encoding the non-oxidative phase of the pentose phosphate pathway (M00007) are plentiful across the phylum, providing a possible alternative to glycolysis for the cleavage of hexoses where one or more of the necessary genes are absent. Numerous multiple-sugar ABC transporters (KEGG: K02025-27, K10112, K10117-19) are commonly encoded by members of the phylum. Across the phylum, however, the phosphotransferase system was missing or incomplete, suggesting that glucose utilization is uncommon in Omnitrophota. Together, these features suggest a conserved chemoheterotrophic metabolic potential by which non-glucose di-and oligosaccharides are used as carbon and energy sources. 4484-213 stands as a stark exception to this, however: the two genomes from this lineage lack multiple-sugar transporters entirely, but uniformly possess an array of amino acid transporters. Members of this class may, therefore, instead use a chemoheterotrophic metabolism centered around the interconversion and subsequent catabolism of amino acids. However, these conclusions will likely be revised pending the expansion of genomic representation for the 4484-213. The oxidative phase of the pentose phosphate pathway (M00006; Supplementary Table 5) and purine or pyrimidine degradation pathways (M00546, M0004, Supplementary Table 5) were absent or extremely rare across the phylum. The non-oxidative phase and subsequent steps in the biosynthesis of LPS and nucleotides were common across the phylum. This suggests that nucleotides are synthesized by members of Omnitrophota and not consumed and that nucleotides are not the preferred carbon source of any Omnitrophota.
Genes encoding energy conservation pathways were ubiquitous in Omnitrophota genomes, and each class followed either an acetogenic or respiratory scheme (Figure 3, 4; Extended Data Figure 5, 6). Genomes of most species assigned to the classes Gorgyraia (32/39) and Velamenicoccia (73/113) encode the key genes for sugar transport, Embden-Meyerhof glycolysis, ferredoxin reduction via pyruvate:ferredoxin oxidoreductase, and acetogenesis via phosphotransacetylase (Pta) and acetate kinase (Ack) (Figure 3, 4a). This acetogenic pathway yields ATP from the oxidation of sugars via glycolysis and the ATP-yielding hydrolysis of acetyl-CoA to acetate via acetyl-P 8 . These genomes also encode a highly conserved Rnf complex. When used by acetogenic bacteria, Rnf complexes serve to restore NAD + and oxidized ferredoxin pools 9 , or in reverse, to generate an electrochemical gradient capable of powering an ATPase 10 . PEP carboxykinase provides another possible source of ATP while simultaneously generating oxaloacetate as a connection between glycolysis (Supplementary Table 4) and a "horseshoe"type TCA cycle (Figure 4a, Supplementary Table 6), as described by Williams et al. 11 . Fumarate hydratase and malate dehydrogenase are present and serve either in redox balance 11 or to recycle fumarate generated through arginine 11 or ADP biosynthesis. Additionally, although very few genes for sulfur metabolism were annotated by KEGG or Metabolic (Supplementary Table 5, Supplementary Fig. 9), 23 of these genomes encode one or more putative reversible desulfovirdin-type dissimilatory sulfite reductases (DsrA, COG2221), which could either reduce sulfite to sulfide via hydrogenases or oxidize sulfide to sulfur 12 , as has been suggested for SKK-01 and supported by an abundance of intracellular sulfur globules 13,14 . Considering species representative genomes, acetogenesis was thus predicted in 96 species, mapping to seven of the ten Gorgyraia families and eight of the ten Velamenicoccia families (Figure 3, Supplementary  Table 5). These patterns also held for lower completeness genomes (Supplementary Fig. 10); however, differences exist among these putative acetogens based on the variable presence and completeness of the Wood-Ljungdahl pathway (WLP) and the types of hydrogenases (Supplementary Fig. 11) and ATPases present.
The most basic acetogenic pathway enabling fermentation of sugars is present in genomes of 14 species in the Gorgyraia families Gorgyraeaceae, Taenaricolaceae, and JABMRG01 and the Velamenicoccia family 4484-1171, as exemplified by Makaraimicrobium thalassicum (Extended Data Figure 5a). In M. thalassicum, the oxidative and ATP-yielding reactions of the Embden-Meyerhof-Parnas pathway and acetogenesis could be coupled to redox balancing and cation pumping reactions of the Rnf complex and the biomass precursor-generating reactions of the "horseshoe"-type TCA cycle. Absent any electron transport chain complexes, the metabolism of these organisms is based on the fermentation of sugars, with acetate and possibly fumarate as likely products.
All other putative acetogens have this basic metabolism but also encode components of the bacterial WLP, which could enable another route for acetogenesis or for fixation of CO and, in some cases, CO2. The methyl branch of the WLP was complete in these genomes (i.e., Fhs, FolD, MetF) except for formate dehydrogenase (K05299, K15022), suggesting formate 15 as a substrate for the methyl branch rather than CO2 (Figure 3, 4a; Supplementary Table 5). The completeness of the carbonyl branch was variable. Twenty-four species in the Velamenicoccia families Velamenicoccaceae, Profunditerraquicolaceae, and DTHP01 and the Gorgyraia families FEN-1320 and CAIMPC01 have a truncated carbonyl branch lacking AcsA and its homolog CooS/F, as exemplified by V. archaeovorus (Extended Data Figure 5b). AcsA and CooS/F catalyze the conversion of CO2 to CO; thus, in the absence of AcsA and CooS/F, the carbonyl branch could utilize CO for acetogenesis or biosynthesis via the carbon monoxide dehydrogenase AcsB-E complex, but CO2 cannot be fixed. As in M. thalassicum, V. archaeovorus could manage redox balance and energize the membrane via an Rnf complex; however, it also encodes a 4g nickel-iron membrane-bound hydrogenase and a cytoplasmic A3 iron-only hydrogenase (Figure 3, 4a, Extended Data Figure 5b). 4g hydrogenases have been proposed to couple the oxidation of H2 or methylated organic compounds to ferredoxin reduction for CO2 fixation 16 , yet other group-4 hydrogenases reverse this process, coupling H2 production to carbon monoxide or formate oxidation to CO2 17 . Group A3 FeFe hydrogenases are electron bifurcating hydrogenases that are used for redox balance. The WLP enzymes can also catalyze the reverse reaction of acetogenesis, fixation of acetate to acetyl-CoA, and can serve to ligate coenzyme A to propionate or other shortchain fatty acids 18 . Direct utilization of propionate via acetate kinase (Figure 4a) and phosphate transacetylase is consistent with the enrichment of Omnitrophota in an anaerobic reactor community fed propionate at a high dilution rate 19 . All Omnitrophota 16S rRNA phylotypes in the reactor community mapped to the family Profunditerraquicolaceae within the Velamenicoccia (Genomes 318-418, Figure 1, Supplementary Table 1), which encode this pathway (Supplementary Table 5). Overall, the metabolic scheme of these organisms is an acetogenic pathway with the same core components as M. thalassicum for sugar utilization, but with additional capacity to conserve energy and incorporate additional carbon sources via a simplified WLP.
In contrast with the predominant metabolism of most Gorgyraia and Velamenicoccia, many species within the Gorgyraia order Pluralincolimonadales (e.g., Pluralincolimonas frigidipaludosa; Figures 1, 3) and Velamenicoccia order Zapsychrales (e.g., Fredricksoniimonas spp. Figures 1, 3) lack genes encoding acetate kinase and phosphate transacetylase and instead encode an acetyl-CoA synthetase along with diverse catabolic pathways. The four Pluralincolimonadales species all encode a reversible acetyl-CoA synthetase (TIGR02717) as well as a simplified electron transport chain including respiratory complex I (NADH dehydrogenase (PF00346)) and an F-type ATPase complex (M00157). Of the 39 species clusters in the Zapsychrales, 18 encode respiratory complex I, 13 encode respiratory complex II (succinate dehydrogenase (M00149)), and 37 encode an F-type ATPase complex. Only the species branching at the basal node of the order Zapsychrales, represented by Genome 314, encodes the WLP, suggesting loss of this pathway early in the evolution of the order. 13 species of Zapsychrales still encode a reversible acetyl-CoA synthetase (TIGR02717), suggesting acetogenesis or acetate utilization. The presence of cytochrome bd ubiquinol oxidase (M00153; 6 species) or cytochrome c oxidase (M00154; 1 species) indicates the genomic potential for aerobic respiration among some members of the order 20 (Figure 3, Supplementary Table 5). Genes encoding oxidoreductases acting on particular terminal electron acceptors for anaerobic respiration such as nitrate, nitrite, and metals indicate a patchwork of respiratory systems in Zapsychrales and Pluralincolimonadales with little evidence of vertical inheritance.
Omnitrophia, Aquiviventia, and 2-02-FULL-51-18 lack the WLP altogether and instead encode diverse respiratory capabilities. Out of the 43 species clusters, genes encoding respiratory complex I are encoded by 34, yet genes encoding complex II are only encoded by 21 species. This complex is also rarely complete; genes encoding the membrane anchor component of complex II (K00242, K18859, or K18860) are missing from all Omnitrophota except members of Aquiviventia (Supplementary Table 5). Capacity for aerobic respiration via cytochrome bd ubiquinol oxidase (4 species) or cytochrome c oxidase (18 species) were also highly variable, although a large fraction of the Aquiviventia (10/13) appear to be aerobic via cytochrome c oxidase.

Supplementary
Supplementary Figure 16. Family-level quantitative stable isotope probing in diverse soils with control taxa included. Y-axis shows the percent difference between atom fraction excess (AFE) for a given taxon (P), compared to all non-predatory (NP) taxa from the same sample. Boxes display the median and inner quartiles while whiskers extend to the 95 percent confidence interval of the distribution of AFE ratios for a given taxon within each experimental group. N = 114 qSIP experiments.