Carboxydotrophy potential of uncultivated Hydrothermarchaeota from the subseafloor crustal biosphere

The exploration of Earth’s terrestrial subsurface biosphere has led to the discovery of several new archaeal lineages of evolutionary significance. Similarly, the deep subseafloor crustal biosphere also harbors many unique, uncultured archaeal taxa, including those belonging to Candidatus Hydrothermarchaeota, formerly known as Marine Benthic Group-E. Recently, Hydrothermarchaeota was identified as an abundant lineage of Juan de Fuca Ridge flank crustal fluids, suggesting its adaptation to this extreme environment. Through the investigation of single-cell and metagenome-assembled genomes, we provide insight into the lineage’s evolutionary history and metabolic potential. Phylogenomic analysis reveals the Hydrothermarchaeota to be an early-branching archaeal phylum, branching between the superphylum DPANN, Euryarchaeota, and Asgard lineages. Hydrothermarchaeota genomes suggest a potential for dissimilative and assimilative carbon monoxide oxidation (carboxydotrophy), as well as sulfate and nitrate reduction. There is also a prevalence of chemotaxis and motility genes, indicating adaptive strategies for this nutrient-limited fluid-rock environment. These findings provide the first genomic interpretations of the Hydrothermarchaeota phylum and highlight the anoxic, hot, deep marine crustal biosphere as an important habitat for understanding the evolution of early life.


Introduction
Over the past several years, the advancement of cultureindependent techniques has prompted the discovery and genomic characterization of several new archaeal lineages (see refs. [1,2] and references within). Each new phylum fills gaps in the genomic tree of life, allowing for the continuous reevaluation of the evolutionary models that lead from a common ancestor [2][3][4][5] to the splitting of the domains [6][7][8][9]. Similarly, analyzing genome annotations of uncultivated lineages has provided valuable insight into each groups' metabolic potential and prospective geochemical role within their environment [10][11][12].
Recently, the first genomes from a unique, uncultivated lineage of Archaea, known as Candidatus Hydrothermarchaeota (previously Marine Benthic Group-E or MBG-E [13,14]), were documented through metagenomic sequencing of crustal fluids collected from the deep subseafloor environment of the Juan de Fuca Ridge flank (JdFR; Figure S1) [15]. Samples were acquired using subseafloor borehole observatories called CORKs for (Circulation Obviation Retrofit Kits) that were installed during Integrated Ocean Drilling Program (IODP) Expedition 327 and provide access to the oceanic crust and the fluids circulating therein [16]. In the JdFR environment, fluid circulating between outcrops undergoes extensive fluid-rock reactions [17][18][19], becoming warm (64°C), depleted in oxygen and nitrate, and enriched in dissolved metals and reduced gases [19,20]. Eventually, the chemically altered fluids escape from discharge outcrops and hydrothermal vents, connecting both abiotic and biologically mediated water-rock reactions to global biogeochemical cycles [21,22].
The fluid circulating through the JdFR basement environment harbors microbial cell densities in the order of 10 4 cells per ml [23,24]. The Hydrothermarchaeota appear particularly abundant in the JdFR environment, comprising up to half of the archaeal 16S ribosomal RNA (rRNA) gene amplicons and one-third of the single-amplified genomes (SAGs) sorted from the total community [23]. These abundances are significantly greater than those observed from sedimentary environments, but similar to some hydrothermal vent structures (Table S1, Figure S2). Thus, some clades of Hydrothermarchaeota may be well adapted to life in the warm crustal biosphere, and detailing the potential metabolisms of these organisms should advance our understanding as to how life survives in this energylimited environment.
This study combined the analysis of Hydrothermarchaeota metagenome-assembled genomes (MAGs [15]) with several newly generated SAGs from the same JdFR crustal fluids to evaluate Hydrothermarchaeota's evolutionary relationship to other archaeal groups and assess the functional potential of the lineage. Our data reveal an early-branching archaeal candidate phylum arising between the Euryarchaeota superphyla and the superphylum containing Micrarchaeota, Altiarchaeota, UAP2, and Nanoarchaeota (DPANN (Diapherotrites, Parvarchaeota, Aenigmarchaeota, Nanoarchaeota, and Nanohaloarchaeota) [25,26]). This basal evolutionary position is supported by the coding potential for early-evolved enzymes for anaerobic sulfate and nitrate reduction.

Observatory description and fluid sampling
During the IODP Expedition 327 in 2010, borehole observatories were placed in subseafloor basement at IODP Holes U1362A and U1362B (Table S2, Figure S1). These observatories feature epoxy-coated steel observatory casing to minimize corrosion and mitigate impact on in situ processes, Teflon-lined "umbilical" tubes for pristine fluid collection from isolated subsurface intervals, and a 4-inch diameter "free flow" ball valve at the wellhead for additional fluid sampling [27][28][29]. Each borehole penetrates approximately 235 m of sediment. Hole U1362B has an umbilical for fluid collection 30 m below the sedimentbasement interface (meters sub-basement (msb)), while Hole U1362A has fluid collection horizons at approximately 30 and 190 msb.
In July 2011, fluid samples were collected using equipment on the remotely operated vehicle (ROV) Jason II from the Research Vessel Atlantis (Table S2). Fluids for metagenomic analyses were sampled from the umbilicals that accessed 190 and 30 msb at Holes U1362A and U1362B, respectively, as described previously [15]. At the seafloor, a mobile pumping system filtered approximately 124 and 70 l of crustal fluid from boreholes U1362A and U1362B using Steripak-GP20 (Millipore, Billerica, MA, USA) polyethersulfone filter cartridges containing 0.22 μm pore-sized membranes [15]. Before filtering, at least three times the volume of the umbilical line was flushed through the system to remove any stagnant fluids. Fluids for single-cell genomic analyses were sampled from the 190 msb Hole U1362A using the same mobile pumping system [23] and from the open ball valve on the wellhead at Hole U1362B [16], after a long-period of free flow to flush out the borehole dead volume, using a syringe cleaned with bleach and dilute trace-metal-grade acid. Temperatures as high as 62°C were recorded with the ROV thermistor inside the ball valve opening. Immediately upon recovery, fluid was fixed with glycerol-Tris-EDTA (glyTE) buffer and frozen at −80°C in cryovials for single-cell sorting [30].
Single-cell sorting, genome sequencing, and assembly The generation, identification, sequencing, and de novo assembly of SAGs was performed at the Bigelow Laboratory for Ocean Sciences Single Cell Genomics Center (scgc. bigelow.org). The cryopreserved samples were thawed, prescreened through a 40 μm mesh size cell strainer (Becton Dickinson) and incubated with 5 μM (final concentration) SYTO-9 DNA stain (Thermo Fisher Scientific) for 10-60 min. Fluorescence-activated cell sorting, cell lysis, multiple displacement amplification, sequencing (using Illumina technology), de novo genome assemblies, and quality control were performed using the workflow benchmarked in ref. [30]. Contigs >2 kbp in length were uploaded to the Joint Genome Institute (JGI) Integrated Microbial Genomes & Microbiomes (IMG/M) comparative data analysis system (Table S3 [ 31]) for gene prediction and annotation using the genome annotation pipeline [32].

Metagenome sequencing, assembly, and annotation
Metagenome sequencing, assembly, binning, and annotation has been reported previously [15]. Briefly, qualityfiltered raw sequence reads from the crustal fluids of Hole U1362A (IMG/M ID 330002481) and Hole U1362B (IMG/ M ID 3300002532) were assembled using SOAPdenovo version 1.05 with default settings, binned using CONCOCT [33] and curated within the Anvi'o package, version 1.1.0 [34]. In total, 98 MAGs were produced, of which 3 were identified as Hydrothermarchaeota. Hydrothermarchaeota MAGs JdFR-16, JdFR-17, and JdFR-18 were uploaded to IMG/M for gene prediction and annotation using the genome annotation pipeline [32]. Completeness and contamination estimates for SAGs and MAGs were made by comparing annotated protein sequences against the Euryarchaeota marker list within CheckM [35]. Average Nucleotide Identity (ANI) comparisons were calculated using IMG/M pairwise ANI tool [31].
A phylogenomic tree based on 43 single-copy marker genes was created from all publicly available genomes from the archaeal domain from IMG/M, National Center for Biotechnology Information (NCBI), and other repositories of data (Tables S4, S5). MAGs and SAGs with CheckMgenerated completeness, contamination, and strain heterogeneity information (version 1.0.11 [35]) were used as input to dRep (version 2.0.5 [41]) to produce a dereplicated set of genomes for phylogenomic analysis. Most default dRep parameters were used, but the required completeness was reduced to 50% to account for genes that are systematically absent from single-copy marker gene sets in archaeal groups of critical importance to the phylogeny (e.g., Hadesarchaeaota). From the dereplicated genomes (n = 1198) and the three most complete Hydrothermarchaeota genomes (JdFR-17, JdFR-18. and SAG AC-708-L17), the 43 single-copy marker gene alignment produced by CheckM was used as input to FastTree [42] with the WAG amino acid substitution model. Phylogenetic lineages were identified and collapsed in ARB (version 6.0.4 [43]) with the guided assistance of taxonomic information generated using GTDB-Tk (version 0.0.7) using the classify workflow with database release 83 [26].

Amplification of mcrA gene
Amplification of the mcrA gene on a sorted, whole genomeamplified Hydrothermarchaeota cell was attempted using primers qmcrA [44] and ML 5 ([45], see supplemental information).

Thermodynamic calculations of Gibbs free energy
The potential energy yields for sulfate reduction coupled to various electron donors were calculated according to the Gibbs energy of reaction (see supplemental information).
Phylogenetic analyses revealed that the JdFR Hydrothermarchaeota 16S rRNA gene sequences grouped most closely with environmental sequences from other crustal environments (Fig. 1a). The majority of sequences clustered with a sequence from black rust that formed on the exterior of a leaking subseafloor observatory at nearby Hole 1026B, where the black rust was still exposed to hydrothermal fluids leaking from the observatory [48]. MAG JdFR-18 branched separately with a sequence identified within crustal fluids collected from the hydrothermal vent of the Southern Mariana Trough [49].
Genome-wide ANI values corroborate the 16S rRNA gene phylogeny (Table S7). With the exception of SAG AC-708-N22, the SAGs are very similar (mean 98.6 ± 1.4% s.d.; n = 4). SAG AC-708-N22 is most similar to MAG JdFR-17, sharing an ANI value of 98.2%. Consistent with the 16S rRNA gene phylogeny and GC content, MAG JdFR-18 is equally dissimilar to all other genomes (mean 68 ± 0.8% s.d.; n = 8). These genomic relationships highlight the connectivity between the two different borehole locations and depth horizons (Table S2) and the diversity within the Hydrothermarchaeota community.

Evolutionary placement of the Hydrothermarchaeota
This study sheds light on the taxonomic placement of Hydrothermarchaeota within the archaeal tree of life. Phylogenies based on 16S rRNA genes as well as concatenated alignments of 43 single-copy phylogenetic marker genes place the Hydrothermarchaeota lineage branches toward the root of the DPANN superphylum (Fig. 1b). Currently, there are several predictions for the root of the archaeal tree, including placement within the Euryarchaeota [4,50] or between DPANN and the remaining archaeal lineages [5]. Regardless, the placement of Hydrothermarchaeota [26] between Euryarchaeota and DPANN lineages suggest that Hydrothermarchaeota represents an early-branched lineage that should be considered in future evolutionary models. Likewise, additional exploration of the JdFR crustal ecosystem is important for interpreting the metabolic potential of this early-branched lineage. Similar to other proposed early-life analogs, the crustal aquifer presents a hot, anoxic environment protected from sunlight and oxygen [51], but is drastically understudied relative to hydrothermal vent systems.

Terminal electron acceptors for Hydrothermarchaeota
By leveraging metagenomics with single-cell genomics, we were able to validate the binning of the previously constructed MAGs by confirming the presence of important genes within the partial SAGs. Here, the functional attributes of each genome are evaluated individually ( Figure S4) in order to provide a collective summary of the lineage's metabolic potential (Fig. 2).
The Hydrothermarchaeota genomes contain genes for the use of several different terminal electron acceptors including sulfate, nitrate, and potentially metal oxides ( Figure S4), suggesting versatility in the choice of oxidant. Phylogenetic analyses of key sulfate and nitrate reductase subunits suggest that these genes represent some of the earliest evolved forms of these enzymes found to date, further supporting Hydrothermarchaea as a deeply branching lineage (Figure S5 [52]).
The capacity for sulfate reduction is evident in the JdFR Hydrothermarchaeota (Fig. 2, Table S9). Most of the genomes had at least one sulP permease, suggesting that sulfate can readily enter Hydrothermarchaeota cells, while MAG JdFR-18 also included coding regions for sulfate adenylyltransferase (sat), the enzyme responsible for reducing sulfate to adenosine-5′-phosphosulfate a Draft genome quality determined as suggested by Bowers et al. [83] (APS), and APS reductase for reducing APS to sulfite. The identification of genes for dissimilatory sulfite reductase (dsrAB) in SAG AC-708-L17 and the three MAGs suggests that sulfite reduction to sulfide is also likely. The dsrA genes found in SAG AC-335-L21 and the three MAGs are most closely related to dsrA from Moorella species, "Candidatus Rokubacteria", and "Candidatus Aigarchaeota", and appear to be an earlyevolved form of the sulfate reductase gene, as suggested previously [52]. The dsrA genes of Hydrothermarchaeota are not monophylogenetic: additional drsA genes present in MAG JdFR-18, and SAGs AC-334-L17 and AC-334-K11 are similar to genes from bacteria and may be obtained via horizontal gene transfer [52]. The electrons for sulfate reduction are possibly passed from the menaquinone loop using membrane-bound dsrMK, and then to a soluble, unidentified electron carrier protein. For example, the protein DsrC has been suggested to be an electron carrier in past studies of Archaeoglobus [53]. The possible roles of other sulfur species as electron donors or acceptors for Hydrothermarchaeota are limited, and no evidence of thiosulfate reductase was found in any of the Hydrothermarchaeota genomes. There is ample evidence for microbial sulfate reduction within the JdFR crustal fluids. Fluids are replete with sulfate (~18 mM [19,20]), demonstrate measurable sulfate reduction, and contain dissimilatory dsrAB genes [54]. dsrAB genes were also observed in JdFR rocks, along with pyrite sulfur stable isotope values that indicate microbial sulfate reduction [55]. We hypothesize that Hydrothermarchaeota contribute to the sulfate reduction potential in this ecosystem, along with the Deltaproteobacteria, Firmicutes, and Archaeoglobus microbial community members previously identified in this ecosystem [15,54].
Hydrothermarchaeota may have metabolic flexibility in terminal electron acceptors for respiration, as evidenced by the presence of genes for nitrate reduction. Genomes contain subunits for two different types of nitrate reductase genes: nap, the periplasmic dissimilatory nitrate reductase genes, and nar, the cytoplasmic membrane-bound nitrate reductase (Fig. 2, S4, Table S10). Phylogenetic analysis of napA genes suggests that the Hydrothermarchaeota napA gene represents an early-evolved form, supporting other evidence for Hydrothermarchaeota's basal placement in the archaeal tree of life ( Figure S5). However, the possibility for nitrate reduction in this ecosystem is unclear. Nitrate is rapidly exhausted after entering the ocean crust [18] and measured concentrations within the JdFR fluids over multiple years have been below detection or at nanomolar concentrations [19,20]. Plausibly, trace amounts of nitrate could be intensely cycled, as has been observed in continental subsurface systems with cryptic N cycling [56] and suggested from other JdFR metagenomic interpretations [15]. Thus, the possibility of nitrate reduction in this crustal ecosystem warrants further attention.
Most genomes also possess various subunits for cytoplasmic membrane-bound nitrate reductase (subunits nar-GHIJ, Table S10). The carboxydrotroph A. fulgidus also has this nitrate reductase, but has not demonstrated nitrate reduction in the laboratory [53]. Interestingly, transcripts of A. fulgidus show an upregulation of narA when reducing sulfate, indicating that this nitrate reductase complex might be accepting electrons from ferredoxin to reduce menaquinone [53]. However, additional laboratory studies are necessary to decipher the true potential of nitrate reductase in A. fulgidus and JdFR Hydrothermarchaeota.   (Table S8). b Phylogenomic associations of Ca. Hydrothermarchaeota genomes among archaeal genomes publicly available in Integrated Microbial Genomes (IMG), National Center for Biotechnology Information (NCBI), and other repositories, using classifications suggested by the Genome Taxonomy Database [26] (Table S4). Tree represents the concatenation of 43 single copy marker proteins (Table S5) Hydrothermarchaeota genomes include genes for many different c-type cytochromes, which are likely involved in terminal electron transferring processes (Table S11). Within the Archaea, microorganisms known to contain c-type cytochromes are restricted to the orders Archaeoglobales, Methanosarcinales, Halobacteriales, and Thermoplasmatales [57], where they serve as electron transfer proteins. Given the metabolisms of neutrophilic anaerobes within these orders, we hypothesize that the c-type cytochromes in Hydrothermarchaeota are either (1) generating a proton gradient through the use of an electron transport system, as observed in methanogens and methanotrophs of the Methanosarcinales order [57], or (2) reducing extracellular ferric oxide species, as observed in Ferroglobus placidus and Geoglobus ahangari [57], iron reducing archaea of other marine hydrothermal systems.
Gene subunits for the ferredoxin:NADP + oxidoreductase (fqo) complex may represent a potential electron shunt into the membrane-bound respiratory chain, linking CO oxidation to an external electron acceptor (Table S14 [53,63]). Genes for the biosynthesis of menaquinone suggests that menaquinone redox reactions could continue to transfer elections along the respiratory chain to an ultimate acceptor, while translocating protons across the membrane (Table S14).
Hydrothermarchaeota SAGs and MAGs also contain one to two bifunctional CO dehydrogenase/acetyl-CoA synthase (CODH/ACS) complexes (cdhABC and cdhCDE; Fig. 2, S12), which couple the reversible reduction of CO 2 to CO oxidation to form acetate (acetogenesis), a key step of the Wood-Ljungdahl pathway [64]. The CODH/ACS complex is hypothesized to be an early-evolved complex [58], and thus its presence in this early-branching lineage is consistent with its proposed ancestry. The presence of these complexes and other Wood-Ljungdahl enzymes suggest that the Hydrothermarchaeota can also assimilate carbon monoxide to biomass, again similar to A. fulgidus. Several of the SAGs and the three MAGs also contain formate dehydrogenase (Fig. 2, Table S13), suggesting that formate production during CO oxidation or the use of formate as an electron donor is possible, as observed with cultures of A. fulgidus [62,65].
Hydrothermarchaeota do not appear to be involved in methane cycling, although they possess some genes known to be involved in methyl cycling. For example, the genomes contain genes that encode for the methanogenic tetrahydromethanopterin S-methyltransferase subunit (mtrH, Table S13), and MAG JdFR-18 also encodes genes for methyl transferases of methylamide compounds (Table S15). These enzymes are all involved with the methylation of methyl-coenzyme M (methyl-SCoM); during methanogenesis, methanogens reduce methyl-SCoM with the enzyme methyl-coenzyme M reductase (MCR). No subunits of the mcr gene were found within the Hydrothermarchaeota genomes, suggesting that these organisms cannot produce methane. To verify that the absence of the mcr subunits did not result from a lack of recovery, a PCR reaction targeting the mcrA gene was performed on amplified SAG DNA from a sorted cell that was identified as Hydrothermarchaeota but not genome sequenced. No PCR product was observed (data not shown). A negative result supports the absence of mcrA in these Hydrothermarchaeota genomes, but cannot rule out possible biases related to the amplification reactions. Nevertheless, the presence of mtrH and lack of mcrA has been recognized in Theionarchaea (SAG DG-70, [66]) and Euryarchaeota genomes (A. fulgidus, IMG/M IDs 2588253768/ 638154502; Achaeoglobus sulfacticallidus, IMG/M ID 2522125074, and a partial MSBL1 genome [67]). This suggests that the methyl-SCoM metabolite may have an alternate fate in these organisms. Given that the last common ancestor has been hypothesized to be a methanogen [3,68], the placement of Hydrothermarchaeota outside the Euryarchaeota (many of which are methanogens) may further advance our understanding of the distribution and evolution of this ancient and fundamental metabolism.
While no known CO-induced hydrogenases were identified, genes for Ni-Fe hydrogenases (large and small subunits) were found in SAG AC-344-K11 and MAGs JdFR-16 and JdFR-17 (Fig. 2, Table S16). These hydrogenases may serve as a sink for the electrons produced during CO oxidation. However, it is also possible that the Ni-Fe hydrogenases may oxidize hydrogen for the hydrogenotrophic reduction of sulfate, as previously observed as an alternative to carbon monoxide oxidation in a culture of A. fulgidus [69]. In addition to Ni-Fe hydrogenases, most JdFR Hydrothermarchaeota contained genes encoding for the F420-reducing hydrogenase beta subunit, a hydrogenase required for the Wood-Ljungdahl pathway, as well as genes for hydrogenase biosynthesis proteins (hypABCDEF) and hydrogenase maturation proteins.
Given the evidence for CO oxidation and sulfate reduction, Gibbs free energy yields were estimated for sulfate reduction with CO oxidation and compared to various other electron donors (equation S1, Tables S17-S19), following approaches described previously [70][71][72]. When possible, in situ concentrations were used for the calculations (Table S18 [19,71]). To our knowledge, concentrations of CO and acetate have not been measured within JdFR crustal fluids in this ecosystem; thus, calculations were based on a range of concentrations for these analytes from (10 nM to 100 µM, based on reported in situ CO concentrations in other marine and fluid-rock reaction environments ( [73,74] Table S18). Of the reactions tested, CO oxidation coupled with sulfate reduction yielded the most exergonic conditions when normalized per electron transferred (Fig. 3, Table S19). This supports the interpretation of dissimilatory carboxydotrophy metabolism in Hydrothermarchaeota, which, as an autotroph may feed the microbial food web and drive the carbon cycling in this warm crustal biosphere.

Metabolic pathways for essential metabolites and biomolecules
The JdFR Hydrothermarchaeota contain pathways for sugar, amino acids, nucleic acids, and lipid metabolism.
Collectively, the genomes have many of the genes required for the gluconeogenesis/glycolysis pathway, missing only pyruvate kinase (Table S20). Instead of being converted to pyruvate, we hypothesize that phosphoenolpyruvate is likely converted to oxaloacetate by phosphoenolpyruvate carboxylase. The presence of fumarate hydratase, succinate dehydrogenase, succinyl-CoA synthetase, 2-oxogluterate ferredoxin reductase, and isocitrate dehydrogenase suggests the presence of an incomplete tricarboxylic acid (TCA) cycle (Table S21), which is similar to other anaerobic archaeal groups such as methanogens [75]. In anaerobic organisms, TCA-related genes provide for the potential synthesis of several important biosynthetic intermediates such as fumarate, succinate, succinyl-CoA, and 2oxoglutate. These intermediates can then serve as the building blocks for amino acid, pyrimidine, and purine metabolisms (Table S22). The Hydrothermarchaeota possess many genes for synthesis of isoprenoid-based lipids using the mevalonate pathway, including hydroxymethylglutaryl-CoA synthase, isopentenyl phosphate kinase, isopentenyl-diphosphate delta-isomerase and mevalonate kinase (Table S23). Transporters for trace elements (Co, Ni, Mo, W) and the vitamin biotin were identified, along with transporters for branched amino acids (Table S24). These could then serve as a potential source of nitrogen and organic carbon for the cell.
Some JdFR Hydrothermarchaeota genomes contain RuBisCO genes (Fig. 2), which were aligned against other RuBisCO genes to understand their potential metabolic function. These Hydrothermarchaeota RuBisCO genes group phylogenetically with form III-a RuBisCOs ( Figure S6), which are known to fix CO 2 for the synthesis of metabolites (including nucleic acids and sugars) using the reductive hexulose-phosphate (RHP) cycle [76]. Hydrothermarchaeota include many of the genes necessary for the RHP cycle, including a gene for a fused hexulose-6-phosphate/formaldehyde activating enzyme (Table S25). When these two enzymes act in concert, they produce methylene-H 4 MPT from 3-arbino-hexulose-6-phosphate, an important metabolite for the Wood-Ljungdahl pathway. However, phosphoribulokinase, an important RHP enzyme, has yet to be identified within the JdFR Hydrothermarchaeota genomes, and thus the potential for the RHP cycle cannot be confirmed.

Motility as an adaptive strategy of Hydrothermarchaeota
JdFR Hydrothermarchaeota partial genomes contain more chemotaxis and motility-related genes than many of the archaeal SAGs publicly available in IMG/M (Table S26- 27). Specifically, when compared to archaeal genomes of marine environments, and accounting for relative completeness, JdFR genomes generally have more motility genes than genomes from sedimentary environments (Fig. 4a). A similar trend can be observed when comparing community-wide metagenomic samples (Fig. 4b). The relative abundance of motility genes within metagenomes from crustal environments (Juan de Fuca and the Mid-Atlantic Ridge [77]) are greater than sediment environments and comparable to ocean water column samples.
Motility genes found in the Hydrothermarchaeota genomes include those for the archaellum and chemotaxis. It is unclear if Hydrothermarchaeota use the archaellum solely for motility or to attach to surfaces, leading to biofilm formation. The processes for attachment are diverse within Archaea [78], and thus the presence of archaellum-related genes do not necessarily indicate or exclude potential biofilm production. In either case, archaellum rotation would likely be driven by adenosine triphosphate hydrolysis [79]. This poses a problem for microorganisms living in subsurface low-energy environments, which are thought to be surviving just slightly above their minimum energy requirements [72,80]. Subsurface cells are expected to focus on maintenance over growth, suggesting that motility would be impractical [80]. However, the possibility of motility in sediment is suggested from metatranscriptomes analyzed from sediments of the Peru Margin [81]. This prior work suggested that motility decreases with decreasing porosity, a trend that supports the comparisons of Fig. 4. Given that the crustal biosphere is relatively porous compared to a sedimentary environment, the advantages of  (Table S18) motility or biofilm production in this fluid environment can be appreciated, even though the energy requirements remain a paradox. Assuming that nutrients and concentrations of electron acceptors exist as patches that disperse with time (i.e., decaying cells, marine snow particles), the energetic gain of motility would depend on the size and concentration of the chemoattractant, degree of fluid mixing, the distance between the cell and nutrient packet, and the speed at which a cell could travel to the nutrient packet [82]. Alternatively, surface attachment and biofilm production can provide protection and foster metabolic interdependencies [78]. Either way, the abundance of genes related to motility and chemotaxis in these Hydrothermarchaeota genomes, and crustal metagenomes in general, suggest that organisms of the deep crustal biosphere have adopted a strategy for balancing these energetic gains and costs.

Conclusions
Single-cell and metagenome-assembled genomes from the uncultivated Hydrothermarchaeota lineage were derived from warm, anoxic subsurface crustal fluids collected from the Juan de Fuca Ridge flank. Comparative genomic analysis provided the first evolutionary and metabolic characterization of this lineage. These genomic datasets revealed Hydrothermarchaeota to be an early-branching archaeal phylum, arising near the central branch points of the DPANN and Euryarchaeota groups. The genomes harbor evidence for many early-evolved metabolisms including ancient forms of sulfate and nitrate reductases. These observations underscore the significance of the hot, deep marine crustal biosphere as an important habitat for understanding the evolution of early life. Hydrothermarchaeota appear to be carboxydotrophs, highlighting this often-overlooked metabolic pathway as playing an important role in subsurface fluid-rock reaction environments, as suggested recently [73]. The presence of chemotactic and motility genes suggests that Hydrothermarchaeota may be capable of seeking favorable redox conditions or other nutrients. Despite a small average genome size, the versatility afforded by the inferred metabolic and phenotypic characteristics of the Hydrothermarchaeota may represent important survival strategies for life in the warm crustal biosphere.  Fig. 4 The relative abundance of motility genes in publicly available marine-related genomes and metagenomes as defined by COG (Clusters of Orthologous Groups) annotations. a Relative abundance of motility genes in single-amplified genomes and metagenomeassembled genomes collected from various marine environments relative to their genome completeness (considering genomes that are at least 10% complete, Table S28): sediments (brown triangle), hydrothermal vents (purple circle), crustal aquifers (orange circles), and ocean water column samples (blue squares). Candidatus Hydrothermarchaeota genomes are highlighted as yellow diamonds. b The relative abundance of motility genes in metagenomes collected from various marine environments (Table S29). The box and whiskers represent the range of relative abundance as defined by quartiles User Facility, is supported under contract no. DE-AC02-05CH11231. This is C-DEBI contribution number 451, HIMB contribution number 1747, and SOEST contribution number 10573.

Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict of interest.
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons. org/licenses/by/4.0/.