Brockarchaeota, a novel archaeal phylum with unique and versatile carbon cycling pathways

Geothermal environments, such as hot springs and hydrothermal vents, are hotspots for carbon cycling and contain many poorly described microbial taxa. Here, we reconstructed 15 archaeal metagenome-assembled genomes (MAGs) from terrestrial hot spring sediments in China and deep-sea hydrothermal vent sediments in Guaymas Basin, Gulf of California. Phylogenetic analyses of these MAGs indicate that they form a distinct group within the TACK superphylum, and thus we propose their classification as a new phylum, ‘Brockarchaeota’, named after Thomas Brock for his seminal research in hot springs. Based on the MAG sequence information, we infer that some Brockarchaeota are uniquely capable of mediating non-methanogenic anaerobic methylotrophy, via the tetrahydrofolate methyl branch of the Wood-Ljungdahl pathway and reductive glycine pathway. The hydrothermal vent genotypes appear to be obligate fermenters of plant-derived polysaccharides that rely mostly on substrate-level phosphorylation, as they seem to lack most respiratory complexes. In contrast, hot spring lineages have alternate pathways to increase their ATP yield, including anaerobic methylotrophy of methanol and trimethylamine, and potentially use geothermally derived mercury, arsenic, or hydrogen. Their broad distribution and their apparent anaerobic metabolic versatility indicate that Brockarchaeota may occupy previously overlooked roles in anaerobic carbon cycling. Geothermal environments are hotspots for carbon cycling. Here, De Anda et al. reconstruct archaeal genomes from terrestrial and deep-sea geothermal sediments, and propose the classification of these microbes as a new phylum, ‘Brockarchaeota’, with unique metabolic capabilities including non-methanogenic anaerobic methylotrophy.

A dvances in DNA sequencing and computational approaches have accelerated the reconstruction of metagenome assembled genomes (MAGs) from natural communities 1 . This approach has revealed many novel lineages on the tree of life and is advancing our understanding the ecological roles of uncultured microbes [1][2][3] . For example, many new archaeal phyla have been described from hot springs including Geoarchaeota 4 , Marsarchaeota 5 , Aigarchaeota 6 , and several Asgard phyla from deep-sea hydrothermal vents [7][8][9][10][11][12] . However, diversity surveys have demonstrated there are many novel taxa left to be explored 13 . Moreover, there are several gaps between our knowledge of active biogeochemical processes and the metabolic mechanisms and taxa mediating them. For example, the description of microbes mediating anaerobic methylotrophy is still limited, and it is unclear which non-methanogenic heterotrophs utilize methylated compounds on the anoxic seafloor 14 . Little is known about the microorganisms or pathways mediating this process 15 .
Methylotrophs are organisms that are capable of using simple organics including single-carbon (C1 e.g., methanol) and methylated (e.g., trimethylamine) compounds as a source of energy and carbon 16,17 . In nature, the most prevalent are compounds such as methanol and methylamines, which are derived from a variety of sources such as phytoplankton, plants, and the decay of organic matter 15,18,19 . As a result, they are ubiquitous in oceans and atmosphere and are important components of the global carbon and nitrogen cycles 15 . In oxic environments, methanol is converted to formaldehyde by the classical pyrroloquinoline quinone (PQQ)-linked methanol dehydrogenase pathway found in aerobic methylotrophs 15,18 . In anoxic settings, these compounds are used as substrate for methylotrophicmethanogenesis [20][21][22][23] and sulfate reduction 24 . Anaerobic methylotrophs utilize the methyltransferase system (MT) to break and transfer the methyl residue to coenzyme M (in the case of methanogens) or tetrahydrofolate (H 4 F) (in acetogens and sulfate reducers) [20][21][22][23][24] and conserved energy via the Wood-Ljungdahl pathway (WLP). Methylotrophic archaea include methanogenic orders (in Euryarchaeota): Methanosarcinales, Methanobacteriales, Methanomassiliicoccales, and the recently discovered uncultured methylotrophic phylum, Verstraetearchaeota 20 . Methylotrophy has not been described in archaeal lineages outside of these methanogenic groups.
Here we describe a new archaeal phylum, the Brockarchaeota, whose members are metabolically versatile and can be found in geothermal environments around the world. The Brockarchaeota appear to possess diverse pathways for carbon cycling including fermentation of complex organic carbon compounds, anaerobic methylotrophy, and chemolithotrophy.

Results
Genomic reconstruction. Metagenomic sequencing, assembly, and binning of sediments from seven terrestrial hot springs in Tibet (up to 70°C) and Tengchong Yunnan, China (up to 86°C) and deep sea hydrothermally heated Guaymas Basin (GB) sediments (10-34°C) resulted in the reconstruction of 15 draft metagenome-assembled genomes (MAGs) estimated to be 67-92% complete (Table 1). These MAGs range from 0.78 to 2.32 Mbp (average 1.47 Mbp) ( Table 1). The two MAGs from GB (B48_G17 and B27_G9, temperature 33.62 and 10.4°C, respectively) were originally designated as "GB-AP1" in a prior study 25 . Although the GB genomes were obtained from lower temperatures these sediments experience increases in temperature due to hydrothermal circulation 25 . Thus, the organisms from which these genomes were derived likely prefer hot geothermal ecosystems, and anoxic conditions (Supplementary Data 1). All of these MAGs are rare members of the communities they were recovered from, presenting 0.1 to 9% of the genomic reads (Supplementary Data 2).
Phylogeny and distribution in nature. Phylogenetic analyses of these MAGs based on a concatenated alignment of 37 conserved proteins (see Methods), revealed they form a distinct group within the TACK superphylum, basal to Aigarchaeota and Thaumarchaeota ( Fig. 1 and Supplementary Data 3). A comparison of average amino acid identities (AAI) across 250 available TACK genomes (Supplementary Data 4), revealed that Brockarchaeota are distinct from neighboring phyla (Geoarchaeota, Aigarchaeota, and Thaumarchaeota) and share up to 99% genome-wide nucleotide similarity to one another (Supplementary Fig. 1 and Supplementary Data 5). The two GB MAGs (B48_G17 and B27_G9) are distinct from the hot springs at the AAI level (<50% similar to each other), and <45% AAI to members of Geoarchaeota, Thaumarchaeota, and Aigarchaeota, which is consistent with their phylogenetic placement. Phylogeny of 16S rRNA genes also indicated that Brockarchaeota do not fall within any described archaeal phyla ( Fig. 2A), with <78% similarity to other TACK members. Together, these results support the classification of these MAGs as a new phylum. We propose that the phylum be named "Brockarchaeota", after Thomas Brock, an American microbiologist known for his groundbreaking research in hot springs microbiology.
Interestingly, only three 16S rRNA gene sequences with similarity (92-96%) to Brockarchaeota sequences have been described in PCR-based surveys, highlighting the inherent bias for primer choice in diversity studies. Therefore, we searched publicly available metagenomic databases to examine the geographic distribution of this phylum. Notably, we almost exclusively found 16S rRNA gene sequences related to Brockarchaeota in sequence data generated from other hot springs from around the world (China, USA, South Africa; Fig. 2A), revealing Brockarchaeota are globally distributed in hot springs (Fig. 2B). Three sequences, which cluster together, were recovered from lake sediments in Rwanda and the Gulf of Boni in Indonesia (28°C ) (see Supplementary Data 8), suggesting that some Brockarchaeota are mesophilic as well.
Unique anaerobic methylotrophic pathways. To begin to understand the metabolism of the Brockarchaeota we compared the predicted proteins encoded by these genomes with a variety of functional databases (see Methods). This revealed a potential unique type of anaerobic methylotrophic metabolism not yet described in archaea. They contain the methyltransferase system (MT), that has been shown to be essential for anaerobic methylotrophy 26 and is composed of two key steps. First, specific methyltransferases break C-R bonds in a variety of substrates (MtaB for methanol, MtmB for monomethylamine, MtbB for dimethylamine, and MttB for trimethylamine) and transfer the methyl moieties to subunit MtaC. The second methyltransferase (MtaA for methanol, MtbA for methylamines), transfers the methyl-group from the corrinoid protein to coenzyme M in methanogens, or tetrahydrofolate in acetogens 24 . Brockarchaeota from hot springs encode proteins predicted to be methanolcobalamin methyltransferases (MtaB) and trimethylaminecorrinoid protein methyltransferase (MttB) for the utilization of methanol and trimethylamine (TMA), respectively. B12-binding corrinoid protein genes are always colocalized with mtaB (Supplementary Data 9), suggesting a co-transcription of both subunits of the methyltransferase complex. Due to the lack of MtaA proteins, we suggest that another undescribed protein may be involved in the transfer of the methylated compound from the corrinoid protein to tetrahydrofolate (see details in Supplementary Discussion). Searches for MtaB and MttB proteins within TACK superphylum indicate methanol-MT system is a unique feature of Brockarchaeota (Fig. 1). Phylogenetic analyses of MtaB and MttB revealed Brockarchaeota form new branches within methanol and TMA methyltransferases (Fig. 3).
These results expand the distribution of anaerobic methylotrophy. Anaerobic methanol-utilization has only been described in Euryarchaeota and TACK (Verstraetaearchaeota 20 , Korarchaeota 27 ) archaea, and some bacteria (Firmicutes and Deltaproteobacteria) 28 . However, Brockarchaeota genomes do not possess the key genes for methanogenesis including methyl-coenzyme M reductase (MCR) found in other archaea that have the MT system (Supplementary Data 10). To ensure that Brockarchaeota mcr genes were not overlooked, we searched metagenomic datasets from each of their communities and did not find any (See Supplementary Data 11). In addition to lacking Mcr, they lack key enzymes of the H 4 MPT methyl branch of the WLP, involved in the transfer and reduction of C1 moieties for methane production. The H 4 MPT is the natural pathway for degrading methylated compounds in methanogenic archaea 20 and a key module for methylotrophy, carbon assimilation, formaldehyde detoxification in methylotrophic bacteria 16,17,29,30 . Furthermore, they lack the key enzyme for the carbonyl branch for the WLP, the CO-dehydrogenase−acetyl-CoAsynthase complex CODH/ACS 31 . Finally, Brockarchaeota do not encode the pyrroloquinoline quinone (PQQ)-linked methanol dehydrogenase pathway for aerobic methylotrophy ( Supplementary  Fig. 2) and recently found in deep-sea anaerobic bacteria 24 .
The lack of previously described methylotrophic pathways raises the question about how C1 compounds are assimilated in Brockarchaeota. Our findings suggest they metabolize formaldehyde, methanol, and trimethylamine via convergent assimilatory pathways for biosynthesis and energy conservation. A detailed comparative analysis suggests that Brockarchaeota assimilate methanol and TMA via the convergent action of the tetrahydrofolate (H 4 F) methyl branch of the WL pathway (H 4 F-WLP) and reductive glycine pathway (rGlyP).  3-hexulose-6-phosphate synthase (HPS) and 6-phospho-3hexuloisomerase (PHI). The RuMP pathway was originally found in methylotrophic bacteria that are able to use C1 compounds as a sole source of carbon and energy; however, it is currently recognized as a widespread prokaryotic pathway for formaldehyde fixation and detoxification 32 . The RuMP pathway functions as a highly efficient system for trapping free formaldehyde at relatively low concentrations. The presence of HPS and PHI in Brockarchaeota genomes suggests that formaldehyde can be fixed and detoxified via the RuMP pathway. Further oxidation of formaldehyde to formate can be carried out by the oxygen-sensitive tungsten-dependent aldehyde ferredoxin oxidoreductase (AFOR) which is present in all MAGs except those obtained from Tengchong hot springs (DRTY). In this way, Brockarchaeota can potentially play a key role in controlling formaldehyde consumption and therefore maintaining viability of the microbial community in geothermally active environments.
Formate, methanol, and trimethylamine assimilation. In Brockarchaeota formate, derived by the oxidation of formaldehyde, or obtained from their geothermally surroundings 33    ). Then pyruvate can be further oxidized to acetate and produce ATP at substrate level phosphorylation for energy conservation and biomass production (see Fig. 4).
An unknown pathway for butanol metabolism. A phylogenetic analysis of alcohol dehydrogenases from the hot spring genomes revealed that they encode a butanol dehydrogenase BDH ( Supplementary Fig. 4) that may catalyze the reversible  40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61   62 63 64 65 66  67 68 69 70 71 72 73 74 75 36,37 which is one of the few organisms that produces butanol as a fermentation product, and Saccharomyces cerevisiae 38 involved in butanol and isopropanol production. We found that Brockarchaeota genomes lack the key enzymes involved in the fermentation of pyruvate to butanol (butanal dehydrogenase, butyryl-CoA dehydrogenase, enoyl-CoA dehydratase, 3-hydroxyacyl-CoA dehydrogenase). However, most of the genomes encode a putative aldehyde dehydrogenase that could convert butyraldehyde to butyric acid. Also, we found a putative enoyl-CoA hydratase/isomerase protein that is coded by one bin (JZ-1.89), which could be involved in further converting butyric acid to acetyl-CoA. Our results suggest an alternative pathway for butanol oxidation that still remains unresolved ( Supplementary Fig. 5). Utilization of extracellular organic carbon and detrital proteins. Brockarchaeota may be able to degrade a variety of organic carbon compounds. They may utilize hexoses via Embden-Meyerhof-Parnas (EMP) pathway ( Supplementary Fig. 6) and pentoses (xylose isomerase XylA and xylulose kinase XylB) via the isomerase pathway ( Supplementary Fig. 7). These enzymes were previously only found in bacterial thermophiles and halophilic archaea that ferment complex compounds and degrade xylose suggesting a similar physiology in Brockarchaeota 39 . Once transported into the cell, carbon complex compounds could enter the central metabolism and be converted to acetate and H 2 via acetogenic fermentation. The ATP conserving step for sugar or pyruvate fermentation to acetate could be catalyzed by acetate-CoA ligase in the hot spring genomes. Acetate can also be converted to acetyl-CoA by acetyl-CoA synthetase (ACS), thus acetate might be a source of carbon and energy in the absence of other substrates in hot spring Brockarchaeota. The presence of pyruvate ferredoxin oxidoreductase (PFO) that couples pyruvate oxidation to H 2 production, generating acetyl-CoA, could support fermentative metabolism via degradation of either acetate, pyruvate, hexoses, or pentoses. Brockarchaeota genomes code a wide repertoire of ATPases such as the plasma-membrane protonefflux P-type ATPase (only present in the hot spring genomes), Zn 2+/ Cd 2+exporting ATPase (present in DRTY7.37), and the V/ A-type H + /Na + -transporting ATPase (in most of the genomes). The existence of ATPase in Brockarchaeota suggests that members of these genotypes have the additional ability to couple acetogenic fermentation to membrane potential generation. To complement their ability to degrade xylose, Brockarchaeota also contain a relatively high number of carbohydrate-active enzymes (average of 27 CAZYmes per genome) which is 3 times what has been observed in other TACK archaea (Supplementary Data 12 and Fig. 5). Ten of the 15 Brockarchaeota genomes have genes with similarity to α-L-fucosidase involved in the degradation of xyloglucan, which is the major component of hemicellulose in plant-cell walls 40 . All the hot spring genomes encode GH3 family proteins which may be involved in biomass degradation, but these proteins play other roles in cell wall remodeling, energy metabolism, and pathogen defense 41 . The hot spring genotypes contain a wider repertoire of CAZymes than the deep-sea GB genomes. Among these are four predicted to be extracellular glycoside hydrolases, which are involved in the breakdown of high molecular-weight plant-derived polysaccharides, primarily xylanes, cellulose, and starch. Comparison of the CAZYmes across the TACK superphylum revealed that 17 extracellular enzymes, including those for the degradation xylanes, are unique  (Supplementary Data 12). The diversity and abundance of CAZYmes in members of the TACK superphylum highlights that despite the low number of sequenced Brockarchaeota genomes (15 described in this study), compared to Thaumarchaeota (89), they contain a greater variety and number than their closest relatives. Brockarchaeota also encode a wide range of peptidases. With exception of some bins (B48_G17, DRTY-6.200, DRTY7_35_44, JZ-1.89, JZZ.4, and QC4_43), they all encode potential extracellular peptidases indicating a potential role of Brockarchaeota in protein remineralization. Many of the are so unique that is was not possible to identify their specific substrates (Supplementary Data 13).
Mercury, sulfur, arsenate, and hydrogen chemolithotrophy. Geothermal ecosystems such as shallow, and deep-sea vents, volcanoes, geysers, hot springs, and fumaroles are natural sources of mercury (Hg) 42 . Hg resisting microorganisms are known to be enriched in deep-sea hydrothermal vents and in terrestrial geothermal springs. Three hot springs Brockarchaeota genomes (DRTY735_44, DRTY-1.18 and DRTY7.37) encode mercuric reductase (MerA), the central enzyme in the microbial mercury detoxification system. MerA transforms the extremely toxic Hg (II) to metallic Hg(0), being potentially involved in mercury detoxification 42 . Brockarchaeota MerA and closely related proteins were used to generate a mercury reductase phylogeny (Supplementary Fig. 8), indicating that Brockarchaeota possess a previously uncharacterized class of MerA, which are related to other archaea (Crenarchaeota, Methanomicrobia, DPANN, and Asgard).
Interestingly, Brockarchaeota also code the arsenic detoxification system that acts by decreasing the intracellular arsenic concentration by pumping out arsenate that enters the cell, thus preventing the metals from accumulating and denaturing proteins 43 . The intracellular dependent arsenate reductase (ArsC, K03741) that catalyzes the reduction of arsenate AsO 4 3− to arsenite As(OH) 3 (Fig. 4), is present in most hot spring genomes (Supplementary Data 6). Phylogeny of Brockarchaeota ArsC ( Supplementary Fig. 9) indicates that they belong to a deep uncharacterized branch of Thioredoxin-coupled clade, that has been mainly described in Firmicutes 44 . In agreement with the geothermal origin of Brockarchaeota genomes, homologous ArsC sequences recovered from geothermally active environments belonging to uncultured Bathyarchaeota or Thaumarchaeota 45,46 , which could potentially be Brockarchaeota, or have a similar arsenate metabolism. The presence of ArsC and the energydependent related detoxification proteins, could also indicate that Brockarchaeota in hot spring genomes could use arsenate as terminal electron acceptor, as seen in other bacteria, yet the exact molecular mechanism of this process is unknown 43,47 .
Furthermore, similar to other heterotrophic fermentative hyperthermophilic archaea 48,49 Brockarchaeota might be able to reduce elemental sulfur during fermentative growth and produce H 2 S due to the presence of [NiFe] Group 3b hydrogenases ( Supplementary Fig. 10). During carbohydrate fermentation in the absence of sulfur, [NiFe] Group 3b hydrogenase can catalyze the production of H 2 with NADPH or NAD(P)H as the electron donor. However, in the presence of sulfur, Brockarchaeota might have the ability to reduce sulfur using H 2 or organic substrates as Thaumarchaeota (89) Marsarchaeota (15) Geothermoarchaeota (5) Geoarchaeota Brockarchaeota Bathyarchaeota Aigarchaeota Verstraetearchaeota Korarchaeota (15) (1) electron donors, a widespread physiology in hyperthermophilic archaea living in geothermally active environments (volcanic habitats, hots springs, or marine sediments). Hydrogen is also abundant in geothermally active systems due to volcanic processes 50 . Brockarchaeota might be able to use 3b [NiFe]-hydrogenases for H 2 oxidation with NADP + or NAD(P) + as an electron acceptor 51 . The hot spring genomes also encode oxygen-tolerant group 3d [NiFe]-hydrogenases, which may allow them to transfer electrons between NAD(P)H and H 2 depending on the availability of electron acceptors. Group 3d [NiFe]hydrogenases are abundant in metagenomes from hot springs where microbial communities are relatively stable despite partial pressure of oxygen fluctuations 52 . Group [NiFe] 3b hydrogenases may also make it possible for these archaea to reduce elemental sulfur to H 2 S during fermentative growth. During carbohydrate fermentation in the absence of sulfur, Group 3b [NiFe]hydrogenases might catalyze the production of H 2 with NADPH or NAD(P)H as the electron donor. Therefore, Brockarchaeota might have the ability to reduce sulfur, using H 2 or organic substrates as electron donors, which is common in hyperthermophilic archaea living in geothermally active environments 53 .

Discussion
Brockarchaeota gene content indicates they are facultative or obligate anaerobic fermentative organisms that produce acetate, CO 2 , and H 2 as byproducts (see Supplementary Information for details). Some Brockarchaeota have unique pathways for non-methanogenic methylotrophy. This puts them a unique ecological position in nature, where they degrade abundant methylamines in anoxic environments without the production of methane (Fig. 6). Brockarchaeota are also able to degrade complex carbon compounds such as xylan. Xylans are a major structural polysaccharide in plant cells, and is the second most abundant polysaccharide in nature, accounting for approximately one-third of all renewable organic carbon on Earth after cellulose 54,55 . This suggests that Brockarchaeota are players in organic matter degradation in geothermally active environments. Interestingly, detrital proteins can be used as a substrate by Brockarchaeota, indicating potential role in protein remineralization in geothermally active environments.
The protein repertoire of GB and hot spring genomes have some important distinctions that reflect different anaerobic metabolisms. GB genomes appear to be obligately fermenting organisms that rely mostly on substrate-level phosphorylation since they lack all the complexes for the respiratory chain with exception of the ATPase. In contrast, hot spring genomes appear to have mechanisms to increase their ATP yield including the use of geothermally derived inorganic substrates as possible terminal electron acceptors such as mercury (Hg), arsenic (As), and hydrogen (H 2 ). Deep-sea hydrothermal vents, hot springs, and fumaroles are natural sources of Hg 42 , H 2 52 , arsenic 56 , and sulfur 57 .
The discovery of Brockarchaeota genomes from sediments around the world, overlooked by conventional rRNA gene diversity approaches, highlights the need for further exploration of subsurface microbial communities. Although they are relatively low in abundance in the communities described here, the addition of these genomes to public databases, will enhance their detection in future environmental studies, like other recently described novel archaeal lineages 1,58 . A lack of recognition of their existence prior to this limited our ability to fully describe sediment community structure and function. Given their broad distribution, and versatile carbon metabolism, they are likely key players in global carbon cycling. However, this first description is limited to genomic characterization, thus culturing or in activity measurements are needed to confirm their physiological activities 59 . Overall, the description of this new phylum enhances our understanding of biodiversity of archaea and suggests they are mediating unique roles in anoxic carbon cycling.

Methods
Metagenomic assembly and binning. Two MAGs (B48_G17 and B27_G9) were obtained from Guaymas Basin sediments (Gulf of California; 27°N0.388, 111°W 24.560) and were obtained as part of a larger study of these hydrothermal marine sediments 25 . Both samples were collected from the same location but G9 was sampled from 0-3 cm and G17 from 12-15 cm depth. The sediment cores from which these two MAGs were binned from were collected during Alvin dive 4571_4 in 2009 using polycarbonate cores (45-60 cm in length, 6.25 cm interior diameter), subsampled into cm layers under N 2 gas in the ship's laboratory and immediately frozen at −80°C. Details on the sampling site and metagenomic sequencing effort is provided in Dombrowski et al. 25 . Briefly, total DNA from ≥10 g of sediment from each sample was extracted using the MoBio PowerMax soil kit using the manufacturer's instructions and adjusted to a final concentration of 10 ng/µl of each sample (using a total amount of 100 ng). Libraries for paired-end Illumina (HiSeq-2500 1TB) sequencing were prepared by the Joint Genome Institute (JGI). Sequencing was performed on an Illumina HiSeq 2500 machine using the pairedend 2 × 125 bp run-type mode. All runs combined provided a total of~280 gigabases of sequencing data. Quality control and sequence assembly were performed by JGI. Briefly, sequences were trimmed and screened for low-quality sequences using bbtools (https://jgi.doe.gov/data-and-tools/bbtools/) and assembled using megahit v1.0.6 using the following options: --k-list23, 43,63,83,103,123. For further binning, only scaffolds ≥2000 bps were included.
Metagenomic binning was performed on individual assemblies using the binning tools ESOM, Anvi'o (v2.2.2) 60 and Metabat (v1) 61  Sequencing was done on an Illumina HiSeq4000 (Beijing Novogene Bioinformatics Technology Co., Ltd). These samples were assembled using metaSPADES (version 3.9.1), with a k-mer set of "21, 33, 55, 77, 99, 127". For each sample only scaffolds larger than 2500 bp were binned using MetaBAT (v.2.12.1) with default parameters, considering both tetranucleotide frequencies (TNF) and scaffold coverage information. The scaffolds from the obtained bins and the unbinned scaffolds were visualized using ESOM with a minimum length of 2500 bp and maximum length of 5000 bp as previously described 62 and the bins were modified by removing any out-of-range scaffolds (indicated by sequence points) or adding any unbinned scaffolds using ESOM related scripts 37 . MAGs from Tibet hot springs with scaffolds ≥1000 bp were uploaded to ggKbase (http://ggkbase.berkeley. edu/), and the bins from ESOM analyses were evaluated and modified manually at ggKbase based on GC content, coverage, and taxonomic information of scaffolds. MAGs from Tengchong hot springs were reassembled using SPAdes (version3.9.1) under the "careful" mode with the same k-mers. During this step, the reads used for the assemblies were recruited by mapping clean reads to the curated genome bins using BBmap (v35.85; http://sourceforge.net/projects/bbmap/). The accuracy of all the MAGs was evaluated by calculating the percentage of completeness and gene duplications using CheckM lineage_wf (v1.0.5).
Phylogenetic analyses. A phylogenetic tree was generated as recently described in ref. 1 . Briefly, 37 conserved marker proteins were extracted using phylosift 63 , in a genomic dataset containing 3549 archaeal genomes including Brockarchaeota, and 40 bacterial genomes. An alignment of the proteins extracted from a total of 3599 genomes was generated using MAFFT v7.450 64 (algorithm autoselection) with a BLOSUM62 scoring and contains 4962 characters after masking gaps present in at least 50% of the taxa. A tree was constructed with IQtree 65 v1.6.11 with a best fit LG + F + R10 model selected using the Bayesian Information Criterion (BIC) and bootstraps are based on 1,000 replicated trees. Command-line options -bb 1000 -bnni The bacterial genomes were used as an outgroup. The 16S rRNA sequences were extracted from Brockarchaeota genomes using Barrnap v.09 (https://github. com/Victorian-Bioinformatics-Consortium/barrnap) with the following parameters: --kingdom arc --lencutoff 0.2 --reject 0.3 -evalue. The obtained sequences (Supplementary Data 7). were used for a 16S rRNA gene phylogeny that included sequences derived from metagenomic surveys (NCBI accession EU924237, KX213943, and KX213897) and the IMG database. We used the 16S rRNA genes from the MAGs to search these databases to identify additional Brockarchaeota genes. The rRNA phylogeny was generated using RAxML within the ARB software package (v. 2.5b). using default parameters.
Metabolic predictions. Gene predictions for individual genomes were performed using Prodigal 66 73 . For KofamKOALA only hits above the predefined threshold for individual KOs were selected. Hydrogenases were extracted using the reference database described in Greening et al. and Søndergaard et al. 52,69 where there was conflict, the protein was manually reanalyzed using BLAST against non-redundant protein database, and genomic organization and annotation was confirmed using a web-based tool Operon Mapper 74 . The detected hydrogenases were used to generate a phylogenetic tree as previously described in Seitz et al. 7 . Hits for key metabolic marker genes were verified across different databases KofamKOALA, PFAMv31 and TIGR-FAMs and HydDB and were further verified using BLASTP using the NCBI web server tool. Genes encoding for carbohydrate degradation enzymes described in the carbohydrate-active enzymes (CAZYmes) database 75 were identified by only retaining hits recovered by ≥2 tools. Protein localization of the selected CAZYmes and peptidases was determined with the command line version of Psort v3.0 76 using the options -a and -terse for archaeal genomes in tabular format files. Finally, the presence of specific protein families was obtained with MEBS. The annotation was performed in a genomic dataset of 250 publicly available TACK genomes (Supplementary Data 3) that were also used for the CAZYmes annotation.
Methyl coenzyme M reductase screening. The mcrA gene was identified using GraftM v0.10.2 77 across metagenome assemblies where Brockarchaeota genomes from hot springs were detected 78 . The mcrA-containing scaffolds with sequence length <2.5 Kbp were discarded since scaffolds with short length were not used during the genome binning step. The taxonomic information of the corresponding bins which contain mcrA genes were determined using either GTDBtk v0.3.2 79 or phylogenetic placement (as reported in Supplementary Data 11). The mcrABGCD genes were identified in metagenome assemblies from deep-sea assemblies previously described 25 (Guay17 and Guay9; IMG genome ID 3300014887 and 3300013103, respectively).
Relative abundance of Brockarchaeota in communities. The relative abundance of the MAGs from deep sea samples was obtained from Supplementary Data 3 in Dombrowski et al. 25 . Only samples G9 and G7 are shown, and the data is sorted according to the relative abundance of those corresponding samples. GB MAGs include a post publication manual refinement in the taxonomy according to the recent archaea tree of life 1 .The relative abundance of MAGs from Tenghchong samples was computed using the bin_abundance.py script (https://github.com/ valdeanda/MetaGaia). For each MAG, total length of mapped reads for individual scaffolds (mapped reads using BWA algorithm) is summed up and the total is then divided by the MAG size in bp. This number is then divided by the total number of reads to obtain the relative abundance. The final relative abundance is multiply it by 100000000 for readability purposes. For the Tibet samples, the genome bins obtained for a given sample, the sequencing coverage was determined by read mapping using Bowtie2 and coverage calculation using the jgi_summar-ize_bam_contig_depths script from MetaBAT 61 . The relative abundance of a given genome bin was calculated as its sequencing coverage divided by the total sequencing coverage of all genome bins in the corresponding sample (Tengchong samples as previously described) 6 .
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
The final assembled and annotated genomic sequences of Brockarchaeota from deep sea sediments (B27_G9 and B48_G17) have been deposited in NCBI under BioProject ID PRJNA362212: BioSample id SAMN09215183 and SAMN09214986, respectively. Sequence data and sample information of Brockarchaeota from hot springs are available at NCBI under Bio Project ID PRJNA544494. All the NCBI accession numbers for the MAGs described in this study are provided in Supplementary Data 1. Raw data for Figs. 1, 2 are provided as Supplementary Data 3 and 7 respectively. Any other datasets generated and/or analyzed during the current study are available from the corresponding authors on request.