Genomics and metatranscriptomics of biogeochemical cycling and degradation of lignin-derived aromatic compounds in thermal swamp sediment

Thermal swamps are unique ecosystems where geothermally warmed waters mix with decomposing woody biomass, hosting novel biogeochemical-cycling and lignin-degrading microbial consortia. Assembly of shotgun metagenome libraries resolved 351 distinct genomes from hot-spring (30–45 °C) and mesophilic (17 °C) sediments. Annotation of 39 refined draft genomes revealed metabolism consistent with oligotrophy, including pathways for degradation of aromatic compounds, such as syringate, vanillate, p-hydroxybenzoate, and phenol. Thermotolerant Burkholderiales, including Rubrivivax ssp., were implicated in diverse biogeochemical and aromatic transformations, highlighting their broad metabolic capacity. Lignin catabolism was further investigated using metatranscriptomics of sediment incubated with milled or Kraft lignin at 45 °C. Aromatic compounds were depleted from lignin-amended sediment over 148 h. The metatranscriptomic data revealed upregulation of des/lig genes predicted to specify the catabolism of syringate, vanillate, and phenolic oligomers in the sphingomonads Altererythrobacter ssp. and Novosphingobium ssp., as well as in the Burkholderiales genus, Rubrivivax. This study demonstrates how temperature structures biogeochemical cycling populations in a unique ecosystem, and combines community-level metagenomics with targeted metatranscriptomics to identify pathways with potential for bio-refinement of lignin-derived aromatic compounds. In addition, the diverse aromatic catabolic pathways of Altererythrobacter ssp. may serve as a source of thermotolerant enzymes for lignin valorization.


Introduction
Few geothermally influenced wetlands exist on Earthgeothermal hot springs that feed marshlands can be found in Yellowstone National Park (USA) [1][2][3], Great Rift Valley (Kenya), Iceland, and New Zealand [4]-fewer still are forested. The Liard River Hot Springs in northern British Columbia, Canada feature geothermally warmed springs (30-55°C) with organic sediments and biomats that receive inputs from the surrounding mixed-forest vegetation. Based on these characteristics, we hypothesized that this environment harbors thermotolerant microbiota capable of degrading lignin-derived aromatic compounds.
Herein, we used genome-resolved shotgun metagenomics and metatranscriptomics to characterize the phylogenetic and metabolic diversity of a thermal swamp. We exploited the negative, unimodal relationship between temperature and microbial diversity [27][28][29] to accomplish high-coverage genome reconstruction from the hot-spring microbiome. We then used metatranscriptomics to test the hypothesis that thermal swamp communities contained active microbial populations involved in aromatic catabolism. Genomic reconstruction revealed diverse chemoautotrophic metabolism in the thermal microbiome but only partially elucidated aromatic pathways for lignin-derived compounds, while metatranscriptomics and analytical chromatography provided evidence for thermotolerant catabolism of aromatic compounds. In particular, this study characterized transcriptomes from oligotrophic sphingomonads associated with degradation of S-and G-lignin at elevated temperature.

Field site and sampling
Liard River Hot Springs (59.431N, 126.1W) is a complex of carbonate-hosted springs that feed geothermally warmed meteoric waters to extensive swampland underlaid by sandstone and shale [30,31] (Supplementary Fig. 1). The pools are surrounded by unique thermal meadow vegetation, and mixed forest containing Populus tremuloides, Betula papyrifera, and Picea glauca. Four hot springs were sampled on September 27, 2018: the Alpha (50-55°C), Beta (30-35°C), Delta (30-40°C), and Epsilon (40-45°C) pools. Alpha pool contains a recreational bathing area, and was previously measured to have a pH of 6.5-6.8 and concentrations of 0.9-1.0 mg l −1 O 2 , 180-187 mg l −1 HCO 3 , 554-592 mg l −1 SO 4 , 0.4-33.8 mg l −1 H 2 S, < 1 mg l −1 NO 3 , and <0.005 mg l −1 Fe [31,32]. Gas in the water column was primarily N 2 (93.9%) with trace CO 2 (4.2%) and methane (0.3%) [32]. Water cools as it traverses the marshland via the Epsilon and Alpha riverine outflows to about 17°C at our final sampling location (Cool). In each pool, three replicatẽ 500 ml bulk samples of sediment were removed with a syringe pump into autoclaved 1 l Nalgene ® bottles. Bottles were filled to the brim with spring water, and the pump was rinsed once with 70% ethanol and thrice with autoclaved distilled water between each sample. In addition,~5 ml from each sampling location was placed in a Falcon ® tube and immediately frozen on dry ice for DNA extraction.
DNA extraction, library preparation, and sequencing DNA was extracted twice per sample, using 0.25 g sediment, soil, or biofilm, with NucleoSpin Soil kits (Machery-Nagel, Düren, Germany) then pooled. DNA quality was assessed with a 1% agarose gel and concentration was measured with Qubit dsDNA high-sensitivity assays (ThermoFisher, Waltham, USA). 1 ng of DNA from three replicates of selected sediment samples (Cool, Beta, Epsilon) was prepared for shotgun sequencing on one Next-Seq550 (Illumina) run in High Output mode using NexteraXT library preparation (Illumina, San Diego, USA). Additional sampling and 16S rRNA gene sequencing are described in the Supplementary Methods.

Lignin incubations
2 g of sediment from Epsilon, the warmest undisturbed pool, was incubated in 5 ml of M9-Goodies minimal medium [33,34] with 2 mg of vanillin (VAN), Eucalyptus milled-wood lignin (EMWL) (see Supplementary Methods), eucalyptus kraft lignin (EKL), or coniferyl alcohol dehydrogenase-polymer model lignin (DHP). Eucalyptus wood chips and EKL were provided by Suzano Canada. DHP was synthesized as in [35]. Control incubations were conducted with no exogenous carbon. All incubations took place in 50-ml sealed serum bottles at 45°C and 150 RPM mixing in the dark. During incubations, substrate-induced respiration (SIR) was measured by adding 1 ml air to the vial headspace with a 1-ml glass syringe, mixing, and removing 1 ml headspace, which was manually injected into a 5890 Series II gas chromatograph (Agilent Technologies, Santa Clara, USA) equipped with a flame ionization detector and methanizer, and quantified against CO 2 standards (Praxair, Danbury, USA). Three separate incubations for each substrate were destructively sampled at 0, 48, 96, and 148 h, except for VAN, which was sampled only at 24 h. At each time point, samples were centrifuged at 4°C for 5 min at 18,000 × g, and the supernatant was stored at −80°C prior to high-pressure liquid chromatography (HPLC). RNA was immediately extracted from the remaining sediment.

RNA extraction, library preparation, and sequencing
Approximately 0.5 g of sediment was separated into 2-ml screw-top microtubes containing 740 µl 0.1 M Na 2 PO 4 -NaH 2 PO 4 pH 7.3, 60 µl 10% SDS and 800 µl 25:24:1 phenol:chloroform:isoamyl alcohol (Sigma Aldrich, St. Louis, USA), modified from [36]. RNases from thermophilic organisms are highly resistant to inhibition or denaturation. To recover high-quality RNA from thermophilic communities, 100 µl of 200 mM ribonucleoside vanadyl complex (RVC) (Sigma Aldrich) was added to the extraction buffer [37,38]. RVC was essential for recovery of RNA from these samples (Supplementary Fig. 2A). Bead beating at 5.5 m/s for 30 s (twice) was followed by centrifugation for 5 min at 4°C and 18,000 × g. The supernatant was added to 800 µl of chloroform:isoamyl alcohol (Sigma Aldrich), mixed by hand, and centrifuged again. The supernatant was purified using the RNeasy ® Mini Kit (Qiagen), and DNA was removed using Turbo DNase (ThermoFisher). RNA concentration was measured using the Qubit™ RNA HS assay (ThermoFisher), and purified RNA stored at −80°C for <1 week. RNA integrity was measured using an Agilent 2100 Bioanalyzer (Agilent Technologies) before and after rRNA removal with Ribo-Minus™ Transcriptome Isolation Kit, bacteria (Thermo-Fisher). Sequencing libraries were prepared using SuperScript™ Double-Stranded cDNA synthesis (Thermo-Fisher) and NexteraXT (Illumina). Twelve 100-bp pairedend libraries were sequenced per NextSeq550 (Illumina) High Output run.

Liquid chromatography
Monoaromatic compounds in incubation supernatants were analyzed by HPLC against 2,6-dimethoxy-1,4-benzoquinone, vanillic acid, syringic acid, VAN, and syringaldehyde standard curves obtained by injecting between 1 and 20 μM of authentic standards in 50 mM sodium phosphate. Full details are provided in Supplementary Methods.

Assembly of genomes from metagenomes
Shotgun metagenome libraries were quality filtered and adapter trimmed using Trimmomatic 0.38 with default settings [42] prior to MegaHit 1.1.3 co-assembly with "metalarge" presets [43]. Contig open reading frames (ORFs) were predicted using Prodigal 2.6.3 [44]. Taxonomy of each ORF was annotated using Kaiju 1.7.2 [45], which, in turn, was used to calculate contig consensus taxonomy if 50%+1 of contig ORFs shared taxonomic identity. Contig coverage was calculated with BBMap 38.22 (https://sourceforge.net/ projects/bbmap/) prior to binning with Metabat2 [46]. CheckM 1.0.13 [47] was used to identify phylogenetic markers and calculate bin completeness, contamination, and heterogeneity. High-quality MAGs were manually refined (Supplementary Methods) and CheckM was re-run on the resulting refined genomes, followed by taxonomic classification and placement in whole-genome phylogenetic trees using GTDB-Tk [48] (github.com/Ecogenomics/GTDBTk). A metatranscriptome co-assembly was performed as above but using only 25-mers and binned without the support of coverage calculations.

Metabolic pathway annotation
Genes were annotated using a set of previously compiled Pfam and TIGRFAMs profile HMMs representing enzymes involved in biogeochemical cycling and energy metabolism [49][50][51], as well as Kofamscan profile HMMs [52]. Hits for ammonia oxidation (amoA) and nitrite oxidoreductase genes (nxrA) were validated by phylogenetic placement. Additional archaeal ammonia oxidation genes were identified using local alignment against Nitrososphaera viennensis EN76 amoABC [53]. Carbohydrate-active enzyme (CAZy) gene annotations were defined using dbCAN2 [54] with an E value of 1e−60. To facilitate the identification aromatic compound degradation and anoxygenic photosystem genes, custom thresholds were set for 25 KEGG HMMs (Supplementary Data 1, Supplementary Methods).

Calculating genomic abundance
Quality-filtered read files from metagenome and transcriptome libraries were aligned to the 39 refined genomes with BBMap. To capture potential biodiversity missed by the assembly approach, unmapped reads were subsequently aligned with 351 remaining contig bins and 24,668 reference genomes [48]. Genomes with >1% coverage were included in subsequent genomic analysis. To visualize the distribution of each genome across temperatures, feature scaling was used to normalize counts in each metagenome library between 0 and 1 for all genomes independently.

Statistical analysis
Comparison of OTU richness, assembly coverage, and genome richness across sequencing libraries was made using unpaired t-tests in R v3.6.2 [55] with a significance cutoff of p < 0.05. Headspace CO 2 was also compared with unpaired t-tests. Genomic and transcriptomic bins were analyzed for transcription levels in incubations with DHP (48 h), EKL (96 h), EMWL (168 h), VAN (24 h), and NC (96 h), coinciding with sampling times when maximum RNA was recovered ( Supplementary Fig. 2B). Transcript counts from metatranscriptome libraries were compared as transcripts per million with generalized linear models (GLM) and the multcomp Tukey-HSD implementation in R against NC controls. Differential gene expression was analyzed across metatranscriptomes using DESeq2 [56], using log 2 fold change (L 2 FC) values, and a significance cutoff of p adj < 0.05 following false-discovery rate correction.
MAGs can contain erroneously binned sequences that can lead to faulty ecological interpretation. While MetaBat2 performs favorably compared to other binning algorithms [46], we nonetheless manually refined 39 MAGs with contamination < 5% and completeness > 80% (unless otherwise noted) into our final "refined genome" data set ( Supplementary Fig. 4). Subsequent analyses focused on the refined genomes because: (1) they are taxonomically representative of the larger MAG data set; (2) many MAGs have incomplete or erroneous assembly; and (3) the refined genomes are the most abundant in the metagenome libraries based on read coverage. Only one Burkholderiaceae genome (Rubrivivax sp. L.E.AP.16) was included in the refined genome pool, despite a completion estimate of 56%, due to its unique metabolic capacity and possible thermotolerance. Sphingomonadaceae MAGs had a mean completeness of only 28%, and none were included in the analysis of refined genome.

Environmental distribution of refined genomes
The temperature at which each refined genome was found to have the highest relative abundance was used to infer their potential thermotolerance. We quantified the relative abundance of the refined genomes across a thermal gradient by aligning quality-filtered reads from shotgun metagenome libraries to refined genomes. We additionally assessed genomes potentially missed by our approach by aligning remaining reads against a collection of species-level reference genomes. Both refined and reference genomes were used to evaluate relative abundance and diversity across thermal samples. Comammox Nitrospira genomes were abundant at 17°C (see Supplementary Figs. 5 and 6 for Amo and Nxr phylogenies), highlighting the oligotrophy of this environment. Twelve of 41 genomes used in this analysis were most abundant at 30°C (Fig. 2). These included many of the Proteobacteria, excluding L.E.AP.16, and the manganeseoxidizing thermophile, Caldimonas manganoxidans [57]. Inferring genome metabolic potential across temperature ranges The metabolic capacity of each genome was inferred using profile HMMs from select Pfam [50], TIGRFAMs [49,51], CAZy [58], and KEGG [52] protein families (Supplementary Data 1), as well as local alignment against the Uni-Ref90 database [59]. Metabolic pathways were positively annotated when they contained essential enzymes such as monooxygenases or dioxygenases, and also contained at least 50%+1 of all genes identified using the above pHMMs (Fig. 3). Gene-level annotation is provided in Supplementary Fig. 7 and Supplementary Data 4.
Multiple refined genomes were found only at 30 or 45°C, while 14 were similarly abundant in both pools (Fig. 2). The genomes from three temperature categories (30,45, and 30-45°C) encoded similar biogeochemical cycling capacity, with differences in the taxonomy of microorganisms involved in nitrogen, sulfur, hydrogen, and halogen metabolism (Figs. 3 and 4). At 30°C, a phylogenetically diverse group of genomes encoded transformation of nitrogen, including a cyanobacterium, Fischerella sp. (L.E.CL.63), which encoded capacity for both N-fixation and urea mineralization. A Nitrososphaera genome (L.E.AR.1), containing a full complement of archaeal ammoniamonooxygenase genes ( Fig. 3 and Supplementary Figs. 5 and 6), was maximally abundant at 30°C. The abundance of Comammox Nitrospira sp. and group I.1b Thaumarchaeota ammonia-oxidizing archaea indicates highly oligotrophic conditions, i.e., [NH 3 + NH 4 + ] < 5 µM [7]. The capacity for dissimilatory nitrogen respiration was encoded by two alphaproteobacterial genomes, L.E.AP.45 and L.E.AP.39, which also contained pathways for aerobic aromatic catabolism via meta-and ortho-cleavage of protocatechuate (Fig. 4), indicating the potential for facultative anaerobic respiration and aerobic biomass decomposition. The genomes that showed equivalent abundance at 30 and 45°C (30-45°C) may represent thermotolerant organisms. Burkholderiales genomes of L.E.AP.16 (Rubrivivax sp.) and C. manganoxidans encoded aromatic degradation capacity though the catechol meta-cleavage and protocatechuate metaand ortho-cleavage pathways (Figs. 3 and 4). The capacity for vanillate degradation was encoded in two-component vanillate O-demethylase (vanAB) in C. manganoxidans, rather than in the tetrahydrofolate-dependent vanillate/syringate O-demethylase (ligM/desA) found in a Caulobacterium genome from 30°C. The vanAB genes were not detected in the L.E.AP.16 genome, despite expression of their Fig. 2 Refined genome phylogeny, assembly statistics, and abundance in Epsilon (45°C) and Beta (30°C) hot springs and mesophilic sediment (17°C) metagenomic libraries. Whole-genome phylogeny assessed as in Fig. 1. Tree scale represents estimated site divergence and green dots indicate branching calculated with bootstrapped confidence > 75%. Completeness (yellow) and contamination (red) were estimated with CheckM 1.0.13. Genome abundance calculated by alignment of quality-filtered reads against refined genomes and GTDB reference genomes using BBMap 38.22 and retaining targets > 0.5% coverage (blue). Abundance counts normalized using feature scaling between 0 and 1 for each genome. homologs in Rubrivivax transcriptomes (see below). The Burkholderiales genomes abundant in the 30-45°C range also showed substantial carbohydrate degradation capacity based on the presence of high numbers of glycoside hydrolases (GHs) (Fig. 4 and Supplementary Fig. 8).
Genomes found only 45°C may represent moderate thermophiles. The Chloroflexota genome L.E.CH.5 encoded the ability to hydroxylate alkylated phenols with an actinobacterial-like two-component monooxygenase [16], in addition to catechol meta-cleavage. In total, thermotolerant and thermophilic Burkholderiales genomes encoded a large portion of the biogeochemical cycling and aromatic degradation at elevated temperatures, with a range of possible electron donors and acceptors allowing substantial metabolic flexibility.

Lignin incubations
To characterize thermotolerant modification of lignin and the catabolism of lignin-derived aromatic compounds, we incubated Epsilon pool (45°C) sediment with one of three preparations of lignin (EMWL, EKL, and DHP lignin), VAN, or a no-carbon control. Epsilon sediment was chosen as it was the warmest undisturbed sampling location. 2D HSQC NMR analysis revealed that EMWL, generated by extracting lignin from enzymatically treated, milled Eucalyptus wood chips, was much less modified than EKL, generated by the Kraft process (Fig. 5A). Specifically, EMWL contained 51 β-aryl ether bonds per 100 aromatic subunits, while EKL contained only eight. Further, β-5 bonds were detected in EMWL but not EKL, and EMWL contained a higher S:G ratio (7:3) than EKL (6:4). EMWL, and to a lesser extent EKL, contained residual carbohydrates (annotated with an "X" in Fig. 5A). The synthetic DHP lignin contained a higher β-5 ratio than EMWL and, consistent with its synthesis, no carbohydrate and only Glignin units. Finally, EMWL and EKL contained small amounts of monoaromatic compounds, such as vanillate and syringate, as well as aromatic oligomers, while DHP lignin was relatively free of these compounds. EKL specifically contained about three times as much S-ligninderived monoaromatics than G-lignin-derived compounds.
SIR during lignin incubations was monitored by quantifying CO 2 in the headspace. CO 2 production significantly increased by 168 h in the presence of EKL (240%, p = 0.0022), EMWL (170%, p = 0.079), DHP (180%, p = 0.0019), and VAN (150%, p = 0.0079) relative to NC control incubations (Fig. 5B). The aromatic constituents of the incubation supernatant were analyzed with HPLC. At time zero, we detected monoaromatic compounds in the lignin-amended samples, and a large peak provisionally assigned to aromatic oligomers based on compound size. The area of this large peak decreased~50% by 48 h. In EKL, syringate concentrations were reduced by 94% after 48 h, syringaldehyde disappeared from the supernatant after 96 h, and VAN was completely removed after 48 h.
The initial concentrations of monoaromatic compounds in lignin-amended samples (t = 0) were in the low µM range. As this is much lower than the concentration of VAN (~1.4 mM), their presence alone does not explain the measured respiration, as the lignins induced more CO 2 production than VAN. As SIR is a very broad measure of metabolic activity, we also cannot rule out the impact of the potential toxicity of VAN [60] on the CO 2 production, or the possible effect of added lignins on the degradation of endogenous carbon. Vanillate and VAN were only detected after several days of incubation with EMWL and DHP, consistent with depolymerization. However, further analysis of lignin substrates would be required to confirm the possibility of depolymerization. Such analysis was not possible in the current study as the samples were sacrificed to obtain RNA yields suitable for metatranscriptomics analysis described in the next section.

Metatranscriptomics of lignin-incubated thermal sediment
To understand how thermotolerant communities can modify lignin, and to identify mineralization pathways for LMW aromatic compounds, we sequenced samples from all incubation time points with >100 ng total RNA. Transcriptome contig taxonomy was assessed with Kaiju ("Material and methods"), as well as analysis of SGCs in MetaBat2 "transcriptomes" (Fig. 6A). Taxonomically classified transcriptomes were added to the refined and reference genomes for read alignment and quantification as above. Transcriptomic reads mapped to 37 genomes and 6 transcriptomes. GLM analysis with Tukey post hoc testing indicated that four bins had higher normalized readmapping rates in EKL relative to controls (Fig. 6B). These included two Altererythrobacter transcriptomes, LS.9 and AB.1, and one Burkholderiales genome, MB2.215. Two Burkholderiales showed higher overall gene expression in VAN incubations: the Rubrivivax transcriptome CO.1, and the C. manganoxidans genome.
Differential expression analysis was performed with DESeq2 to determine transcript-level expression patterns during lignin incubations (Fig. 6C-E). Twenty-six transcripts in aromatic degradation pathways were found to be significantly upregulated on EKL using a cutoff of p adj < 0.05, compared to only nine in EMWL and three in VAN incubations (Fig. 6C). Several upregulated transcripts were identified as homologs of syringate or vanillate degradation pathway genes (Fig. 6G). Phylogenetic analysis (Supplementary Figs. 9-14) and syntenic conservation (Fig. 6F) of the protocatechuate meta-cleavage pathway genes were assessed to validate these assignments. Transcripts assigned to this pathway were found in AB.1, LS.6, LS.9, NO.1, CO.1, and CO.2 transcriptomes (Supplementary Data 5).
The protein phylogenies derived from metagenomic and metatranscriptomic sequences for vanillate O-demethylase (LigM), syringate O-demethylase (DesA), protocatechuate 4,5-dioxygenase α (LigA) and β (LigB) subunits were assessed by sequence alignment and phylogenetic tree estimation ( Supplementary Figs. 9-14). The AB.1 and LS.9 bins encoded all of these enzymes, except LigM, which was only encoded in AB.1. Differences existed between these two bins, despite both being classified as Altererythrobacter on the basis of SCG phylogeny. The two bins contained identical desA genes; however, LS.9 encoded protocatechuate 4,5-dioxygenase subunits that clustered with those of SYK-6, while AB.1 encoded homologs that clustered with those of Altererythrobacter. Furthermore, the presence of an uncharacterized ORF directly upstream of the ligA was detected in only Altererythrobacter reference genomes and AB.1, while the corresponding genes in LS.9 were syntenous to those of SYK-6 and N. aromaticivorans (Fig. 6F).

Discussion
Thermotolerant Alphaproteobacteria and Gammaproteobacteria from a geothermal environment expressed aromatic degradation pathways for lignin-derived monoaromatic compounds and aromatic oligomers in incubation with lignin. This finding is highly relevant to understanding the ecology of carbon cycling in geothermal environments, and has substantial biotechnology implications. In this study, two Altererythrobacter ssp. transcriptomes expressed complete aromatic monomer and oligomer degradation pathways during incubation with lignin (Fig. 6), despite not being fully reconstructed by genomic analysis. The syringate meta-cleavage pathway was primary expressed on EKL, which contained about three times higher concentration of dimethoxylated S-lignin-derived substrates than Glignin monomers. Significantly elevated expression of ligAB in the Altererythrobacter ssp. transcriptomes suggests that syringate meta-cleavage in these strains proceeds via 3-Omethylgallate. Expression of desB in EKL, EMWL, and DHP also suggests that some syringate degradation occurs via gallate oxidation. Strain-level characterization of Altererythrobacter ssp. should resolve the relative contributions of these pathways to syringate catabolism, and characterization of the thermotolerance of these enzymes may provide new targets for biological upgrading of lignin-derived compounds.
No clear extracellular lignin-depolymerization systems were expressed in the metatranscriptomes, consistent with the apparent low mineralization of intact lignins. However, transcripts of two CopA-type LCMOs and Q23D were present in the Altererythrobacter transcripts. Although CopA-type LCMOs are predicted to play a role in copper detoxification, several of these enzymes have shown lignin oxidation activity [65,66]. Similarly, a Q23D was associated with lignin depolymerization in Pseudomonas putida KT2440 [64], but the physiological role and substrate range of these enzymes remain uncharacterized. Finally, Altererythrobacter expressed genes associated with the intracellular degradation of the lignin-derived oligomers, e.g., guaiacylglycerol β-guaiacyl ether and DDVA. To date, aromatic oligomer depolymerization pathways have been identified only in the widely studied sphingomonads, SYK-6, and N. aromaticivorans DSM 12446 [19,20,61,62,[67][68][69]. Altererythrobacter ssp. may also use Nu-class GSTs to break β-O-4 bonds [62]. As in other sphingomonads, aryl O-demethylation may supply essential C1 metabolites in Altererythrobacter strains, based on the prevalence of expressed tetrahydrofolate-dependent O-demethylases during lignin degradation [70]. Altererythrobacter spp. are typically associated with oligotrophic environments such as deep-sea sediments [71] and 45°C hot springs [72]. Our metatranscriptomic experiment provides evidence that Altererythrobacter spp. catabolizes lignin-derived monoaromatic compounds via syringate and protocatechuate meta-cleavage pathways, and lignin-derived aromatic oligomers through several pathways including GST β-etherases.
Genes encoding aromatic catabolic pathways, including ring-cleaving dioxygenases and GSTs, are associated with genomic adaptation to oligotrophic environments [12]. These enzymes may serve to supply carbon and energy, or carry out detoxification reactions. While we assembled protocatechuate dioxygenases in both 30 and 45°C pools, the genes encoding ortho-and meta-cleavage enzymes were, respectively, encoded in Alphaproteobacteria (caulobacteria, Xanthobacteraceae) and Burkholderiales at each temperature. Additionally, the putative catabolism of alkylated phenols was encoded in thermophilic Chloroflexota. This is the first report of homologs for alkylated phenol hydroxylation encoded in Chloroflexota genomes, although catechol meta-cleavage potential was previously reported for pelagic Chloroflexota genomes [78]. These data demonstrate that biochemical transformations may be encoded by phylogenetically distinct organisms adapted to specific temperatures. The abundant Burkholderiales described herein encode multiple protocatechuate cleavage pathways and diverse energy generation mechanisms, and appear to be adapted to these highly oligotrophic geothermal environments. However, our metatranscriptomics data indicate that syringate and protocatechuate meta-cleavage pathways are the primary mechanisms for aromatic catabolism in these thermal swamp communities.
In this study, we demonstrate that genomic reconstruction could only partially distinguish aromatic degradation pathways, which co-occurred with pathways for facultative-photoautotrophy, halogen respiration, and arsenate detoxification in refined Burkholderiales (Rubrivivax sp.) and Chloroflexota (Roseiflexaceae) genomes. However, the targeted combination of metatranscriptomics and metabolite analysis indicated that significantly elevated expression of Rubrivivax sp. and sphingomonad (Altererythrobacter sp., Novosphingobium sp.) aromatic degradation pathways were associated with removal of lignin-derived substrates at 45°C. These results supported our hypothesis that thermal swamp communities contain untapped sources of thermotolerant enzymes for lignin valorization, including copper oxidases, GSTs, and Q23Ds. The apparent preference for G-and S-lignin substrates between strains in families Sphingomonadaceae and Comamonadaceae, respectively, has implications for the use of these organisms in upgrading lignin-derived compounds. Additional approaches to characterizing hot-spring microorganisms, including isolation of identified strains, and characterization of putative lignin-depolymerization enzymes, will be critical to improve our understanding of the contribution of thermotolerant bacteria to degradation of lignin-derived aromatic compounds.

Data availability
Sequence accessions are provided in Supplementary Data 2 and as part of NCBI BioProject PRJNA564648 (SRR10095339-SRR10095374). Refined genome information and NCBI accessions are provided in Supplementary Data 3.
Author contributions DJLB designed and implemented field sampling and experiments, performed genomic and transcriptomics analysis, and wrote the paper. AH performed metabolic pathway annotation and edited the paper. RR performed HPLC metabolite analysis and edited the paper. LYL and SR performed lignin chemistry analysis. LDE designed experiments and edited the paper. WWM designed experiments and co-wrote the paper.

Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict of interest.
Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons. org/licenses/by/4.0/.