Carbon fixation rates in groundwater similar to those in oligotrophic marine systems

The terrestrial subsurface contains nearly all of Earth’s freshwater reserves and harbours the majority of our planet’s total prokaryotic biomass. Although genetic surveys suggest these organisms rely on in situ carbon fixation, rather than the photosynthetically derived organic carbon transported from surface environments, direct measurements of carbon fixation in the subsurface are absent. Using an ultra-low level 14C-labelling technique, we estimate in situ carbon fixation rates in a carbonate aquifer. We find these rates are similar to those measured in oligotrophic marine surface waters and up to six-fold greater than those observed in the lower euphotic zone. Our empirical carbon fixation rates agree with nitrification rate data. Metagenomic analyses reveal abundant putative chemolithoautotrophic members of an uncharacterized order of Nitrospiria that may be behind the carbon fixation. On the basis of our determined carbon fixation rates, we conservatively extrapolate global primary production in carbonate groundwaters (10% of global reserves) to be 0.11 Pg carbon per year. These rates fall within the range found for oligotrophic marine surface waters, indicating a substantial contribution of in situ primary production to subsurface ecosystem processes. We further suggest that, just as phototrophs are for marine biogeochemical cycling, such subsurface carbon fixation is potentially foundational to subsurface trophic webs. Direct measurements of carbon fixation rates in groundwater suggest a substantial contribution of in situ primary production to subsurface ecosystem processes.

T he continental subsurface is the planet's largest carbon reservoir 1 , housing up to 19% of its total biomass 2,3 and 99% of its freshwater 4 . Despite accounting for only 6% of total stores, modern groundwater (the fraction accrued in aquifers over the past 50 years) is the single most significant source of potable water. Carbonate karst aquifers alone are thought to supply people with nearly 10% of their drinking water 5 . Unfortunately, modern groundwater is also the most vulnerable to anthropogenic and climatic impacts 4 . While subsurface ecosystems have long fascinated ecologists 6 , and more recently microbiologists 7 , accessibility, enormous spatial heterogeneity and challenges in interpreting process rate measurements have obscured a meaningful understanding of their contributions to global biogeochemical cycles 8 .
The widespread recognition that Earth's biosphere extends deep into the subsurface occurred only recently 9 . Historically, carbon supply in such environments was thought to be limited to the trickling of surface-produced organic matter into the shallow subsurface 10 or what was stored within sedimentary rocks 11 . By stark contrast, a wealth of compelling genetic evidence suggests that in situ carbon fixation is critical for sustaining highly diverse microbial metabolic networks in groundwater, in both the shallow and the deep subsurface [12][13][14][15][16][17][18][19] . Despite the implications of gene-based surveys, the empirically derived activity measurements required to corroborate such inferences, constrain biogeochemical fluxes, understand system dynamics and integrate processes into regional and global models have yet to be reported. Here we report our use of a novel radiocarbon method to derive empirical carbon fixation rates and place them in the context of global groundwater.

Groundwater CO 2 fixation rates resemble marine waters
Primary productivity in the shallow subsurface (groundwater wells 5-90 m deep), experimental carbon fixation rates varied from 0.043 ± 0.01 to 0.23 ± 0.10 μgC l −1 d −1 (mean ± 1 standard deviation (s.d.); Fig. 1a, Extended Data Table 1 and Supplementary Information). The ultra-low level 14 C-labelling approach developed in this investigation exploits the high sensitivity of accelerator mass spectrometry, thereby minimizing impacts to groundwater hydrochemical equilibria and affording shorter incubation times. This method is particularly useful within a carbonate geological setting, where high dissolved inorganic carbon (DIC) backgrounds and a scarcity of microbes warrant greater sensitivity than is achievable via scintillation-based 14 C-labelling approaches. Rates resulting from our new labelling technique probably approximate net primary productivity rather than gross productivity, as has been reported for marine systems 20,21 , and we further expect them to be conservative estimates for carbon fixation as they consider only contributions from the planktonic portion of the community (Supplementary Information).
We compared these carbon fixation rates that were measured in groundwater of varying biogeochemical characteristics 22 with the only other subsurface 14 CO 2 assimilation measurements reported: those of a deep (830-1,078 m) groundwater borehole from crystalline bedrock in Sweden 23 . To do so, we converted the published rates of isotopic incorporation to carbon equivalents, revealing the lower but overlapping range of 0.0095 to 0.0560 μgC l −1 d −1 .
To better understand the relevance of the rates measured, we compared them with those of well-documented oligotrophic marine surface waters. Unlike our samples, the carbon fixed in these waters was sourced almost entirely by bacterial photoautotrophs 24,25 . When compared directly with a comprehensive dataset compiled by ref. 26 , our rates overlapped with those of global marine waters at depths to 140 m, equating to roughly 10% of the reported global median for 0-20 m depths (2.65 μg l −1 d −1 , interquartile range (IQR) = 1. 74,6.02) and 20% of the median for 20-140 m depths (1.2 μg l −1 d −1 , IQR = 0.6, 1.7). Comparisons with the extensively studied Sargasso Sea in the Bermuda Atlantic Timeseries Study (BATS) 27,28 and the Hawaiian Oceanographic Timeseries (HOTs) 29 datasets yielded similar findings (Fig. 2). Our rate measurements ranged between 3% and 23% of the median reported net primary productivity in the upper euphotic zones (down to ~120 m) and between 20% and 600% of the median of the lower euphotic region (100-120 m).
We also considered contributions to existing particulate organic carbon (POC) stocks and new carbon inputs per microbial cell count. After normalizing for estimated total bacterial cell numbers, groundwater yielded 0.3-10.8 fg fixed carbon per bacterial cell per day (Extended Data Table 2), which matched estimates of 0.25-12.1 fgC per bacterial cell per day across the marine photic zone (5-150 m). However, groundwater received new daily carbon inputs of only 0.47% ± 0.22% (mean ± s.d.; Extended Data Table 2) of its existing POC, much lower than the marine system's 2.6% ± 2.9% gain in the lower euphotic zone and 22% ± 18% at the surface 30,31 . This disparity might stem from the larger recalcitrant fraction of POC in groundwater compared with oligotrophic oceans, which is supported by deviations in 14 C and 13 C signatures of total groundwater POC concentrations compared with lipid signatures of resident microbes 32 .

An ecosystem dominated by chemolithoautotrophs
To identify dominant microbial primary producers, a dereplicated and quality-controlled set of 1,224 metagenome-assembled genomes (MAGs) were generated from groundwater samples. Of these, 102 putative chemolithoautotrophs exhibiting at least 50% completion scores for carbon fixation pathways were identified (Fig. 3). Almost exclusively bacterial (101), these MAGs represented 17 distinct phyla, 21 classes and 35 families ( Fig. 3  gene products for the 4-hydroxybutyrate/3-hydroxypropionate pathway and was not relatively abundant (<5× maximum normalized coverage). Three chemolithoautotrophic pathways were detected (Fig. 1b): the Calvin-Benson-Bassham (CBB), Wood-Ljungdahl (WL) and reverse tricarboxylic acid (rTCA) cycles were present in 37, 50 and 15 MAGs, respectively. The summed and normalized relative coverages of MAGs equipped with these metabolic pathways aligned with the carbon fixation rates measured in wells H52, H32 and H14 while contrasting with rate data from wells H41 and H43 (Fig. 1, Extended Data Fig. 2 and Supplementary Information). The greatest relative abundances of predicted chemolithoautotrophs were detected in oxic well H41 and anoxic well H52.

uncharacterized microbes influence CO 2 fixation potential
The most abundant putative chemolithoautotrophic populations represented by MAGs generated from anoxic groundwater were of poorly studied and/or uncharacterized microbial lineages. Those most abundant in oxic groundwaters, however, were phylogenetically and metabolically similar to well-characterized microbes (Supplementary Information). In both cases, metabolic reconstructions suggested that dominant subpopulations could access a diverse suite of (in)organic electron acceptors and donors. We mapped previously generated transcripts to these MAGs to confirm the active expression of gene products involved in energy acquisition and carbon fixation. As opposed to the broad distributions posited by deoxyribonucleic acid-(DNA-) based abundances, transcript data revealed far more restrictive ranges in which specific gene products were favoured (Fig. 3). Given their metabolic versatility and the results of previous cultivation-based analyses 33 , these populations are expected to be mixotrophic (capable of supplementing carbon requirements with available organic matter). Overall, carbon fixation in anoxic groundwater was predicted to be fuelled by reduced sulfur, and there were three highly abundant, putative sulfur-oxidizing MAGs identified, each accounting for >2% of the total metagenomic reads in some samples (100-400× normalized coverages). Diverse reduced sulfur species fuelling these metabolisms are released from pyrite weathering of the karst rock 34 .
The most abundant MAG (H51-bin250-1) encountered in this study belongs to a deep-branching order, 9FT-COMBO-4 2 -15, of class Nitrospira and is the first representative of class Nitrospiria thought to fix carbon via the WL pathway ( Fig. 3, Fig. 4a and Supplementary Information). As there is precedence for autotrophic WL-utilizing bacteria within phylum Nitrospirota, and Candidatus Magnetobacterium was characterized with an equally flexible metabolism, these traits may be more widespread within the phylum than previously thought 35 . In addition, two MAGs with the potential to couple sulfur oxidation to carbon fixation via the CBB cycle were identified as members of the Sulfurifustaceae family of Proteobacteria (Supplementary Information and Fig. 4b; H3 2 -bin014, H3 2 -bin069). These MAGs recruited tenfold more transcripts than their Nitrospirota counterparts and were among the most transcriptionally active putative chemolithoautotrophic genomes detected ( Planctomycetota MAGs, predicted to couple anaerobic ammonium oxidation to carbon fixation via the WL pathway, exhibited mean transcriptional activities on par with their Sulfurifustaceae MAG counterparts (Figs. 3 and 4c, Extended Data Fig. 5 and Supplementary Information). The elevated transcriptional activity of gene products within the CBB and WL pathways suggests that these taxa play a disproportionately large role in chemolithoautotrophy relative to their DNA-based abundances. Surprisingly, all putative anammox MAGs detected were transcriptionally active in oxic groundwater (Figs. 3 and 4b; wells H41 and H51). Anammox reactions are typically inhibited in the presence of oxygen 39 , although microbes will still express critically important genes in low oxygen environments 40,41 .

N-based rate measurements validate carbon fixation rates
To evaluate the relationship between anammox and carbon fixation in anoxic groundwaters, we compared the rates of each in a well harbouring the greatest relative abundance of putative anaerobic ammonium-oxidizing bacteria (well H52). Here, anammox rates of 1.2 ± 0.5 nmol l −1 d −1 N 2 were measured, similar to rates in another freshwater aquifer 42 . Empirical stoichiometric data demonstrate that 1.02 moles of N 2 is produced via anaerobic ammonium oxidation for every 0.066 moles of CH 2 O 0.5 N 0.15 reduced to biomass 43 . Assuming equivalent stoichiometry, the rate of carbon fixation via anammox in groundwater would be 0.93 ± 0.39 ngC l −1 d −1 , more than 200 times lower than the 220 ngC l −1 d −1 measured. This result is corroborated by metagenomic data that suggest the high rate of carbon fixation in anoxic groundwater is more likely driven by reduced sulfur than by reduced nitrogen.
Metagenomic and metatranscriptomic data predicted that nearly all the organic carbon produced under oxic conditions in well H41 would be coupled to nitrification. To test this, we monitored the rate of aerobic ammonium oxidation in this well and recorded a mean production of 125.8 ± 5.9 nmol NO 2 - Since the most abundant nitrifiers detected were most closely related to complete ammonium-oxidizing bacteria (Supplementary Information), we based our calculations on the 394 mg protein per mol of ammonia growth yields of Nitrospira inopinata, a comammox organism 44 . Assuming a cellular composition of C 5 H 7 O 2 N (ref. 45 ) and 55% protein content, we estimated a rate of 48.5 ± 1.9 ngC l −1 d −1 , which was well within the range of error for our measured rate of 43 ± 13 ngC l −1 d −1 and confirms the importance of nitrification for carbon fixation at this site. Furthermore, the stoichiometry determined for oligotrophic marine rTCA nitrifiers 46 of 0.0216 mol C/ mol N matched our calculated ratio of 0.0276 ± 0.0084, indicating they are responsible for most of the fixed carbon.

Global estimates for groundwater primary productivity
There are an estimated 22.6 million km 3 of groundwater on Earth 4 , 2.26 and 12.66 million km 3 of which are housed in carbonate and crystalline aquifers, respectively. If we assume that our average rates accurately represent carbonate groundwater systems, then 0.108 ± 0.069 PgC (mean ± s.d.) is fixed every year in this global ecosystem (Extended Data Table 3). If the values reported from crystalline aquifers 23 are representative of this environment, then another 0.15 ± 0.11 PgC would be fixed there annually. Collectively, the net primary productivity of ~66% of the planet's groundwater reservoirs would total 0.26 PgC yr −1 , approximately 0.5% that of marine systems and 0.25% of global NPP estimates 47 . As these projections exclude the missing contributions from groundwaters within siliciclastic and volcanic geologic settings and activities of attached microorganisms, global contributions to the carbon cycle are expected be many-fold higher.
We showed that conservative estimates of carbon fixation rates in a carbonate aquifer reached 10% of the median rates reported in oligotrophic marine surface waters and six-fold greater than those observed in the lower euphotic zone. Within oxic groundwaters, our carbon fixation method was independently validated by nitrification rate measurements. Normalizing carbon fixation rates by estimated bacterial numbers revealed equivalent carbon input (0.3-12 fgC per cell) for both marine and groundwater systems, despite the fact that daily inputs of new POC were 40 times greater in marine waters. This disparity makes sense since trophic webs are simpler in the subsurface, and the export of organic matter is constrained by long water residence times within the aquifer. Complementary metagenomic analyses revealed that groundwater carbon fixation is not dominated by a single functional guild but rather has contributions   from diverse pathways and versatile microorganisms that are setting specific. As the majority of photosynthetically derived carbon in marine systems is labile (half-life <1 day), the findings of this study solicit new hypotheses regarding carbon cycling in the subsurface, particularly those positing newly synthesized carbon rather than surface-derived organic matter as the primary source of fuel for microbiota. Indeed, subsurface primary producers need to be considered as important to ecosystem processes as marine phototrophs are known to be in the surface ocean. Applying these rates of carbon fixation to ecosystem processes alters the way we think about these environments, challenges the importance of surface-derived organic matter fluxes on shallow subsurface functioning and establishes a framework broadly applicable across groundwater systems.

Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/ s41561-022-00968-5. Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Open Access This article is licensed under a Creative Commons
Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons. org/licenses/by/4.0/. © The Author(s) 2022

Methods
Site description. Groundwater samples were sampled from the Hainich Critical Zone Exploratory (NW Thuringia, Germany) 22,48,49 . This aquifer assemblage consists of a multistory fractured system composed of alternating layers of limestone and mudstone that developed along a hillslope of Upper Muschelkalk bedrock 22 . The primary aquifer, represented in this study by wells H41 and H51, is oxic and lies within the Trochitenkalk Formation (moTK). Primarily suboxic to anoxic, mudstone-dominated overhanging strata lie within the Meissner Formation (moM) and are represented here by wells H14 (moM-substory 1), H32 (moM-5, 6, 7), H43 (moM-8) and H52 (moM-3, 4). Geochemically, H32 and H41 coalesce into a single cluster while each of the other wells represents a distinct regime. Consistent with previous microbiological characterizations, however, each well studied represented a distinct community state 50 . 14 C-DIC incorporation assay. This method, similar to a sensitive methane oxidation technique previously described 51 , is a modification of traditional 14 C-CO 2 primary productivity approaches 52 predicated on the sensitivity offered by accelerator-based mass spectrometry. Groundwater was collected in July 2020 during sampling campaign PNK130, as described by ref. 19 . After approximately three well volumes had been discharged and physicochemical parameters stabilized, groundwater was collected directly into nine pre-sterilized 2 l borosilicate bottles, from the bottom up. Bottles were then overfilled with greater than two volumes and sealed with gas-tight rubber stoppers. Triplicate samples from each well were then subjected to three treatments. A labelling treatment consisted of 6.77 × 10 −7 mmol C-NaHCO 3 that contained 200 Bq of activity (50 μCi; American Radiolabeled Chemicals) diluted to 9.38 Bq μl -1 with sterilized milliQ water, adjusted to pH 10 and verified using a scintillation counter. An advantage of this 14 C technique is that the small amount of tracer added (representing 0.000006% of the total DIC) did not change the substrate concentration or influence conditions such as pH that could affect microbial populations. Kill controls were prepared in the same way, except 10 ml 50% ZnCl 2 (w/v; final concentration 36.7 mM) was added to inhibit microbial activity. Unamended groundwater was also used as a control. All bottles were incubated in the dark at near in situ temperature for ~24 hours. Entire volumes were acidified to pH 4 with 3 M HCl, bubbled with N 2 for one hour to remove DIC and then filtered through pre-baked (550 °C, eight hours) quartz fibres (47 mm, 0.3 um pore size, Macherey-Nagel QN-10) using pre-baked filter stands (EMD Millipore).
Filters were vacuum dried, sealed in quartz tubes with cupric oxide wire under vacuum and combusted at 900 °C for two hours. Evolved CO 2 was purified cryogenically, measured as pressure in a known volume to determine C content and reduced to graphite for measurement by accelerator mass spectrometry at the WM Keck Carbon Cycle Accelerator Mass Spectrometry facility 53 . From the label incorporation and amount of carbon retained on the filters (Supplementary Data File 2), fixation rates were calculated using equation (1): The technical variation was at most 3.6% (median = 0.78%) of the biological variation for the 14 C measurements and was not considered in standard error of the mean calculations. Standard error of the mean was determined for both the 14 C-based measurements (difference between two sets of triplicates, label and control, or label and kill controls) and POC measurements (all nine bottles from each well), separately. These errors were then propagated to yield the final error estimations. Analyses of variance and post hoc Tukey honestly significant difference (HSD) tests were conducted on resulting summary statistics (mean ± s.e.m.) using the following utility 54 . All 14 C enrichment values were calculated using the differences between the 200 Bq-labelled samples and the 200 Bq-labelled kill controls. Rates calculated on the basis of no-label addition controls are presented in Extended Data

N-isotope incubation experiments.
Groundwater from wells H41 and H52 was collected in September 2018 and November 2018 to measure nitrification rates and anammox rates, respectively. Briefly, groundwater was collected into sterile glass bottles, from the bottom up, using a sterile tube. Bottles were then overfilled with three volume exchanges and sealed headspace free with silicone septa. Each sample was collected in triplicate alongside one control bottle per well. Samples were kept at 4 °C until they were processed (no more than 2 hours post-collection).
For nitrification measurements, 10 ml was removed from each sampling bottle (total volume 0.5 l) and replaced with N 2 to analyse inorganic nitrogen and pH. Groundwater from control bottles was sterile filtered through a 0.2 µm filter (Supor, Pall Corporation). Sterile filtered 15 N ammonium sulfate solution (98%, Cambridge Isotope Laboratories), serving as a substrate for ammonia-oxidizing prokaryotes, was then added to a final concentration of 50 µM. Samples were incubated at 15 °C in the dark sans agitation for five days. Ten-millilitre fractions were removed and replaced with N 2 at the outset of the experiment and after 12, 24, 48, 70 and 120 hours via filtration through 0.2 µm filters; these fractions were stored at -20 °C for isotopic ratio mass spectrometry analyses. Additional 10 ml fractions were removed at intervals to monitor pH and inorganic nitrogen during the incubation.
For anammox rate measurements, sampling bottles (total volume 1 l) were flushed with N 2 under sterile conditions for 30 minutes to remove all remnants of oxygen. Five-millilitre fractions were removed and replaced with N 2 from each sample (and control) bottle to assess background 14 NH 4 + concentrations. Subsequently, samples were spiked with either (1) 50 µM 15 NH 4 + + 5 µM 14 NO 2 − or (2) 5 µM 15 NO 2 − as previously described 58 . Control bottles, serving as abiotic controls, were sterile filtered (0.2 µm filters; Supor, Pall Corporation) before flushing and the addition of nitrogen compounds. To facilitate destructive sampling at eight time points, groundwater (30 ml; in triplicate) was dispensed into sterile serum bottles leaving ~8 ml of headspace. Bottles were immediately sealed with butyl septa and crimp sealed and the headspace was purged with He. All bottles were then incubated in the dark at 15 °C sans agitation, and incubations were terminated after 0, 12, 24, 36, 48, 60, 72 and 96 hours by adding 300 µl 50% (v/w) aqueous zinc chloride solution.
Nitrification rates were determined on the basis of 15 NO 2 − + 15 NO 3 − production in incubations with 15 NH 4 + . 15 NO 2 -and 15 NO 3 − were converted to N 2 via cadmium reduction followed by a sulfamic acid addition 59,60 . The N 2 produced ( 14 N 15 N and 15 N 15 N) was analysed on a gas chromatography isotope ratio mass spectrometer as previously described 61 . Rates were evaluated from the slope of the linear regression of 15 N produced with time and corrected for the fraction of the NH 4 + pool labelled in the initial substrate pool. The production of 15 N-labelled N 2 from anammox was analysed on the same isotopic ratio mass spectrometer as for nitrification rates and calculated as described 62 . Note, denitrification was not detected in any of the 15 NO 2 − incubations. T tests were applied (P < 0.05) to assess whether rates were significantly different from zero (Extended Data Fig. 3).
DNA extraction and sample preparation. Samples used to generate metagenomic libraries were collected in January 2019 during sampling campaign PNK 110. For each sample replicate, approximately 50-100 l of groundwater was filtered sequentially through 0.2-µm-and 0.1-µm-pore-sized polytetrafluoroethylene (PTFE) filters (142 mm, Omnipore Membrane, Merck Millipore; Supplementary Data File 3). With the exception of H32 (which did not yield sufficient volumes), each well was sampled in triplicate. H32 was duplicated using a sample previously collected during campaign PNK108 (November 2018). Filters were frozen on dry ice and stored at -80 °C before extraction. DNA was extracted using a phenol-chloroform-based method, as previously described 63 , and resulting DNA extracts were purified using a Zymo DNA Clean & Concentrator kit. Metagenome libraries were generated with a NEBNext Ultra II FS DNA library preparation kit, in accordance with manufacturer's protocols. DNA fragment sizes were estimated using an Agilent Bioanalyzer DNA 7500 instrument with High Sensitivity kits depending on DNA concentrations and recommendations of protocols Metagenomic assembly and binning. Adaptors were trimmed and raw sequences subjected to quality control processing using BBduk v.38.51 64 . Assembly and binning were performed as previously described 65 . Briefly, all libraries were independently assembled into scaffolds using metaSPAdes v.3.12 66 , all of which were taxonomically classified per ref. 65 . For individual assemblies, open reading frames (ORFs) were identified using Prodigal v.2.6.3 in meta mode 67 . To generate coverage profiles, all quality-assessed and quality-controlled (QAQC) sequences from each of the 32 metagenomic libraries were mapped back to each of the 32 scaffold databases using Bowtie2 v.2.3.4.3 in the sensitive mode 68 .
Characterizations of the MAGs. ORFs originating from all of the resulting MAGs were annotated using kofamscan 74 with the 'detail' flag, and KO annotations were filtered using a custom script (https://git.io/JtHVw). This utility preserves hits with scores of at least 80% of the kofamscan defined threshold, as well as those exhibiting a score >100 if there is no threshold. We elected to relax the default thresholds since all MAGs representing putatively chemolithoautotrophic microbes were verified manually, and we noticed that the best reciprocal blast hits with known reference sequences routinely scored below the kofamscan thresholds; that is, we favoured false positives over false negatives since we included a secondary verification step.
KEGGDecoder 75 was used to assess the metabolic potential of five of the primary chemolithoautotrophic pathways: the CBB cycle, the WL pathway, the reverse citric acid cycle, the 4-hydroxybutyrate 3-hydroxypropionate pathway and the 3-hydroxypropionate bi-cycle. MAGs were examined in greater depth if a given pathway was >50% complete. The MAGs representing potential chemolithoautotrophs were re-annotated using the online BlastKoala server 76 with essential steps verified through blast 77 against the RefSeq database. A collection of HMM models was used to determine which form of Rubisco was detected, along with potential hydrogenases 37 . Using blastp 77 , dissimilatory bisulfite reductases (dsrAB) were compared with a database compiled by ref. 78 to predict whether the pathway operated in an oxidative or reductive manner. Blast was used to compare gene hits for narGH/nxrAB (nitrate reductase/nitrite oxidoreductase) with a custom database based on sequences presented within ref. 79 .
All QAQC reads were remapped to a database consisting of only contigs of dereplicated MAGs. Normalized coverages for each of the MAGs were determined by scaling the resulting Anvi' o-determined coverages on the basis of the number of RNA polymerase B (rpoB) genes identified in the QAQC filtered reads. RpoB sequences were identified using ROCker with the precomputed model 80 . Scaling factors were calculated by dividing the maximum number of rpoB identified in the 32 metagenomic libraries by the number of rpoB detected in each sample. Reported values represent averages of the triplicates/replicates, unless stated otherwise. The taxonomy of each MAG was evaluated using the GTDB_TK tool kit 81 in concert with the Genome Taxonomy Database (release 89) 82,83 and its associated utilities 67,84-88 . Single-copy marker genes were identified and aligned with GTDB_ TK for all bacterial MAGs, and a phylogenetic tree of the concatenated alignment was constructed using FastTree2 v.2.1.10 in accordance with the JTT + CAT evolutionary model. The resulting phylogenetic tree was then imported into iToL 89 for visualization, and all MAGs were subjected to growth rate index analysis within each metagenomic library 90 .
Previously generated mRNA-enriched and post-processed metatranscriptomic libraries were procured from project PRJEB28783 91 . The groundwater source of these metatranscriptomes was collected in August and November 2015. QAQC filtered reads were mapped to MAGs using Bowtie2 v.2.3.5 in sensitive mode 68 , and the total number of rpoB transcripts from each metatranscriptomic library was determined, as described in the preceding for metagenomes. The transcriptomic coverages for each ORF from each MAG were determined using Anvi' o v.6 and normalized via scaling-factor calculations based on the total number of rpoB reads from the original metatranscriptome library (the coverage of each ORF from each MAG was normalized to a community-wide estimate of the transcriptional activity of a housekeeping gene in each sample). Means were determined considering all of the metatranscriptomes generated from a given well, including different sampling time points. While well H32 was sampled only once, mean values from all other wells account for three to four metatranscriptome coverages each. In addition, an average of the resulting normalized coverages for each MAG from each sample (sum of the MAG transcriptional coverage divided by the number of ORFs) was determined to estimate the relative transcriptional activity of the MAGs across the transect. Data were compiled and processed using R v.3.5.2 with Rstudio v.1.1.463 92,93 and the tidyverse package 94 , and colour schemes were generated using the RColorBrewer utility 95 . All MAGs were deposited in project PRJEB36505's data repository.

Data availability
The metagenomic raw data for this study was uploaded to the European Nucleotide Archive (ENA) under project PRJEB36505, while the individual sample assemblies and the metagenome-assembled-genomes were deposited into the ENA under project PRJEB36523. To improve ease of access, we also uploaded all MAGs and their associated data to the Open Science Framework (OSF) Project https://osf. io/4ceqs/. All raw and summarized accelerator mass spectrometry (AMS) data are available in Supplementary Data File 2, which was also uploaded to the same OSF project as the MAGs.