Reconstructing a hydrogen-driven microbial metabolic network in Opalinus Clay rock

The Opalinus Clay formation will host geological nuclear waste repositories in Switzerland. It is expected that gas pressure will build-up due to hydrogen production from steel corrosion, jeopardizing the integrity of the engineered barriers. In an in situ experiment located in the Mont Terri Underground Rock Laboratory, we demonstrate that hydrogen is consumed by microorganisms, fuelling a microbial community. Metagenomic binning and metaproteomic analysis of this deep subsurface community reveals a carbon cycle driven by autotrophic hydrogen oxidizers belonging to novel genera. Necromass is then processed by fermenters, followed by complete oxidation to carbon dioxide by heterotrophic sulfate-reducing bacteria, which closes the cycle. This microbial metabolic web can be integrated in the design of geological repositories to reduce pressure build-up. This study shows that Opalinus Clay harbours the potential for chemolithoautotrophic-based system, and provides a model of microbial carbon cycle in deep subsurface environments where hydrogen and sulfate are present.

days, the gas permeable membrane became clogged. It is the reason it was decided to inject H 2 directly into the borehole, thus creating a gas phase. (b) Each increase in H 2 concentration is due to an injection during a sampling event (red diamonds), and each decrease highlights H 2 consumption in the borehole during the sulfate reduction phase (Fig. 3). during the phase when water was recirculated and H 2 continuously injected (b) during the phase when water was not recirculated and H 2 delivered through discrete injections into the borehole. (a) The sample at day 0 was recovered prior to any H 2 amendment. The first planktonic cell density peak (at day 14) corresponds to the suboxic phase, when O 2 is still present in borehole, and when the Xanthomonadaceae and Pseudomonas populations are at their maximum (Fig. 4). After 27 days, O 2 concentration dropped to zero (Fig. 3), causing the decrease of suboxic microorganisms. At 56 days, a second peak coincides with the maximal Fe(II) concentration measured (Fig. 3), and highlights the growth of anaerobic microorganisms. However, after 56 days, H 2 transferred stopped suddenly ( Supplementary Fig. 2), impacting microbial growth. A third peak appears immediately after replacing the H 2 -permeable membrane (black cross at 98 days) but because it was clogged again soon thereafter, planktonic cell density rapidly dropped. (b) For each sampling event (red diamonds), sampled borehole water was replaced with synthetic porewater (that was sterile and anoxic). This is why every second measurement of planktonic cell density is lower, because it was carried out after having replaced the borehole water. But after a few days, planktonic cell density always increased (except one time, between 238 and 242 days), indicating that microorganisms were growing during sulfate reduction phase (Fig. 3). Planktonic cell density is missing for the sampling event of day 276 because, unlike all other samples that were measured right after they were recovered, the samples were stored two days at 4 ˚C, leading to biased results. non-recirculation mode with discrete H2 injections into the borehole. Dashed lines represent temporary connections between gas tanks and elements of the experiment. (a) In circulation mode, water is withdrawn from the borehole by the upper green line and goes through a first needle valve (needle valve 1), 5 cylinders that act as sediment traps (only three are indicated in the cartoon), a DO probe, a peristaltic pump (5 mL/min), a flowmeter, a gas-permeable membrane in contact with a H2 reservoir, and a second needle-valve (needle valve 2), before being re-injected into the bottom of the borehole. The third line (in orange) served as a pressure valve, in order to keep a stable pressure of 0.5 relative bars in the borehole. Additional water produced by the borehole was removed and collected into a bag. Here, surface equipment is kept under anoxic conditions in an argonflushed cabinet in order to prevent O2 from contaminating borehole water. (b) In non-recirculation mode, only needle valves are used for borehole water sampling, artificial water injection (not shown in the cartoon) and H2 injection that created a gas phase at the top of borehole chamber.  Fig. 6A and grey shading highlights other MAGs.
14 Supplementary Table 7. List of proteins involved in processes described in Fig. 5. In the second column, P stands for proteome, while G for genome.

Instrumented H 2 injection borehole
A 25 m long borehole was drilled from the gallery floor (borehole BRC-3). A hydraulic neoprene packer was installed at the bottom of the borehole, in order to create a 2.74 m long chamber isolated from oxic gallery atmosphere, where porewater constantly produced by the borehole (at a rate of 20 mL/day) accumulated. Multiple polyamide lines were placed for connecting this chamber to surface equipment, allowing water recirculation and sampling. To avoid lines clogging with particles, a PVC screen was also installed in the chamber. An artist's rendering of the borehole equipment in presented in Fig. 2.
The surface equipment, through which borehole water was recirculated, consists of PEEK lines connected in a circulation loop to a plexiglas sediment trap (originally designed as sampling cylinders), a peristaltic pump (with Pharmed BPT tubing), a flow-meter, a dissolved oxygen probe, a gas permeable membrane connected to a 500 mL reservoir filled with 100% H 2 and two needle valves, the first one placed right after borehole and second one right before the borehole, in the direction of water flow ( Supplementary Fig. 7). In order to protect this experiment from oxygen contamination when borehole water was recirculated, a plexiglas cabinet was installed and was regularly flushed with argon. In addition to the circulation loop, another line was dedicated to borehole pressure monitoring, releasing water when pressure was above 0.5 bars (relative pressure). Pure H 2 was later directly and non-continuously injected into the borehole chamber, thus creating a gas phase. More details concerning experiment set-up of recirculation and non-recirculation modes can be found in Supplementary Fig. 7 and in paragraphs describing sampling procedures (below).
It was not possible to install this experimental setting under truly sterile conditions. However, great care was taken to limit contamination. These steps include, depending on the material, autoclaving, ethanol flaming, rinsing with 70 % ethanol, 1 M HCl (overnight), or 7 % bleach, before rinsing with sterile water.
During recirculation mode, samples were recovered by connecting a sterile and anoxic 1 L bottle to needle valve 1 when the pump was shut down. This sample, which consists of several hundred mL of borehole water, was mainly used for DNA extraction, but also served for some chemical analyses. For some other analysis, borehole water was directly sampled from needle valves. More information about this particular type of sampling can be found in the next section. Borehole overpressure was used as a driving force for water sampling. But, in order to keep borehole pressure constant, sterile and anoxic artificial porewater (APW), amended with HCO 3 right before use, whose chemical composition mimics the one of the natural porewater composition, was injected into needle valve 2, from a sealed bottle connected to a sterile argon flux whose pressure was 0.5 relative bar. In this way, pressure and water volume in borehole stayed constant during sampling. In order to avoid sampling this artificial porewater, all connections between needle valves 1 and 2 in surface equipment were closed, forcing the water flow to go from artificial porewater bottle, to the interval and to the sampling bottle. Tracer tests were carried out to figure out how much borehole water can be sampled before injected artificial porewater is sampled.
During non-recirculation mode, in order not to sample the water contained in the lines dead volume, pump was either started for 20 min at 5 mL/min, or the first 100 mL of water were discarded. Pressure regulating line was kept closed. The higher borehole pressure caused by the injection of H 2 allowed for a larger sampling volume even before any artificial porewater injection. However, to sample the last hundred milliliters, when borehole pressure got closer to atmospheric pressure, artificial porewater was injected, as described earlier, to increase borehole pressure. This injection continued until after the sampling was done, in order to replace all the sampled water. After this injection, 1 to 3 L of sterile H 2 was injected to borehole through needle valve 2.
We used a third mode, which consisted of a transition between the two modes described above: borehole water was recirculated but H 2 was non-continuously injected into the borehole. In term of sampling and monitoring, this mode is similar to the recirculation mode.
APW composition is given in Supplementary composition was based on ion chromatograph measurements from BRC-3 porewater that was sampled before the experiment started.

Chemical sampling and assays
To measure organic acids, a 1.5 to 6 mL aliquot was withdrawn from the 1 L sampling bottle, filtered with a 0.2 µm pore size filter and stored at -20 ˚C until analysis. Before analysis, a volume of 4.5 mL of APW II was added (if needed) to complete the sample volume to 6 mL.  For measuring dissolved gases (H 2 and CO 2 ), 5 to 10 mL of borehole samples were recovered in a pre-prepared sealed serum bottle. The bottle contained 3.7 mg of mercury(II) chloride, was flushed with N 2 , and its internal volume and pressure (null or slightly positive relative to atmospheric pressure) was measured. The samples were directly recovered from needle valves, using a needle. The exact amount of water sampled was determined by the difference in bottle weight before and after sampling. Each sample was stored on its side at 37 ˚C between 1 and 2 days for gas phase equilibration. After this incubation, the gas phase was sampled using a gas-tight syringe and 1 mL was injected in a GC-FID (Varian 450-GC, Agilent, Santa Clara, USA). The H 2 was separated on a 1.5 meter mol sieve 13x 80/100 mesh, and the CH 4 was separated on a 2 meter hayesep Q 80/100 mesh column. 19 In order to back-calculate the dissolved gas concentration, first the volume of sampled water was determined.
Then, headspace volume can be calculated.
Then, the final pressure of the sampling bottle was calculated.
The following calculations were performed for each analyzed gas. The gas partial pressure was calculated.
Then, the gas concentration in headspace was calculated.
Then, the number of moles of gas in the headspace was calculated.
⁄ Then, Henry's law constant kH,cc, which gives the ratio between gas concentration in water and gas concentration in headspace at equilibrium and for a given temperature, was calculated for each gas and for incubation temperature 4 . Then, gas concentration in water and after equilibrium can be calculated.
Then, gas amount in water and after equilibrium can be calculated.

⁄
Then, the total amount of gas before equilibrium can be calculated.
Finally, sample gas concentration can be calculated. -Add the filter to a Lysing Matrix E tube.
-Add 650 L Sodium Phosphate Buffer to sample in Lysing Matrix E tube.
-Gently mix the tube by invert it by hand.
-Incubate the tube 5 minutes at 60 C.
-Homogenize in the FastPrep Instrument for 45 seconds at a speed of 5.5 m/s.
-Centrifuge at 12'000 g for 2 minutes to pellet debris. 22 -Transfer 400 l supernatant to a clean 2.0 mL microcentrifuge tube.
-Add 400 L Sodium Phosphate Buffer to sample in Lysing Matrix E tube.
-Gently mix the tube by slowly turning it over.
-Incubate the tube 5 minutes at 60 C.
-Vigorously shake the Lysing Matrix E tube for 45 seconds.
-Transfer 500 L supernatant to the 2.0 ml microcentrifuge tube.
-Mix by shacking the tube by hand 10 times.
-Transfer supernatant to a clean 15 mL tube.
-Resuspend Binding Matrix suspension and add 1.0 mL to supernatant in 15 mL tube.
-Place tubes on a rack and invert them by hand for 2 minutes to allow binding DNA.
-Incubate the tube for 3 minutes at room temperature to allow settling of silica matrix.
-Remove and discard 500 L of supernatant being careful to avoid settled Binding Matrix.
-Resuspend Binding Matrix in the remaining amount of supernatant. Transfer approximately 600 L of the mixture to a SPIN Filter and centrifuge at 12'000 g for 1 minute. Empty the catch tube.
-Repeat last step again till all the Binding Matrix have been transferred and centrifuged in the SPIN Filter.
-Add 500 L prepared SEWS-M (with ethanol added) and gently resuspend the pellet using the force of the liquid from the pipet tip.
-Centrifuge at 12'000 g for 1 minute. Empty the catch tube and replace. 23 -Without any addition of liquid, centrifuge a second time at 12'000 g for 2 minutes to dry the matrix of residual wash solution. Discard the catch tube and replace with a new, clean catch tube.
-Air dry the SPIN Filter for 5 minutes at room temperature.
-Gently resuspend Binding Matrix (above the SPIN Filter) in 100 L of 10 mM Tris-HCl, pH 7.5.
-Mix, close the tube, and incubated 2 minutes at room temperature.
-Centrifuge at 12'000 g for 1 minute to bring eluted DNA into the clean catch tube.
Discard the SPIN Filter.
-Separate the DNA in two samples, in 500 L tubes. One tube can be stored at -80C and the other can be used or stored at -20C. If it is used in the 2 days, it can be stored in the fridge.
An extra purification step was carried out subsequently, using the standard protocol of Genomic DNA Clean & Concentrator purification kit (Zymoresearch, Irvine, USA).
Water samples from BRC-3 (where the in situ experiment took place) were extracted using a second method that is phenol-chloroform extraction followed by an ethanol precipitation. The reason is that FastDNA SPIN Kit for Soil method recovered poor DNA quality (in term of fragment length) from samples containing S(-II) and black precipitates. It was decided to use a method that doesn't involve bead-beating because low quality DNA was obtained with that mechanical lysis approach. It starts with the recovery of biomass from filtration membranes.
After having filtered and stored the sample in LifeGuard, the membrane was placed in a 60 mL sterile bag containing 0.6 mL of TE buffer pH 7.5-8.0. The bag was closed and biomass was transferred to the TE buffer by rubbing the membrane with one's fingers on the outside of the bag. The TE buffer containing the filtrate was transferred to a new tube, and combined with the pellet obtained by centrifuging the LifeGuard solution (after having removed the membrane) at 7'000 g for 10 minutes. For DNA extraction, the following procedure was carried out: -Add lysozyme to 150 mg/L.
-Treat the lysate with Proteinase K (100 mg/L) for 1 hour at 55 ˚C.
-Mix by inverting rapidly the tubes.
-Pipet the aqueous phase to a fresh tube.
-Repeat 4 last steps with the aqueous phase until no protein is visible at the interface of the phases.
-Add to the aqueous phase one volume of chloroform/isoamyl alcohol.
-Mix by inverting rapidly the tubes.
-Pipet the aqueous phase to a fresh tube.
-Add glycogen 1 µL of a solution of 20 g/L of glycogen.
-Mix and incubate over night at 4 ˚C or -20 ˚C, or 1 hour at -20 ˚C or -80 ˚C.
-Discard carefully the supernatant.
-Remove carefully the supernatant and remove all drops around the pellet.
-Let it dry for 10 minutes.
-Incubate at 60 ˚C for 15 minutes. 25 -Freeze the sample at -20 ˚C if needed to be store for more than 2 days (otherwise at 4 ˚C).

16S rRNA gene sequencing
Itag   reads with 3 or more s or with average quality score of less than Q20 were removed. In addition, reads with a minimum sequence length of <50 bps were removed.

Proportion of microorganisms in microbial community
The abundance of each bin greater than 500 kb in size compared the abundance of the entire microbial community is defined as follows: ∑ where is the number of bins and where is the number of contigs in bin , the mean coverage of and ( ) is the length of contig . The abundance of a bin is then expressed as its mean coverage. The relative abundance , or contribution, of each bin was defined as

Metaproteomics
Two borehole water samples (0.45 L each) were collected and filtered by Sterivex 0.22 µm polyethersulfone membrane (Millipore, Billerica, USA) 483 days after the first H 2 injection and directly frozen in dry ice. The frozen filter was cut into small pieces, pooled together and immersed in detergent based lysis buffer, described by Chourey et al. 24 . The cells trapped on the filters were heat-lysed and processed as described earlier 25  A single aliquot of 75 ug peptide mix was loaded onto a biphasic resin packed column [SCX (Luna, Phenomenex, Torrance, USA) and C18 (Aqua, Phenomenex, Torrance, USA)] as described earlier 26,27 . Following sample loading, the column was washed for 15 min, offline as described by Sharma (28) and connected to the C18 packed nanospray tip (New Objective, Woburn, USA) mounted on Proxeon (Odense, Denmark) nanospray source as described earlier 28 . Peptides were subjected to 24h multi-step chromatographic separation via the Ultimate 3000 HPLC system (Dionex, Sunnyvale, USA) connected to the mass spectrometer and measurements done using the Multi-Dimensional Protein Identification Technology (MuDPIT) approach as described earlier [26][27][28] . The peptide fragmentation was executed and recorded via an LTQ-Orbitrap-Elite mass spectrometer (ThermoFisher Scientific, Germany) operated in data dependent mode, via Thermo Xcalibur software V2.1.0. Each full scan (1 microscan) was followed by collision-activated dissociation (CID) based fragmentation using 35% collision energy of 20 most abundant parent ions (1 microscan) with a mass exclusion width of 0.2 m/z and dynamic exclusion duration of 60 s. The peptide sample was analyzed via MuDPIT as two independent runs (technical duplicates). 30 For protein identification, the raw spectra were searched against the protein database generated by groundwater sample sequencing (as described above), via Myrimatch v2.1 algorithm 29 (omictools.com/myrimatch-tool) set to parameters described previously 30 with minor modifications such as omission of static cysteine and dynamic oxidation modifications.
Identification of at least two peptides per protein (one unique and one non-unique) sequence was set as a prerequisite for protein identification. Common contaminant peptide sequences from trypsin and keratin were concatenated to the database. Reverse database sequences were also included in the database as decoy sequences to calculate false discovery rate (FDR).
False discovery rate (FDR) cutoff for peptide to spectrum identification was maintained at < 1%. For downstream data analysis, spectral counts of identified peptides was normalized as described before 31 to obtain the normalized spectral abundance factor (NSAF 31 ), which was further adjusted by multiplying NSAF by 10 5 to obtain normalized spectral counts (nSpc). The sample was analyzed in duplicate, and the average nSpc values of technical duplicates was considered as the total proteome profile of borewater sample. The proteins were ranked in the order of high to low nSpc counts to indicate high to low protein abundances in the sample.