Niche differentiation of sulfur-oxidizing bacteria (SUP05) in submarine hydrothermal plumes

Hydrothermal plumes transport reduced chemical species and metals into the open ocean. Despite their considerable spatial scale and impact on biogeochemical cycles, niche differentiation of abundant microbial clades is poorly understood. Here, we analyzed the microbial ecology of two bathy- (Brothers volcano; BrV-cone and northwest caldera; NWC) and a mesopelagic (Macauley volcano; McV) plumes on the Kermadec intra-oceanic arc in the South Pacific Ocean. The microbial community structure, determined by a combination of 16S rRNA gene, fluorescence in situ hybridization and metagenome analysis, was similar to the communities observed in other sulfur-rich plumes. This includes a dominance of the vent characteristic SUP05 clade (up to 22% in McV and 51% in BrV). In each of the three plumes analyzed, the community was dominated by a different yet uncultivated chemoautotrophic SUP05 species, here, provisionally named, Candidatus Thioglobus vadi (McV), Candidatus Thioglobus vulcanius (BrV-cone) and Candidatus Thioglobus plumae (BrV-NWC). Statistical analyses, genomic potential and mRNA expression profiles suggested a SUP05 niche partitioning based on sulfide and iron concentration as well as water depth. A fourth SUP05 species was present at low frequency throughout investigated plume samples and may be capable of heterotrophic or mixotrophic growth. Taken together, we propose that small variations in environmental parameters and depth drive SUP05 niche partitioning in hydrothermal plumes.


Sample collection
Monitoring the primary plume tracers of turbidity and ORP in real-time directed the discrete sampling of both hydrothermal plume and non-plume affected water.
Turbidity anomalies are reported as dimensionless Nephelometric Turbidity Units [1] above the regional background (ΔNTU). The ORP sensor detects reduced hydrothermal chemical species (e.g., Fe 2+ , HS -, H2) that oxidize rapidly in dispersing neutrally-buoyant hydrothermal plumes [2,3]. ORP anomalies are identified by a negative change in potential (E, mV) and are expressed as either the time derivative (dE/dt, mV) that shows a sustained decreasing trend greater than the ambient drift over time, or as the magnitude and duration of the overall drop in value for any given signal (ΔE, mV).
After retrieval of the CTD rosette on board, water samples were immediately processed for further analyses: A small volume of each sample (2 l) was filtered through 0.22 pore size polycarbonate (PC) membrane filters (4.7 mm diameter, Millipore, Darmstadt, Germany). In addition, from selected samples, large volumes of 10-30 litre were filtered through 14.2 cm diameter PC filters (0.22 µm pore size), with the exception of 54CTD_b18 which was filtered through a polyethersulfone (PES; Millipore, Darmstadt, Germany) filter (0.22 µm pore size).
All filters were sliced and preserved for different downstream analyses: Subsamples for DNA and RNA extraction were either directly stored at -80°C, or preserved with RNAlater prior to freezing. For fluorescence in situ hybridization (FISH), samples were fixed with formaldehyde (1% final concentration) for 10-16 h at 4°C. Subsequently, cells were collected by filtration on 0.22 µm pore size PC membrane filters, dried and frozen until use.
DOC and SPE-DOC, and nutrients were measured on samples, collected at the same time and depth in parallel Niskin bottles. For example, Niskin bottles 04CTD_b1 and b2 were closed at the same time. Subsequently, the 04CTD_b1 sample was used for geochemical analyses, whereas 04CTD_b2 was processed for culture-independent molecular techniques in this study.
16S rRNA gene amplification DNA was extracted either from 1⁄4 − 1⁄3 of a 4.7 cm PC filter piece or a 1-2 cm 2 piece of a 14.2 cm PC membrane filter using the PowerSoil DNA Isolation Kit (MoBio, CA, USA) according to the manufacturer's instructions. The V3-V4 region of the 16S rRNA gene was amplified using the primer combination Bakt_341F and Bakt_805R [4] using barcoded primers [5].
Phusion-PCR was conducted as follows: Each 25 µl PCR reaction mix was composed of 0.75- polymerase (Takara) in 1-fold concentrated Ex Taq buffer. The PCR cycling conditions were as follows: 3 min denaturation at 96°C followed by 30 cycles of 1 min denaturation at 96°C, 1 min annealing at 55°C and 2 min elongation at 72°C, and a 10 min final elongation step at 72°C. PCR reactions were checked using an E-GeI iBase and E-Gel Safe lmager system (Thermo Fisher Scientific) and amplicons were purified using AMPure XP (Beckmann Coulter, Brea, CA, USA). PCR product concentrations were determined with a Qubit 3.0 fluorometer and the Qubit dsDNA HS Assay Kit (Invitrogen). Equimolar amounts of PCR product were pooled for amplicon sequencing on an Ion Torrent Personal Genome Machine (PGM) System (Thermo Fisher Scientific) using Ion PGM Hi-Q chemistry. Part of DNA extractions, PCR reactions and amplicon sequencing runs were already conducted on-board of cruise SO253 and parts were sequenced in the home laboratory.

Metagenome sequencing and analysis
The protocol "Ovation Ultralow System v2 1-16" (NuGEN, Redwood City, CA, US) was followed for end-repair, ligation and ligation purification.  [8] depth was conducted, in order to compare the abundance of functional genes between these plumes. Genes were normalized using the ribosomal gene rpoB.
Mariner and Kilo Moana plume [6] as well as Woody Crack buoyant plume [7] metagenomes were chosen due to the chemical similarity of the vent fluids to those emitted at BrV and McV.
Functional analyses were done by gene comparison against the Uniref100 Release 2019_04 database using DIAMOND v0. 8.24.86 (e-15) [9] and visualization of the results using MEGAN v6.6.0 [10]. MEGAN was further used for a neighbor joining analysis of the dissimilarities and a biplot PCoA. Relative numbers of sox and cytochrome genes were compared between all metagenomes.
All MAGs were visualized and manually refined using anvi'o v4 and v6 [14]. Completeness and contamination of the MAGs was analyzed by CheckM [15]. A genome-based maximum likelihood phylogenetic tree of SUP05 MAGs and publicly available genomes [16] was calculated using an alignment of 120 bacterial marker genes by HMMER [17] in GTDB-Tk v1.3.0 [18]. The tree was visualized using ITOL [19]. Average nucleotide identity of retrieved MAGs was calculated by JSpeciesWS [20]. MAGs were automatically annotated with RAST [21] and Prokka [22]. Metabolic capabilities of MAGs were manually checked and genes in agreement between pipelines were depicted as present. Carbohydrate-active enzymes were checked by blasting against the Carbohydrate Active Enzymes Database (default parameters; http://www.cazy.org/) [23]. Iron-related genes were checked using the program FeGenie [24]. Viral genes were mined using VirSorter v2 [25].

Metatranscriptome sequencing
Total RNA was extracted from 4-15 cm² membrane filter pieces according to the following protocol: Samples were thawn on ice and RNAlater was removed carefully (exception: 54CTD_B12 was not preserved in RNAlater, but directly frozen and stored at -80°C after filtering). Filter pieces were transferred to a 5 ml polypropylene (PP) tube. 1.8 ml ROTI Aqua-P/C/I (Carl Roth, Karlsruhe, Germany), 1.8 ml extraction buffer [120 mM sodium phosphate, pH 8, 1% acid washed polyvinylpolypyrrolidone (PVPP)], 260 µl 10% SDS and the content of 1 tube Lysing matrix E (MP Biomedicals, Irvine, CA, USA) were added. Tubes were shaken on a Vortex Genie (Scientific Industries, Bohemia, NY, USA) at maximum speed for 10 min, centrifuged at 5242xg for 10 min at room temperature, and the aqueous phase was transferred to a fresh tube. The organic phase was re-extracted with 500 µl 100 mM sodium phosphate buffer (pH 8), including an additional 10 min vortex step, 10 min incubation at 60°C, and a further 3 min vortex step. The aqueous phases of both extractions were combined, an equal volume of ROTI Aqua-P/C/I (Carl Roth) added, and the solution was mixed and centrifuged as before. The aqueous phase was again transferred to a fresh PP tube and nucleic acids were precipitated by adding 0.1 volume 3 M sodium acetate and 0.7 volumes isopropanol, mixing, incubating for 1 h on ice and 30 min centrifugation at 21000xg and 4°C.
The pellet was washed with ice cold 75% ethanol, and the nucleic acids were resuspended in 50 µl RNase free water. Furthermore, DNA was removed by DNase treatment (Turbo DNase, Thermo Fisher Scientific) and total RNA was purified by columns (RNA clean and concentrator, Zymo Research, Irvine, CA, USA). Capillary electrophoresis (Picochip, Agilent Bioanalyser; Agilent, Santa Clara, CA, USA) was used for quality assessment. An Illumina-compatible library was generated with the NEBNext Ultra II Directional RNA Library Prep kit (NEB, Ipswich, MA, USA) followed by sequencing (2x250 bp) on a HiSeq 2500 system (Illumina) at the Max Planck Genome Centre (Cologne, Germany). We received more than 24 million reads per sample (Table S3).

Phylogenetic reconstruction
SUP05 16S rRNA sequences were aligned to a curated SILVA SSU132 NR99 database (alignment quality >85, pintail >50) using SINA [28]. Phylogenetic trees were constructed based on almost full length SUP05 16S rRNA genes reconstructed with phyloFlash [13] (length > 900 bp) and 60 related published sequences (quality >98) in ARB [29]. Several treeing algorithms such as: Neighbour-joining [29], PhyML [30] and RAxML [31] and different position conservation filters (no filter, 30%, 50%) were applied. Shorter 16S rRNA amplicon sequences and 16S rRNA gene sequences retrieved from MAGs were added to the trees using maximum parsimony, without allowing changes of the overall tree topology. Sequences were exported from the tree and used to calculate the abundance of the SUP05 subgroups: CTA, CTT and CPS. Species representative of each group are CTA [32], CTT [33] and CPS [34].

CARD-FISH
CARD-FISH analysis was conducted according to Pernthaler et al. [35]. Cell permeabilization was conducted at 37°C for 60 min with 10 mg/ml lysozyme. Additionally, archaeal cell walls were permeabilized with 15 µg/ml proteinase K for 3 minutes at room temperature.
Depending on the abundance of the cells and the intensity of the signal 5 rows of a grid or a whole grid (up to 1000 cells) were counted in 8-10 different fields of view. Statistical analyses were performed in R. PerMANOVA [42] was used to test the redundancy analysis and to analyse the impact of environmental conditions. Linear regression using the package "stats" v3.6.2 [43] was conducted to model the relationship between specific clades and chemical parameters. Linear model assumption was validated using the gvlma v1.0.0.3 package [44]. Similarity percentages breakdown analysis was performed to discriminate species contributing to the dissimilarity using "Simper" function [45].

DOC and TDN
Dissolved organic carbon (DOC) and total dissolved nitrogen (TDN) were quantified on triplicate 10 mL water samples by high temperature catalytic combustion (Shimadzu TOC-VCPH) in the Marine Chemistry lab at the ICBM of the University of Oldenburg following onboard filtration (0.2 µm, GHP Aerodisc), acidification (pH2, 0.01 M HCl) and dark, refrigerated transport. Additional water volumes were solid-phase extracted (SPE) following the procedure described in Dittmar et al. [46] to acquire a methanolic extract containing a subset of DOC (SPE-DOC) which is less prone to contamination and can be further characterized on a molecular level. SPE-DOC was measured as bulk DOC following evaporation of an extract aliquot and re-dissolution in 0.01 M HCL as described in Hansen et al. [47]. Accuracy (< 2% deviation of expected reference value), precision (RSD% < 4%) and LOD (1 µmol L-1) of the analysis were monitored via repeated measurements of ultrapure water and a deep-sea  Table S1. Main environmental parameters for samples used in this study. Geo-chemical parameters were measured on board using in-line temperature and pH probes and in the lab after sample collection. As sulfide was below the detection limit in the plume, the highest measured concentration of DFe and H2S in the fluid coupled with maximum temperature were used to calculate plume sulfide concentrations. The maximal concentrations were chosen, due to our sampling strategy, which was geared towards sampling lower-T fluids, subsequently lowering the Fe and H2S average in the vent fluid.    Table S4. MAG completeness, contamination and relative abundance in the metagenomic datasets. a) Relative abundance and statistics of SUP05-related MAGs in the raw read datasets of different metagenomes. MAGs with more than 99% ANI between each other are clustered together: SUP05-1-4 (MAG-1 to -4); SUP05-5 (MAG-5 and MAG 5_1); SUP05-6 (MAG-6, MAG-6_1 and MAG-6_2). The left panel shows the percentage of raw reads mapped to binned contigs with 97% identity. The middle panel shows completeness and contamination of MAGs calculated by CheckM, and the right panel shows MAG composite information. All MAGs were retrieved from single assemblies, apart from MAG-5 and MAG-6. b) Relative abundance of MAGs which affiliated taxonomically with other taxa. c) Statistics of relative abundance of SUP05 clusters in the metatranscriptomes, with minimum identity of 97% identity, calculated as read percentage and RPKM. The darker color of the heatmap signals higher RPKM numbers.