Time-series transcriptomics from cold, oxic subseafloor crustal fluids reveals a motile, mixotrophic microbial community

The oceanic crustal aquifer is one of the largest habitable volumes on Earth, and it harbors a reservoir of microbial life that influences global-scale biogeochemical cycles. Here, we use time series metagenomic and metatranscriptomic data from a low-temperature, ridge flank environment representative of the majority of global hydrothermal fluid circulation in the ocean to reconstruct microbial metabolic potential, transcript abundance, and community dynamics. We also present metagenome-assembled genomes from recently collected fluids that are furthest removed from drilling disturbances. Our results suggest that the microbial community in the North Pond aquifer plays an important role in the oxidation of organic carbon within the crust. This community is motile and metabolically flexible, with the ability to use both autotrophic and organotrophic pathways, as well as function under low oxygen conditions by using alternative electron acceptors such as nitrate and thiosulfate. Anaerobic processes are most abundant in subseafloor horizons deepest in the aquifer, furthest from connectivity with the deep ocean, and there was little overlap in the active microbial populations between sampling horizons. This work highlights the heterogeneity of microbial life in the subseafloor aquifer and provides new insights into biogeochemical cycling in ocean crust.


RNA Extraction and Library Preparation
Cells were lysed in mirVana RNA extraction kit lysis buffer by vortexing filters with RNA Powersoil beads for 10 minutes, followed by incubation with homogenate additive for 10 minutes at -20°C. The lysate was removed by centrifugation, and nucleic acids were purified with acid:phenol chloroform and washed with 100% ethanol. RNA was captured on a provided filter cartridge and eluted using two 50-ul aliquots of elution solution. Remaining DNA was removed using a Turbo-DNAase kit (Ambion) and extracts were cleaned with a RNeasy MinElute Cleanup kit (QIAGEN). RNA was then reverse transcribed to cDNA as described in the Methods.
cDNA was sheared to 400 bp using a Covaris focused-ultrasonicator, and purified using Agencourt AMPure magnetic beads (Beckman Coulter) using the Ovation kit protocol. End repair, ligation, and amplification (15 cycles) were carried out according to the kit manufacturer's instructions, using nuclease-free water to elute cDNA from the Agencourt beads. A KAPA Library Amplification Kit (Illumina) was used to further amplify (3 cycles) libraries with low yields. cDNA fragments between 400 and 570 bp were selected from the libraries using a Pippin Prep (Sage Science). For the U1383C Shallow (2012) and U1383C Deep (2017) samples, processing resulted in an overall smaller size distribution, and thus 200-370 bp fragments were selected for sequencing.

Metagenome Assembly
Assemblies were constructed using IDBA-UD version 1.1.3 [27] with a minimum contig length of 450 bp. For comparative purposes, multiple assemblies were produced, using iterative kmer values from 110 to 150 and from 20 to 150, produced stepwise by 10. Assembly statistics were computed for each metagenome with MetaQUAST v5.0.2 [28]. The N50, number of contigs, and total length of the assemblies were compared for each kmer step, and assembly performance was calculated as the product of the N50 (in kilobases) and read mapping rate (in percent) [88]. The 150-kmer assembly from the 110 to 150 kmer run was chosen as the best assembly using this assembly performance metric.
The Keck sequencing facility at the Marine Biological Laboratory, resulting in an average read length of ~110 bp. These metagenomes were reassembled using the same pipeline as the 2017 metagenomes, but the assemblies were produced for comparison using iterative kmer values from 110 to 150, 90 to 130, and 70 to 100. The 100 kmer assembly step from the 70 to 100 kmer run was chosen as the highest-quality assembly using the assembly performance metric in Vollmers et al. [88] (Supplementary Data).
Quality filtering was performed on the metatranscriptomes in the same manner as for the 2017 metagenomes (Supplementary Data).

Mapping MAGs to Metagenomes and Metatranscriptomes
A bowtie2 index was built from a concatenated file of all MAG contigs using bowtie2-build. Then, the quality-filtered metagenome reads and metatranscriptome reads were mapped against the MAG bowtie2 index using bowtie2 with the -no-unal flag. The resulting SAM files (containing only the reads which mapped) were converted to BAM files, then BamM (https://github.com/Ecogenomics/BamM) was used to remove all alignments with <95% identity and <75% alignment coverage. The number of reads in each remaining BAM file was counted using Binsanity-profile (v0.3.3) [36]. The read counts were then used to calculate the normalized relative fraction of the metagenomes or metatranscriptomes that mapped to the MAGs using the following equation:

Discerning Nitrite Oxidation/Nitrate Reduction and Methane/Ammonia Oxidation
Because genes for nitrite oxidation (nxrAB) and nitrate reduction (narGH) are annotated under the same KEGG orthology group, we discerned between these processes using three sources of evidence: (1) the taxa annotations of the ORFs called by IMG and mapped by Kallisto (Supplemental Table 4), (2) a phylogenetic tree of nxrA and narG proteins from the UniProt database [98], and (3) nxrA and narG protein annotations in the MAGs. A global alignment of nxrA/narG protein sequences from the MAGs and known nxrA and narG proteins from UniProt was performed using Geneious v. 9.0.5 (https://www.geneious.com), and a neighbor-joining tree was produced using the Jukes-Cantor genetic distance model and 100 bootstrap replicates (Supplemental Figure 4).
Particulate methane monooxygenase (pmoA) was distinguished from ammonia monooxygenase (amoA) using the taxa annotations of the ORFs called by IMG and mapped by Kallisto. Figure 1. A) Relative abundance of taxa associated with the small subunit (16S/18S) and large subunit (23S/28S) ribosomal genes annotated in the metagenomes. Ribosomal genes identified using SortMeRNA and annotated using UCLUST. B) Normalized abundance of key genes for carbon, oxygen, hydrogen, nitrogen, sulfur, phosphate, and iron within the metagenome at each site over three sampling years, in transcripts per million reads (TPM). TPM normalizes for both gene length and sequencing depth for more accurate comparison between samples. Here, because metagenomes are compared, TPM refers to "genes" and not "transcripts."