Metabolic potential of uncultured bacteria and archaea associated with petroleum seepage in deep-sea sediments

The lack of microbial genomes and isolates from the deep seabed means that very little is known about the ecology of this vast habitat. Here, we investigate energy and carbon acquisition strategies of microbial communities from three deep seabed petroleum seeps (3 km water depth) in the Eastern Gulf of Mexico. Shotgun metagenomic analysis reveals that each sediment harbors diverse communities of chemoheterotrophs and chemolithotrophs. We recovered 82 metagenome-assembled genomes affiliated with 21 different archaeal and bacterial phyla. Multiple genomes encode enzymes for anaerobic oxidation of aliphatic and aromatic compounds, including those of candidate phyla Aerophobetes, Aminicenantes, TA06 and Bathyarchaeota. Microbial interactions are predicted to be driven by acetate and molecular hydrogen. These findings are supported by sediment geochemistry, metabolomics, and thermodynamic modelling. Overall, we infer that deep-sea sediments experiencing thermogenic hydrocarbon inputs harbor phylogenetically and functionally diverse communities potentially sustained through anaerobic hydrocarbon, acetate and hydrogen metabolism.

harbored diverse communities of chemoheterotrophs and chemolithotrophs. We recovered 82 23 metagenome-assembled genomes affiliated with 21 different archaeal and bacterial phyla. 24 Multiple genomes encoded enzymes for anaerobic oxidation of aliphatic and aromatic 25 compounds, including those of candidate phyla Aerophobetes, Aminicenantes, TA06 and 26 Bathyarchaeota. Microbial interactions are predicted to be driven by acetate and molecular 27 hydrogen. These findings are supported by sediment geochemistry, metabolomics, and 28 thermodynamic modelling. Overall, we infer that deep-sea sediments experiencing thermogenic 29 hydrocarbon inputs harbor phylogenetically and functionally diverse communities potentially 30 sustained through anaerobic hydrocarbon, acetate and hydrogen metabolism. 31 . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/400804 doi: bioRxiv preprint first posted online Aug. 28, 2018; In this study, we used culture-independent approaches to study the role of microbial communities 70 in the degradation of organic matter, including both detrital biomass and petroleum 71 hydrocarbons. We performed metagenomic, geochemical and metabolomic analyses of deep 72 seabed sediments (water depth ~3 km). Samples were chosen from three sites exhibiting 73 evidence of different levels of migrated thermogenic hydrocarbons. Metagenomes generated 74 from sediment samples of each site were assembled and binned to obtain metagenome-assembled 75 genomes (MAGs) and to reconstruct metabolic pathways for dominant members of the microbial 76 . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/400804 doi: bioRxiv preprint first posted online Aug. 28, 2018; communities. Complementing this genome-resolved metagenomics, a gene-centric analysis was 77 performed by directly examining unassembled metagenomic data. Through the combination of 78 metagenomics with geochemistry and metabolomics, supported by thermodynamic modeling, we 79 provide evidence that (1) deep-sea sediments harbor phylogenetically diverse heterotrophic and 80 lithotrophic microbial communities; (2) some members from the candidate phyla are engaged in 81 degradation of aliphatic and aromatic compounds; and (3) microbial community members are 82 likely interconnected via acetate and hydrogen metabolism. 83 Results 84

Hydrocarbon migration in seabed sediments 85
This study tested three petroleum-associated near-surface sediments (referred to as Sites E26, 86 E29 and E44; see map in Supplementary Figure 1) sampled from the Eastern Gulf of Mexico 20 . 87 Migrated thermogenic hydrocarbon content in the piston cores was analyzed for each of the three 88 sites (Table 1). All three sediments contained high concentrations of aromatic compounds and 89 liquid alkanes; aromatic compounds were most abundant at Site E26, while liquid alkanes were 90 on average 2.5-fold higher concentration at Sites E26 and E29 than Site E44. Alkane gases were 91 only abundant at Site E29 and were almost exclusively methane (CH4). CH4 sources can be 92 inferred from stable isotopic compositions of CH4 and molar ratios of CH4 to higher 93 hydrocarbons 15 . Ratios of C1/(C2+C3) were greater than 1,000 and δ 13 C values of methane were 94 more negative than -60‰, indicating that the CH4 in these sediments was predominantly 95 . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/400804 doi: bioRxiv preprint first posted online Aug. 28, 2018; 6 sites. Such UCM signals correspond to degraded petroleum hydrocarbons and may indicate the 99 occurrence of oil biodegradation at these sites 23 Figure 1a). While the three sites share a broadly 113 similar community composition, notable differences were Ca. Bathyarchaeota and Proteobacteria 114 being in higher relative abundance at the sites associated with more hydrocarbons (E29 and E26; 115 Table 1), whereas the inverse is true for Actinobacteria, the Patescibacteria group, and Ca. 116 Aerophobetes that are all present in higher relative abundance at Site E44 where associated 117 hydrocarbon levels are lower. Additional sampling is required to determine whether these 118 differences are due to the presence of hydrocarbons or other factors. 119 . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. In summary, while there are considerable community-level differences between the three sample 131 locations, the recovered MAGs share common taxonomic affiliations at the phylum and class 132 levels. Guided by associated geochemistry from the three sediment cores (Table 1 and  133 Supplementary Note 2), we subsequently analyzed the metabolic potential of these MAGs to 134 understand how bacterial and archaeal community members generate energy and biomass in 135 these natural petroleum-associated deep-sea environments. Hidden Markov models (HMMs) and 136 homology-based models were used to search for the presence of different metabolic genes in 137 both the recovered MAGs and unbinned metagenomes. Where appropriate, findings were further 138 validated through metabolomic analyses, phylogenetic visualization, and analysis of gene 139 context. 140 . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.

Detrital biomass and hydrocarbon degradation 141
In deep-sea marine sediments, organic carbon is supplied either as detrital matter from the 142 overlying water column or as aliphatic and aromatic petroleum compounds that migrate upwards 143 from underlying petroleum-bearing sediments 11 . With respect to detrital matter, genes involved 144 in carbon acquisition and breakdown were prevalent across both archaeal and bacterial MAGs. 145 These include genes encoding intracellular and extracellular carbohydrate-active enzymes and 146 peptidases, as well as relevant transporters and glycolysis enzymes (Figure 3 and Supplementary 147 Table 6). The ability to break down fatty acids and other organic acids via the beta-oxidation 148 pathway was identified in 13 MAGs, including members of Chloroflexi, Deltaproteobacteria, 149 Aerophobetes and Lokiarchaeota (Figure 3 and Supplementary Table 6). Metabolomics data 150 supported these genomic predictions and showed a surprising degree of consistency between the 151 geographically distinct sampling sites ( Figure 4). Over 50 metabolites from eight pathways were 152 detected in all the samples, including carbohydrate metabolism (e.g. glucose), amino acid 153 metabolism (e.g. glutamate), and beta oxidation (e.g. 10-hydroxydecanoate). Together, the 154 metagenomic and metabolomic data suggest that seabed microorganisms are involved in 155 recycling of residual organic matter, including complex carbohydrates, proteins and lipids. 156 To identify the potential for microbial degradation of hydrocarbons, we focused on functional 157 marker genes encoding enzymes that initiate anaerobic hydrocarbon biodegradation by activating 158 mechanistically stable C-H bonds 25 . We obtained evidence that two of the four known pathways 159 for oxygen-independent C-H activation 25-28 were present: hydrocarbon addition to fumarate by 160 glycyl-radical enzymes 28 and hydroxylation with water by molybdenum cofactor-containing 161 enzymes 25 . Glycyl-radical enzymes proposed to mediate hydrocarbon addition to fumarate were 162 . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/400804 doi: bioRxiv preprint first posted online Aug. 28, 2018; found in 13 MAGs (Chloroflexi, Aminicenantes, Aerophobetes, Actinobacteria, Bathyarchaeota, 163 Thorarchaeota and Lokiarchaeota) (Figure 3). The sequences identified are phylogenetically 164 distant from canonical akyl-/arylalkylsuccinate synthases, but form a common clade with the 165 glycyl-radical enzymes proposed to mediate alkane activation in anaerobic alkane degraders 166 Vallitalea guaymasensis L81 and Archaeoglobus fulgidus VC-16 29-31 (Supplementary Figure 3). 167 Based on quality-filtered reads, canonical AssA (n-alkane succinate synthase) and BssA (benzyl 168 succinate synthase) enzymes are also encoded at these sites and were most abundant in Site E29 169 (Supplementary Table 7 Table 7). The latter result is expected due to the low 177 concentrations of oxygen in the top 20 cm of organic rich seabed sediments 11 . 178 Our results also provide evidence that aromatic compounds can be anaerobically degraded via Various compounds that can be activated to form benzoyl-CoA were detected, including 183 benzoate, benzylsuccinate, 4-hydroxybenzoate, phenylacetate, acetophenone, and phenol. The 184 downstream metabolite glutarate was also highly abundant ( Figure 4). Benzoyl-CoA can be 185 . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/400804 doi: bioRxiv preprint first posted online Aug. 28, 2018; fermentation, including acetate production, in deep-sea sediments 12, 36 . Acetate can also be 208 produced by acetogenic CO2 reduction through the Wood-Ljungdahl pathway using a range of 209 inorganic and organic substrates 15 . Partial or complete sets of genes for the Wood-Ljungdahl 210 pathway were found in 50 MAGs (  reversible, but generally support fermentation in anoxic environments by coupling NAD(P)H 230 . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/400804 doi: bioRxiv preprint first posted online Aug. 28, 2018; reoxidation to fermentative H2 evolution 42-44 . Group 3c hydrogenases mediate a central step in 231 hydrogenotrophic methanogenesis, bifurcating electrons from H2 to heterodisulfides and 232 ferredoxin 45 ; their functional role in bacteria and non-methanogenic archaea remains unresolved 233 44 yet corresponding genes frequently co-occur with heterodisulfide reductases across multiple 234 archaeal and bacterial MAGs (Figure 3). Various Group 1 [NiFe]-hydrogenases were also 235 detected, which are known to support hydrogenotrophic respiration in conjunction with a wide 236 range of terminal reductases. This is consistent with previous studies in the Gulf of Mexico that 237 experimentally measured the potential for hydrogen oxidation catalyzed by hydrogenase 238 enzymes 46 . 239 Given the genomic evidence for hydrogen and acetate production in these sediments, we 240 investigated whether any of the MAGs encoded terminal reductases to respire these compounds. 241 In agreement with porewater sulfate concentrations (16-27 mM; see Supplementary Note 2), the 242 key genes for dissimilatory sulfate reduction (dsrAB) were widespread across the metagenome 243 reads, particularly at Site E29 (Supplementary Table 7); however, probably due to 244 incompleteness of genomes or insufficient binning, these genes were identified only in two 245 MAGs affiliated with Deltaproteobacteria and Dehalococcoidia (Supplementary Table 6). We 246 also identified 31 putative reductive dehalogenase genes (rdhA) across 22 MAGs, mainly from 247 Aminicenantes and Bathyarchaeota (Figure 3 and Supplementary Table 6); this suggests that 248 organohalides, which can be produced through abiotic and biotic processes in marine ecosystems 249 47 , may be electron acceptors in these deep-sea sediments. At least two thirds of the MAGs 250 corresponding to putative sulfate reducers and dehalorespirers encoded the capacity to 251 completely oxidize acetate and other organic acids to CO2 using either the reverse Wood-252 Ljungdahl pathway or TCA cycle (Figure 3 and Supplementary Table 6). Several of these MAGs 253 . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/400804 doi: bioRxiv preprint first posted online Aug. 28, 2018; also harbored the capacity for hydrogenotrophic dehalorespiration via Group 1a and 1b [NiFe]-254 hydrogenases ( Figure 3). In addition to these dominant uptake pathways, one MAG belonging to 255 the epsilonproteobacterial genus Sulfurovum (E29_bin29) included genes for the enzymes 256 needed to oxidize either H2 (group 1b [NiFe]-hydrogenase), elemental sulfur (soxABXYZ), and 257 sulfide (sqr), using nitrate as an electron acceptor (napAGH); this MAG also has a complete set 258 of genes for autotrophic CO2 fixation via the reductive TCA cycle (Figure 3 and Supplementary 259 Table 6). 260 The capacity for methanogenesis appears to be relatively low. The genes for methanogenesis 261 were detected in quality-filtered unassembled reads in all three sediments and were mainly 262 affiliated with acetoclastic methanogens at Site E29, and hydrogenotrophic methanogens at the 263 other two sites ( Figure 1d). However, none of the MAGs contained mcrA genes. Overall, the 264 collectively weak mcrA signal in the metagenomes suggests that the high levels of biogenic 265 methane detected by geochemical analysis (Table 1)

Thermodynamic modelling 271
Together, the geochemical, metabolomic, and metagenomic data strongly indicate that anaerobic 272 degradation of aliphatic and aromatic compounds occurs in these deep-sea sediments (Table 1  273 and Figures 3-4). Recreating the environmental conditions for cultivating the organisms 274 represented by the retrieved MAGs is a challenging process, preventing further validation of the 275 . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/400804 doi: bioRxiv preprint first posted online Aug. 28, 2018; degradation capabilities of these compounds (and other metabolisms) among the majority of the 276 lineages represented by the MAGs retrieved here 41 . Instead, we provide theoretical evidence that 277 anaerobic degradation of aliphatic and aromatic compounds is feasible in this environment by 278 modelling whether these processes are thermodynamically favorable in the conditions typical of 279 deep-sea sediments, namely high pressure and low temperature. 280 As concluded from the genome analysis and supported by metabolomics (Figures 3-4), it is 281 likely that anaerobic degradation occurs through an incomplete oxidation pathway. However, 282 due to incompleteness of the reconstructed genomes, we cannot exclude the possibility that 283 complete oxidation of aliphatic and aromatic compounds to CO2 occurs through, for example, 284 coupling with sulfate reduction (Figure 3). Additionally, several recent studies indicate that some 285 aliphatic and aromatic compounds can be incompletely oxidized to acetate via the Wood-286 Ljungdahl pathway 37, 40, 49 . Therefore, we compared the thermodynamic constraints on anaerobic 287 biodegradation for three plausible scenarios (Table 2) (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/400804 doi: bioRxiv preprint first posted online Aug. 28, 2018; could also take place in theory (Table 2 and Figure 5). This suggests that acetate-and H2-299 scavengers, by making acetogenic and hydrogenogenic degradation more thermodynamically 300 favorable, may support activity of anaerobic degraders in the community. 301

Discussion 302
In this study, metagenomics revealed that most of the bacteria and archaea in the deep-sea 303 sediment microbial communities sampled belong to candidate phyla that lack cultured 304 representatives and sequenced genomes (Figures 1 and 2). As a consequence, it is challenging to 305 link phylogenetic patterns with the microbial functional traits underpinning the biogeochemistry 306 of deep seabed habitats. Here, we were able to address this by combining de novo assembly and 307 binning of metagenomic data with petroleum geochemistry, metabolite identification, and 308 thermodynamic modeling. Pathway reconstruction from 82 MAGs recovered from the three 309 deep-sea near surface sediments revealed that many community members were capable of 310 acquiring and hydrolyzing residual organic matter (Figure 3), whether supplied as detritus from 311 the overlying water column or as autochthonously produced necromass. Heterotrophic 312 fermenters and acetogens were in considerably higher relative abundance than heterotrophic 313 respirers, despite the abundance of sulfate in the sediments (Supplementary Note 2). For 314 example, while genomic coverage of putative sulfate reducers is relatively low (< 1% of the 315 communities), putative acetogenic heterotrophs were the most abundant community members at 316 each site. Therefore, microbial communities in the deep seabed are likely shaped more by the 317 capacity to utilize available electron donors than by the availability of oxidants. In line with the 318 different geochemical profiles at the three sites (Table 1), some differences in the composition of 319 microbial communities and the abundance of key metabolic genes were observed (Figure 1 and 320 . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/400804 doi: bioRxiv preprint first posted online Aug. 28, 2018; Supplementary Table 7). However, metabolic capabilities such as fermentation, acetogenesis, 321 and hydrogen metabolism were conserved across diverse phyla in each site (Figure 3). This 322 suggests some functional redundancy in these microbial communities, similar to that recently 323 inferred in a study of Guaymas Basin hydrothermal sediments 50 . 324 In this context, multiple lines of evidence indicate aliphatic or aromatic compounds serve as 325 carbon and energy sources for anaerobic populations in these deep-sea hydrocarbon seep 326 environments (Tables 1 -2; Figures 3 -5). Whereas capacity for detrital organic matter 327 degradation is a common feature in the genomes retrieved in this study, and from many other 328 environments in general 51 , anaerobic degradation of aliphatic or aromatic compounds is a more 329 exclusive feature that was detected in 19 out of 82 MAGs. In all three sediments, there was finding that Bathyarchaeota and other archaeal phyla are potentially capable of anaerobic 342 degradation of aliphatic or aromatic compounds extends the potential substrate spectrum for 343 . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. CoA reductases 37 . Acetogenic oxidation may explain why the Class I reaction is energetically 356 favorable; thermodynamic modelling indicates the free Gibbs energy yield of acetogenic 357 oxidation of benzoate is much higher than its hydrogenogenic or complete oxidation (Table 2).

Sample selection based on geochemical characterization 379
The three marine sediment samples used in this study were chosen from among several sites 380 sampled as part of a piston coring seafloor survey in the Eastern Gulf of Mexico, as described 381 previously 20 . Piston cores penetrating 5 to 6 meters below sea floor (mbsf) were sectioned in 20 382 cm intervals on board the research vessel immediately following their retrieval. Three intervals 383 from the bottom half of the core were chosen for geochemical analysis, and were either frozen 384 immediately (for liquid hydrocarbon analyses), or flushed with N2 and sealed in hydrocarbon-385 free gas tight metal canisters then frozen until analysis (for gaseous hydrocarbon analysis). 386 Interstitial gas analysis was later performed on the headspace in the canisters using GC with 387 . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint . http://dx.doi.org/10.1101/400804 doi: bioRxiv preprint first posted online Aug. 28, 2018; Flame Ionization Detector (GC-FID). Sediment samples for gas/liquid chromatography and 388 stable isotope analysis were frozen, freeze-dried and homogenized then extracted using 389 accelerated solvent extraction (ACE 200). Extracts were subsequently analyzed using GC/FID, 390 GC/MS, a Perkin-Elmer Model LS 50B fluorometer, and Finnigan MAT 252 isotope mass 391 spectrometry as detailed elsewhere 54 . On the basis of TSF and UCM concentration thresholds 392 described previously 20 , core segments from E26 and E29 were qualified and core segments from 393 E44 were disqualified for unambiguous occurrence of thermogenic liquid hydrocarbons. 394 Additionally, interstitial hydrocarbon gases were observed in the core segments of E29. Samples 395 from the surface 0-20 cm interval from these three cores were further analyzed as described 396 below. 397

Porewater geochemistry 398
Porewater sulfate and chloride concentrations were measured in a Dionex ICS-5000 reagent-free 399 Organic acids were compared to the retention time of known standards and the limit of detection 406 for acetate was determined to be 2.5 µM. 407 . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.

Metabolomic analysis 408
For the analysis of metabolites, sediment was centrifuged at 21,100 × g for 10 minutes at room 409 temperature, the supernatant was collected, diluted 1:1 in pure methanol, and filtered through 0.2 410 µm Teflon syringe filters. Each sediment was subsampled five times to assess technical 411 variability across the sample. Metabolites present in the extracts were separated using ultra high- Bact-0785-aA21 and SD-Arch-0519-aS15/SD-Arch-0911-aA20, respectively 58 as described 450 previously 20 on a ~15 Gb 600-cycle (2 × 300 bp) sequencing run. 451 . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.

Metagenomic assembly and binning 452
Raw reads were quality-controlled by (1) clipping off primers and adapters and (2) filtering out 453 artifacts and low-quality reads as described previously 59 . Filtered reads were assembled using 454 metaSPAdes version 3.11.0 60 and short contigs (<500 bp) were removed. Sequence coverage 455 was determined by mapping filtered reads onto assembled contigs using BBmap version 36 456  Table 8) (cutoffs: e-value, 1e-20; sequence identity, 30%). Hydrogenases were 472 identified and classified using a web-based search using the hydrogenase classifier HydDB 67 . 473 . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.

Phylogenetic analyses 482
For taxonomic classification of each MAG, two methods were used to produce genome trees that 483 were then used to validate each other. In the first method the tree was constructed using 484 concatenated proteins of up to 16 syntenic ribosomal protein genes following procedures 485 reported elsewhere 72 ; the second tree was constructed using concatenated amino acid sequences 486 of up to 43 conserved single-copy genes following procedures described previously 73 . Both trees 487 were calculated using FastTree version 2.1.9 (-lg -gamma) 74  CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.  . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.  . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.

14.
Brooks . CC-BY-NC-ND 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.