Taxonomic and functional heterogeneity of the gill microbiome in a symbiotic coastal mangrove lucinid species

Lucinidae clams harbor gammaproteobacterial thioautotrophic gill endosymbionts that are environmentally acquired. Thioautotrophic lucinid symbionts are related to metabolically similar symbionts associated with diverse marine host taxa and fall into three distinct phylogenetic clades. Most studies on the lucinid–bacteria chemosymbiosis have been done with seagrass-dwelling hosts, whose symbionts belong to the largest phylogenetic clade. In this study, we examined the taxonomy and functional repertoire of bacterial endosymbionts at an unprecedented resolution from Phacoides pectinatus retrieved from mangrove-lined coastal sediments, which are underrepresented in chemosymbiosis studies. The P. pectinatus thioautotrophic endosymbiont expressed metabolic gene variants for thioautotrophy, respiration, and nitrogen assimilation distinct from previously characterized lucinid thioautotrophic symbionts and other marine symbionts. At least two other bacterial species with different metabolisms were also consistently identified in the P. pectinatus gill microbiome, including a Kistimonas-like species and a Spirochaeta-like species. Bacterial transcripts involved in adhesion, growth, and virulence and mixotrophy were highly expressed, as were host-related hemoglobin and lysozyme transcripts indicative of sulfide/oxygen/CO2 transport and bactericidal activity. This study suggests the potential roles of P. pectinatus and its gill microbiome species in mangrove sediment biogeochemistry and offers insights into host and microbe metabolisms in the habitat.


Introduction
Chemosymbiosis is widespread in marine habitats, where endosymbiotic or episymbiotic chemolithoautotrophs use inorganic chemical energy for the synthesis of organic compounds that benefit their hosts [1,2]. One of the most ancient examples of marine chemosymbiosis is found in the bivalve family Lucinidae [3], which has a fossil record arguably dating back to the Silurian period [4]. Despite being capable of suspension feeding, all living lucinids studied to date fulfill a considerable proportion of their nutritional needs through obligate chemosymbiotic associations with gammaproteobacterial endosymbionts occupying bacteriocytes in their gills [3]. Lucinid species examined so far acquire their thioautotrophic endosymbionts from free-living environmental bacterial populations [5][6][7][8][9]. Enzymatic assays, stable isotope analyses, and clone-based amplicon sequencing methods demonstrate that lucinid endosymbionts mainly use energy derived from the oxidation of reduced sulfur compounds to fix inorganic carbon for their hosts [10]. Other reported functions of lucinid endosymbionts included mixotrophy, denitrification, assimilation of nitrogenous compounds, and diazotrophy [11][12][13][14].
Because of the widespread distribution of lucinids in marine habitats, ranges in host and endosymbiont phylogenetic diversity, as well as the possibility that lucinids may harbor non-thioautotrophic symbionts [15][16][17], the lucinidbacteria chemosymbiotic system has the potential to address fundamental cellular to ecological questions about hostsymbiont interactions, cues, and communication across individual hosts, among species, and within populations. However, there is still relatively poor understanding of lucinid and gill microbiome diversity and metabolic functions. For instance, although 16S rRNA gene sequences of thioautotrophic lucinid endosymbionts form a paraphyletic group consisting of three distinct clades [10,18], only the genomes, transcriptomes, and proteomes of two lucinid endosymbiont species from clade A have been sequenced [13,14]. Clade A symbionts are mainly associated with diverse seagrass-dwelling lucinids, but symbiont clades B and C are from predominately mangrove-dwelling Anodontia spp. and Phacoides pectinatus, respectively [18]. Almost no diversity or functional diversity study has centered on either of these bacterial clades.
To begin to fill these gaps, our study characterizes the metabolic repertoire of the host and gill-associated thioautotrophic bacterial endosymbiont from P. pectinatus Gmelin 1791 (syn = Tellina pectinata Gmelin 1791, Lucina pectinata (Gmelin 1791), Anodontia pectinatus (Gmelin 1791), Lucina jamaicensis Lamarck 1801, Lucina funiculata Reeve 1850). Possibly the only extant species of its genus, P. pectinatus possesses morphological features distinct from other lucinid bivalves, such as high levels of three types of hemoglobin in gill pigment granules, sulfur bodies, and large lysosomes [19,20]. Molecular phylogeny studies place P. pectinatus as a deeply branching genus within the Lucinidae [21] and the thioautotrophic endosymbiont distant from seagrass-or other mangrove-associated lucinid endosymbionts [18,22,23]. This lucinid inhabits organicrich seagrass and mangrove sediments [24] and has a widespread tropical geographic distribution that ranges from the Caribbean Sea and Gulf of Mexico to the Atlantic Ocean seaboard of South America to Brazil [25]. The unusual morphological features, phylogeny, and habitat distribution of P. pectinatus and its distinct thioautotrophic endosymbiont belonging to clade C have led to the hypothesis that symbiont metabolic pathways in this species are different than in other lucinid endosymbionts [8]. To test this hypothesis, we assessed gill microbiome diversity within P. pectinatus using 16S rRNA gene sequencing, quantitative PCR (qPCR), metagenomic sequencing, and metatranscriptomic sequencing and compared the expression profiles from P. pectinatus and its gill microbiome species to previously sequenced seagrass-associated lucinid endosymbiont species from clade A, including Ca. Thiodiazotropha endoloripes within Loripes orbiculatus [14] and Ca. Thiodiazotropha endolucinida within Codakia orbicularis [13].

Sample collection
Phacoides pectinatus populations at Wildcat Cove, St. Lucie County, FL, USA ( Figure S1), as well as their ecology, sediment geochemistry, and microbiology, have previously been investigated [23,26] and briefly described in SI. For this study, research excursions were completed in February 2011, June 2013, July 2014, and November 2017, and live specimens were sieved from sediments hand-dug to 30 cm depth, approximately 3 m from the shoreline of Rhizophora mangle (red mangrove). Specimens were temporarily stored in Whirl-Pak® Bags (Nasco, Fort Atkinson, WI, USA) filled with surface water from the habitat and maintained at ambient temperature before dissection. During dissection, gill and foot tissues were separated from other body tissues. Tissues used for 16S rRNA gene sequencing and metagenomics were dissected within the same day of collection and fixed in 100% molecular-grade ethanol. Tissues used for metatranscriptomics were dissected within 30 min of collection and fixed in RNAlater. Tissues used for microscopy were fixed in 2% paraformaldehyde (pH 7) made with artificial sea water prepared using Difco™ Marine Broth 2216 formula (Becton Dickinson and Company, Franklin Lakes, NJ, USA) for 3 h at 4°C prior to washing, sucrose infiltration, storage, hematoxylin-eosin staining, and fluorescence in situ hybridization (FISH) procedures described in Table S1 and SI. Total nucleic acids were extracted from partial gill and foot tissues using the Qiagen's (Valencia, CA, USA) DNeasy Blood and Tissue Kit (2011 and 2013 sample collection) or Allprep DNA/RNA Mini Kit (2014 samples) after mechanical homogenization of the tissues with a motorized pestle and mortar (Argos Technologies, Elgin, IL, USA) or tissue grinder (Wheaton, Millville, NJ, USA). For further lysis, the sample was passed through a 21-gauge (0.8 mm) needle attached to a 3 mL syringe (Becton, Dickinson and Company, Franklin Lakes, NJ, USA) at least ten times and incubated at 60°C for at least 10 min. Extracted nucleic acid concentrations were quantified fluorometrically with Qubit™ dsDNA HS and RNA assays (Life Technologies, Austin, TX, USA).

Data analysis
Mothur v1.39.5 [27] was used for data processing for the 16S rRNA gene dataset. Operational Taxonomic Unit (OTU) clustering was performed at 99% sequence identity for higher species resolution [28] and taxonomic classification was performed against the Silva v132 database [29]. The final dataset was subsampled to the library with the smallest four-digit number size (n = 1269). The 16S rRNA gene analysis pipeline and qPCR procedures used to validate analysis results are documented in Table S2 and SI. Trimmed Illumina-sequenced metagenomic reads from each sequenced sample were individually assembled using IDBA-UD v1.0.9 [30]. Additionally, reads from the most complete gammaproteobacterial assembly were coassembled with unprocessed Nanopore reads using the hybridSPAdes algorithm [31] of the SPAdes genome assembler (v3.11.1). For each assembly, contigs >1500 bp long were binned with MetaBat v0.32.4 using the ensemble binning approach [32], after read mapping with Bowtie2 v2.2.7 [33] (very sensitive local and dovetail mode) and SAMtools v0.1.19 [34]. All metagenome-assembled genomes (MAGs) were annotated with NCBI's Prokaryotic Genome Annotation Pipeline [35]. MAGs with >90% completeness were also annotated with Rapid Annotation using Subsystem Technology (RAST) FIGfam release 70 [36]. Methods for the evaluation of sequence and MAG quality, read trimming, sequencing depth analyses, sequence comparisons with published reference genomes, and PCR validation of sequencing results are in Table S2 and SI.
Metatranscriptomic assembly and downstream analyses were performed with Trinity v2.5.1 [37]. Trimmed reads from all three metatranscriptomic libraries were coassembled into one metatransciptome de novo with Trinity's default parameters (k = 20). The co-assembly standardizes transcript IDs, lengths, and clusters across libraries for efficient downstream quantification and cross-sample comparisons [37]. Trinity's Chrysalis module clusters transcripts with at least k−1 bases overlap and with sufficient reads spanning the join across both transcripts, and the Butterfly module refines the clustering and uses these transcript clusters as proxy for genes [37,38]. Reads were mapped to the co-assembly using Bowtie2 v2.2.7's [33] nomixed, no-discordant, end-to-end options reporting up to 200 alignments per read (-k 200) and disallowing gaps within 1000 nucleotides of read extremes (-gbar 1000). Isoform and gene-level abundances were estimated by RNA-Seq Expectation-Maximization (RSEM) that maximizes the probability of observed variables, including read lengths, quality scores, and sequences based on RSEM's directed graph statistical model [39]. The probability value for each isoform/gene was divided by the effective transcript/gene length, which is the average number of possible start positions of a transcript of a given length or the abundance-weighted average effective transcript lengths of a gene's isoforms [39]. The resulting length-normalized value for each transcript/gene was divided by the sum of length-normalized values for all transcripts/genes in each sample to calculate the transcript fraction value, which was then multiplied by 10 6 to derive the transcript per million (TPM) measure [39]. For cross-sample comparisons, TPM values were further normalized with the trimmed means of M-values (TMM) factor that minimizes log-fold changes across samples [40] using the edgeR Bioconductor package [41].
All assembled host and bacterial transcripts, as well as unbinned contigs from metagenomic assemblies, were annotated with Trinotate v3.1.1 (https://trinotate.github.io/), which uses the manually curated but less representative Swissprot [42] database as reference. rRNA transcripts were predicted with SortMeRNA v2 [43] using SILVA's v119 [29] collection of archaeal, bacterial, and eukaryotic 16S rRNA, 23 S rRNA, 18 S rRNA, and 28 S rRNA gene sequences as references. Host and bacterial genes of interest were analyzed at the level of transcript clusters loosely equivalent to genes. To map transcript clusters to symbiont genes, a pan-genome for the thioautotrophic endosymbiont from P. pectinatus, named Candidatus Sedimenticola endophacoides (explained in the Results section), was created by extracting and concatenating nucleotide sequences of RAST-annotated PEGs and RNAs from six >90% complete MAGs, followed by de-duplication with CD-HIT v4.6 [44] at a global sequence identity threshold of 100%. The de-duplicated dataset was searched against the Trinity assembly using NCBI's Basic Local Alignment Search Tool (BLAST) v2.6.0+ local blastn package [35,45] and only the top hit was reported (-max_target_seqs 1). Similar local blastn searches were performed on other MAGs of interest for transcript cluster to gene mapping. Functions of transcript clusters of interest were inferred by comparing Trinotate's transcript annotations with web blastp, blastn, or blastx search results [45] against the more representative NCBI's non-redundant (nr) protein sequence or nucleotide (nt) databases [35] using the same 10 −3 e-value threshold as Trinotate. For each transcript within a transcript cluster, a blastp search was performed if a likely peptide sequence was predicted by Transdecoder v5.1.0 (http://transdecoder. github.io/) based on a minimum open reading frame (ORF) length and a log-likelihood score related to the reading frame where the ORF was located. If the blastp search returned negative results or if no likely peptide sequence was predicted for a transcript, then blastn and blastx searches were performed instead. Functions of transcript clusters mapping to more than one gene were assigned based on annotations of transcript(s) within the cluster with the highest TMM-normalized TPM value(s). A transcript cluster was considered multi-mapping if more than one transcript within the cluster shared high TMM-normalized TPM values but different predicted functions and their corresponding genes were not adjacent to each other in the reference MAG.

Site characterization
Live P. pectinatus had clumped distributions at Wildcat Cove in all sample years, with the highest concentrations being near the mangrove-lined coast where total organic carbon content in the sediment was highest [26]. Overall, live abundances averaged over 40 individuals per square meter [26]. Porewater dissolved sulfide and oxygen concentrations were measured and reported by Doty [26] from low-flow fluid sampling of piezometers installed near to where specimens were recovered, according to previously described methods [46]. Dissolved sulfide concentrations at Wildcat Cove (18-56 μmol/L) were an order of magnitude higher than concentrations measured from intertidal zone porewater occupied by the lucinid Lucinoma borealis [47]. Dissolved oxygen concentrations ranged from 78 to 125 μmol/L at quadrats adjacent to where P. pectinatus were collected [26].

Gill microbiome diversity
To examine P. pectinatus microbiome diversity, we sequenced 16S rRNA genes and used metagenomic and metatranscriptomic content from gill and foot samples collected in 2011, 2014, and 2017. All amplicon-sequenced DNA and cDNA samples were dominated by one gammaproteobacterial Sedimenticola-like species (OTU1), occurring at average 84 ± 11% relative abundance (Fig. 1a). MAGs of this species were binned from 14 separate assemblies and three co-assemblies (Table 1) and shared 100% sequence identity in the 16S rRNA gene V4 region with OTU1, as well as 99.8 ± 0.4% average nucleotide identity (ANI) and 99 ± 1% average amino acid identity (AAI; Figure S2a) with each other. These gammaproteobacterial MAGs were at least 20% smaller, and with at least 11% higher G+C content, than previously sequenced clade A thioautotrophic lucinid endosymbiont species Ca. Thiodiazotropha endoloripes [14] and Ca. Thiodiazotropha endolucinida [13] and Sedimenticola spp. [48,49,118] ( Table 1). FISH using a newly designed SED642 probe targeting the 16S rRNA gene of this Sedimenticola-like species confirmed that the P. pectinatus gill bacteriocytes contained cells that matched the gammaproteobacterial MAGs ( Figure S3). Results of phylogenetic analyses using 16S rRNA gene sequences ( Figure S4) and ten single-copy marker genes ( Figure S2b) corroborated previous reports on the distinct phylogenetic position of the thioautotrophic P.
Besides the thioautotrophic symbiont species, we also observed lower relative abundances of a gammaproteobacterial Kistimonas-like OTU (average 13 ± 12%; OTU2) belonging to the order Oceanospirillales in all ampliconsequenced DNA and cDNA samples and a Spirochaeta-like OTU (average 0.2 ± 0.2%; OTU5) in 25 out of the 33 gill DNA and cDNA samples (Fig. 1a). The transcriptional activity of the Sedimenticola-like, Kistimonas-like, and Spirochaeta-like species was confirmed by absolute qPCR quantification, where copy numbers of the OTUs in matched DNA and cDNA samples were consistent with their OTU relative abundances (Fig. 1b). Deep metagenomic sequencing of one 2014 P. pectinatus gill sample also binned a Kistimonas-like MAG (3% of reads), a Spirochaeta-like MAG (0.4% of reads), a Ca. Sedimenticola endophacoides MAG (58% of reads; Table 1), and 12 other bins with 0% completeness and no taxonomic classification. These three MAGs contained 16S rRNA gene sequences with perfect matches to their corresponding OTU sequences. Unbinned contigs comprised 89% (527,385/591,741) of all assembled contigs from this sample, out of which only 11% (59,232/527,385) had predicted protein-coding regions (SI). Reads from all sequenced metagenomic and metatranscriptomic libraries could be mapped to MAGs of the Kistimonas-like (0.4 ± 0.4% of MiSeq metagenomic reads and 0.1 ± 0.04% of metatranscriptomic reads) and Spirochaeta-like species (1 ± 0.3% of MiSeq metagenomic reads and 0.008 ± 0.003% of metatranscriptomic reads) at lower sequencing depths compared to the Ca. Sedimenticola endophacoides MAG (8 ± 4% of MiSeq metagenomic reads and 1 ± 0.6% of metatranscriptomic reads; Fig. 1c).
MAGs of Ca. Sedimenticola endophacoides, the Kistimonas-like species, and the Spirochaeta-like species shared <70% ANI and <56% AAI with each other ( Figure S2a). Phylogenetic analyses using 16S rRNA gene sequences clustered the Kistimonas-like OTU sequences with potentially pathogenic K. scapharcae from a dead ark clam Anadara broughtonii [51], skin-associated K. asteriae from the starfish Asterias amurensis [52], and gill-associated Oceanospirillales from the limid bivalve Acesta excavata [53] ( Figure S4). The Spirochaeta-like OTU sequence was most closely related to spirochete endosymbionts in the gutless marine worm Olavius [54,55] and loosely associated with spirochete sequences retrieved from a L. kazanilike lucinid [56] (Figure S4). Genomic sequences of these closest relatives of both species are not yet available in public databases. Foot microbiome diversity in P. pectinatus is described in SI.

Metagenomic and metatranscriptomic analyses
Sequenced gill cDNA libraries showed consistent read coverages of the co-assembled metatranscriptome and pairwise Pearson correlations of >0.8 across replicates ( Figure S5). A total of 1,563,787 transcripts were assembled, out of which 85% (average length 364 ± 262 bp) were without protein-coding region and functional annotation (SI). In all, 11% of the total transcripts (average length 989 ± 1181 bp) could be mapped to gene/protein homologs. These were grouped into 91,465 transcript clusters (loosely equivalent to genes), from which a subset (3%) mapped to the bacterial MAGs of interest. As such, it should be noted that the quality of gene/transcript annotations is heavily dependent on the completeness of the MAGs and the reference databases used. Although we made every effort to search for absent genes and pathways in the unbinned gill metagenomes, incompletely binned MAGs used to make inferences may still contain missing genes and functions. The lack of host genomic data and the high abundances of unclassifiable sequences in the gill metagenomes and metatranscriptomes imply that functional analyses can be skewed toward annotated genes/transcripts that would overlook novel genes [57]. Also, gene/transcript annotations based on homology may not be accurate predictors of reaction mechanisms [57] and even function (in the case of novel paralogs). Transcript quantification can also be influenced by swift changes in mRNA expression occurring between sample collection and fixation, as well as mRNA turnover that causes rapidly degrading mRNAs to exhibit inaccurately low transcripts per million (TPM) values.

Endosymbiont functions
Sixteen of the 30 most abundant bacteria-related transcript clusters could be mapped to Ca. Sedimenticola endophacoides, while 8 mapped to the species' relatives (Fig. 3a).
'+' denotes a feature annotated in a genome, whereas '−' denotes a feature not yet sequenced in a genome Function Ca. Sedimenticola endophacoides S. thiotaurini S. selenatireducens Ca.

Discussion
Systems-level approaches utilizing next-generation sequencing technologies successfully reveal host-microbe and microbe-microbe interactions in different invertebrate symbioses [73][74][75][76] but have not been widely applied to lucinid-bacteria chemosymbioses. Currently, the lack of genomic, transcriptomic, and proteomic data for lucinids hosting gammaproteobacterial clades B and C thioautotrophic endosymbionts results in a poor understanding of the metabolism, inter-and intra-species diversity, and molecular interactions between these partners that may impact their surrounding coastal habitat and other organisms in the environment. In this study, we focused on describing the gill microbiomes of the mangrove-dwelling P. pectinatus that hosts the poorly characterized clade C lucinid endosymbiont species. This is the first investigation to describe the functional repertoire of (1) a lucinid symbiont species belonging to clade C, (2) a lucinid clam, and (3) other bacterial species in a lucinid gill microbiome. Our comparative genomics analyses showed thioautotrophy, respiration, and nitrogen assimilation metabolic differences among the clade C P. pectinatus endosymbionts, clade A lucinid symbionts, and other thioautotrophic marine symbionts, while host transcriptomes revealed candidate genes putatively involved in symbiont/microbiome selection, regulation, and nutrient transfer. Metagenomic and metatranscriptomic analyses also uncovered consistency among members of the gill microbiome, including a Kistimonaslike species and a Spirochaeta-like species that have previously been associated with a variety of marine invertebrates but not yet been comprehensively studied in lucinid GOGAT glutamine oxoglutarate aminotransferase (glutamate synthase), Fd-GOGAT ferrodoxin-dependent glutamate synthase, Pst phosphate-specific transport, Pho phosphate regulon, PolyP polyphosphate granule, TBDT TonB-dependent transporter, ABC ATPbinding cassette transporters, DcuB C4-dicarboxylate uptake family transporter, SDH succinate dehydrogenase, FRD fumarate reductase, TCA cycle tricarboxylic acid cycle, TRAP tripartite ATP-independent periplasmic transport, ECF energy-coupling factor transporter clams. Additional insights into the lucinid-bacteria chemosymbiosis is now possible, and these findings may help in species conservation, habitat management [77][78][79], and even in fisheries productivity [80], which are areas of ongoing research.
Compared to previously sequenced lucinid clade A endosymbiont species and other thioautotrophic symbionts, Ca. Sedimenticola endophacoides encoded a unique combination of low-affinity type VI Sqr that functions best at high sulfide concentrations [81,82], form II RuBisCO that is less efficient at discriminating between oxygen and CO 2 [83], and the high affinity cbb3-type terminal oxidase that performs best at low oxygen concentrations [84]. These genomic differences suggest that Ca. Sedimenticola endophacoides experiences a more oxygen-poor extracellular and/or intracellular environment compared to Ca. Thiodiazotropha spp. Although pore water sulfide concentrations at Wildcat Cove were higher than previous studies [47], pore water dissolved oxygen concentrations were similar to those from sub-tropical coastal mangroves [85] and seagrass rhizomes [86] that have the potential to harbor lucinids. Sulfide and oxygen levels in the clam gills are likely regulated through hemoglobins, which can be partially saturated with oxygen [87]. As such, sulfide-reactive hemoglobin 1, which has a higher oxygen dissociation rate than oxygen-reactive hemoglobins 2 and 3, may be confined to the symbiotic mollusc gills [88]. In support of previous literature, we observed high expression levels of host-related hemoglobin 1, 2, and 3 genes responsible for sulfide and oxygen transport [88][89][90]. Despite genomic evidence for the maintenance of low intracellular oxygen that would be conducive for nitrogen fixation, which can contribute to the lucinid's diet and seagrass health [13,14,91], Ca. Sedimenticola endophacoides, unlike Ca. Thiodiazotropha spp., is likely incapable of diazotrophy. In lieu of nitrogen fixation, we speculate that Ca. Sedimenticola endophacoides may utilize urea and ammonium as its nitrogen source because these transcripts were detected.
Expression levels of autotrophy-related transcripts encoding RuBisCO and Calvin cycle enzymes in relation to other transcripts were much lower for Ca. Sedimenticola endophacoides than previously reported in Ca. Thiodiazotropha endoloripes [14] and other symbiotic bivalve species that expressed RuBisCO form Iaq [75,92], where these transcripts were among the most abundant in the transcriptomes. Low RuBisCO protein levels (~1%) were similarly observed in the tubeworm Riftia pachyptila thioautotrophic symbiont, which was discovered to produce proteins involved in an additional oxygen-sensitive reductive TCA cycle [93][94][95]. Although Ca. Sedimenticola endophacoides expressed genes encoding 2-oxoglutarate oxidoreductase that may reverse the 2-oxoglutarate to succinyl-CoA step in the TCA cycle, we did not identify any gene for citrate lyase or citryl-coenzyme A synthetase subunit that potentially converts citrate to oxaloacetate or acetate [93,95]. Mixotrophy has previously been inferred in Ca. Thiodiazotropha endoloripes [14], as well as thioautotrophic symbionts in a variety of other marine organisms [96][97][98], and is a likely possibility for Ca. Sedimenticola endophacoides because of encoded and expressed genes associated with the dicarboxylate transport and TCA cycle, as well as the correlation of P. pectinatus live abundances to sediment organic carbon content [26]. However, gene expression and geochemical data are insufficient support for proven mixotrophy, and more carbon assimilation experiments will be needed to determine such mechanisms in Ca. Sedimenticola endophacoides.
Besides Ca. Sedimenticola endophacoides, we also identified genes and transcripts belonging to other bacterial taxa in the P. pectinatus gill metagenomes and metatranscriptomes. Transcripts mapped to Ca. Thiodiazotropha endoloripes were noted in the gill metatransciptomes and could originate from unbinned contigs in the gill metagenomes or closely related species co-occurring in the gill microbial population. In all sequenced gill samples, we observed the consistent presence of a Kistimonas-like species related to the metabolically versatile Oceanospirillales species that can be symbiotic [99][100][101][102], parasitic [103], or pathogenic [51,104]. In bivalves, parasitic Oceanospirillales have been identified from nuclei in the vent mussel Bathymodiolus spp. [103]. Another Oceanospirillales species with unknown functions was also reported in gills from A. excavata [53]. Consistent with previous genomic reports on Oceanospirillales species, we observed high expression of various families of transposases in the Kistimonas-like species, which may facilitate rapid adaptation to new hosts or environments [105][106][107]. We also identified lower relative abundances of a Spirochaeta-like species in most gill samples, as well as transcriptional evidence of their activity. Spirochete species have been associated with a L. kazanilike lucinid [56], the symbiotic gutless oligochete worm Olavius [55,108], and episymbionts of the hydrothermal vent worm Alvinella pompejana [109].
Metatranscriptomic analyses showed that these three bacterial species may utilize distinct carbon sources. Specifically, Ca. Sedimenticola endophacoides may participate in mixotrophy in addition to thioautotrophy, whereas the Kistimonas-like species performs fermentation and fatty acid catabolism, and the Spirochaeta-like species breaks down chitin, sugars, and dicarboxylate compounds. To identify cellular locations of the Kistimonas-like and the Spirochaeta-like species within the host gill tissue, we designed multiple FISH probes targeting various 16S rRNA gene regions of the Kistimonas-like and Spirochaeta-like species, as these species showed positive DNA and cDNA amplification from gill specimens. However, in contrast to positive FISH signals for Ca. Sedimenticola endophacoides, we repeatedly failed to get unambiguous true positive signals for the Kistimonas-like and Spirochaeta-like species. This could be because of the low abundances of these species within the tissue samples, the hybridization efficiency of the designed probes, the resolution of the confocal microscopy, and/or other technical issues. Without microscopic data, we are unable to determine the location of these species and entirely rule out that they could be environmental contaminants, transient gill-filtered bacteria, pathogens, or parasites. More sensitive techniques, such as catalyzed reporter deposition-FISH [110] and hybridization chain reaction [111], should be performed to validate the presence of these bacteria species in the gills of P. pectinatus.
Our gill metatranscriptomic analyses also revealed potential host-microbiota interactions involved in the establishment and maintenance of lucinid-bacteria relationships. In P. pectinatus, transfer of nutrients, including carbon and possibly B vitamins and cofactors, from symbiont to host may predominantly occur via host lysosomal digestion. The high abundances of host-associated lysozyme-encoding transcripts observed in this study may indicate the presence of active lysosomes, supporting previous reports of lysosomes in the host gills [19] and in the vent mussel Bathymodiolus azoricus [75]. We speculate that host selection may include the secretion of bactericidal lysozyme and other compounds (SI), which can be countered by gill microbiome species. Presumably to decrease competition from closely related species/strains, as speculated in the Eupyrmna-Vibrio symbiosis [112], Ca. Sedimenticola endophacoides encoded and expressed genes for the production and secretion of bactericidal colicin [113], which were also annotated in the Kistimonas-like species MAG. A strongly expressed transcript cluster encoding a hypothetical filamentous hemagglutinin N-terminal domaincontaining iron-responsive protein responsible for adhesion to host tissues [66] was also observed in Ca. Sedimenticola endophacoides, while fatty acid synthesis and catabolismrelated genes encoding isocitrate lyase, BCKDH, and proteins within the methylcitrate cycle in Kistimonas-like species have been attributed to growth and virulence in other bacterial taxa [68][69][70][71][72]. Other genes associated with virulence and bacterial secretion systems were also detected in the genomes and transcriptomes of Ca. Sedimenticola endophacoides. However, their significance in the lucinidbacteria chemosymbiosis is unclear. Nevertheless, the speculated roles of bactericidal, adhesion, and virulence compounds would have to be tested using experimental approaches to better understand host selection and microbiome persistence.
Overall, this study provides insight into the metabolic functions and interactions of P. pectinatus, its thioautotrophic symbiont, and other gill microbiome species. Our discovery of distinct metabolic differences between the clade C endosymbiont, clade A lucinid symbionts, and other marine thioautotrophic symbionts, as well as the consistent presence and activity of other bacterial taxa in the gills, suggests that lucinid gill microbiome diversity is currently underrepresented in the literature and should warrant more investigative efforts, including additional host-microbiome meta-omics, imaging, and experimental studies. It is well established that the lucinid gill microbiome and their interactions with the host and/or the environment contribute to nutrient cycles in coastal marine sediments; however, many details have been lacking. Our metagenomic and metatranscriptomic analyses of mangrove-associated lucinid host and gill microbiome functions provide a systems biology perspective of host and microbiome physiology that is relevant to host-microbe and microbe-microbe interactions. histology staining; Dr. Terri Bruce and Rhonda Reigers Powell at the Clemson Light Imaging Facility for microscopy advice; Louie Alexander for assistance with FISH; Dr. Vincent Richards for MiSeq sequencing resources and advice on RNA-seq; and Clemson University's Palmetto cluster for computing resources.
Author contributions ASE, BJC, and LCA secured the funding for this study, supervised sample collection, and put in research efforts; SJL, BJC, ASE, and LCA collected the samples used in the study; SJL and BJC conceived the experiments; SJL performed most of the experiments, software implementation, data analyses, and wrote the manuscript. BGD performed qPCR analyses on the thioautotrophic symbiont and Kistimonas-like species; DEG and JW performed qPCR and PCR analyses on the Spirochaeta-like species; JW performed transcriptomic analyses on the Spirochaeta-like species; EN performed PCR analyses on the Kistimonas-like species; SJL maintains the NCBI sequence data and LCA curates the metadata and maintains specimens of dissected tissues and valves. BJC, ASE, and LCA reviewed and edited the manuscript; and all authors approved the final manuscript.
Data and specimen availability Sequence data were deposited in the National Center for Biotechnology Information [35] under the Bio-Project ID PRJNA368737. Accession numbers are listed in Table S3. Dissected specimen tissues and valves are cataloged at the South Dakota School of Mines and Technology, Museum of Geology, with details provided through the iDigBio portal (https://www.idigbio.org/ portal/recordsets/db3181c9-48dd-489f-96ab-a5888f5a938c).

Compliance with ethical standards
Conflict of interest The authors declare that they have no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons. org/licenses/by/4.0/.