Sponges have garnered considerable scientific attention for their ecological, evolutionary, and microbiological significance. These sessile benthic filter-feeders are the oldest extant multicellular animals [1] and collectively process thousands of liters of water per day per kilogram of tissue, facilitating the transfer of essential nutrients between the benthic and pelagic zones in aquatic ecosystems [2]. In the past few decades, marine sponges have additionally become known for harboring dense and phylogenetically diverse communities of microorganisms. In some sponges, sponge-associated microbes comprise up to 40% of sponge body weight and reach population densities up to 2–4 orders of magnitude higher than ambient seawater [3]. These microbial communities, often housed within the sponge mesohyl, contribute to host metabolism and produce antimicrobials and other biologically active compounds that are beneficial to the sponge host [2].

The vast majority of sponge microbiome studies have focused on marine sponges, which are highly diverse, widely distributed, and comprise more than 95% of documented sponge species [4]. Comparatively little attention has been given to the 248 currently described species of freshwater sponges [5], which provide similar ecosystem services in a diversity of freshwater habitats [6]. Most freshwater sponge research has focused on microbes associated with endemic sponges in Lake Baikal [7,8,9,10,11,12,13], though isolated studies have examined microbial communities and activity in Ephydatia fluviatilis [14,15,16], E. muelleri [17], Spongilla lacustris [18, 19], Eunapius carteri [20], Corvospongilla lapidosa [20], Tubella variabilis [5], and Metania reticulata [21]. These studies have found that, like marine sponges, freshwater sponges are primarily populated by Proteobacteria, Bacteroidetes, and Actinobacteria, though the resident microbial communities still differ substantially from those of marine sponges [5]. However, fewer than ten studies [5, 7, 8, 11, 17, 20] have examined freshwater sponge-associated microbial communities using next-generation sequencing approaches, and high-resolution taxonomic data on these communities therefore remains limited. In addition, no study has yet determined if freshwater sponge microbiomes differ from adjacent biofilms. This is another foundational consideration in sponge microbiome research, given that differences in microbiome composition between sponges and water could reflect differences between benthic and planktonic communities rather than functionally meaningful sponge-microbe associations.

The paucity of data on the freshwater sponge microbiome also means that key hypotheses regarding host-microbe specificity, host-microbe metabolic partnerships, and the environmental drivers of microbiome structure (reviewed in ref. [2]) remain poorly tested in freshwater sponges. For example, several studies have observed a distance-decay relationship in the marine sponge microbiome in which geographically proximate sponges (i.e., those from the same ocean or within a few hundred kilometers of each other) have more similar microbiota [22,23,24]. However, it is unclear if or how such relationships would be preserved in non-contiguous freshwater habitats. Tests for geographic variation in the freshwater sponge microbiome could therefore help identify conserved sponge-microbe relationships and elucidate the mechanisms through which freshwater sponges acquire their microbiomes.

Metagenomic and multi-omic studies of marine sponge microbiomes have additionally demonstrated characteristic functional features that are consistent among sponge-associated bacteria [25, 26], but no such techniques have yet been applied to freshwater sponges. Marine sponge-associated bacteria contribute to host metabolism by providing their hosts with fixed carbon and essential vitamins [27] and degrading sponge-derived ammonium [28]. They also have unique genomic features, including restriction/modification (RM) systems, transposases, clustered regularly interspaced short palindromic repeats (CRISPR), and eukaryotic-like protein domains [27, 29, 30], which are all believed to promote microbial survival in the sponge by helping the sponge discriminate among pathogens, mutualists, and food [2, 31]. Whether any of these features are preserved in freshwater sponges is unknown, and a better understanding of the freshwater sponge microbiome therefore requires the supplementation of 16 S rRNA gene sequencing data with shotgun metagenomic data to provide more detailed analyses of functional potential.

In this study, we used 16 S rRNA gene amplicon sequencing complemented with shotgun metagenomics to test three foundational hypotheses about the bacterial community associated with the freshwater sponge Ephydatia muelleri: (1) this species harbors bacterial communities that are distinct from both ambient water and adjacent biofilms; (2) these sponge-associated microbial features are conserved across ecologically similar but geographically isolated sponge populations; and (3) the freshwater sponge microbiome is enriched in many of the same symbiont-associated genetic features that characterize marine sponge microbiomes. We chose E. muelleri for these analyses because of its cosmopolitan distribution and extensive history in both in situ and laboratory studies [17]. Overall, our study represents the largest and most comprehensive investigation of freshwater sponge microbiomes to date and provides an important foundation for future research on the composition, function, and ecological role of freshwater sponge microbiomes.

Materials and methods

Sample collection

We collected sponge, water, and biofilm samples from the Sooke, Cowichan, and Nanaimo Rivers on southeastern Vancouver Island, British Columbia, Canada over three days in July 2018 (Fig. 1). All three rivers are short (<50 km) first- through third-order freshwater systems situated within temperate old-growth forests. The rivers are in separate watersheds and are not connected at any time of year. At our sampling locations, the rivers have rocky banks and cobble beds, interspersed with bedrock and boulders that provide substrate for various sponge species and other freshwater fauna and flora.

Fig. 1: Locations for sampling the freshwater sponge Ephydatia muelleri.
figure 1

Map depicts southern Vancouver Island (British Columbia, Canada). Samples were collected in July 2018 from suitable habitat in the Sooke, Nanaimo, and Cowichan Rivers. Watersheds are shaded blue and sampling locations within each river are indicated in red.

All samples and measurements were taken within 50 m stretches of each river (Fig. 1). At each site, we measured water temperature, pH and physicochemistry as outlined in the Supplementary Methods. We then collected two sponge tissue samples approximately 1 cm in diameter from five individual sponges ranging from 0–2 m in depth. Samples were collected from adult sponges with encrusting growth forms and included both the pinacoderm and endoderm. Sponges were chosen randomly and varied in color from tan to green (Fig. S1). One tissue sample of each pair, designated for microbiome analysis, was rinsed gently with distilled water to remove any attached debris and then flash-frozen in a dry ice and isopropanol slurry before being stored at −20 °C. The second tissue sample of each pair was fixed in 100% ethanol for species identification and stored at 4 °C. Species identifications were confirmed using both histologic and genomic methods (see Supplementary Methods).

To compare the sponge microbiome to its ambient environment, we also collected paired water and biofilm samples for each individual sampled sponge. We used sterile spatulas to scrape biofilm samples from the bedrock adjacent to each sponge and flash-froze the samples as before. A 500-ml water sample was collected next to each sponge using Whirl-Pak bags (Nasco, Canada). Water samples were stored in a cool and dark place for 2–4 hours during transport to a laboratory, where they were vacuum filtered through 0.2-μm cellulose acetate (CA) filters (MilliporeSigma, USA). We filtered 500 ml of distilled water as a negative control prior to filtering the samples. Filters were flash-frozen as before.

DNA extraction

Whole community DNA was extracted from the sponge tissue, membrane filters, and biofilm samples using the FastDNA Spin Kit for Soil (MP Biomedicals, Santa Ana, CA) following the manufacturer’s instructions, with an added 5-minute incubation at 50 °C prior to the final elution to maximize DNA yield. For the extractions, we thawed sponge samples and measured 100 mg of tissue (wet weight) into extraction tubes, and we aseptically cut the membrane filters into small pieces to extract DNA directly from the filters. All biofilm samples weighed less than 100 mg, so DNA was extracted from the entire available sample. Due to low yields of biofilm DNA, multiple biofilm samples from the same river were pooled and concentrated using an ethanol precipitation procedure to obtain a sufficient quantity of DNA for sequencing [32]. This process resulted in final biofilm sample numbers of 4, 3, and 1 for the Sooke, Nanaimo, and Cowichan Rivers, respectively.

16S rRNA gene sequencing and analysis

DNA samples were submitted to Microbiome Insights (Vancouver, BC) for 16 S rRNA gene amplicon sequencing using the barcoded primers 515 F and 806 R [33] following previously described methods [34]. All DNA extraction and sequencing steps were performed with appropriate positive and negative controls to ensure the accuracy and replicability of sequencing results (see Supplementary Methods and Figs. S2, S3). Sequencing data was processed using the R package dada2 v1.6.0 [35] to produce amplicon sequence variants (ASVs) following a previously described protocol for read filtering, taxonomic assignment, contaminant removal, and phylogenetic tree construction [34] (see also Supplementary Methods). ASV abundances were then centered log ratio (CLR)-transformed to account for the compositional nature of sequencing data. The final feature table was imported into the R package phyloseq [36] for statistical analyses.

Metagenome sequence analysis

Whole community DNA from three paired sponge and water samples from the Sooke River were additionally sequenced by shotgun metagenome sequencing to explore the functional potential of the freshwater sponge microbiome. Paired-end genomic DNA libraries were prepared using the Nextera XT DNA Sample Preparation Kit (Illumina Inc., CA, USA). Libraries were multiplexed and sequenced using the NextSeq platform (Illumina) to generate 150-bp paired-end reads. Library preparation and sequencing were performed by the University of British Columbia Sequencing and Bioinformatics Consortium.

Read quality control was performed in four steps: adapter removal, read trimming, low complexity read removal, and host sequence removal. Procedural details for these processing steps are provided in the Supplementary Methods. Processed, merged reads were assembled with Megahit v1.0.3 [37], resulting in 1,225,502 contigs with a minimum length of 200 bp and a maximum length of 258,389 bp (N50 = 819 bp). We additionally recovered and annotated 25 non-redundant metagenome assembled genomes (MAGs) using the compiled results from three different genome assembly pipelines as described in the Supplementary Methods. Protein-coding genes were functionally annotated against the Clusters of Orthologous Groups (COGs) of Proteins database [38]. For downstream analyses, the abundance of each gene or COG was imputed as the abundance of the contig to which it was assigned.

Statistical analysis

All statistical analyses were performed in R 3.6.2 [39] and are described in detail in the Supplementary Methods. We tested for sponge-specific bacterial communities (hypothesis #1) by analyzing the 16 S rRNA amplicon data across our three sample types, independent of river, and we tested for geographic variation among sampling locations (hypothesis #2) by comparing the amplicon data across sponge and water samples separately for our three different rivers. Species richness and Shannon diversity were calculated for all samples using iNext [40] (see Fig. S4), and significant differences among sample types and rivers were evaluated using an ANOVA with Tukey’s post hoc test. We tested for differences in microbiome composition among sample types and rivers using Aitchison distance-based permutational analyses of variance (PERMANOVA) and random forest models. Taxa that were differentially abundant among sample types and rivers were also identified at the phylum, class, family, and genus levels by pooling ASVs in common taxa. Differential abundance was tested using ALDEx2 [41], and p-values were adjusted using the Benjamini-Hochberg correction. We lastly used Spearman’s correlation to assess the relationship between ASV relative abundances in sponges compared to water and biofilm samples. All taxon relative abundances are reported in percentages as means ± standard deviation, and statistical significance for all comparisons was defined at p < 0.05.

We expanded our geographic comparison of sponges in different rivers (hypothesis #2) to include publicly available 16 S rRNA amplicon sequencing data from 11 other E. muelleri samples collected from North America and the United Kingdom [17]. These included three unhatched gemmule samples (clusters of embryonic cells produced as a form of asexual reproduction), five gemmule samples hatched in a laboratory setting, and three adult sponges. To minimize any potential noise caused by different extraction and sequencing methodologies [42, 43], sequencing data from the two experiments were clustered into operational taxonomic units (OTUs) at 97% identity (see Supplementary Methods). Samples from the two experiments were then compared using the same analyses described above.

We used the metagenome data from Sooke River samples to search for shared functional characteristics among sponge-associated taxa (hypothesis #3). We tested for significant taxonomic and functional differences between sponges and water samples using Aitchison and Euclidean distance-based PERMANOVAs, respectively, and identified differentially abundant COGs using edgeR [44]. Functional categories that were significantly over-represented among the differentially abundant COGs were identified using a hypergeometric over-representation test. We also searched for genomic features linked to taxa implicated in the 16 S rRNA gene amplicon analysis, including chitin degradation, steroid degradation, and nitrogen cycle genes, using hidden Markov models (HMMs) or BLASTp analysis as described in the Supplementary Methods.

We identified key symbiosis features in the MAGs by classifying MAGs as either sponge- or water-associated based on the sample type in which they were more relatively abundant. We then compared sponge- and water-associated MAGs by (1) testing for significant differences in basic genome features (genome size, etc.); (2) performing PCA on whole-genome COG profiles to test for broad genomic differences; (3) using logistic regression models to identify COGs most strongly associated with sponge-associated MAGs independent of any underlying phylogenetic differences among taxa; and (4) searching for a “core” set of COGs present in all sponge-associated MAGs and a “unique” set of COGs absent from water-associated MAGs. We also obtained COG profiles for the closest relatives of our sponge-associated MAGs and used paired t-tests to identify genomic features enriched in sponge-associated MAGs relative to these references. These analyses are described in the Supplementary Methods.


We sampled sponges, water, and biofilms from the Sooke, Nanaimo, and Cowichan Rivers. Due to low DNA yield from some samples, final sample sizes for statistical analyses were 15, 14, and 8 for sponges, water, and biofilms, respectively. All three rivers exhibited similar limnological parameters, with the Sooke River containing slightly more total nitrogen and total dissolved solids (Table S1). Morphological analysis of sponge spicules confirmed that the sponges were E. muelleri (Fig. S5), and species identification was further confirmed by the detection of two 258-bp 16 S rRNA gene amplicons that were present at high abundances in all sponge samples (>2800 reads per sample) and shared 100% identity with 16 S rRNA genes in the E. muelleri mitochondrial genome (Fig. S6). In addition, mitochondrial genomes assembled from the three Sooke River sponges used for metagenome sequencing were 99.8% identical to the published E. muelleri mitochondrial genome [45] (Fig. S6).

Microbiome diversity, structure, and composition

We first tested whether the diversity or composition of the freshwater sponge bacterial microbiome differed from the ambient environment. Our 16 S rRNA gene sequencing efforts resulted in 2545 unique ASVs from an average of 18,077 ± 6334 high-quality reads per sample. Based on extrapolated species richness estimates, sponges hosted an average of 881 ± 215 ASVs per individual, significantly fewer than both the paired water samples and adjacent biofilms (Fig. 2a, Table S2, ANOVA F = 9.04, df = 2, p < 0.001). Shannon diversity was also significantly lower in sponges than in both water and biofilm samples (Fig. 2b, Table S2; ANOVA F = 16.73, df = 2, p < 0.001). With respect to community composition (beta diversity), we observed strong and significant clustering of sample types (sponge, water, and biofilm) independent of sampling location (Fig. 2c; PERMANOVA F = 6.33, df = 2, R2 = 0.27, p < 0.001), with sponges and water also showing significantly lower multivariate dispersion than biofilms (Table S3; F = 8.28, df = 2, p = 0.003). These clusters were robust to the choice of dissimilarity index and were also evident when each sampling site was evaluated separately (Figs. S7, 8). Random forest models were able to discriminate sponge samples from water and biofilms with 97.3% accuracy (Fig. S9).

Fig. 2: Alpha and beta diversity across sponges, water, and biofilms.
figure 2

Boxplots of (a) ASV richness and (b) Shannon diversity calculated via rarefaction and extrapolation, shown for all samples and for each sampling location. Letters represent significant pairwise contrasts (p < 0.05) between sample types (sponge, water, biofilm) within each grouping. Pairwise comparisons were performed using Tukey’s HSD post hoc test. The single biofilm sample from the Cowichan River was not included in any pairwise comparisons, and detailed results for all pairwise comparisons are presented in Table S2. c Aitchison distance-based ordination showing significant differences among sample types. Colored lines show 95% confidence ellipses around sample types, and black dashed lines show 95% confidence ellipses around sponge and water samples separately for each river.

With respect to individual taxa, ASV relative abundances were more strongly correlated between sponge and water samples (Spearman’s r = 0.384) than between sponges and biofilms (r = 0.119; Fig. 3), though all samples were dominated by the same five phyla (Proteobacteria, Bacteroidetes, Actinobacteria, Verrucomicrobia, and Cyanobacteria; Fig. 4). Overall, sponges harbored a core microbiome of 92 ASVs present in at least 12 sponge samples (Table S4), as well as 134 ASVs not detected in water or biofilm samples (Fig. S10, Table S5). Taxa that were present in the core microbiome and significantly more relatively abundant in sponges included Comamonas, Diaphorobacter, Methylotenera, Rhodoferax, Rhodospirillales, and Sediminibacterium (Table S6). Three of these taxa accounted for 58% of the sponge microbiome: unclassified Rhodospirillales (22.9% ± 7.4), Sediminibacterium (22.9% ± 19.0), and Comamonas (12.5% ± 7.2). A sequence similarity analysis using the BLASTN algorithm revealed that the most abundant Sediminibacterium and Rhodospirillales sequences closely resembled uncultured bacteria from Lake Baikal sponges (sequence identities ≥98.42%). Notably, the dominant Sediminibacterium ASV in sponge-associated communities was different from the dominant Sediminibacterium ASV in water or biofilm samples (Table S7).

Fig. 3: ASV abundance correlations between sponges and water or biofilms.
figure 3

ASV abundances in sponges were plotted as a function of ASV abundances in water (left) or biofilms (right). Spearman’s correlation coefficient is given on top of each graph. Colored dots indicate ASVs that were significantly more abundant in sponges (orange), water (blue) or biofilms (green) after Benjamini-Hochberg correction. Selected ASVs are labeled with their lowest taxonomic affiliation. Hash marks along the x- and y-axes indicate ASVs with zero abundance in one sample type but non-zero abundance in the other.

Fig. 4: Microbial composition of sponge, water, and biofilm samples.
figure 4

Average relative abundances (in percent) of different bacterial phyla based on ASV counts for each sample type at each of sampling location. The two most abundant phyla, Bacteroidetes and Proteobacteria are further subdivided into classes.

Geographic variation

We also tested the extent to which the sponge-associated microbiome was conserved across different freshwater habitats. Richness estimates for the sponge samples were only marginally different among rivers (Fig. 2a; ANOVA F = 3.69, df = 2, p = 0.056) despite considerable variation in water samples among rivers (F = 92.7, df = 2, p < 0.001). Shannon diversity of both sponge and water samples varied significantly among rivers (Fig. 2b; F = 15.5, df = 2, p < 0.001). Overall microbiome composition differed significantly among sponges from different rivers (Fig. 2c, Fig. S8; PERMANOVA F = 4.5, df = 2, R2 = 0.363, p < 0.001), and random forest models perfectly discriminated sponges based on their river of origin (Fig. S9). Our samples also differed in microbiome composition relative to E. muelleri sponges and gemmules from a previous study covering a wider geographic range [17]. ASV richness and diversity estimates were consistent between samples from both studies; however, the microbiome of samples from Kenny et al. [17] most closely resembled the microbiome of biofilms from this study in beta-diversity analyses (Fig. S11). Gemmule samples from Kenny et al. that were collected upstream of the Sooke River samples described here were also microbiologically distinct from our Sooke River sponge samples (Figs. S12, S13).

In this study, sponges from different rivers could be distinguished based on significantly higher relative abundances of select bacterial taxa: Pseudarcicella, Flavobacterium, and Fluviicola in Sooke sponges; Polynucleobacter in Nanaimo sponges; and Sediminibacterium and Parcubacteria in Cowichan sponges (Fig. S14). There were also 45 ASVs present with ≥0.01% mean relative abundance in at least four sponges from one river but undetectable in sponges from the other rivers (Fig. S10). Most of these “river-specific” ASVs were assigned to Proteobacteria or Bacteroidetes, and three were not detected in any water or biofilm samples (Table S8). Notably, among-river differences in the sponge microbiome were not consistently reflected in water samples: for example, the relative abundance of Sediminibacterium varied by 14-fold among sponges from different rivers but was not significantly different among water samples. Similarly, Cowichan River sponges contained significantly more Parcubacteria and Turneriella than other sponges, but these taxa were not significantly more relatively abundant in Cowichan River water samples compared to other rivers (Fig. S14).

Functional signatures

We explored the functional potential of the sponge microbiome by performing shotgun metagenome sequencing on three sponge and three water samples from the Sooke River. Host and eukaryotic DNA accounted for 88.8% ± 2.4 of reads from sponge samples and 0.41% ± 0.15 of reads from water samples (Table S9). After removing these sequences, we obtained an average of 1.10 ± 0.44 Gbp per sponge sample and 3.29 ± 1.69 Gbp per water sample. Bacteria comprised >99% of the sequences in each sample and were distributed among the same dominant phyla as were the amplicon sequences (Fig. S15).

Metagenome-based taxonomic and functional profiles were significantly different between sponges and water in ordination analyses (Fig. S15). Remarkably, 924 of the 4,125 predicted functions (COGs) in the metagenomes were significantly more relatively abundant in sponges, including the COG categories “mobilome: prophages, transposons” and “defense mechanisms” (Fig. S16; Table S10). In comparison, only 493 COGs were significantly more relatively abundant in water samples, mostly related to the COG class “cell motility” (Fig. S16; Table S10). The most abundant sponge-enriched COGs included bacterial defense mechanisms (CRISPR proteins, type IV secretion systems, RM systems, and transposases), eukaryote-like motifs (zinc fingers, ankyrin repeats, WD40 repeats), glutamine synthetase, and serine/threonine protein kinase (Fig. 5, Fig. S17). Conversely, the sponge microbiome was significantly depleted in various drug efflux pumps and spore/capsule genes (Fig. 5, Fig. S17).

Fig. 5: COGs that were differentially abundant between sponge and water samples.
figure 5

Heat map shows natural log-transformed RPKM values for a subset of the 1417 COGs that were differentially abundant (log2 fold-change >|1| and Benjamini-Hochberg-adjusted p < 0.05) between sponge (SKE) and water (SKEW) samples. This subset of COGs was chosen because the associated functions are consistently implicated in studies of the marine sponge microbiome (e.g., [27, 57, 58, 71]). COGs are organized by general functional categories.

Many sponge-enriched functions were associated with diverse Proteobacterial lineages, with Alphaproteobacteria carrying most of the vitamin B12 biosynthesis genes, Comamonadaceae most of the CRISPR genes, and Burkholderiales and Comamonadaceae most of the transposase-related genes (Fig. S18). We noted that transposase-related COGs were also abundant in Bacteroidia, Actinobacteria also carried many vitamin B12 synthesis genes, and CRISPR genes were conspicuously absent from Planctomycetes. Sponge-associated bacteria have been implicated in the nitrogen cycle, and we found that genes for nitrogen fixation and nitrate reduction were highly abundant in both sponges and water. Nitrate reduction genes were more commonly associated with Bacteroidetes in sponge samples than in water samples (Fig. S19, Table S11).

Differential abundance analysis of COGs assigned to the three most abundant sponge-associated taxa (Chitinophagaceae, Comamonadaceae, and Rhodospirillaceae) showed that sponge-associated Chitinophagaceae were enriched with transposons, sponge-associated Comamonadaceae in defense mechanisms (including CRISPR and intracellular trafficking) and sponge-associated Rhodospirillaceae in defense mechanisms and cell motility COGs (Table S12). Sponge-associated Sediminbacterium contigs did not carry any COGs that were significantly differentially abundant compared to COGs from water-associated contigs. Some Sediminibacterium species can degrade chitin [46] and members of the Comamonas testosteroni species can degrade steroids [47]; however, there was no evidence that chitin or steroid degradation genes were enriched in sponges (Fig. S20).

Metagenome-assembled genomes

Our sequencing efforts also produced 25 medium- to high-quality MAGs from four of the most abundant bacterial phyla in our study, including two MAGs assigned to Sediminibacterium (Table S13). The MAGs collectively encoded 3077 of the 4125 functions found in the shotgun metagenomes, and twelve of the MAGs were more relatively abundant in sponge samples (“sponge-associated MAGs”) (Fig. 6). Sponge-associated MAGs had slightly smaller genomes with lower GC content than water-associated MAGs, though these differences were not significant (Table S14). COG profiles varied significantly among MAGs from different phyla and were weakly linked with sample association (sponge vs. water) in both constrained and unconstrained ordination analyses (Fig. S21; Table S15). All sponge-associated MAGs shared a core genome of 134 COGs that primarily included genes for general cellular functions such as transcription, translation, and DNA replication (Table S16).

Fig. 6: Genomic composition of metagenome-assembled genomes.
figure 6

Relative abundances of each metagenome-assembled genome (MAG) in sponge and water samples are shown on the left. On the right, the number of copies of each COG in each MAG is shown for select groups of COGs that were differentially abundant between sponge and water samples (see Fig. 5). Abbreviations: Str. m., structural motifs; R/M, restriction/modification.

The phylogenetic diversity, small number, and moderate completeness values (59–98%) of these MAGs precluded detailed taxon-wise phylogenomic analysis [48]; however, multiple lines of evidence demonstrated that sponge-associated MAGs were generally enriched in many of the same genes implicated in the shotgun metagenome analysis, including CRISPR-related genes, vitamin B12 biosynthesis genes, serine/threonine protein kinase, transposases, and various secretory and transport proteins (Fig. 6). Specifically, these genes were (i) implicated in logistic regression models that differentiated sponge- and water-associated MAGs while controlling for natural genomic differences among phyla (Figs. S22, S23 and Tables S17, 18); (ii) present among the 25 COGs that were found in at least four sponge-associated MAGs but absent from all water-associated MAGs (Tables S18, 19); and (iii) enriched in sponge-associated MAGs relative to closely related reference genomes from other freshwater environments (Fig. S24; Tables S18, S20). Many of these genes were distributed across MAGs from both Proteobacteria and Bacteroidetes (Fig. 6). Nutrient metabolisms also varied among MAGs, but nutrient-cycling genes were not consistently implicated in any tests for gene enrichment in sponge- or water-associated MAGs. However, we noted that Betaproteobacteria and Bacteroidetes MAGs encoded several carbohydrate degradation pathways, including chitin degradation, that were largely absent from Alphaproteobacteria MAGs (Fig. S25).


We found that bacterial communities associated with the freshwater sponge Ephydatia muelleri in Vancouver Island rivers were (1) significantly different from communities found in ambient water and adjacent biofilms and (2) largely conserved among rivers, though they could still be distinguished by their river of origin based on a small number of microbial features. Metagenome and MAG profiles from a subset of our samples further showed that (3) freshwater sponge-associated bacteria carry many of the same genetic signatures that have been considered indicative of symbiosis in the marine sponge microbiota, including an abundance of genes related to bacterial defense (e.g., CRISPR-associated genes, RM systems, and secretion systems) as well as genes to produce vitamin B12, which is an essential vitamin for the sponge host [2, 25,26,27, 29]. Overall, our observation that the structure, composition, and metagenomic potential of the E. muelleri microbiome are largely consistent with previous studies of freshwater sponges in Lake Baikal [7,8,9,10,11,12,13] and elsewhere [14,15,16, 18,19,20,21], and with studies of marine sponge microbiomes [3, 49,50,51], supports the existence of evolutionarily conserved sponge-bacteria associations with ecological implications in freshwater environments.

The consistent dominance of Sediminibacterium, Rhodospirillales, and Comamonas in the microbiome of healthy E. muelleri, and the similarities between these sequences and sequences from sponges in Lake Baikal, suggests that these three taxa likely participate in geographically conserved host-microbe interactions. Although contigs and MAGs assigned to these taxa were universally enriched in defense genes, as is common for sponge symbionts [52], we found only moderate evidence for lineage-specific nutrient metabolisms. For example, some Sediminibacterium species can degrade chitin and other complex polymers [46], which form the skeleton of many marine and freshwater sponges (reviewed in ref. [53, 54]). We found that chitinase genes were not significantly enriched in sponges, though we did find that one of the two sponge-associated Sediminibacterium MAGs encoded several carbohydrate degradation genes that were absent from closely related reference genomes.

Sponge-specific metabolic activities in Rhodospirillales and Comamonas species were equally difficult to identify. Several members of the Comamonas testosteroni species are known steroid degraders [47, 55], but steroid degradation genes were neither significantly enriched in sponges nor present in the two sponge-associated Comamonadaceae MAGs. Instead, these two MAGs encoded digestive enzymes for various complex carbohydrates and cellulose derivatives, which are a major component of the dissolved organic matter filtered by sponges [56, 57]. Rhodospirillales have been shown to metabolize sponge-derived sulfur and nitrogen compounds [58], but genes encoding this metabolism were not strongly represented in our metagenome or MAG analyses. Overall, we believe that the absence of strong signatures for sponge-specific metabolic activity in these taxa, combined with their consistent genomic signatures of sponge symbionts, indicates that the sponge-associated representatives of these taxa may perform similar metabolic activities as their free-living counterparts but have genetically adapted to their symbiotic lifestyle. Deeper sequencing to produce more MAGs, along with metatranscriptomic studies of in situ activity, will be needed to conclusively assign functions to these highly abundant taxa.

Although the E. muelleri microbiome was largely conserved among different rivers, we also found river-specific microbial signatures that were not matched by differences in the ambient environment. Similar patterns of phylum-level similarity and phylotype-level variation have been observed in several marine sponge species across much larger geographic distances [24, 59,60,61]. In some cases, site-specific microbial signatures can be driven by environmental factors; for example, sponges that receive more sunlight harbor more phototrophic symbionts [62], and changes in the sponge microbiome help sponges tolerate lower pH [63]. Although water physicochemical profiles were broadly similar among our three sample sites, it remains possible that environmental factors we did not measure, including variation in light regimes, nutrient inputs, host genotype, and sponge health, contribute to geographic variation in microbiome composition (e.g., [23, 24, 51]). Microbial dispersal limitation may also drive this variation: because the rivers we studied are not hydrologically connected, microbial dispersal among rivers is extremely limited, and sponges are mainly exposed to the taxa present in the river they inhabit.

These river-specific microbial signatures thus raise the additional question of how E. muelleri acquires its microbial complement. Sponge-associated bacteria can generally be acquired via a combination of horizontal transmission directly from the ambient environment and vertical transmission from parent to progeny [64,65,66,67], though our results implicate horizontal transmission as the primary mechanism of symbiont acquisition in E. muelleri. Because E. muelleri reproduces via gemmules, vertically transmitted symbionts would need to be located inside or on the surface of the gemmule (e.g., [68, 69]). However, the gemmules sequenced by Kenny et al. [17] most closely resembled the rock-associated biofilms analyzed here rather than the adult sponges we surveyed, suggesting that gemmules are primarily colonized by ambient biofilm-forming bacteria. Moreover, only a small number of sponge-associated ASVs in our study were absent from the ambient environment, and the correlation between ASV abundances in sponge and water samples further suggests that sponge-associated microbes are acquired from the water. Even so, the presence of Sediminibacterium sequences in the Sooke River gemmules analyzed by Kenny et al. [17], and the fact that the dominant sponge-associated Sediminibacterium ASVs were distinct from those found in ambient water, hints at the potential for vertical transmission of select taxa and should be further investigated.

Our metagenomic and MAG-based analyses additionally demonstrated that sponge-associated microbes are enriched in many of the genomic features that are considered indicators of a symbiotic lifestyle in marine sponge-associated bacteria. Features that were shared across diverse sponge-associated taxa included eukaryote-like proteins (e.g., ankyrin repeats and WD40 repeats), which are believed to mediate sponge-microbe interactions [25,26,27] and protect sponge symbionts from being digested by the host [31], as well as various defense mechanisms against foreign DNA, such as RM enzymes and CRISPR genes [29, 52]. One notable exception was the absence of defense-related genes in Planctomycetes, a finding that has also been reported in marine sponges and suggests that sponge-associated Planctomycetes carry some other mechanism for avoiding infection [57]. Sponge-associated Proteobacteria and Actinobacteria were enriched in genes for vitamin B12 synthesis, indicating a possible role in provisioning this essential vitamin to the sponge host. Although previous genomic analyses have identified biosynthesis genes for several B-vitamins in marine sponge microbiomes [27, 70], genes for thiamine (B1), riboflavin (B2), and biotin (B7) biosynthesis were not differentially abundant between sponges and water in our study. In addition, most sponge-associated lineages also contained significantly fewer genes for flagellar biosynthesis, which suggests adaptation to a stationary lifestyle [71]. Overall, the presence of these genomic signatures in freshwater sponge-associated bacteria provides strong evidence for mutualistic rather than commensal associations with sponge hosts or acquisition by selective filter-feeding.

There were also several features of the sponge microbiomes that appear to be unique to, and driven by, the freshwater environment we studied. We found no evidence for enrichment in sponge-associated bacteria of nitrogen metabolism-related functions, and although nitrifiers and ammonia-oxidizing bacteria are ubiquitous in marine sponge microbiomes [28, 72], they were nearly absent from both the sponge and water samples in our study. Instead, genes for nitrogen fixation and dissimilatory nitrate reduction, leading to bioavailable ammonium, were over 100 times more abundant than amoA in both sponge and water samples. In marine sponges, nitrification is associated with symbiont removal and uptake of ammonia-rich sponge waste [25, 73, 74]; we hypothesize the ammonium-producing activity in the freshwater environment arises from the oligotrophic nature of the system (see Table S1) relative to marine sponge habitats [75, 76], as both sponge and water bacterial communities assimilate the scarce supply of inorganic nitrogen. Multidrug efflux pumps and sporulation-association proteins are also frequently found to be abundant in marine sponge-associated bacteria [29, 77, 78] but were less abundant in the E. muelleri microbiome than the ambient water, indicating that the rivers may also be less rich in antimicrobial compounds and stressors than the marine environment.


Although marine sponge microbiomes have been the foundation of sponge microbiome research over the past decades, here we have shown that the freshwater sponge E. muelleri also harbors a unique and geographically variable microbiome with genomic features common to the marine sponge microbiota. Our results suggest that, as with marine sponges, the freshwater sponge microbiome may degrade sponge-derived compounds and provide nutrients to the sponge host while also carrying genes that enable microbe-eukaryote mutualisms and protect the bacteria from host immune defenses. These microbial activities may promote sponge health and, consequently, the health of freshwater ecosystems, though specific functional relationships between freshwater sponges and their microbiota remain to be delineated to more fully understand the ecological and evolutionary significance of these unique and largely understudied microbial communities.