Introduction

Organisms are hosts to complex communities of microorganisms collectively known as the microbiota. Species in the microbiota may provide beneficial functions to the host and other species in the microbiota, including nutrient acquisition and uptake1, production of accessible metabolites2, host immune system priming3, and direct pathogen protection4. However, characterizing beneficial host-microbe and microbe-microbe interactions in biologically relevant systems is difficult due to the number of core species present in the microbiota and the variation in an individual organism’s microbiota over time from dietary changes or differential environmental exposure5.

The zooplankton Daphnia magna provides a useful model for studying functional relationships between microbes and their hosts. Daphnia species are used as a model system in ecology, ecotoxicology, and host-parasite dynamics due to their well-documented life cycle and rapid asexual reproduction6. The ability to raise Daphnia magna clonally allows for the use of genetically identical hosts, reducing the impact of genetic variation on the microbiota. Furthermore, their indiscriminate filter feeding allows for control over food input. Daphnia are colonized with bacteria throughout their entire body cavity and gut7,8. Composition of the Daphnia microbiota appears to be similar in spatially unique populations8,9, suggesting mechanisms of acquisition and cultivation of these microbes by the host. The Daphnia microbiota is relatively simple at the class level, with β-proteobacteria, γ-proteobacteria, and Flavobacteriia consistently identified at high relative abundances9,10. However, relatively little work has extended past coarse-level 16S rRNA analysis for identification and characterization of bacterial taxa present in the Daphnia microbiota7. The genomic content of Daphnia-associated bacteria is largely unknown due to the amplicon sequencing methods used in prior research, and we cannot make statements about the metabolic potential of these bacteria due to the relative scarcity of full genomes for close relative species.

Beyond identifying the composition of the microbiota, it is clear that this composition affects host fitness. The Daphnia microbiota has been implicated in nutrient acquisition and breakdown of toxic compounds11, and survival and growth are affected by the presence of, and perturbations to, the microbiota12,13,14. Host fecundity has been specifically tied to Limnohabitans, an abundant genus in the Daphnia microbiota15. However, the functions underlying the beneficial Limnohabitans-Daphnia association are unknown, and how other species in the Daphnia microbiota contribute to host life history and what functions these species may be providing is entirely unclear.

Here, we use shotgun metagenomics to characterize the bacterial species present in the Daphnia magna microbiota. The metagenome-assembled genomes (MAGs) generated from this data were then used to examine potential metabolic functions of the microbiota in total and for each species assembled. This study is the first to report MAGs of bacteria in the Daphnia magna microbiota and their associated gene content. This increased resolution allows us to formulate testable hypotheses about the metabolic interactions happening between the host and microbes, and among microbes, that impact host fitness. This functional knowledge will provide a new lens for studying this important ecological model system.

Results

Shotgun sequencing, assembly, and binning

Shotgun sequencing resulted in 20.19 Mbp of paired-end Illumina read data, which was then reduced to 9.64 Mbp after quality trimming and host genome filtering. A co-assembly of all samples generated 174,991 contigs (N50 of 1,923 bp; longest assembled contig 226,522 bp). Summaries of assembly statistics for the co-assemblies and the individual sample assemblies can be found in Table 1. Identification of high-quality reads and ≥1000 bp assembled contigs using Kraken, Kaiju, and MetaPhlAn2 indicated 18 genera that were present as more than 1% of the sample (Supplementary Fig. 1). Of the 18 genera, only Limnohabitans was identified by all three tools in reads and contigs. Other abundant genera included Pedobacter, Flavobacterium, Polaromonas and unclassified Burkholderiaceae. 16S rRNA sequencing of the library preparation and DNA extraction kits resulted in <50 reads (Fig. 1a). 16S rRNA community profiles of the Daphnia magna food source, Chlamydomonas reinhardtii, the COMBO culture media for Daphnia magna, and samples of 5 healthy adult Daphnia magna showed differences in composition (Fig. 1a), which were then confirmed using an unweighted principal coordinate analysis on the unweighted UniFrac distances between samples (Supplementary Fig. 2). Chlamydomonas samples showed reduced relative abundance of Proteobacteria as compared to adult Daphnia and COMBO; COMBO showed higher relative abundance of Actinobacteria; and healthy Daphnia magna were primarily colonized by Proteobacteria and Bacteroidetes.

Table 1 Summary of assembly statistics for individual sample assemblies and co-assemblies of all four samples (master co-assembly), the two adult samples (adult co-assembly), and the two juvenile samples (juvenile co-assembly).
Figure 1
figure 1

Comparison of bacterial community profiles of Daphnia magna, Chlamydomonas reinhardtii, COMBO media, and negative controls using 16S rRNA sequencing. (a) Phylum-level 16S rRNA profiles of Chlamydomonas reinhardtii (Chlamy, n = 4), COMBO culture media (Culture, n = 4), and the DNA extraction kit and library preparation kit used for sequencing (Kit (−), n = 2). (b) Relative abundance of phyla generated from 16S rRNA community profiles in Chlamydomonas reinhardtii, healthy adult Daphnia magna, and the culture media (all n = 4).

CONCOCT clustering of contigs from the master co-assembly resulted in 19 bins of draft and incomplete genomes. Manual curation, refinement, and merging using Anvi’o resulted in 15 bins, with 7 meeting the ≥50 quality metric. The adult-only co-assembly generated 12 bins, which further refined to 5 bins, with 4 meeting the ≥50 quality metric (Supplementary Fig. 5). Three of the four high-quality bins were also present in the master co-assembly and one unique bin was identified. The juvenile-only co-assembly generated 15 bins, which further refined to 7 bins, with 4 meeting the ≥50 quality metric (Supplementary Fig. 6). Because the master co-assembly generated bins with higher coverage or higher completeness, we used GTDB-Tk on the seven high- and medium-quality bacterial MAGs identified from the master co-assembly only. Two MAGs were placed in the Limnohabitans genus, one in the Polaromonas genus, one unable to be placed below the Burkholderiaceae family, one in the Pedobacter genus, one in the Emticicia genus, and one unable to be placed below the Chitinophagales order (Supplementary Table 1; Supplementary Fig. 4). None of the MAGs mapped to previously identified species in the GTDB-Tk database. Average nucleotide identity of the two Limnohabitans MAGs was measured at 86.6% similarity. This is less than the 95% cutoff generally accepted for strain-levels similarity, suggesting the MAGs are likely separate species16.

To examine potential functions of the Daphnia magna microbiota, we focused on the three most complete MAGs: unknown Burkholderiaceae, 99.28% complete, 11.5X mean coverage; Pedobacter sp., 98.56% complete, 14.87X mean coverage; and Polaromonas sp., 82.73% complete, 15.3X mean coverage. We also examined the two Limnohabitans MAGs: sp1, 78.42% complete, 29.99X coverage; sp2, 60.43% complete, 14.16X coverage (Fig. 2). Though the Limnohabitans MAGs were not over 90% complete, coverage of both exceeded 30x in the juvenile D. magna samples and prior work has indicated the importance of this genus in Daphnia, suggesting unique functions may be present in these genomes. We extracted the nearest matching reference genome for each of the five MAGs above and calculated ANI for each to further investigate if these MAGs were novel or just new strains of already sequenced species (Table 2).

Figure 2
figure 2

Anvi’o display of the two adult and two juvenile samples with the metagenome-assembled genomes highlighted in color. The innermost dendogram represents similarity among contigs based on sequence composition, GC content (green inner ring), and differential coverage (black bars across the four grey rings). The outermost layer shows the genome bins. Genome bin identification performed by GTDB-Tk.

Table 2 Average nucleotide identity of the five MAGs of interest to their nearest sequenced relative. The closest sequenced relative was identified from placement within GTDB-Tk’s reference phylogeny output.

Functional profiling of the five high- and medium-quality MAGs

Contigs from the five MAGs were analyzed for potential coding sequences (CDS) using Prokka. All identified CDS were then queried against the KEGG database and identified orthologs were grouped into KEGG functional categories (Fig. 3). We focused on pathways associated with respiration, carbohydrate metabolism, amino acid metabolism, other energy metabolism, and transport. These functional categories indicate not only general bacterial metabolism but also potential functions involved in host association. From the five MAGs’ total 7,453 CDS that mapped to KEGG pathways, 707 (9.48%) were associated with carbohydrate metabolism, 642 (8.71%) were associated with amino acid metabolism, and 358 (4.80%) were associated with other energy metabolism (Fig. 3a,b; Supplementary Fig. 3). Although C. reinhardtii has a relatively high concentration of lipids17, only 188 (2.52%) of all CDS were associated with lipid metabolism. To understand how many genes were shared among MAGs, we clustered identified genes with OrthoVenn. The two Limnohabitans MAGs shared the highest number of genes (641), and across all MAGs 325 genes were shared (Fig. 4). The Pedobacter MAG had the most unique genes in its gene set (117), while the Burkholderiaceae had so much overlap with other MAGs that it had relatively few unique genes (21).

Figure 3
figure 3

Annotated genes in each of the five MAGs associated with specific KEGG metabolic pathways. (a) Number of genes associated with carbohydrate metabolism pathways in the five MAGs. Genes within each MAG bin were annotated with Prokka, assigned KEGG orthology, and mapped to KEGG metabolic pathways. Genes assigned to pathways were then counted. (b) Number of genes associated with KEGG amino acid metabolism pathways in the five MAGs.

Figure 4
figure 4

Shared and unique genes with KEGG orthologs in each MAG. Unique gene sets are highlighted in gold; shared genes are in red. Intersections between MAGs, or numbers of genes in each MAG’s gene set shared with the other MAGs connected by lines, are noted in the bottom matrix. The total number of genes in each MAG is listed on the left side of the matrix.

After examining broad gene content of the five MAGs, we examined complete metabolic pathways annotated by KEGG. Here, we describe some of the complete pathways annotated, specifically focusing on pathways involved in critical functions for bacteria such as nutrient uptake and biosynthesis, as well as pathways that may be involved in host or environment interaction. A description of each MAG’s complete pathways can be found in Supplementary Materials and in Supplemental Tables 25. Here, we separate pathway descriptions into those encoded by all or multiple MAGs, indicating shared pathways that could indicate functional redundancy or common metabolites accessible to multiple species, and pathways uniquely encoded by single MAGs, indicating potential niche differentiation within the microbiota18. Finally, we highlight some of the functions as signals of host association, providing potential explanations for the presence of these genera in the Daphnia magna microbiota.

Shared functions of the Daphnia magna microbiota

Nutrient uptake and major biosynthesis pathways

All five MAGs shared genes involved in some key metabolic pathways (Supplementary Tables 2, 3, & 5), including those for a complete TCA cycle. Transporters for multiple TCA cycle intermediates were present across the MAGs, including a C4-dicarboxylate transport system and transporters for alpha-glucosides, malate, fumarate, 2-oxogultarate, succinate, and aspartate. Genes encoding a cytochrome c oxidase were identified, suggesting all have the capacity for aerobic respiration. All encoded for lipopolysaccharide transport and lipoprotein release. All MAGs except the Pedobacter encoded for a complete glyoxylate cycle, and the non-oxidative phase of the pentose phosphate pathway was present in all MAGs except the Limnohabitans species.

Transport and biosynthesis of other essential molecules were shared across MAGs. All MAGs except Pedobacter shared transport systems for phospholipids, phosphate, and branched-chain amino acids. All except for Limnohabitans MAG 2 encoded for the elongation step of fatty acid biosynthesis, but only the Burkholderiaceae, Polaromonas, and Pedobacter MAGs encoded for the initiation step. The Limnohabitans MAGs shared multiple sialic acid TRAP transporter genes (siaM, siaT, siaQ).

All MAGs encoded for the transport of multiple necessary vitamins, including riboflavin (B2, ribX, ribY, ribZ), pantothenate (B5, panS), and cobalamin (B12, btuB). Genes involved in the cobalamin salvage pathway were present in both Limnohabitans MAGs and in the Polaromonas MAGs (cobO, cobP). Some species were able to biosynthesize vitamins: the Pedobacter and Polaromonas MAGs encoded for biotin biosynthesis, and all MAGs except Pedobacter encoded for tetrahydrofolate biosynthesis. All MAGs encoded for a pyridoxine 5-phosphate synthase (PdxJ), indicating likely biosynthesis of pyridoxine (B6). Biosynthesis and transport of amino acids varied across MAGs, with no shared amino acid transport systems present across all species. However, all MAGs except Pedobacter encoded for arginine and ornithine biosynthesis pathways. The Burkholderiaceae, Limnohabitans 1, and Polaromonas MAGs encoded for threonine, cysteine, lysine, proline, phenylalanine, tyrosine, and glutathione biosynthesis.

Host and environment interaction

Superoxide dismutase (sodA, sodB) and catalase-peroxidase (katG) were found in all MAGs, potentially acting as defense mechanisms against host-produced reactive oxygen species. The universal minimal Tat system was encoded by all MAGs (tatA, tatC), allowing all of these species to transport folded proteins across cell membranes. Both Limnohabitans MAGs and the Polaromonas MAG encoded for complete adhesin transport systems and a gene associated with type IV pilus biosynthesis (pilQ). These MAGs also encoded two membrane proteases related to aminoglycoside resistance (HtpX, FtsH)19. All MAGs except Pedobacter encoded for the QseC-QseB quorum sensing regulatory system. Though the Polaromonas MAG was the only genome to encode for a complete flagellar assembly pathway, the Limnohabitans MAGs encoded for multiple genes involved in flagellar assembly (flhA, flhB, flgBL). A gene for chemotaxis protein CheA was present in these three MAGs.

Some amino acid importers were present across the MAGs. These include arginine (artM), proline (proP), cystine (yecS) in Limnohabitans MAGs, glutamine (glnQ, glnM) and glutathione (gsiC) in Limnohabitans MAGs and in the Polaromonas MAG, and histidine (hisP, hisM, hisQ) in the Burkholderiaceae MAG and Limnohabitans MAG 1. It is unclear whether food (e.g., Chlamydomonas reinhardtii) or other bacteria is the source of these imported amino acids. As some MAGs are able to biosynthesize the amino acids others require, cross-feeding among bacteria here may be occurring.

Environmental stress tolerance mechanisms were shared among genomes. All encoded for the phosphate starvation response regulatory system PhoR-PhoB. The Limnohabitans MAGs and the Polaromonas MAG shared the EnvZ-OmpR osmotic stress response system, allowing the bacterial cells to respond to changes in osmolality. Limnohabitans MAG 2 and the Polaromonas MAG shared CusS-CusR, a copper tolerance regulatory system.

MAG-specific functions of the Daphnia magna microbiota

Nutrient uptake and major energy pathways

Other respiratory pathways were encoded by some MAGs. The Burkholderiaceae encoded a nitrate/nitrite transporter (narK), a respiratory nitrate reductase (narGHI), and the nitrate respiration two-component regulatory system NarX-NarL. The Polaromonas encoded for thiosulfate transport (cysT, cysW, cysP) and potential use via a subunit of the Sox complex (soxA). We found a supplementary bacteriochlorophyll biosynthesis energy pathway in the Limnohabitans MAGs (pufLM, bchY). This pathway has been documented in other Limnohabitans species20.

Differences in carbohydrate metabolism among MAGs was apparent. The Pedobacter MAG was the only high-quality genome to encode for glycolysis and glycogen biosynthesis; the Burkholderiaceae MAG for the Leloir pathway; and Limnohabitans MAG 2 for the Entner-Doudoroff pathway. The Polaromonas MAG encoded for a complete beta-oxidation system, enabling fatty acid metabolism. Different carbohydrate transporters were found in the MAGs. Limnohabitans MAG 1 uniquely encoded for the transport of glucose, mannose, and glycerol (gtsABC, malK), L-arabinose (araPQ), and for a semiSWEET general sugar transporter. The Burkholderiaceae MAG encoded for methyl-galactoside transport (mglABC) and a galactose processing pathway. The Pedobacter MAG uniquely encoded for chitin degradation to fructose-6-phosphate (chiA, chb, nagB), suggesting potential utilization of chitin.

Other pathways for vitamin and amino acid import, biosynthesis, and degradation were uniquely present in some genomes. Limnohabitans MAG 1 encoded for taurine import (tauA, tauB). This MAG also encoded for degradation of histidine to glutamate. Limnohabitans MAG 2 could transport thiamine into the cell via a putative thiamine transport system (KEGG Module M00192). Uniquely, Polaromonas could synthesize cobalamin (cobA, cobQ, cbiB, cobP, cobC), and potentially could synthesize riboflavin via the purine biosynthesis pathway.

Host and environment interaction

Multiple genes and pathways involved in antibiotic resistance and detoxification were found in the bacterial MAGs, as were a multitude of secretion and regulatory systems. Pedobacter encoded the multidrug efflux pump MdlAB/SmdAB and genes for macrolide export (macA, macB). The Polaromonas MAG encoded an AcrAB-TolC/SmeDEF efflux pump. Polaromonas was the only MAG to encode for a type I secretion system (RaxAB-RaxC). The Burkholderiaceae MAG encoded the BaeS-BaeR envelope stress response two-component regulatory system, as well an alpha-hemolysin and cyclolysin secretion system and a type IV secretion system (virB1–5, 10, 11). It also encoded for an osmotically-inducible protein, OsmY, part of an osmoprotectant ABC transporter complex, and a putrescine transporter (potFGHI).

Genomic signs of host association in the Daphnia magna microbiota

Because the Daphnia magna microbiota has been shown to differ significantly from the surrounding aquatic environment at the genus level9,21, we examined the MAGs for any potential indicators of host association. Multiple instances of host immune system evasion or tolerance were noted in the MAGs. Both Limnohabitans MAGs encoded for the stealth protein CpsY, implicated in host immune system evasion22. Genes involved in dTDP-L-rhamnose biosynthesis were present in all MAGs, indicating the potential for use of this alternative cell wall polysaccharide. L-rhamnose has been linked to bacterial viability and virulence23, and similarly structured polysaccharides have been shown to modulate host immune systems24. A gene involved in quorum quenching (signaling between other microbial species or between microbes and the host) was identified in Limnohabitans MAG 1 and the Pedobacter MAG (ytnP). Other virulence factors and regulators were identified in both Limnohabitans and the Pedobacter MAG (cvfB, bvgS/bvgA).

A potential benefit these bacterial species may provide is amino acid and vitamin biosynthesis and export for host use, as eukaryotes must acquire essential molecules from their diet or from heterotrophic microorganisms. Six amino acids have been demonstrated as essential for Daphnia magna’s close relative, Daphnia pulex: arginine, histidine, leucine, phenylalanine, isoleucine, and tryptophan25, and are likely essential for Daphnia magna. All of the these are biosynthesized by at least one of the MAGs, and some may export them as well. The Limnohabitans MAGs and the Burkholderiaceae MAG encode for an arginine exporter (argO), Limnohabitans sp. 1 encodes for a threonine exporter (rhtA), and the Burkholderiaceae MAG a threonine-serine exchanger (steT). A beneficial vitamin for Daphnia magna is cobalamin, which is not produced by Chlamydomonas reinhardtii26. However, the Polaromonas MAG encodes for cobalamin biosynthesis. Other vitamins are biosynthesized by at least one of the MAGs, including tetrahydrofolate, biotin, and pyridoxine. Supplementation of Daphnia magna growth media with biotin and cobalamin has been shown to increase host fitness27, though it is unclear if these vitamins are essential for Daphnia survival.

Other host-microbe interactions were also present in the MAGs. The Limnohabitans MAG 2 encoded for heme binding proteins and a heme transport system (ccmB, ccmC), implicated in use of host-synthesized heme28. The Pedobacter MAG encoded for an N-acetylneuraminate lyase (nanA) and all five MAGs contain genes involved in transport of sialic acids. Sialic acids are found in complex host tissue29, indicating potential cleavage of sialic acids from host cells for import and use by the bacteria. Pathways involved in host invasion and colonization were also present in the MAGs. The Polaromonas MAG encoded a suite of type IV pilus and fimbriae associated genes, including type IV pilus biogenesis factors (pilY1, pilQ), fimbrial proteins (pilE), the type IV fimbriae synthesis two-component regulatory system PilS-PilR, and a complete set of flagellar assembly genes. This MAG, along with the two Limnohabitans MAGs, also encoded for twitching motility (pilT). Both Limnohabitans MAGs encoded for swarming motility (swrC, rssA), and all three for other genes involved in chemotaxis (pctA, cheA, cheY, tar, cheB, mcp4, tsr, cheW). Though genes for adhesin production were not identified, genes involved in adhesin transport were identified in both Limnohabitans MAGs and in the Burkholderiaceae MAG (bmaC, ehaG, ata).

Discussion

To date, studies examining the Daphnia magna microbiota have only sequenced the 16S rRNA marker gene to understand broad-level interactions and functions of the microbiota. Here, we were able to assemble the first metagenome-assembled genomes from the Daphnia magna microbiota to elucidate genome-specific functions associated with highly abundant members of the bacterial community. Our 16S rRNA gene level data shows that the Daphnia magna bacterial community is structured differently than the surrounding culture environment and from the microbiome of their food, which agrees with results from earlier studies9,30. Our shotgun sequencing revealed five MAGs that were distinct from each other and distinct from their closest sequenced relatives based on average nucleotide identity. While generalizations of the importance of these species to the host may be specific to the single host genotype used in our study, Limnohabitans spp., Pedobacter spp. and other betaproteobacteria have been well-documented as abundant constituents of the Daphnia magna microbiota in laboratory cultures of different host genotypes sourced from different geographical locations and in field samples, suggesting that these are common across Daphnia magna genotypes31. We found that these MAGs were high- or medium-quality, meaning they contained some or most of the single-copy genes found in all bacteria. With metagenomic short-read shotgun sequencing it is unlikely that the MAGs assembled here contained the full gene sets associated with these species; however, bins of relatively high-quality suggest that we found many important genes within each assembled genome. Most studies of the Daphnia magna microbiota using marker-based sequencing have identified more OTUs or ASVs and higher diversity than the 12 bins we identified, likely due to the higher sequencing depth necessary for shotgun sequencing to identify rare taxa32. However, the five MAGs assembled from our shotgun sequencing are mostly consistent with genera found in 16S rRNA sequencing from other studies and from our own sequencing.

The Burkholderiales order has been demonstrated as the most abundant in the Daphnia gut and whole organism microbiota9,13, and four of the five MAGs assembled here were identified to families or genera within this order. Two Limnohabitans MAGs were assembled and found at high coverage in juveniles; this genus has been reported to be highly abundant in the Daphnia microbiota and has been implicated in increasing host fecundity12,15. Here, we show that there are two distinct Limnohabitans MAGs that encode for different metabolic pathways. Only one other study has definitively identified more than one OTU in this genus13. We also identify two other MAGs in the Burkholderiales order, including a Polaromonas species and one unclassified Burkholderiaceae. Surprisingly, we also identify a Pedobacter MAG, which has previously only been reported as a rare taxon in the Daphnia microbiota33.

Analysis of annotated genes and pathways across the five MAGs showed overlap and differences in metabolism. Key pathways such as the TCA cycle were shared across all MAGs, and multiple ABC transporters of key vitamins and amino acids were identified. Many genes encoding for the use of different carbohydrates were encoded across the MAGs. Microalgae are generally rich in carbohydrates34 and serve as a major food source for Daphnia magna35, potentially allowing these microbes easy access to myriad nutrient sources. The two Limnohabitans MAGs shared a high proportion of annotated genes, suggesting some functional similarity between them. Indeed, multiple metabolic pathways were shared between these two MAGs, including the glyoxylate cycle, bacteriochlorophyll biosynthesis, and biosynthesis of some vitamins and amino acids. The Limnohabitans MAGs also shared many genomic features with the Polaromonas MAG and the Burkholderiaceae MAG, including multiple TCA cycle intermediate importers and several transport systems. There is also potential flexibility among the taxa in encoded respiration, notably in the Polaromonas MAG’s ability to use thiosulfate and the Burkholderiaceae’s ability to utilize nitrate under hypoxic conditions. Along with the variety of two-component regulatory systems found within each MAG, the wide range of potential respiratory pathways may allow functions to sustain through different bacterial species even when stressful environments cause fluctuations in the abundance of species within the microbiota.

The difference between taxa identified in the Daphnia magna microbiota and its culture environment suggest that there are some key interactions between the host and its associated microbes in order to establish and maintain these microbial populations. Furthermore, there may be some interactions between bacterial species in the microbiota that could impact the host. We found many genes in the Limnohabitans MAGs and the Polaromonas MAG involved in flagellar assembly, type IV pilus biogenesis and production, and biofilm formation, all of which have been implicated in host colonization and successful adhesion to host-associated surfaces36,37,38. All MAGs encode for l-rhamnose production, which has been implicated in adhesion to other cells23. Genes in secretion systems implicated in host cell adhesion, particularly Type I, II, and IV were also encoded39. We found many genes involved in host immune system evasion or modification, which may allow these bacteria to persist within the host species40. Notably, superoxide dismutase and catalase were encoded by multiple MAGs, suggesting the bacteria could defend against radical oxygen species produced by the host as a defense mechanism41. Also present across MAGs are genes involved in the detoxification of antibiotics and toxins, including multidrug efflux transporters and pumps (MdlAB/SmdAB, AcrAB-TolC/SmeDEF), macrolide export (macA, macB), and stress tolerance to antibiotics (BaeS-BaeR).

Biosynthesis and provision of amino acids by bacteria to their host is a well-documented set of interactions that is known to confer fitness benefits to the host42,43,44. Here, we find that the Limnohabitans MAGs and the Burkholderiaceae MAG encode for the export of arginine and threonine, essential amino acids for the host Daphnia25. Similarly, biosynthesis or metabolism of vitamins and minerals by bacteria and provisioning to the host has been well-documented in the microbiota of other organisms45,46,47. Many genes involved in vitamin B biosynthesis were found across all MAGs. Media supplemented with cobalamin (B12) is used to successfully culture Daphnia magna48, and we find the Polaromonas MAG encoded for a complete cobalamin biosynthesis pathway, suggesting potential vitamin provisioning to the host from this species. We also find a potential microbe-microbe interaction, where the Pedobacter MAG encodes for cleavage of sialic acids from host tissue, where it can then be transported and utilized by other species as a carbohydrate source via a sialic acid TRAP transporter. The breakdown of sialic acid to metabolites that are accessible to the host by microbiome-associated species has been shown to increase host fitness49. If Limnohabitans are able to use sialic acid as a nutrient source, this may be the basis for a microbe-host-microbe interaction, where Limnohabitans provides essential amino acids to the host using energy generated from metabolism of molecules provisioned from the host.

In total, our data shows that there is much versatility in metabolism among the MAGs, but some overlap in function. As Daphnia magna are indiscriminate filter feeders, they may feed on a wide variety of particulate matter with variable nutrient profiles. The versatility in metabolism encoded by these MAGs indicates that they are able to utilize this unpredictable range of nutrients both in the digestive tract and on the carapace. The specific functions of certain MAGs, particularly in amino acid and vitamin biosynthesis and export, seem critical in providing nutritional benefits to the host zooplankton.

Conclusions

Daphnia magna is an important model system for multiple facets of ecology, and has recently become an organism of interest for understanding fundamental questions about the microbiota. Our metagenomic sequencing and subsequent analysis characterizes the Daphnia magna microbiota to the species level and finds some genomic features that allow core bacterial species to acquire and biosynthesize nutrients, and to potentially interact with their host via amino acid and vitamin export. By examining this relatively simple microbiota via MAGs, we can begin to investigate metabolic interactions between the host and its associated microbes. Future work to further elucidate functions of these MAGs will involve long-read metagenome sequencing to complete genome assemblies and pure, single-isolate sequencing to understand strain variation within the microbiota. Furthermore, transcriptomics and metabolomics could be used to understand which of these encoded genes are functioning under different environmental and host conditions, and will direct future hypotheses on host-microbiota interactions. For example, how much of the differences in Daphnia life history and population dynamics across food environments50 can be attributed to differences in microbiota composition? These results will help to inform future work studying the effects of the microbiota on host health and population dynamics across ecological contexts. Moreover, as more populations of Daphnia and their microbiota are sequenced, it will become possible to examine the coevolutionary relationships between hosts and microbiota, and this functional information will be essential for making sense of those relationships.

Methods

Sample collection and extraction

Daphnia magna clone 8A, isolated from a pond at Kaimes Farm, Leitholm, Scottish Borders51, was used in this sequencing effort. Two samples of 100 21-day-old adult Daphnia magna and two samples of 6-day-old juvenile Daphnia were collected from laboratory cultures of between 20–40 individuals maintained in 400 mL jars in defined COMBO medium48. Laboratory cultures were fed a standardized volume calculated with the Biotek Epoch Microplane Spectrophotometer of green algae Chlamydomonas reinhardtii (CPCC 243) to provide 0.25 mg C/ml/day. Laboratory cultures were maintained in a 19 °C controlled environment under a 16 h:8 h light:dark cycle. Samples were immediately ground after collection using in sterile 1.5 mL microcentrifuge tubes. To separate bacterial cells from host cells, a modified protocol from Benson et al. was used52. Samples were suspended in 2 mL PBE buffer and layered on a cushion of 50% sucrose, then centrifuged at 4,000 g for 10 minutes. DNA was extracted from the pellets at the base of the sucrose fractions following the Qiagen DNEasy Blood & Tissue Kit spin-column protocol of total DNA from animal tissues (Qiagen, Hilden, Germany).

Library preparation and sequencing

Shotgun sequencing libraries from the two adult samples and two technical replicates of one juvenile sample were prepared and multiplexed using the Illumina Nextera XT kit and protocol. Input DNA was quantified using the Qubit dsDNA system. Libraries were checked using the Agilent Technology 2100 Bioanalyzer. Libraries were manually normalized due to a final library yield under 10 nM. Paired-end sequencing was performed on an Illumina MiSeq using a MiSeq Reagent Kit V2.

Quality filtering and metagenome assembly

Reads were demultiplexed using the built-in Illumina MiSeq Reporter. Quality of demultiplexed reads was checked using FastQC v0.11.5. Reads were trimmed using Trimmomatic v0.3653 with the commands: ILLUMINACLIP:NexteraPE-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:25 MINLEN:36 to remove adapter sequences and to remove segments of reads where quality fell below 25. Trimmed reads were mapped to the Daphnia magna draft genome54 using BWA55, Samtools56, and BEDtools57, and reads with greater than 80% identity over 50% of the read were filtered. Remaining paired unmapped reads were assembled de novo using metaSPAdes in SPAdes v3.1158. A co-assembly of all four samples, a co-assembly of the adult samples, a co-assembly of the juvenile samples, and individual assemblies for each sample were created.

Taxonomy-independent sequence binning and identification of individual MAGs

To resolve genomes of identified organisms and to discover genomes of organisms not present in read-based taxonomy identification programs or entirely new organisms, contigs from the master co-assembly with reads mapped from each sample were binned using CONCOCT within Anvi’o59, accepting contigs over 2500 bp in length. Contigs greater than 20 kb in length were split into 20 kb fragments prior to running CONCOCT. Bins from the master co-assembly were assessed using Anvi’o, using Parks et al.’s quality score (genome completeness - 5x estimated redundancy or contamination) of ≥50 as a cutoff for further refinement59. Bins meeting this quality cutoff were manually curated within Anvi’o, where contigs within a bin that deviated dramatically from the mean GC content or mean coverage of the bin were removed from the bin. Bins that increased completeness when merged, did not increase redundancy above 10%, and were similar in GC content were merged. Bins that did not have high (>90% completion) or medium quality (>50% completion) after merging and refining were not analyzed further. After merging and refining, bins that still met the quality score cutoff were assigned taxonomy using GTDB-Tk60,61,62. GTDB-Tk uses average nucleotide identity and genome topology to find the closest genomic relative in its database. The same process was repeated for the juvenile co-assembly and the adult co-assembly to confirm species presence and attempt to resolve species identity at different host life stages. Similarity between MAGs was calculated using the average nucleotide identity tool in Pyani63.

Read-based taxonomic classification

Kaiju v1.564, Kraken v1.065, and MetaPhlAn2 v2.666 were used to assign taxonomy to reads. We used all three identifiers due to due to Kaiju’s high rate of false identification67 and MetaPhlAn2’s use of specific marker genes from reference organisms rather than entire genomes. All programs were used with their built-in databases. Kraken results were confirmed via cross-comparison of abundant species with both Kaiju and MetaPhlAn2. All three taxonomy profilers were also used to assign taxonomy to contigs assembled from each sample. Visualization of each sample’s community composition was performed in R.

Functional profiling

Contigs from each MAG and from the adult, juvenile, and master co-assemblies were annotated using Prokka v1.1268. Genes annotated in Prokka were assigned KOs (KEGG Orthologs) using the GhostKOALA tool on the Kyoto Encyclopedia of Genes and Genomes69. KOs identified from GhostKOALA were mapped to standard KEGG categories and metabolic pathways using the KEGG Pathway Mapper & KEGG Module tools to examine pathway completeness and identify pathways of interest70. Genes identified using KEGG and GhostKOALA in each MAG were compared using OrthoVenn71. Overlapping and unique orthologs were compared using custom R scripts and with the ‘UpSetR’ package72.

16S rRNA gene sequencing and identification of contaminant taxa

The V4 hypervariable region of the 16S rRNA gene was sequenced on the Illumina MiSeq using a MiSeq Reagent Kit V2 and the same Qiagen DNEasy Blood & Tissue Kit and reagents as in the shotgun sequencing sample processing. Four samples of five adult Daphnia magna were sequenced to compare community composition found in 16S sequencing to that found in shotgun sequencing, along with four samples of the COMBO media Daphnia magna cultures are raised in, four samples of Chlamydomonas reinhardtii, and two samples of the DNA sequencing kit and library preparation kit as negative controls. Paired-end reads were analyzed in R using the ‘dada2’ package to trim primer sequences, identify amplicon sequence variants, and assign taxonomy73. Taxonomy was assigned using the RefSeq + RDP taxonomic training data set formatted for dada274. Further analysis of community composition and visualization were carried out using the ‘phyloseq’ package in R75.