Epibionts dominate metabolic functional potential of Trichodesmium colonies from the oligotrophic ocean

Trichodesmium is a genus of marine diazotrophic colonial cyanobacteria that exerts a profound influence on global biogeochemistry, by injecting ‘new’ nitrogen into the low nutrient systems where it occurs. Colonies of Trichodesmium ubiquitously contain a diverse assemblage of epibiotic microorganisms, constituting a microbiome on the Trichodesmium host. Metagenome sequences from Trichodesmium colonies were analyzed along a resource gradient in the western North Atlantic to examine microbiome community structure, functional diversity and metabolic contributions to the holobiont. Here we demonstrate the presence of a core Trichodesmium microbiome that is modulated to suit different ocean regions, and contributes over 10 times the metabolic potential of Trichodesmium to the holobiont. Given the ubiquitous nature of epibionts on colonies, the substantial functional diversity within the microbiome is likely an integral facet of Trichodesmium physiological ecology across the oligotrophic oceans where this biogeochemically significant diazotroph thrives.


Introduction
The colonial, diazotrophic cyanobacterium Trichodesmium has a cosmopolitan distribution throughout the tropical and subtropical oceans where it has a keystone role in oligotrophic ecosystems because of its ability to supply biologically available nitrogen through N 2 fixation and fixed carbon through photosynthesis (Capone et al., 1997). Models suggest that Trichodesmium N 2 fixation accounts for roughly half of the total 100-200 Tg of biologically fixed N 2 annually (Bergman et al., 2013), a supply that fuels the uptake of carbon by the broader community of photoautotrophs and ultimately the export of carbon to the deep sea (Arrigo, 2005). In the oligotrophic oceans where Trichodesmium occurs, there is intense competition for resources such as phosphorus and iron, which can limit Trichodesmium N 2 fixation (Sañudo-Wilhelmy et al., 2001) and growth (Krishnamurthy et al., 2007).
Trichodesmium cells grow as filaments, which aggregate to form colonies up to a millimeter in diameter, creating stable substrates that concentrate fixed carbon and nitrogen relative to surrounding seawater (Capone et al., 1997). A hallmark of these colonies is their ubiquitous association with a diverse assemblage of microorganisms that are dominated by heterotrophic bacteria, as well as photosynthetic and even other N 2 fixing bacteria (Sheridan et al., 2002;Hmelo et al., 2012;Momper et al., 2014;Rouco et al., 2016). Collectively, these tightly associated organisms are referred to as epibionts, and they form the Trichodesmium microbiome, a distinct community that is taxonomically different from planktonic microbes in surrounding seawater .
Despite the global biogeochemical significance of Trichodesmium and the ubiquitous presence of a community of tightly associated microorganisms, eco-physiological studies of Trichodesmium have rarely considered this consortia of co-occurring organisms as a holobiont. The microbiome's taxonomic diversity, functional diversity and the interplay between host and epibionts within the holobiont is only beginning to be explored (Hewson et al., 2009;Van Mooy et al., 2012;Rouco et al., 2016), but could help explain the fundamental unknowns that persist regarding Trichodesmium distribution and activities across different environments. Exploring the metabolic functional potential contained within the Trichodesmium microbiome is a key step toward gaining a mechanistic understanding of how this relationship influences the fitness of the holobiont and subsequently the fate of fixed carbon and nitrogen in the oligotrophic ocean. Here we use metagenomic sequencing of Trichodesmium colonies collected from stations along a gradient of phosphorus in the western North Atlantic to examine microbiome composition, functional diversity and metabolic contributions to the holobiont.

Materials and methods
Field sampling Trichodesmium colonies were collected with surface water net tows along a cruise transect in the western North Atlantic aboard the R/V Atlantic Explorer (AE1409) during May 2014. Sampling occurred at the same time each day (~0730-0830 hours) using nets with a mesh size of 130 μm. Nets were deployed and hauled through the surface water column six times before recovery, such that each sample represented thousands of liters of water. Individual Trichodesmium colonies were isolated and washed three times by successive transfer through fresh 0.2 μm sterile-filtered local surface seawater to remove all but tightly associated epibionts. A pooled sample of colonies was isolated and processed from each station. For each sample, an average of~30 cleaned colonies were transferred onto 47 mm 5 μm pore size polycarbonate filters, gently vacuum filtered to remove excess liquid, flash frozen and stored in liquid nitrogen until extraction and sequencing. There were no discernable changes in average colony size from one station to another across the transect. In order to broadly assess the microbiome composition of the North Atlantic Trichodesmium populations, colony composition was sampled to reflect the distribution of Trichodesmium colony morphology found in net tows. At all stations, raft type colonies were much more abundant than puff or bowtie variants with approximately 30 rafts to 2 puff/bowtie colonies. As such, the data largely reflect the dominant raft morphology.

Chemical analyses
Total dissolved phosphorus was determined on 0.2 μm filtrates of surface water (~5 m depth) samples collected by a Niskin rosette outfitted with conductivity, temperature, and depth sensors into acid-clean polycarbonate bottles. Samples were processed at the SOEST Laboratory for Analytical Biogeochemistry at the University of Hawaii, Honolulu, HI, USA, according to facility protocols. Alkaline phosphatase activity samples were obtained by placing 2-5 cleaned Trichodesmium colonies on 5 μm PC filters, gently vacuum filtering away excess liquid, then storing in 47 mm plastic Petri dishes at -20°C until analysis. Samples were processed as previously described (Dyhrman and Ruttenberg, 2006) using 6,8-difluoro-4-methylumbeliferyl phosphate (DiMufP) on a Synergy H1 Hybrid plate reader using the Gen5 software package (BioTek, Winooski, VT, USA) (Dyhrman and Ruttenberg, 2006). N 2 fixation was measured using the acetylene reduction technique as previously described (Capone, 1993;Paerl, 1994). Briefly, approximately 20 Trichodesmium colonies were placed in a 60 ml polycarbonate bottle containing 60 ml of filtered seawater. A 1 ml aliquot of acetylene was injected into the bottle through a septum cap, the bottle was gently inverted and allowed to incubate in an on-deck incubator at ambient temperature and light. The headspace of the bottle was analyzed for ethylene approximately every 30 min and the rate of ethylene production through acetylene reduction was determined by linear regression. All incubations were conducted in triplicate and harvested between approximately local noon and 1400 hours.

DNA extraction and sequencing
Total genomic DNA was extracted from samples using the MoBio Power Plant Pro DNA Isolation Kit (MoBio Laboratories, Inc., Carlsbad, CA, USA) following the manufacturer instructions. An average concentration of 30 ng μl -1 of genomic DNA for each sample was sequenced at the Argonne National Lab (Lemont, IL, USA). Genomic DNA was quantified using the Invitrogen (Carlsbad, CA, USA) qubit and sheared using the Covaris Sonicator (Woburn, MA, USA) to the desired size range. Libraries were then generated using WaferGen's Apollo324 automated library system and Illumina (San Diego, CA, USA) compatible PrepX ILMN DNA library kits following the manufacturer's instructions. Resulting libraries were then size-selected using the Sage BluePippin (Beverly, MA, USA) and sequenced on one 2 × 100 bp lane of the Illumina HiSeq2000. During library preparation, an average insert size of approximately 750 base pairs (bp) was targeted. Metagenomic reads from the six samples are available on the NCBI Sequence Read Archive under BioProject number PRJNA330990.

Sequence assembly and analysis
Metagenomic reads were first trimmed using Sickle with default settings (https://github.com/najoshi/ sickle). Trimmed forward and reverse reads were then converted to fasta with the fq2fa command in IDBA-UD (Peng et al., 2012). Reads from the six samples were assembled into scaffolds to create a merged assembly, using IDBA-UD under default parameters in order to yield robust assembly of the western North Atlantic Trichodesmium holobiont, modeling our approach after similar environmental metagenomic investigations (Handley et al., 2012;Dombrowski et al., 2016). This merged assembly of pooled colony metagenomic reads from across six stations was used for the bulk of the analyses presented herein.
Scaffolds produced by the merged assembly were clustered into genome bins by tetranucleotide frequency and read coverage of individual samples using MaxBin 2.0 set with default parameters (Wu et al., 2015). Genome completeness was estimated at 465% using MaxBin, resulting in robust gene set comparisons for the majority of genome bins. Relative abundance estimates were calculated after Aylward et al. (2014) by multiplying the length of contigs in each bin by the number of reads recruited (coverage), then summing across genome bins. This method has shown good correlation with 16S-based results in other metagenomic data sets (Aylward et al., 2014). Binned scaffolds were translated into predicted proteins using Prodigal on the metagenomic setting (Hyatt et al., 2010). The resulting protein sequences were annotated using the blastp program of DIAMOND against the NCBI nr database (Suzek et al., 2007;Buchfink et al., 2015). The organismal identity of each bin was determined by assessing the nr database taxonomic affiliation of the best hits, with identity determined as the majority (470%) taxonomic affiliation of predicted proteins in a bin. Proteins from each epibiont bin were also annotated against the SEED subsystems using the RAST online annotation program to assess differences in functional categories (Aziz et al., 2008;Overbeek et al., 2014). Although the majority of our analyses were based on the merged, six station metagenome assembly, we also examined regional difference in epibiont community structure by grouping northern (n = 2) and southern (n = 4) stations as replicates. Variance in this north-south epibiont community structure (species relative abundance) was visualized using principal component analysis in R (www.r-project.org), using the rda function in the vegan package (https://github.com/ vegandevs/vegan) (Oksanen et al., 2015). A permutational multivariate analyses of variance test (Po0.1) was performed to examine differences in community structure between the two northern stations versus the four southern stations, using a Bray-Curtis dissimilarity distance matrix (Anderson, 2001). Unpaired t-tests were calculated using GraphPad (La Jolla, CA, USA).
To prepare for orthologous group (OG) clustering, the six individual station assemblies were translated into predicted proteins as described above and merged together. These proteins were then filtered to remove sequences o70 amino acids and clustered into OGs by performing a reciprocal blastp with DIAMOND and then using MCL (Markov cluster algorithm), set to an inflation parameter of 1.4 as previously described (Bertrand et al., 2015). Taxonomic composition of the individual station assembly OG clustered proteins was determined by DIAMOND blastp using the previously assembled genome bins as a reference. OGs were classified as 'epibiont only' if no genes making up the group had best blast hits to the four Trichodesmium genome bins or to proteins that are taxonomically affiliated with Trichodesmium. OGs defined as 'both' were composed of epibiont and Trichodesmium identified proteins.
Functional annotation of the predicted proteins clustered into OGs followed a tiered protocol. First, the size filtered and clustered proteins were annotated using blastp search with DIAMOND against the UniRef90 database (Buchfink et al., 2015) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) with the online Automatic Annotation Server using the single-directional best-hit method targeted to prokaryotes and with the metagenomic option selected. Single functional annotations for entire OGs were determined by taking the majority annotation for all proteins clustered into that group. To refine the annotations of select proteins, curated databases and protein models were used. The alkaline phosphatase-identified OG annotations were refined by performing DIAMOND searches against representative proteins from COGs 3211 (PhoX), 1785 (PhoA) and 3540 (PhoD) (Luo et al., 2009), as well as the protein sequences of three previously identified putative alkaline phosphatases in the IMS101 genome (PhoA: YP723031, PhoX: YP723360, and PhoX2: YP723924) (Orchard et al., 2009). Putative alkaline phosphatase metal cofactors were determined based on previous investigations (Luo et al., 2009;Rodriguez et al., 2014;Yong et al., 2014). PepM OGs were obtained by DIAMOND blast against representative proteins from PFAM 13714 (PEP_mutase). Blast results were accepted if the e-value was o1 × 10 − 5 with a bit score 450. For comparisons against the putative PhoA gene from IMS101, OGs within the metagenomes were considered homologous to this protein if they passed the blast requirements above, and contained UniRef blast homologs to the YP723031 gene from T. erythraeum IMS101 and other putative alkaline phosphatase genes identified through KEGG or UniRef annotation.

Results and discussion
Composition of the Trichodesmium holobiont High-throughput paired-end sequencing (Supplementary Table 1) was performed on total genomic DNA extracted from Trichodesmium colonies collected at six stations in the western North Atlantic (Figure 1a). A merged genomic DNA assembly of sequences from all stations was used to identify 12 unique taxonomic genome bins conserved across all 6 North Atlantic stations, 9 of which were estimated to be over~65% complete based on single copy marker gene presence (Supplementary Figure 1), results that are similar to recoveries from other environmental metagenomic data sets (Handley et al., 2012;Dombrowski et al., 2016). All bins were identified down to the lowest definitive taxonomic level possible. Of the 12 taxonomic bins, there were 4 identified as Trichodesmium, which suggests that multiple Trichodesmium species could be present in these samples. Consistent with this observation, analyses of surface water and colonies along a similar transect in the western North Atlantic detected species from at least three co-occurring clades of Trichodesmium in this region (Rouco et al., 2014(Rouco et al., , 2016. The remaining eight genomic bins were identified as heterotrophic epibionts including two in the Bacteroidetes genus Microscilla, one Gammaproteobacterium, one Alphaproteobacterium in the order Rhodobacterales and four Alphaproteobacteria in the order Rhodospirillales ( Figure 1b). These results contribute new information about the Trichodesmium microbiome, building upon a previous 16S clone library survey of epibiont diversity from the North Atlantic , recent highthroughput assessment of epibiont taxonomic diversity across three ocean basins (Rouco et al., 2016), and metatranscriptomic profiles of Trichodesmium communities from the South Pacific (Hewson et al., 2009). Similarities in the taxonomic groups dominating samples in all of these aforementioned studies indicate that the epibiont metagenomes detected here are likely from core members of the Trichodesmium holobiont. In addition, this microbiome community does not merely represent general particle-associated microbes, as the community recovered here was distinct from those previously found on sinking particles, which have been shown to be enriched with Deltaproteobacteria, Planctomyces and Bacteroidetes in genera other than Microscilla (Fontanez et al., 2015).
To estimate changes in relative abundance of holobiont members across the cruise transect, reads from each station were mapped to the taxonomic bins after Aylward et al. (2014). All microbiome members were detected at each station, confirming that these epibionts represent core components of the microbiome, although the relative abundance was variable from station to station (Figure 1c). Using the relative abundance of different epibiont members, microbiome communities clustered according to their sampling site, with the two northernmost stations separated from the southernmost stations along the PC1 axis of a principal component analysis (Figure 2). A permutational multivariate analyses of variance analysis (P = 0.067), confirmed a differential community structure between the two northern and four southern stations. This may be driven in part by the abundance of Microscilla (bins 4 and 6), which was  significantly different (t-test, P = 0.010) between the two northern stations and the four southern stations. There were no significant differences between the northern and southern stations in N 2 fixation rate, colony PO 4 turnover or alkaline phosphatase activity, however, the two northernmost stations had significantly lower total dissolved phosphorus compared with the four southern stations (0.08 and 0.2 μM total dissolved phosphorus, respectively, t-test, P = 0.0157; Supplementary Table 1). As such, Microscilla relative abundance was higher at the stations with increased oligotrophy. Although further work measuring more parameters over a greater range of conditions is warranted to address the consistency of these relationships, the differential distribution of members of the core microbiome between regions suggests that variations in epibiont community structure could be modulated by the geochemical environment.
In order to examine the diversity of metabolic pathways in the Trichodesmium microbiome, the genome bins were functionally annotated against the SEED subsystems to assess differences in broad functional categories between epibionts (Overbeek et al., 2014). This functional analysis showed that while key capacities like ammonia assimilation, phosphate metabolism, transport of organic compounds and aerobic metabolic pathways like the tricarboxylic acid cycle were largely uniformly present in all epibionts, there were certain SEED subsystems that were enriched or uniquely present in discrete groups (Figure 3). The Microscilla genome bins were enriched in functional categories related to nitrogen metabolism, such as nitrate, and nitrogen stress functions, as well as the synthesis and utilization of reduced phosphorus compounds (Figure 3), which could help explain their relative abundance at the most oligotrophic stations (2 and 5) (Supplementary Table 1). The Gammaproteobacterium and Rhodobacterales (bins 7 and 8) were enriched with functions related to the uptake and exchange of genetic information (gene transfer agents and bacterial secretion systems), whereas the Rhodospirillales bins were enriched in motility-related functions, the uptake of tungstate and the utilization of plant-derived sugars like fructose (Figure 3). In marine bacteria, secretion systems and motility functions have been implicated in the transfer of toxins between adjacent cells and pathogenicity (Salomon et al., 2015) and the modulation of these activities could influence relationships between epibionts or the nature of the host-microbiome relationship. Finally, homologs to genes encoding the light-mediated proton pump proteorhodopsin were found in Microscilla, as well as the Rhodospirillales (Figure 3). In sum, these data show that there are differences in functional metabolic capacity between epibiont groups present in the core microbiome.
Iron and phosphorus-related SEED subsystems within the microbiome (Figure 3), like siderophore and heme-related functions, or reduced phosphorus utilization pathways, may be particularly critical to Trichodesmium physiological ecology given that iron and phosphorus are known drivers of Trichodesmium activities in the study region (Sañudo-Wilhelmy et al., 2001;Chappell et al., 2012). The relative proportion of microbiome genome bins in which key marker genes were found for phosphonate metabolism, heme and siderophore utilization, were compared with free-living microbial communities in the Sargasso Sea region of the western North Atlantic. The Trichodesmium microbiome was enriched nearly twofold in the phosphonate utilization marker phnJ relative to genome equivalents in the Global Ocean Survey data set from the Sargasso Sea (Karl et al., 2008) (Supplementary  Table 2). Phosphonates are known to be an important source of bioavailable phosphorus for Trichodesmium in the North Atlantic  and the enrichment of phnJ in the microbiome indicates the importance of this bond class of phosphorus to the holobiont relative to freeliving microbes in the upper water column. Heme transporters and siderophore/vitamins transporters were present in all core microbiome genome bins, relative to communities of free-living Sargasso Sea microbes in which these functions were present in roughly 2% and 18% of genome equivalents, respectively (Tang et al., 2012;Supplementary Table 2). This enrichment is again suggestive of the importance of iron to the holobiont relative to free-living microbes in the upper water column. The apparent enrichment of phosphorus and iron functions in the microbiome may provide a competitive advantage to the holobiont relative to co-occurring free-living microbes and underlie possible syntrophic interactions between Trichodesmium and the microbiome, but this enrichment could also enhance internal competition within the holobiont. Taken together, the taxonomic and functional diversity within the Trichodesmium microbiome, and its variation along the transect, raises the possibility that the fitness of holobiont members or regional biogeochemistry could influence the distribution and activities of the holobiont.

The microbiome dominates holobiont functional diversity
We evaluated the distribution of and diversity of physiological capabilities in the Trichodesmium host (genome bins 1-3 and 9) compared with members of their microbiome by functionally analyzing OGs of proteins. The taxonomic composition of predicted proteins in OGs was assessed to investigate the extent to which the Trichodesmium microbiome augments the global metabolic potential of the holobiont. Of the total 55 738 unique OGs, 4546 OGs were composed solely of predicted proteins identified as Trichodesmium and 2252 OGs were composed of predicted proteins from Trichodesmium and epibiont genome bins ( Figure 4). The Trichodesmium spp. present in the field samples contained over two times the number of gene families (6798) present in T. erythraeum IMS101 genome, which is predicted to encode 5076 proteins (Walworth et al., 2015) and yielded 2982 OGs following the clustering protocol described herein. Trichodesmium does not exhibit genome streamlining like other oligotrophic cyanobacteria (Walworth et al., 2015), and the disparity in the number of OGs between type strain and field samples could be driven by the presence of multiple species with varying gene contents, or the fact that T. erythraeum is not common in Trichodesmium populations in the North Atlantic (Rouco et al., 2014).
Given the taxonomic diversity of the microbiome, we would expect the number of epibiont OGs to exceed that of Trichodesmium. Nearly 90% of the total 55 738 OGs were composed only of epibiontidentified proteins (Figure 4). This number exceeds what would be expected from the eight core microbiome members together and likely comes from less abundant epibionts that were not sequenced deeply enough to yield genome bins. These epibiont-only OGs represent functions without homologs in Trichodesmium, suggesting recruitment and selection in the microbiome could expand and alter the metabolic repertoire of the holobiont. To more conservatively examine functional content, OGs were assigned functional annotations using KEGG, and the Figure 3 Enrichment of functional pathways recovered from epibiont genome bins. The distribution is based on RAST annotation against the SEED subsystems (Aziz et al., 2008;Overbeek et al., 2014). The contribution of each epibiont to a given SEED subsystem is scaled relative to the percentage of genes within each subcategory found in each genome bin. Ammonifi., Ammonification; ABC, ABC transporter; AA, amino acid. Alpha., Alphaproteobacteria; Bacter., Bacteroidetes; Gamma., Gammaproteobacterium. epibiont-only OGs were found to have twice as many unique functions relative to those shared between epibionts and Trichodesmium, and nearly 10 times as many unique functions as those found in Trichodesmium alone (Supplementary Figure 2). These differences are most notable within the genetic information processing module, consistent with the broad taxonomic diversity within the microbiome and to some extent in the carbohydrate and lipid metabolism module (Supplementary Figure 2), which is represented by an abundance of carbohydrate active enzymes not present in Trichodesmium (see below).
Specific OGs enriched or unique to the microbiome included proteins with functions related to cell-cell signaling and the processing of organic matter ( Figure 5). Homologs to the proteins responsible for sensing and responding to quorum-sensing molecules like acyl homoserine lactone (AHL), the LuxR family, were detected in all epibionts ( Figure 5). Although putative LuxR homologs identified as belonging to Trichodesmium genome bins shared sequence identity along the DNA-binding domain, it has been previously determined that these Trichodesmium genes do not share the characterized AHL-binding residues of these proteins (Vannini et al., 2002;Patankar and Gonzalez, 2009;Van Mooy et al., 2012). This indicates that Trichodesmium is either not involved in quorum sensing, or that the LuxR variant present in Trichodesmium is responding to a different quorum-sensing molecule. Homologs to proteins in the LuxQ family, which is part of a sensor kinase complex that detects the autoinducer-2 (AI-2) signaling molecule (Miller and Bassler, 2001) were detected in the Rhodospirillales (bins 5 and 11) ( Figure 5). Homologs of LuxI, a gene responsible for synthesis of AHL quorum-sensing molecules, and a homolog to acyl-sn-glycerol-3-phosphate acyltransferase, a putative AHL synthase designated HdtS and characterized in Pseudomonas fluorescens (Case et al., 2008), were found in the Rhodobacterales and Rhodospirillales epibionts ( Figure 5). Quorumsensing circuits, and subsequently the activities they modulate, can be broken through the secretion of molecules that break down AHLs. This process, termed quorum quenching, can be driven by quorum quenching enzymes like metallo-beta-lactamases, which degrade antibiotics and AHLs (Hong et al., 2012;Fetzner, 2014). Homologs of this enzyme were detected in Microscilla, Rhodospirillales and Rhodobacterales ( Figure 5). Taken together, these data indicate that a suite of different quorum-sensing pathways are present within members of the microbiome. In fact, quorum-sensing molecules and quorum-quenching activity have been detected in Trichodesmium colonies from environmental samples, and quorum-sensing molecules added to sinking particles from Trichodesmium rich environments stimulate the hydrolysis of organic matter Figure 4 Distribution of OGs in the Trichodesmium holobiont. A total of 264 073 predicted proteins (470 amino acids) were clustered into 55 738 OGs. OGs were considered 'Trichodesmium only' or 'epibiont only' if they were composed of predicted proteins solely from those organisms. The 'both' category refers to OGs composed of predicted proteins from Trichodesmium and epibiont genome bins.
OGs with homologs of nitric oxide synthase (NOS) and the sensing protein (H-NOX) were found in both Microscilla-identified genome bins ( Figure 5). The presence of both pathways have only previously been detected in an Alphaproteobacterial lineage (Rao et al., 2015), and their presence here in Microscilla, a genus in the Bacteroidetes class, suggests that the ability to both produce and sense NO is more widespread in marine bacteria than previously thought. The NO signaling pathway can be triggered by Trichodesmium to induce biofilm formation (Rao et al., 2015), and concomitant quorum sensing is used to modulate a range of responses including N 2 fixation, siderophore and enzyme biosynthesis, as well as motility, aggregation and biofilm formation in other systems (Bassler, 2012).
Finally, homologs were detected for the efflux and sensing of auxin, a family of plant hormones implicated in stimulating a suite of plant processes, including growth and division, which can be synthesized by both plants and bacteria (Spaepen and Vanderleyden, 2011) (Figure 5). In the microbiome, the Microscilla epibionts contained an auxin responsive gene, and both Microscilla and the Rhodospirillales contained genes for the degradation of phenylacetic acid ( Figure 5), an auxin-like compound produced by certain plant-associated bacteria with known antimicrobial activity (Somers et al., 2005). For heterotrophic bacteria, auxin compounds have been shown to induce resistance to stress agents and biofilm formation, upregulation of the tricarboxylic acid cycle and amino-acid biosynthesis, and increased enzyme activity (Spaepen and Vanderleyden, 2011 and references therein). The known roles of auxin, NO and AHLs in cell-cell signaling suggests that these gene targets could be used to query potential interactions, such as mutualism or parasitism within the holobiont. Overall, the presence and known activation of a number of different signaling pathways in the Trichodesmium microbiome suggests that physiological activities may be modulated and coordinated within the holobiont, and could vary depending on the relative abundance of epibionts and with fluctuations in taxonomic composition of the microbiome or the fitness of the Trichodesmium host.
Another set of OGs enriched in the microbiome were identified as members of the glycoside hydrolase (GHs) family of carbohydrate active enzymes (CAZymes). The majority of GH OGs were unique to epibionts with over 80% (21 out of 26) of the unique GH-identified OGs composed solely of epibiont proteins (Supplementary Table 3). These GHs are predicted to be active against a diverse suite of compounds including xylans, lichenin, chitin, xylose and arabinose (Supplementary Table 3). GH enzymes have been implicated in the bacterial processing of algal-derived polysaccharides (Teeling et al., 2012) and the processing of Trichodesmium exudates by epibionts has been suggested to affect the rate of carbon and nitrogen transfer to the deep ocean (Herbst and Overbeck, 1978;Nausch, 1996;Hmelo et al., 2012). As such, microbiome metabolism and its potential modulation through cell-cell signaling, may influence organic matter processing within the holobiont, competition for resources, and the concomitant impact on the fate of carbon and nitrogen. The abundance of these GHs and other epibiont-only OGs representing key functions highlights how the microbiome of Trichodesmium expands holobiont functional diversity. The extent to which this metabolic potential increases or decreases host fitness may influence the cycling of both nitrogen and carbon.
Resource niche partitioning in the Trichodesmium holobiont Iron and phosphorus bioavailability have been shown to influence the distribution and activities of diazotrophic communities (Deutsch et al., 2007;Krishnamurthy et al., 2007), and both are major drivers of Trichodesmium abundance and N 2 fixation rate in the Atlantic (Wu et al., 2000;Sañudo-Wilhelmy et al., 2001;Sohm and Capone, 2006;Webb et al., 2007;Moore et al., 2008Moore et al., , 2009. To evaluate the contribution of the microbiome to potential iron and phosphorus cycling in the holobiont, OGs were additionally screened for the presence of key processes including the metabolism of organic phosphorus, processing of reduced phosphorus, and metabolism and transport of organic iron. The alkaline phosphatase enzyme hydrolyzes phosphate from phosphoester bond dissolved organic phosphorus, and is central to microbial phosphorus bioavailability in the western North Atlantic (Mahaffey et al., 2014), where dissolved organic phosphorus concentration is higher than the inorganic phosphate concentration (Lomas et al., 2010). Homologs for the alkaline phosphatases PhoA, PhoD and PhoX were found in both Trichodesmium and the microbiome ( Figure 5). PhoX was found in Trichodesmium, as well as all epibionts with the exception of one Rhodospirillales (bin 12). Within the core microbiome, PhoX was more common than PhoA, which was only found in bins 5, 6, 7 and 10, and PhoD, which was only detected in the Microscilla genome bins (bins 4 and 6). This observation agrees with previous molecular surveys that found PhoX to be more prevalent than PhoA among oligotrophic planktonic marine bacteria (Sebastian and Ammerman, 2009).
OGs with proteins for the uptake and metabolism of reduced phosphorus were also present in both Trichodesmium and the microbiome ( Figure 5). The phosphite dehydrogenase PtxD was detected in Trichodesmium and the microbiome, and a homolog of an alternate phosphite dehydrogenase enzyme was detected in the Rhodospirillales and in Microscilla in the form of HtxA, a gene characterized in Pseudomonas stutzeri that has a similar function to PtxD (Metcalf and Wolfe, 1998) (Figure 5). When paired to the ABC transporter PtxABC, PtxD allows Trichodesmium to use phosphite as a sole phosphorus source (Polyviou et al., 2015). In addition, Trichodesmium consortia at station 9 took up radiolabeled phosphite (Van Mooy et al., 2015) and the data herein suggest that phosphite is a substrate for both the Trichodesmium and multiple members of the microbiome, which could contribute to resource competition between holobiont members.
Other pathways to metabolize reduced phosphorus, like those involved in phosphonate transport (Phn E) and hydrolysis were detected in both Trichodesmium and the microbiome. PhnJ, a component of the broad specificity C-P lyase, was found in Trichodesmium, consistent with its presence in the T. erythraeum IMS101 genome , two Rhodospirillales epibionts and the Gammaproteobacterium ( Figure 5). Finally, phosphoenolpyruvate mutase (PepM), the enzyme that catalyzes formation of a carbon-phosphorus bond through the conversion of phosphoenolpyruvate to phosphonopyruvate and thought to be a major source of biological phosphonate production (Metcalf and van der Donk, 2009), was detected in Microscilla bin 4, an unbinned epibiont species, as well as one of the Trichodesmium genome bins ( Figure 5). Van Mooy et al. (2015) showed elevated rates of phosphonate compound biosynthesis at station 5, and the metagenome findings suggest that phosphonate biosynthesis may be driven by both Trichodesmium and members of the microbiome, in particular Miroscilla in this system. Overall, the abundance and diversity of phosphorus-related functions shared between Trichodesmium and the microbiome is consistent with the importance of active phosphorus cycling within the holobiont, as may be expected given the known role of phosphorus in limiting Trichodesmium populations in this region (Sañudo-Wilhelmy et al., 2001;Dyhrman et al., 2002).
Similar to phosphorus-related OGs, iron-related OGs were also enriched in the holobiont (Supplementary Table 2), however, the majority of these were present only in epibionts, and with subtle differences in specific genes contributing to each OG between different members of the microbiome ( Figure 5). Homologs to a siderophore biosynthesis protein were found exclusively in one Microscilla (bin 6) ( Figure 5). In addition, homologs to heme group biosynthesis and exporter proteins were made up exclusively from those found within the Microscilla and Rhodospirillales ( Figure 5). Homologs of the TonB-dependent transporter system predicted to transport heme-and siderophore-bound iron, were detected in Trichodesmium and all epibiont genome bins ( Figure 5; Supplementary Table 2), suggesting that while not all holobiont members produce these molecules, they could be taken up and potentially utilized by the core microbiome and Trichodesmium. Trichodesmium is rarely maintained in axenic culture, so direct comparisons of iron uptake between epibionts and the host are challenging. Roe et al. (2012) found that siderophore-bound iron was not as accessible to the Trichodesmium holobiont as inorganic forms, at least relative to select epibiont isolates, which could readily utilize most iron sources. However, uptake of different iron forms has been shown to vary between colonies (Achilles et al., 2003), suggestive of the fact that alterations in epibiont composition could influence the accessibility of different iron forms to the holobiont, and could increase competition for this nutrient in low iron, oligotrophic conditions. Overall, the production and uptake of different organic iron complexes by epibionts could shape the dynamics of irondependent processes like N 2 fixation, particularly in the many low iron regions of the ocean. In contrast to the redundant phosphorus-related functions, the epibiont-only iron functions highlight how the microbiome contributes diverse and unique functions that have the potential to uniquely modulate the geochemical microenvironment within colonies. Iron is relatively more abundant in the study region than in other areas like the North Pacific (Sohm et al., 2011 and references therein), and additional surveys of the functional composition of holobiont metagenomes from diverse ocean regimes would identify the consistency of the iron and phosphorusrelated OG distributions within the Trichodesmium holobiont.

Conclusions
The importance of microbiomes to marine metazoans is well established (for example, Hentschel et al., 2012), but the role of the Trichodesmium microbiome in shaping the distribution, activities and concomitant biogeochemical impact of these consortia is still in its infancy. Here we show that the substantial majority of the metabolic potential in the Trichodesmium holobiont is contained within the microbiome. This finding suggests that within the microbiome, there is a palette of functional diversity that could modulate host fitness and subsequent biogeochemical impact across different environments. This study also provides an annotated holobiont metagenome that could serve as a template for future metatranscriptomic and metaproteomic investigations focused on tracking physiological activities or interactions within the holobiont. The microbiome may underpin Trichodesmium's success in oligotrophic systems and could be an important facet determining its resilience in a future ocean that is likely to bring increased oligotrophic conditions (Riebesell et al., 2009). Overall, these results suggest that Trichodesmium should not be considered in isolation, but rather studied as a dynamic microbial holobiont.