TbasCO: trait-based comparative ‘omics identifies ecosystem-level and niche-differentiating adaptations of an engineered microbiome

McDaniel, E. A.; van Steenbrugge, J. J. M.; Noguera, D. R.; McMahon, K. D.; Raaijmakers, J. M.; Medema, M. H.; Oyserman, B. O.

doi:10.1038/s43705-022-00189-2

Download PDF

Article
Open access
Published: 07 November 2022

TbasCO: trait-based comparative ‘omics identifies ecosystem-level and niche-differentiating adaptations of an engineered microbiome

E. A. McDaniel ORCID: orcid.org/0000-0003-4692-0913^1,2^na1,
J. J. M. van Steenbrugge^3,4,5^na1,
D. R. Noguera⁶,
K. D. McMahon^1,6,
J. M. Raaijmakers^4,7,
M. H. Medema^3,7 &
…
B. O. Oyserman ORCID: orcid.org/0000-0001-9052-2651^3,4

ISME Communications volume 2, Article number: 111 (2022) Cite this article

3019 Accesses
4 Citations
16 Altmetric
Metrics details

Subjects

Abstract

A grand challenge in microbial ecology is disentangling the traits of individual populations within complex communities. Various cultivation-independent approaches have been used to infer traits based on the presence of marker genes. However, marker genes are not linked to traits with complete fidelity, nor do they capture important attributes, such as the timing of gene expression or coordination among traits. To address this, we present an approach for assessing the trait landscape of microbial communities by statistically defining a trait attribute as a shared transcriptional pattern across multiple organisms. Leveraging the KEGG pathway database as a trait library and the Enhanced Biological Phosphorus Removal (EBPR) model microbial ecosystem, we demonstrate that a majority (65%) of traits present in 10 or more genomes have niche-differentiating expression attributes. For example, while many genomes containing high-affinity phosphorus transporter pstABCS display a canonical attribute (e.g. up-regulation under phosphorus starvation), we identified another attribute shared by many genomes where transcription was highest under high phosphorus conditions. Taken together, we provide a novel framework for unravelling the functional dynamics of uncultivated microorganisms by assigning trait-attributes through genome-resolved time-series metatranscriptomics.

Integration of time-series meta-omics data reveals how microbial ecosystems respond to disturbance

Article Open access 19 October 2020

Malte Herold, Susana Martínez Arbas, … Paul Wilmes

Integration of absolute multi-omics reveals dynamic protein-to-RNA ratios and metabolic interplay within mixed-domain microbiomes

Article Open access 18 September 2020

F. Delogu, B. J. Kunath, … P. B. Pope

A comparative whole-genome approach identifies bacterial traits for marine microbial interactions

Article Open access 28 March 2022

Luca Zoccarato, Daniel Sher, … Hans-Peter Grossart

Introduction

A longstanding cornerstone of deterministic ecological theory is that the environment selects for traits. Traits may be defined as any physiological, morphological, or genomic signature that affects the fitness or function of an individual [1]. Trait-based approaches have become indispensable in macroecological systems to describe fitness trade-offs and the effects of biodiversity on ecosystem functioning [2,3,4,5]. Recently, trait-based frameworks have been proposed as an alternative to taxonomy-based methods for describing microbial ecosystem processes [6, 7]. Connecting microbial traits and their phylogenetic distributions to ecosystem-level functions can provide powerful insights into the ecological and evolutionary dynamics underpinning community assembly, microbial biogeography, and organismal responses to changes in the environment [8,9,10]. Additionally, pinpointing the organismal distribution of traits and the ecological selective pressures that enrich them may be leveraged to reproducibly and rationally engineer stable, functionally redundant ecosystems [11,12,13,14,15]. However, applying trait-based approaches to microbial communities is challenging due to the difficulty in identifying and measuring relevant ecological traits for a given ecosystem [16].

High-throughput sequencing technologies and multi-omics techniques are now routinely used to describe the diversity, activity, and functional potential of uncultivated microbial lineages [17,18,19,20,21]. Improvements in bioinformatics algorithms, and in particular metagenomic binning methods, have allowed for genome-resolved investigations of microbial communities rather than gene-based analyses of assembled contigs [22]. These (meta) genomes are subsequently leveraged to detect the presence of key genes or pathways and predict specific traits of the whole community [19, 23]. Integrating metatranscriptomics data addresses a key limitation, as expression patterns better reflect the actual functional dynamics of a trait compared to gene presence alone. Here, we present TbasCO, a software package and statistical framework for Trait-based Comparative ‘Omics to identify expression attributes. We adopt the terminology attribute as a hierarchically structured feature of a trait and assert that statistically similar transcriptional patterns of traits across multiple organisms be treated as attributes (Fig. 1). This new terminology addresses two key semantic challenges. First, by extending upon the current usage of the term “trait” for the presence and absence of pathways to the corresponding transcriptional patterns. Second, it addresses a limitation of the terminology of “co-expression”, which becomes biologically inaccurate when comparing across independent populations of organisms within a community. In this manner, the identification of expression-based attributes provides a high-throughput and intuitive framework for extending trait-based methods to time-series expression patterns in microbial communities. We implement this trait-based approach to classify transcriptional attributes in a microbial community performing Enhanced Biological Phosphorus Removal (EBPR), a globally important biotechnological process implemented in numerous wastewater treatment plants (WWTPs).

**Fig. 1: Overview of trait-based comparative transcriptomics approach**

The fundamental feature of the engineered EBPR ecosystem is the decoupled and cyclic availability of an external carbon source and terminal electron acceptor. This cycling is often referred to as “feast-famine” conditions and provides a strong selective pressure for traits such as polymer cycling. Accumulation of intracellular polyphosphate through cyclic anaerobic-aerobic conditions ultimately results in net phosphorus removal and accomplishes the EBPR process [24, 25]. One of the most well-studied polyphosphate accumulating organisms (PAOs) belongs to the uncultivated bacterial lineage ‘Candidatus Accumulibacter phosphatis’ (hereby referred to as Accumulibacter) [24, 26]. Numerous genome-resolved ‘omics methods have been used to investigate the physiology and regulation of this model PAO enriched in engineered lab-scale enrichment bioreactor systems [27,28,29,30,31,32,33,34,35]. However, novel and putative PAOs have been discovered that remove phosphorus without exhibiting the hallmark traits of Accumulibacter [36,37,38,39,40,41]. Additionally, although these lab-scale systems are designed to specifically enrich for Accumulibacter, a diverse bacterial community persists in these environments [27], and their ecological roles have largely remained unexplored. As a result, the general adaptations of microbial lineages inhabiting the EBPR community are not well understood. Using genome-resolved metagenomics and metatranscriptomics, we assembled 66 species-representative genomes spanning several significant EBPR lineages and identified the distribution of expression-based attributes. We show that while some expression attributes are distributed in few genomes, many are redundant and shared across many lineages. Furthermore, we find that a majority of core traits (as defined by the presence of marker genes) have multiple attributes, suggesting that identifying niche-differentiating expression attributes may be used to reveal a large hidden metabolic versatility when investigating genomic data alone.

Materials and methods

Metagenomic assembly, annotation, and metatranscriptomic mapping

Three metagenomes sampled from an EBPR bioreactor in May of 2013 with linked time-series metatranscriptomics data were sequenced [42]. Samples were collected and DNA extracted according to the Supplemental Methods. Metagenomic samples were processed and assembled into 66 species-representative bins as described in detail in the Supplemental Methods. All bins are greater than 75% complete and contain less than 10% contamination, with a large majority (44/66) >95% complete and <5% redundant as calculated by CheckM [43] and are all described in Table 1.

Table 1 Genome quality statistics and relative abundance calculations for all 66 EBPR SBR MAGs.

Full size table

Each bin was functionally annotated using the KEGG database through an HMM-based approach under KEGG release 93.0 using the command-line KofamKOALA pipeline [44, 45], selecting annotations that were significant hits above the specific HMM threshold. This resulted in 117,657 total annotations with 5,228 unique annotations. We used a metatranscriptomic dataset of six timepoints collected over a single EBPR cycle from Oyserman et al. 2016 [42], with three timepoints from the anaerobic phase and three from the aerobic phase. Raw reads were quality filtered using BBtools suite v38.07 [46] and ribosomal rRNA was removed from each sample using SortMeRNA [47]. Reads from each sample were mapped against the concatenated set of open reading frames from all 66 bins using kallisto v0.44.0 and parsed using the R package tximport [48, 49].

TbasCO method implementation

The TbasCO package identifies expression-based attributes of predefined traits using time-series (meta)transcriptomics data (Fig. 1). As expression patterns are determined by the time-points assessed in an experiment, it is important to design the sampling regime to capture relevant ecophysiological changes within the ecosystem. In general, traits are defined by the presence of a pathway or other collection of genes from an externally provided database. A weighted distance metric between expression patterns for all genes that define a trait is calculated, and statistically significant similarity is determined based on the background distribution of a trait of equal size. Thereby, two or more organisms with a statistically similar expression pattern for a trait share an attribute. As the expression profiles of genes within a trait are compared across genomes independently, co-expression of genes within a genome is not a pre-requisite for identifying an attribute.

Input and preprocessing

The input that is accepted by TbasCO is a table of RNAseq counts in csv format. Each row is treated as gene that has columns for the gene/locus name, counts per sample, the genome the gene belongs to, and the KEGG Orthology (KO) identifier. The RNAseq counts table may be provided pre-normalized or can be normalized by the program. The default normalization method is designed to minimize compositional bias in the differential abundance and activity of constituent populations in metatranscriptomics studies. RNA expression counts are therefore normalized relative to each genomic bin separately for each sample [42]. After normalization, a pruning step is introduced to filter genes that have zero counts or a mean absolute deviation of less than one across all time points. To make inter-organismal comparisons of the relative contribution of a gene to total measured organismal RNA, an additional statistic is calculated ranking the expression counts from each sample from highest to lowest. The ranks for each sample are then normalized by dividing them by the maximum rank value in that sample. This normalization is applied to make ranks comparable between organisms with different genome sizes.

To assess the statistical significance of the calculated distances between the expression patterns of all genes within a trait, random background distributions are created for (1) individual genes and (2) traits of N genes. For individual genes, three different distributions were calculated, based on the distances between randomly sampled open reading frames, randomly sampled genes with an annotation (but not necessarily the same annotation), and randomly sampled genes with the same annotation. The background distribution for a trait of N genes is based on the distances between randomly composed sets of genes. For each gene pair, two distances metrics are calculated, the Pearson Correlation (PC) and the Normalized Rank Euclidean Distance (NRED). In practice, it is often found that a certain annotation is assigned to multiple genes in the same genome. If this occurs, there is an option to use either a random selection, or the highest scoring pair. In the latter case, a correction for multiple testing is implemented. This process is repeated N-times, where N equals the number of genes in any given trait. The background distribution for traits is determined by first randomly sampling two genomes, identifying the overlap in annotations, and finally artificially defining a trait containing N annotations. For each annotation in the trait, the distances are calculated between genome A and genome B, as described in the previous section. As modules vary in size, this process is repeated for traits of different sizes.

Identifying attributes

TbasCO provides both a cluster-based and pair-wise approach to identify attributes. In both methods, the distance between expression patterns of a trait between two genomes is first calculated based on a composite Z score of the PC and NRED for each gene composing the trait. In the cluster-based analysis, the distances are subsequently clustered using the Louvain clustering algorithm to identify trait attributes. To determine if an attribute is significantly similar or not, a one-sided T-test between the attribute and the random background distribution of traits is conducted. This is done for both cluster-based and model-based comparisons. Many traits are complex and represented in databases such as KEGG by numerous alternative routes. To deal with this complexity, each pathway is expanded into all possible alternative routes. Due to the extremely high number of alternative routes for some traits, attributes are pruned based on a strict requirement of 100% completion.

Distance calculations

To determine the similarity in expression patterns between genes, two dissimilarity metrics are calculated: the PC between RNAseq counts across samples, and the NRED, where ranks are a measure of relative abundance of RNA in each sample, normalized the abundance of RNA in the corresponding genome. These distance scores are converted to Z scores using a background distribution of distances between randomly sampled genes as previously described. To determine statistically significant similarities in expression patterns of a trait, a composite score is calculated. For each of these genes the PC and NRED are calculated and transformed to Z scores and combined as (−1*PC + NRED). The distance of the trait between two genomes is defined as the average of these composite distance scores. If traits being compared do not have 100% overlap in gene content, then the dissimilarity score is normalized by the Jaccard distance between gene content of the trait.

$$\left( { - PC + NRED} \right) \ast \left( {1 - dJ} \right)$$

Statistical assessment of trait attributes

In both model-based and pair-wise approaches, the distance is first calculated between expression patterns of a trait between two genomes based on the composite Z score of the PC and NRED for each gene composing the trait. In the clustering-based analysis, the distances are subsequently clustered using the Louvain clustering algorithm to identify trait-attributes. To determine if attributes are significantly similar, a one-sided T-test is conducted between the attribute and a background distribution of randomly sampled traits with the same number of genes. To derive the random background distributions, multiple distributions are calculated ranging in gene numbers from the smallest trait to the largest trait in the dataset as described previously. For each background distribution, N (default: 10,000) traits are randomly composed. The distances between these artificial traits are calculated in the same way as for the actual traits. In addition to a statistical pruning step, the attributes are pruned based on a strict requirement of 100% completion of each module. A benchmarking analysis to examine the effects of different parameters, including the presence of zero counts, was conducted to determine their influence on the number of attributes identified and may be found in the supplementary materials (Supplementary Table 1, Supplementary Figs. 2–4).

Results and discussion

Reconstructing a diverse EBPR SBR community

To explore trait-based transcriptional dynamics of a semi-complex microbial community, we applied genome-resolved metagenomics and metatranscriptomics to an EBPR sequencing-batch reactor (SBR) ecosystem (Fig. 2). We previously performed a metatranscriptomics time-series experiment over the course of a normally operating EBPR cycle to investigate the regulatory controls of Accumulibacter gene expression [42]. In this experiment, six samples were collected for RNA sequencing: three from the anaerobic phase and three from the aerobic phase (Fig. 2A). Additionally, three metagenomes were collected from the same month of the metatranscriptomic experiment, including a sample from the same date of the experiment. We reassembled contemporary Accumulibacter clade IIA and IA genomes that were previously assembled from the same bioreactor system [27, 28]. The genomes of Accumulibacter clades IA and IIA are similar by approximately 85% average-nucleotide identity [28, 31], which is well below the common species-resolved cutoff of 95%, and these groups have recently been designated as separate species (Candidatus Accumulibacter regalis and Candidatus Accumulibacter phosphatis, respectively) [35]. However, we maintain references to the Accumulibacter clade nomenclature based on polyphosphate kinase (ppk1) sequence identity throughout the manuscript (CAPIA and CAPIIA) [31, 50, 51]. During the experiment, the bioreactor was highly enriched in Accumulibacter clade IIA, accounting for approximately 50% of the mapped metagenomic reads and the highest transcriptional counts (Fig. 2B, C) [42]. Whereas Accumulibacter clade IA exhibited low abundance patterns but was within the top 10 genomes with the highest total transcriptional counts (Fig. 2C).

**Fig. 2: Genome-resolved metatranscriptomics approach of an EBPR system.**

Although this bioreactor system was highly enriched in Accumulibacter, a diverse bacterial community persisted and was active in this ecosystem (Fig. 2B, C). We reconstructed representative population genomes of the microbial community of the SBR system, resulting in 64 metagenome-assembled genomes (MAGs) of the (non-Accumulibacter) bacterial community. Interestingly, we recovered genomes of experimentally verified and putative PAOs previously not detected in these bioreactors, including two Tetrasphaera spp. (TET1 and TET2) ‘Candidatus Obscuribacter phosphatis’ (OBS1), and Gemmatimonadetes (GEMMA1). Pure cultures of Tetrasphaera have been experimentally shown to cycle polyphosphate without incorporating PHA [37], deviating from the hallmark Accumulibacter PAO model. The first cultured representative of the Gemmatimonadetes phylum Gemmatimonas aurantiaca was isolated from an SBR simulating EBPR and was shown to accumulate polyphosphate through Neisser and DAPI staining [52]. Additionally, Ca. Obscuribacter phosphatis has been hypothesized to cycle phosphorus based on the presence of genes for phosphorus transport, polyphosphate incorporation, and potential for both anaerobic and aerobic respiration [38], and was enriched in a photobioreactor EBPR system [53]. Both Tetrasphaera spp. TET1 and TET2, OBS1, and GEMMA1 groups exhibit higher relative abundance patterns than CAPIA but have similar relative transcriptional levels (Fig. 2B, C, Table 1).

Numerous SBR MAGs among the Actinobacteria and Proteobacteria contain the high-affinity phosphorus transporter pstABCS system, polyphosphate kinase ppk1, and the low-affinity pit phosphorus transporter (Supplementary Fig. 5). Additionally, select MAGs within the Alphaproteobacteria, Betaproteobacteria, and Gammaproteobacteria contain all required subunits for polyhydroxyalkanoate synthesis (Supplementary Fig. 5). Other abundant and transcriptionally active groups in the SBR ecosystem that are not predicted to be PAOs are members of the Bacteroidetes such as CHIT1 within the Chitinophagaceae, and Cytophagales members Runella sp. RUN1 and Leadbetterella sp. LEAD1 (Fig. 2B, C, Table 1). Interestingly, an uncharacterized group within the Bacteroidetes, represented by BAC1, contributed the third most to the pool of transcripts (Fig. 2C), and did not show phylogenetic similarity to MAGs assembled from Danish full-scale wastewater treatment systems [40] (Supplementary Fig. 1). Other groups from which we assembled MAGs for that do not exhibit clear roles in EBPR systems were Chloroflexi ANAER1 and HERP1 MAGs, Armatimonadetes FIMBRI1, Firmicutes FUSI1, and Patescibacteria SACCH1. Members of the Chloroflexi are filamentous bacteria that have been associated with bulking and foaming events in full-scale WWTPs [54,55,56], but also aid in forming the scaffolding around floc aggregates and degrade complex polymers [56,57,58]. The Patescibacteria (formerly TM7) are widespread but low abundant members of natural and engineered ecosystems, have reduced genome sizes, and may contribute to filamentous bulking in activated sludge [22, 59]. To summarize, lab-scale SBRs designed to enrich for Accumulibacter contain diverse bacterial microorganisms [27, 32], but their ecological functions and putative interactions remain to be fully understood in the context of the EBPR ecosystem.

Identifying expression-based trait attributes among the EBPR SBR community with TbasCO

Current metatranscriptomics analyses often employ either a gene-centric [31, 60,61,62] or genome-centric approach [42, 63,64,65]. In both approaches, highly, differentially, or co-expressed genes are identified and tested for enrichment of specific functions. Enrichment- or annotation-based approaches are employed in numerous metatranscriptomics tools such as MG-RAST, MetaTrans, SAMSA2, COMAN, IMP, and Anvi’o [66,67,68,69,70,71]. Here, we expand on the use of molecular markers as traits by defining expression attributes by leveraging a priori knowledge from predefined trait libraries, such as the KEGG database [72], to statistically assess inter-species expression patterns of genes that together form a trait (Fig. 1). First, our results showed that there is statistically significant transcriptional conservation of genes at the community level; genes that share an annotation were significantly more similar than expected using two different distance metrics (NRED: p value <2.2e–16, PC: p value <2.2e–16). Extending this statistical analysis to the trait level, we identified 1674 attributes distributed across the 66 genomes. On average, we identified 9.12 genomes per attribute (SD -5.22), with a minimum of 3 genomes and a maximum of 35 (Fig. 3B). Based on these statistics, we defined redundant attributes as those two standard deviations above the mean (19 genomes). With this cutoff applied, we identified 79 redundant trait attributes mostly belonging to pathways among carbohydrate metabolism, purine metabolism, and fatty acid metabolism categories (Table 2). Of 290 traits, we identified 97 traits with two or more attributes identified (33%). Of these, traits in 10 or more genomes were twice as likely to have two or more attributes (65%), suggesting that divergent expression patterns for a trait are common, and may represent a niche-differentiating feature (Fig. 3A). Henceforth, when multiple attributes are identified for a trait, we refer to these as niche-differentiating attributes.

**Fig. 3: Clustering and distribution of trait attributes across EBPR SBR community members.**

Table 2 KEGG pathways for core trait-attributes present in greater than 19 genomes.

Full size table

From the ecosystem perspective, a clear phylogenetic signal is observed in the distribution of attributes, as genomes cluster together by shared trait attributes by phylum with some exceptions, such as genomes belonging to the Bacteroidetes, Actinobacteria, and Proteobacteria clustering together, respectively (Fig. 3C). For simplicity, we filtered the network to only include nodes with more than 5 connections. Highly redundant trait attributes belonged to modules in the lipid metabolism, energy metabolism, and nucleotide metabolism KEGG functional categories. In contrast, more specialized trait attributes on the periphery of the network or amongst group-specific clusters such as within the Actinobacteria or subsets of the Proteobacteria belonged to amino acid metabolism, biosynthesis of terpenoids and polyketides, metabolism of cofactors and vitamins, and carbohydrate metabolism KEGG modules. Pathways of note that showed a high level of redundancy include the TCA cycle, isoleucine biosynthesis, acyl-CoA synthesis, threonine biosynthesis, and cytochrome c oxidase activity (Table 2). Large pathways with hundreds of possible routes such as glycolysis, the TCA cycle, gluconeogenesis, and the pentose phosphate pathway are not included in the main network and are displayed as individual networks (Supplementary Fig. 6).

We next explored the distribution of non-redundant attributes (e.g. 3–18 genomes) (Fig. 3B). A total of 796 trait attributes with low redundancy were identified belonging to pathways involved in carbohydrate cofactor and vitamin metabolism including glycolysis, gluconeogenesis, parts of the TCA cycle, tetrahydrofolate biosynthesis, tryptophan biosynthesis, and the pentose phosphate pathway (Table 3). Different sets of low redundancy trait attributes were identified within respective phyla (Supplementary Fig. 7). Between genomes belonging to the Actinobacteria, Alphaproteobacteria, Bacteroidetes, Betaproteobacteria, and Gammaproteobacteria, low redundancy attributes (belonging to less than half of the total genomes within the phylum) include carbohydrate metabolism, amino acid metabolism and metabolism of cofactors and vitamins (Supplementary Fig. 7). Redundant trait attributes within individual phyla belong to core energy metabolism pathways, fatty acid biosynthesis, and carbohydrate metabolism. However, even within individual phyla, non-redundant attributes include different amino acids and cofactors (Extended Table 1 - available on Figshare https://figshare.com/articles/dataset/Lineage-Specific_Core_and_Niche_Differentiating_Traits/15001200).

Table 3 KEGG Pathways for differentiating trait-attributes present between 3 and 18 genomes.

Full size table

As noted previously, one of the most striking findings is that a majority, 65% of traits present in 10 or more genomes have multiple expression attributes. Thus, it seems that while the presence of marker genes suggests many organisms share a particular trait, the presence of niche-differentiating expression profiles suggest an alternative story, that there is a level of hidden metabolic diversity. For example, central carbon metabolism and energy pathways such as the TCA cycle, glycolysis, gluconeogenesis, and the pentose phosphate pathway are oftentimes considered core traits when only analyzing the presence and/or absence of individual markers belonging to these pathways. Among over 1000 high-quality MAGs assembled from full-scale Danish WWTPs, the TCA cycle and pentose phosphate pathway are highly represented among the abundant microorganisms, with glycolysis less so [40]. Whereas the TCA cycle and pentose phosphate pathway are present among a high number of genomes in the EBPR SBR community, different routes or parts of these pathways have niche-differentiating distributions (Supplementary Fig. 6, Tables 2 and 3). These finer-scale differences in expression of “core” traits may explain the persistence of a diverse community when solely fed acetate, as different lineages could employ similar carbon utilization pathways differently or in more versatile ways. Another salient aspect of this analysis is the astonishingly high number of possible routes within individual pathways here represented by their Disjunctive Normal Forms. For example, accounting for all alternative routes and enzymes, the glycolysis pathway has 100 s of possible routes. Layering upon this many expression attributes reveals a large hidden metabolic versatility.

Dimensionality of the high-affinity phosphorus transporter system PstABCS

The EBPR ecosystem is characterized by its highly dynamic phosphorus cycles. To explore how different lineages respond to fluctuating phosphorus concentrations, we examined the expression-based attributes for the KEGG module of the high-affinity phosphorus transporter pstABCS (Fig. 4). The pstABCS system is an ABC-type transporter that strongly binds phosphate with high affinity under phosphorus-limiting conditions, and therefore we expected that the highest expression levels would be at the end of the aerobic cycle [73]. In contrast, we found that pstABCS expression was characterized by two different trait attributes. In the first attribute shared by 14 community members, all pstABCS components displayed the highest activity towards the end of the aerobic cycle, when phosphorus concentrations were depleted (Fig. 4, Attribute 1). Conversely, 11 community members displayed an alternate attribute where the highest activity of pstABCS was at the transition from anaerobic to aerobic phases when phosphorus concentrations are highest (Fig. 4, Attribute 2).

**Fig. 4: Trait attributes of the high-affinity phosphorus transporter system *pstABCS*.**

Interestingly, the two Accumulibacter clades IA and IIA are split amongst these separate pstABCS attributes. These results are in agreement with previous results showing that Accumulibacter clade IIC has a canonical pstABCS expression pattern (as in Fig. 4, Attribute 1), whereas the Accumulibacter clade IA has a non-canonical expression (as in Fig. 4, Attribute 2) [31]. By assigning trait attributes, we can extend these findings beyond Accumulibacter to other community members in the SBR ecosystem suggesting that there are conserved ecological pressures driving niche differentiating expression patterns in pstABCS within the EBPR community.

Distribution and expression of truncated denitrification steps among EPBR community members

Denitrification gene induction is an important ecosystem property linked to the redox status of an environment. In EBPR communities, we find many genomes with diverse and incomplete denitrification pathways, distributed across many lineages denitrification steps expected in denitrifying systems (Fig. 5). Among all 66 MAGs, we did not identify any single MAG with a complete denitrification pathway consisting of the genetic repertoire necessary to fully reduce nitrate to nitrogen gas (Supplementary Fig. 5). Instead, we identified multiple groups of organisms with truncated denitrification pathways, with steps distributed among cohorts of community members (Fig. 5).

**Fig. 5: Expression dynamics of distributed denitrification routes.**

For the first steps of reducing nitrate to nitrite, we examined expression attributes of the napAB and narGH pathways (Fig. 5B, C). For the narGH pathway, two attributes were identified (Fig. 5B). The first narGH attribute was characterized by high expression in the anaerobic phase, with decreasing transcript levels by the second time point of the anaerobic phase. Genomes containing this attribute included the experimentally verified and putative PAOs Tetrasphaera (TET1 and TET2) and Ca. Obscuribacter (OBS1), respectively. The second attribute was exhibited among members of the Actinobacteria (PROP2, PHYC2, PROP3, and NANO1), Proteobacteria (BEIJ4), and Bacteroidetes (BAC1). The attribute identified for napAB was also more highly expressed anaerobically and included CAPIA, CAPIIA, ALIC1, REYR2, RUBRI1, and BEIJ3. Interestingly, this napAB attribute had expression patterns that quickly decreased in the first aerobic time point, suggesting a tighter regulation than Attribute 1 for narGH. Together, this suggests that the regulation of denitrification within the EBPR ecosystem is a niche-differentiating feature whereby the induction of denitrification pathways occurs either anaerobically or only after anaerobic carbon contact.

A smaller cohort contained the genetic repertoire to reduce nitrite to nitrogen gas and exhibited hallmark anaerobic-aerobic expression patterns (Fig. 5E) These members within the Proteobacteria (OTTO2, BEIJ3, VITREO1, and ZOO1) contained the nirS nitrite reductase, the norBC nitric oxide reductase, and nosZ, and showed highest expression of these subunits towards the beginning of the anaerobic cycle, slowly decreasing over the aerobic period to their lowest in the end of the aerobic cycle. Although BEIJ2 was lacking the norBC system, it contained the nirS nitrite reductase and nosZ subunit, and exhibited similar expression patterns to others in this cohort. Other Proteobacteria lineages only contained the norBC subunits but were expressed in similar fashions (RHODO2, FLAVO1, RHIZO1, and LEAD1) (Fig. 5D). Accumulibacter clades IA and IIA as well as ALIC1 were the only lineages with near-complete denitrification pathways. These lineages contained the napAB nitrate reductase system as mentioned above, the nirS nitrite reductase, norB (missing a confident hit for the norC subunit), and nosZ. These three lineages also exhibited hallmark upregulation of all steps in the anaerobic phase, with decreased activity after aerobic contact (Fig. 5F).

Interestingly, Accumulibacter clade IA exhibited a higher level of transcripts associated denitrification steps when expression levels were normalized relative to clade IIA, supporting the hypothesis that denitrification is a niche-differentiating feature among clades [28, 31, 74], and possibly a strain-specific trait since denitrification traits cannot be predicted based on ppk1 clade designations [32]. For example, independent observations in differences among denitrification activities among strains within Accumulibacter clade IC are inconsistent [34, 75]. Within the same bioreactor environment, coexisting Accumulibacter clades differ between denitrification abilities and expression profiles [31,32,33]. Truncated denitrification pathways have also been previously shown to be distributed among community members, with the complete denitrification genetic repertoire only present in few members [32, 33], which could be due to extensive horizontal gene transfer of genes comprising denitrification steps [32, 76]. Although this experiment was not conducted under denitrifying conditions, our approach could be applied to denitrifying EBPR systems to further understand the distribution of denitrification traits among community members and how to selectively enrich for diverse DPAOs.

Biosynthetic potential and expression dynamics of amino acid and vitamin synthesis pathways

Although SBRs are designed to enrich for Accumulibacter by providing acetate as the sole carbon source, a diverse bacterial community persists in these setups [27, 32]. One hypothesis for the persistence of these bacterial community members may be cooperative interactions due to underlying auxotrophies of amino acid and vitamin biosynthetic pathways in Accumulibacter. Amino acids and vitamin cofactors are metabolically expensive to synthesize, and widespread auxotrophies have been widely documented among microbial communities [77, 78]. Specifically, auxotrophies of vitamin cofactors have been shown to fuel bacterial and cross-kingdom interactions with de novo synthesizers [79, 80]. To explore this hypothesis in the EPBR SBR community, we analyzed the presence of amino acid and vitamin biosynthetic pathways and their expression patterns among the top 15 genomes based on transcript abundance (Fig. 6).

**Fig. 6: Biosynthetic potential compared to expression of amino acid and vitamin synthesis pathways for top 15 expressed MAGs.**

Within Accumulibacter, there are a few key vitamin cofactor and amino acid auxotrophies that could fuel potential interactions with other community members. Both Accumulibacter clade genomes are missing the riboflavin pathway for FAD cofactor synthesis, as well as known pathways for serine and aspartic acid (Fig. 6A). The biosynthetic pathway for aspartic acid is distributed among members of the Bacteroidetes and Proteobacteria, whereas only TET2 contains the pathway for serine synthesis (Fig. 5A). The lack of serine biosynthesis pathways in Accumulibacter and other genomes seems striking given that serine is one of the least metabolically costly amino acids to synthesize [81]. Interestingly, Accumulibacter clade IIA (strain CAPIIA) does not contain the biosynthetic machinery for thiamine and pantothenate synthesis, whereas clade IA (strain CAPIA) does (Fig. 6A). Only the CAULO1, HYPHO1, and PSEUDO1 genomes within the Proteobacteria can synthesize thiamine, whereas several other members can synthesize pantothenate (Fig. 6A). The absence of the pantothenate biosynthetic pathway in Accumulibacter CAP IIA is particularly interesting given that coenzyme A is essential for polyhydroxyalkanote biosynthesis, which fuels the rapid and extensive polymer cycling PAO phenotype of Accumulibacter [24].

In addition to other community members potentially supporting the growth of Accumulibacter due to underlying auxotrophies, the reciprocal logic may be possible as well. Both Accumulibacter clades contain the pathways for synthesizing tyrosine and phenylalanine, which are missing in a majority of the top 15 active non-Accumulibacter bacterial genomes (Fig. 6A). Only two other members within the Proteobacteria can synthesize tyrosine and phenylalanine, where RAM1 can synthesize both and PSEUDO1 only phenylalanine. Interestingly, phenylalanine and tyrosine are the second and third most metabolically expensive amino acids to synthesize, respectively, with tryptophan being the most costly [81]. Additionally, a few highly active non-Accumulibacter bacterial community members lack the biosynthetic machinery for several vitamin cofactors and amino acids, such as FLAVO1 and BAC3 within the Bacteroidetes and the putative PAO Ca. Obscuribacter phosphatis OBS1 (Fig. 6A). Particularly, RAM1 within the Proteobacteria is missing the biosynthetic machinery for all vitamin cofactors but can synthesize most amino acids including the most metabolically expensive as mentioned above.

We next analyzed the distribution of trait-attributes of vitamin and amino acid pathways among these genomes to understand how these biosynthetic pathways are expressed similarly or differently in the EBPR SBR ecosystem (Fig. 6B, C). Members of the Proteobacteria containing thiamine and cobalamin biosynthetic pathways all express these traits similarly (Fig. 6B). However, the pantothenate synthesis pathway contains two trait-attributes and is expressed differently among two cohorts. In the first attribute, RUN1, TET1, CAULO1, CAPIA, and PSEUDO1 express the pantothenate pathway similarly. However, OBS1 and TET2 express the pantothenate pathway differently (Fig. 6B). Because tetrahydrofolate can be synthesized through different metabolic routes, we analyzed the differences in trait attribute expression for all routes in genomes that contained sufficient coverage of this trait. Bacteroidetes and Proteobacteria members mostly cluster together among tetrahydrofolate attributes, whereas the TET1 and TET2 genomes are differentiated (Fig. 6B).

Expression of various groups of amino acids show more differentiated expression patterns for genomes with these pathways. Several amino acids also contain different metabolic routes for biosynthesis, and we analyzed all trait attributes for each amino acid for all routes grouped by type (Fig. 6C). For the charged amino acids arginine, histidine, and lysine, Proteobacteria and Bacteroidetes members cluster within their phylogenetic groups, respectively, with lysine and histidine expressed differently among these groups (Fig. 6C). In contrast, arginine is expressed similarly among all Proteobacteria genomes. Among the polar charged amino acids, TET2 is the only genome among the top 15 genomes that contains the pathway to synthesize serine (Fig. 6A). Several groups contain the pathway for threonine synthesis, and expression of different threonine routes are differentiated among the Proteobacteria, Bacteroidetes, and Tetrasphaera spp., though they mostly cluster phylogenetically (Fig. 6C). Notably, the expression patterns for the cysteine and proline biosynthetic pathways do not cluster phylogenetically, such as both Tetrasphaera genomes expressing the proline pathway more similarly to other Proteobacteria and Bacteroidetes (Fig. 6C). The few lineages that can synthesize tyrosine and phenylalanine (CAPIA, CAPIIA, RAM1, PSEUDO1) show different expression patterns. These results show that beyond the presence or absence of key vitamin cofactor and amino acid biosynthetic pathways, EBPR SBR organisms also display coherent and differentiated expression patterns for these traits, of which the functional consequences remain to be further understood.

Conclusions and future perspectives

In this work, we applied a novel trait-based ‘omics pipeline to a semi-complex, engineered bioreactor microbial community to explore ecosystem-level and niche-differentiating traits. Through recovering 66 MAGs from the EBPR SBR community and using a time-series metatranscriptomics experiment, we were able to extend functional predictions such as identifying multiple attributes of high-affinity phosphate transporters beyond hypotheses made from traits alone. We extended this framework to other significant traits that are distributed among community members such as denitrification and amino acid metabolism. Specifically, we demonstrate that traits with similar expression profiles may be clustered into attributes providing a new layer to trait-based approaches.

We believe that identifying expression-based attributes will be a powerful tool to explore microbial traits in natural, engineered, and host-associated microbiomes. Outside of activated sludge systems, trait-based approaches could illuminate how similar secondary metabolite clusters are expressed among different species in a community [82, 83], how auxotrophies for amino acid and vitamin cofactors govern interactions [84], how rhizosphere microorganisms respond to day-night cycles, and identify putative traits that universally exhibit ecosystem-level or niche-differentiating patterns across ecosystems [19, 23]. Importantly, our trait-based approach can be used to screen for expected expression patterns of a key trait compared to a model organism, and then prioritize specific microbial lineages for downstream experimental verification with techniques such as Raman-FISH [85, 86].

Data availability

All supplementary files including functional annotations and transcriptome count files are available at https://figshare.com/projects/EBPR_Trait-Based_Comparative_Omics/90437. All 64 genomes have been deposited in NCBI at Bioproject PRJNA714686. The remaining two reassembled Accumulibacter genomes have not been deposited in NCBI to not confuse between the original CAPIA and CAPIIA assemblies [27, 28]. These contemporary assemblies are available at the Figshare repository. The three metagenomes and six metatranscriptomes used in this study are available on the JGI/IMG at accession codes 3300026302, 3300026286, 3300009517, and 3300002341-46, respectively. All code for performing metagenomic assembly, binning, and annotation can be found at https://github.com/elizabethmcd/EBPR-MAGs. The TbasCO method has been implemented as a reproducible R package and can be accessed at https://github.com/Jorisvansteenbrugge/TbasCO.

References

Violle C, Navas M-L, Vile D, Kazakou E, Fortunel C, Hummel I, et al. Let the concept of trait be functional! Oikos. 2007;116:882–92.
Article Google Scholar
Lavorel S, Garnier E. Predicting changes in community composition and ecosystem functioning from plant traits: revisiting the Holy Grail. Funct Ecol. 2002;16:545–56.
Article Google Scholar
Hooper DU, Chapin FS, Ewel JJ, Hector A, Inchausti P, Lavorel S, et al. Effects of biodiversity on ecosystem functioning: A consensus of current knowledge. Ecol Monogr. 2005;75:3–35.
Article Google Scholar
Pianka ER. On r-and K-selection. The American Naturalist, Vol. 104 (Nov. - Dec., 1970), pp. 592-597.
Wright IJ, Reich PB, Westoby M, Ackerly DD, Baruch Z, Bongers F, et al. The worldwide leaf economics spectrum. Nature. 2004;428:821–7.
Article CAS PubMed Google Scholar
Krause S, Le Roux X, Niklaus PA, Van Bodegom PM, Lennon JT, Bertilsson S, et al. Trait-based approaches for understanding microbial biodiversity and ecosystem functioning. Front Microbiol. 2014;5:251.
Article PubMed PubMed Central Google Scholar
Malik, A.A., Martiny, J.B.H., Brodie, E.L. et al. Defining trait-based microbial strategies with consequences for soil carbon cycling under climate change. ISME J. 2020;14:1–9. https://doi.org/10.1038/s41396-019-0510-0
Guittar J, Shade A, Litchman E. Trait-based community assembly and succession of the infant gut microbiome. Nat Commun. 2019;10:512.
Article CAS PubMed PubMed Central Google Scholar
Wolfe BE, Button JE, Santarelli M, Dutton RJ. Cheese rind communities provide tractable systems for in situ and in vitro studies of microbial diversity. Cell. 2014;158:422–33.
Article CAS PubMed PubMed Central Google Scholar
Enke TN, Datta MS, Schwartzman J, Barrere J, Pascual-García A, Cordero OX. modular assembly of polysaccharide-degrading marine microbial communities. Curr Biol. 2019;29:1528–35.
Herrera Paredes S, Gao T, Law TF, Finkel OM, Mucyn T, Teixeira PJPL, et al. Design of synthetic bacterial communities for predictable plant phenotypes. PLOS Biol. 2018;16:e2003962.
Article PubMed PubMed Central Google Scholar
Lindemann SR, Bernstein HC, Song H-S, Fredrickson JK, Fields MW, Shou W, et al. Engineering microbial consortia for controllable outputs. ISME J. 2016;10:2077–84.
Article CAS PubMed PubMed Central Google Scholar
Oyserman BO, Medema MH, Raaijmakers JM. Road MAPs to engineer host microbiomes. Curr Opin Microbiol. 2018;43:46–54.
Article CAS PubMed Google Scholar
Lawson CE, Harcombe WR, Hatzenpichler R, Lindemann SR, Löffler FE, O’Malley MA, et al. Common principles and best practices for engineering microbiomes. Nat Rev Microbiol. 2019;17:725–41. Nature Publishing Group.
Article CAS PubMed PubMed Central Google Scholar
Gutierrez CF, Sanabria J, Raaijmakers JM, Oyserman BO Restoring degraded microbiome function with self-assembled communities. FEMS Microbiol Ecol. 2020;96:fiaa225.
Allison SD. A trait-based approach for modelling microbial litter decomposition. Ecol Lett. 2012;15:1058–70.
Article CAS PubMed Google Scholar
Tringe SG, von Mering C, Kobayashi A, Salamov AA, Chen K, Chang HW, et al. Comparative metagenomics of microbial communities. Science. 2005;308:554–7.
Article CAS PubMed Google Scholar
Tyson GW, Chapman J, Hugenholtz P, Allen EE, Ram RJ, Richardson PM, et al. Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature. 2004;428:37–43.
Article CAS PubMed Google Scholar
Anantharaman K, Brown CT, Hug LA, Sharon I, Castelle CJ, Probst AJ, et al. Thousands of microbial genomes shed light on interconnected biogeochemical processes in an aquifer system. Nat Commun. 2016;7:13219.
Article CAS PubMed PubMed Central Google Scholar
Woodcroft BJ, Singleton CM, Boyd JA, Evans PN, Emerson JB, Zayed AAF, et al. Genome-centric view of carbon processing in thawing permafrost. Nature. 2018;560:49–54.
Article CAS PubMed Google Scholar
McDaniel EA, Wahl SA, Ishii S, Pinto A, Ziels R, Nielsen PH, et al. Prospects for multi-omics in the microbial ecology of water engineering. Water Res. 2021; 205.
Albertsen M, Hugenholtz P, Skarshewski A, Nielsen KL, Tyson GW, Nielsen PH. Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes. Nat Biotechnol. 2013;31:533–8.
Article CAS PubMed Google Scholar
Wrighton KC, Thomas BC, Sharon I, Miller CS, Castelle CJ, VerBerkmoes NC, et al. Fermentation, hydrogen, and sulfur metabolism in multiple uncultivated bacterial phyla. Science. 2012;337:1661–5.
Article CAS PubMed Google Scholar
Hesselmann RPX, Werlen C, Hahn D, van der Meer JR, Zehnder AJB. Enrichment, phylogenetic analysis and detection of a bacterium that performs enhanced biological phosphate removal in activated sludge. Syst Appl Microbiol. 1999;22:454–65.
Article CAS PubMed Google Scholar
Seviour RJ, Mino T, Onuki M. The microbiology of biological phosphorus removal in activated sludge systems. FEMS Microbiol Rev. 2003;27:99–127.
Article CAS PubMed Google Scholar
Crocetti GR, Hugenholtz P, Bond PL, Schuler A, Ju¨ J, Keller J, et al. Identification of polyphosphate-accumulating organisms and design of 16S rRNA-directed probes for their detection and quantitation. Appl Environ Microbiol. 2000;66:1175–82.
Martín HG, Ivanova N, Kunin V, Warnecke F, Barry KW, McHardy AC, et al. Metagenomic analysis of two enhanced biological phosphorus removal (EBPR) sludge communities. Nat Biotechnol. 2006;24:1263–9.
Article Google Scholar
Flowers JJ, He S, Malfatti S, del Rio TG, Tringe SG, Hugenholtz P, et al. Comparative genomics of two ‘Candidatus Accumulibacter’ clades performing biological phosphorus removal. ISME J. 2013;7:2301–14.
Article CAS PubMed PubMed Central Google Scholar
Oyserman BO, Moya F, Lawson CE, Garcia AL, Vogt M, Heffernen M, et al. Ancestral genome reconstruction identifies the evolutionary basis for trait acquisition in polyphosphate accumulating bacteria. ISME J. 2016;10:2931–45.
Article PubMed PubMed Central Google Scholar
Wilmes P, Andersson AF, Lefsrud MG, Wexler M, Shah M, Zhang B, et al. Community proteogenomics highlights microbial strain-variant protein expression within activated sludge performing enhanced biological phosphorus removal. ISME J. 2008;2:853–64.
Article CAS PubMed Google Scholar
McDaniel EA, Moya-Flores F, Keene Beach N, Camejo PY, Oyserman BO, Kizaric M, et al. Metabolic Differentiation of Co-occurring Accumulibacter Clades Revealed through Genome-Resolved Metatranscriptomics. mSystems. 2021;6:474–95.
Article Google Scholar
Gao H, Mao Y, Zhao X, Liu WT, Zhang T, Wells G. Genome-centric metagenomics resolves microbial diversity and prevalent truncated denitrification pathways in a denitrifying PAO-enriched bioprocess. Water Res. 2019;155:275–87.
Article CAS PubMed Google Scholar
Wang Y, Gao H, F Wells G. Integrated omics analyses reveal differential gene expression and potential for cooperation between denitrifying polyphosphate and glycogen accumulating organisms. Environ Microbiol. 2021;23:3274–93.
Camejo PY, Oyserman BO, McMahon KD, Noguera DR. Integrated omic analyses provide evidence that a “candidatus accumulibacter phosphatis” strain performs denitrification under microaerobic conditions. mSystems. 2019;4:e00193–18.
Article CAS PubMed PubMed Central Google Scholar
Petriglieri F, Singleton CM, Kondrotaite Z, Dueholm MKD, McDaniel EA, McMahon KD, et al. Reevaluation of the Phylogenetic Diversity and Global Distribution of the Genus ‘ Candidatus Accumulibacter’. mSystems. 2022;7:e00016-22.
Kong Y, Nielsen JL, Nielsen PH. Identity and ecophysiology of uncultured actinobacterial polyphosphate-accumulating organisms in full-scale enhanced biological phosphorus removal plants. Appl Environ Microbiol. 2005;71:4076–85.
Article CAS PubMed PubMed Central Google Scholar
Kristiansen R, Nguyen HTT, Saunders AM, Nielsen JL, Wimmer R, Le VQ, et al. A metabolic model for members of the genus Tetrasphaera involved in enhanced biological phosphorus removal. ISME J. 2013;7:543–54.
Article CAS PubMed Google Scholar
Soo R, Skennerton CT, Sekiguchi Y, Imelfort M, Paech S, Dennis P, et al. An expanded genomic representation of the phylum cyanobacteria. Genome Biology Evolut. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4040986/. Accessed 11 Jul 2020.
Petriglieri F, Singleton C, Peces M, Petersen JF, Nierychlo M, Nielsen PH. “Candidatus Dechloromonas phosphoritropha” and “Ca. D. phosphorivorans”, novel polyphosphate accumulating organisms abundant in wastewater treatment systems. ISME J. 2021;15:3605–14. 2021 1512
Article CAS PubMed PubMed Central Google Scholar
Singleton CM, Petriglieri F, Kristensen JM, Kirkegaard RH, Michaelsen TY, Andersen MH, et al. Connecting structure to function with the recovery of over 1000 high-quality metagenome-assembled genomes from activated sludge using long-read sequencing. Nat Commun. 2021;12:2009.
Article CAS PubMed PubMed Central Google Scholar
Singleton CM, Petriglieri F, Wasmund K, Nierychlo M, Kondrotaite Z, Petersen JF, et al. The novel genus, ‘Candidatus Phosphoribacter’, previously identified as Tetrasphaera, is the dominant polyphosphate accumulating lineage in EBPR wastewater treatment plants worldwide. ISME J. 2022 2022; 1–12.
Oyserman BO, Noguera DR, del Rio TG, Tringe SG, McMahon KD. Metatranscriptomic insights on gene expression and regulatory controls in Candidatus Accumulibacter phosphatis. ISME J. 2016;10:810–22.
Article CAS PubMed Google Scholar
Parks DH, Imelfort M, Skennerton CT, Hugenholtz P, Tyson GW. CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes. Genome Res. 2015;25:1043–55.
Article CAS PubMed PubMed Central Google Scholar
Kanehisa M, Sato Y, Morishima K. BlastKOALA and GhostKOALA: KEGG tools for functional characterization of genome and metagenome sequences. J Mol Biol. 2016;428:726–31.
Article CAS PubMed Google Scholar
Aramaki T, Blanc-Mathieu R, Endo H, Ohkubo K, Kanehisa M, Goto S, et al. KofamKOALA: KEGG ortholog assignment based on profile HMM and adaptive score threshold. Bioinformatics. 2020;36:2251-2.
Bushnell B, Rood J, Singer E. BBMerge – Accurate paired shotgun read merging via overlap. PLoS One. 2017;12:e0185056.
Article PubMed PubMed Central Google Scholar
Kopylova E, Noé L, Touzet H. SortMeRNA: fast and accurate filtering of ribosomal RNAs in metatranscriptomic data. Bioinformatics. 2012;28:3211–7.
Article CAS PubMed Google Scholar
Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol. 2016;34:525–7.
Article CAS PubMed Google Scholar
Soneson C, Love MI, Robinson MD. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000Research. 2015;4:1521.
Article PubMed Google Scholar
He S, Gall DL, McMahon KD. ‘Candidatus accumulibacter’ population structure in enhanced biological phosphorus removal sludges as revealed by polyphosphate kinase genes. Appl Environ Microbiol. 2007;73:5865–74.
Article CAS PubMed PubMed Central Google Scholar
Camejo PY, Owen BR, Martirano J, Ma J, Kapoor V, Santo Domingo J, et al. Candidatus Accumulibacter phosphatis clades enriched under cyclic anaerobic and microaerobic conditions simultaneously use different electron acceptors. Water Res. 2016;102:125–37.
Article CAS PubMed PubMed Central Google Scholar
Zhang H, Sekiguchi Y, Hanada S, Hugenholtz P, Kim H, Kamagata Y, et al. Gemmatimonas aurantiaca gen. nov., sp. nov., a Gram-negative, aerobic, polyphosphate-accumulating micro-organism, the first cultured representative of the new bacterial phylum Gemmatimonadetes phyl. nov. Int J Syst Evol Microbiol. 2003;53:1155–63.
Article CAS PubMed Google Scholar
McDaniel EA, Wever R, Oyserman BO, Noguera DR, McMahon KD Genome-resolved metagenomics of a photosynthetic bioreactor performing biological nutrient removal. Microbiol Resour Announc. 2021;10:e00244-21.
Speirs LBM, Rice DTF, Petrovski S, Seviour RJ. The phylogeny, biodiversity, and ecology of the chloroflexi in activated sludge. Front Microbiol. 2019;10:2015. Frontiers Media S.A.
Article PubMed PubMed Central Google Scholar
Andersen MH, McIlroy SJ, Nierychlo M, Nielsen PH, Albertsen M. Genomic insights into Candidatus Amarolinea aalborgensis gen. nov., sp. nov., associated with settleability problems in wastewater treatment plants. Syst Appl Microbiol. 2019;42:77–84.
Article CAS PubMed Google Scholar
Nierychlo M, Miłobȩdzka A, Petriglieri F, McIlroy B, Nielsen PH, McIlroy SJ. The morphology and metabolic potential of the Chloroflexi in full-scale activated sludge wastewater treatment plants. FEMS Microbiol Ecol. 2019;95:fiy228.
McIlroy SJ, Karst SM, Nierychlo M, Dueholm MS, Albertsen M, Kirkegaard RH, et al. Genomic and in situ investigations of the novel uncultured Chloroflexi associated with 0092 morphotype filamentous bulking in activated sludge. ISME J. 2016;10:2223–34.
Article CAS PubMed PubMed Central Google Scholar
Kragelund C, Levantesi C, Borger A, Thelen K, Eikelboom D, Tandoi V, et al. Identity, abundance and ecophysiology of filamentous Chloroflexi species present in activated sludge treatment plants. FEMS Microbiol Ecol. 2007;59:671–82.
Article CAS PubMed Google Scholar
Kindaichi T, Yamaoka S, Uehara R, Ozaki N, Ohashi A, Albertsen M, et al. Phylogenetic diversity and ecophysiology of Candidate phylum Saccharibacteria in activated sludge. FEMS Microbiol Ecol. 2016;92:1–11.
Article Google Scholar
Mann E, Wetzels SU, Wagner M, Zebeli Q, Schmitz-Esser S. Metatranscriptome sequencing reveals insights into the gene expression and functional potential of rumen wall bacteria. Front Microbiol. 2018;9:43.
Article PubMed PubMed Central Google Scholar
Jiang Y, Xiong X, Danska J, Parkinson J. Metatranscriptomic analysis of diverse microbial communities reveals core metabolic pathways and microbiome-specific functionality. Microbiome. 2016;4:2.
Article PubMed PubMed Central Google Scholar
Linz AM, Aylward FO, Bertilsson S, McMahon KD Time-series metatranscriptomes reveal conserved patterns between phototrophic and heterotrophic microbes in diverse freshwater systems. Limnol Oceanogr. 2019;65:101–12.
Lawson CE, Wu S, Bhattacharjee AS, Hamilton JJ, McMahon KD, Goel R, et al. Metabolic network analysis reveals microbial community interactions in anammox granules. Nat Commun. 2017;8:15416.
Article CAS PubMed PubMed Central Google Scholar
Aylward FO, Eppley JM, Smith JM, Chavez FP, Scholin CA, DeLong EF. Microbial community transcriptional networks are conserved in three domains at ocean basin scales. Proc Natl Acad Sci. 2015;112:5443–8.
Article CAS PubMed PubMed Central Google Scholar
Hao L, Michaelsen TY, Singleton CM, Dottorini G, Kirkegaard RH, Albertsen M, et al. Novel syntrophic bacteria in full-scale anaerobic digesters revealed by genome-centric metatranscriptomics. ISME J. 2020;14:906–18.
Article CAS PubMed PubMed Central Google Scholar
Glass EM, Meyer F. The metagenomics RAST server: a public resource for the automatic phylogenetic and functional analysis of metagenomes. Handb Mol Microb Ecol I Metagenomics Complement Approaches. 2011;8:325–31.
Google Scholar
Martinez X, Pozuelo M, Pascal V, Campos D, Gut I, Gut M, et al. MetaTrans: an open-source pipeline for metatranscriptomics. Sci Rep. 2016;6:26447.
Article CAS PubMed PubMed Central Google Scholar
Westreich ST, Treiber ML, Mills DA, Korf I, Lemay DG. SAMSA2: a standalone metatranscriptome analysis pipeline. BMC Bioinformatics. 2018;19:175.
Article PubMed PubMed Central Google Scholar
Ni Y, Li J, Panagiotou G. COMAN: a web server for comprehensive metatranscriptomics analysis. BMC Genomics. 2016;17:622.
Article PubMed PubMed Central Google Scholar
Narayanasamy S, Jarosz Y, Muller EEL, Heintz-Buschart A, Herold M, Kaysen A, et al. IMP: a pipeline for reproducible reference-independent integrated metagenomic and metatranscriptomic analyses. Genome Biol. 2016;17:260.
Article PubMed PubMed Central Google Scholar
Eren AM, Esen ÖC, Quince C, Vineis JH, Morrison HG, Sogin ML, et al. Anvi’o: an advanced analysis and visualization platform for ‘omics data. PeerJ. 2015;3:e1319.
Article PubMed PubMed Central Google Scholar
Kanehisa M, Sato Y, Kawashima M, Furumichi M, Tanabe M. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res. 2016;44:D457–D462.
Article CAS PubMed Google Scholar
Wanner BL. Gene regulation by phosphate in enteric bacteria. J Cell Biochem. 1993;51:47–54.
Article CAS PubMed Google Scholar
Flowers JJ, He S, Yilmaz S, Noguera DR, McMahon KD. Denitrification capabilities of two biological phosphorus removal sludges dominated by different ‘Candidatus Accumulibacter’ clades. Environ Microbiol Rep. 2009;1:583–8.
Article CAS PubMed PubMed Central Google Scholar
Rubio-Rincón FJ, Weissbrodt DG, Lopez-Vazquez CM, Welles L, Abbas B, Albertsen M, et al. “Candidatus Accumulibacter delftensis”: A clade IC novel polyphosphate-accumulating organism without denitrifying activity on nitrate. Water Res. 2019;161:136–51.
Article PubMed Google Scholar
Parsons C, Stüeken EE, Rosen CJ, Mateos K, Anderson RE Radiation of nitrogen-metabolizing enzymes across the tree of life tracks environmental transitions in Earth history. Geobiology. 2021;1:18–34.
Gómez-Consarnau L, Sachdeva R, Gifford SM, Cutter LS, Fuhrman JA, Sañudo-Wilhelmy SA, et al. Mosaic patterns of B-vitamin synthesis and utilization in a natural marine microbial community. Environ Microbiol. 2018;20:2809–23.
Article PubMed Google Scholar
Hamilton JJ, Garcia SL, Brown BS, Oyserman BO, Moya-Flores F, Bertilsson S, et al. Metabolic Network Analysis and Metatranscriptomics Reveal Auxotrophies and Nutrient Sources of the Cosmopolitan Freshwater Microbial Lineage acI. mSystems. 2017;2:e00091–17.
Article CAS PubMed PubMed Central Google Scholar
McClure RS, Overall CC, Hill EA, Song H-S, Charania M, Bernstein HC, et al. Species-specific transcriptomic network inference of interspecies interactions. ISME J. 2018;1:2011–23.
Croft MT, Lawrence AD, Raux-Deery E, Warren MJ, Smith AG. Algae acquire vitamin B12 through a symbiotic relationship with bacteria. Nature. 2005;438:90–93.
Article CAS PubMed Google Scholar
Akashi H, Gojobori T. Metabolic efficiency and amino acid composition in the proteomes of Escherichia coli and Bacillus subtilis. Proc Natl Acad Sci USA. 2002;99:3695–3700.
Article CAS PubMed PubMed Central Google Scholar
Lozano GL, Bravo JI, Diago MFG, Park HB, Hurley A, Peterson SB, et al. Introducing THOR, a model microbiome for genetic dissection of community behavior. MBio. 2019;10:e02846–18.
Article CAS PubMed PubMed Central Google Scholar
Crits-Christoph A, Diamond S, Butterfield CN, Thomas BC, Banfield JF. Novel soil bacteria possess diverse genes for secondary metabolite biosynthesis. Nature. 2018;558:440–4.
Article CAS PubMed Google Scholar
Zengler K, Zaramela LS. The social network of microorganisms—how auxotrophies shape complex communities. Nat Rev Microbiol. 2018;16:383–90.
Article CAS PubMed PubMed Central Google Scholar
Fernando EY, McIlroy SJ, Nierychlo M, Herbst FA, Petriglieri F, Schmid MC, et al. Resolving the individual contribution of key microbial populations to enhanced biological phosphorus removal with Raman–FISH. ISME J. 2019;13:1933–46.
Article CAS PubMed PubMed Central Google Scholar
Petriglieri F, Petersen JF, Peces M, Nierychlo M, Hansen K, Baastrand CE, et al. Quantification of biologically and chemically bound phosphorus in activated sludge from full-scale plants with biological P-removal. Environ Sci Technol. 2022;56:5132–40.
Article CAS PubMed PubMed Central Google Scholar
Chaumeil P-A, Mussig AJ, Hugenholtz P, Parks DH GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database. Bioinformatics. 2019;36:1925–27.
Seemann T Prokka: Rapid prokaryotic genome annotation. Bioinformatics. 2014;30:2068–69.

Download references

Acknowledgements

We thank Caitlin Singleton for providing early access to high-quality genomes from a full-scale WWTP to compare our MAGs against. Metagenomic and metatranscriptomic sequencing was provided through a Joint Genome Institute Community Science Proposal (Proposal ID 873). This work was supported by funding from the National Science Foundation (MCB-1518130) to K.D.M and D.R.N. Funding was provided to E.A.M. by a fellowship through the Department of Bacteriology at the University of Wisconsin – Madison. Funding for B.O.O was in part provided by the Technology Foundation of the Dutch National Science Foundation (NWO-TTW). This research was performed in part using the Wisconsin Energy Institute computing cluster, which is supported by the Great Lakes Bioenergy Research Center as a part of the U.S. Department of Energy Office of Science (DE-SC0018409).

Author information

These authors contributed equally: E. A. McDaniel, J. J. M. van Steenbrugge.

Authors and Affiliations

Department of Bacteriology, University of Wisconsin—Madison, Madison, WI, USA
E. A. McDaniel & K. D. McMahon
Microbiology Doctoral Training Program, University of Wisconsin—Madison, Madison, WI, USA
E. A. McDaniel
Bioinformatics Group, Wageningen University and Research, Wageningen, The Netherlands
J. J. M. van Steenbrugge, M. H. Medema & B. O. Oyserman
Microbial Ecology, Netherlands Institute of Ecological Research, Wageningen, The Netherlands
J. J. M. van Steenbrugge, J. M. Raaijmakers & B. O. Oyserman
Laboratory of Nematology, Wageningen University, Wageningen, The Netherlands
J. J. M. van Steenbrugge
Department of Civil and Environmental Engineering, University of Wisconsin—Madison, Madison, WI, USA
D. R. Noguera & K. D. McMahon
Institute of Biology, Leiden University, Leiden, Netherlands
J. M. Raaijmakers & M. H. Medema

Authors

E. A. McDaniel
View author publications
You can also search for this author in PubMed Google Scholar
J. J. M. van Steenbrugge
View author publications
You can also search for this author in PubMed Google Scholar
D. R. Noguera
View author publications
You can also search for this author in PubMed Google Scholar
K. D. McMahon
View author publications
You can also search for this author in PubMed Google Scholar
J. M. Raaijmakers
View author publications
You can also search for this author in PubMed Google Scholar
M. H. Medema
View author publications
You can also search for this author in PubMed Google Scholar
B. O. Oyserman
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

EAM and JJMVS contributed equally to this work. EAM performed metagenomic assembly and genome curation, metatranscriptomic and functional analysis, and software testing. JJMVS and BOO designed the TbasCO analysis software and JJMVS performed benchmarking testing. DRN, KDM, JMR, and MHM provided critical feedback on the manuscript. EAM, JJMVS, and BOO performed analyses, interpretation, and wrote the manuscript with input from all coauthors.

Corresponding authors

Correspondence to E. A. McDaniel, J. J. M. van Steenbrugge or B. O. Oyserman.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplemental Material

Supplementary Figure 1

Supplementary Figure 2

Supplementary Figure 3

Supplementary Figure 4

Supplementary Figure 5

Supplementary Figure 6

Supplementary Figure 7

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

McDaniel, E.A., van Steenbrugge, J.J.M., Noguera, D.R. et al. TbasCO: trait-based comparative ‘omics identifies ecosystem-level and niche-differentiating adaptations of an engineered microbiome. ISME COMMUN. 2, 111 (2022). https://doi.org/10.1038/s43705-022-00189-2

Download citation

Received: 20 May 2022
Revised: 29 September 2022
Accepted: 10 October 2022
Published: 07 November 2022
DOI: https://doi.org/10.1038/s43705-022-00189-2

This article is cited by

Trait biases in microbial reference genomes
- Sage Albright
- Stilianos Louca
Scientific Data (2023)

Subjects

Abstract

Similar content being viewed by others

Introduction

Materials and methods

Metagenomic assembly, annotation, and metatranscriptomic mapping

TbasCO method implementation

Input and preprocessing

Identifying attributes

Distance calculations

Statistical assessment of trait attributes

Results and discussion

Reconstructing a diverse EBPR SBR community

Identifying expression-based trait attributes among the EBPR SBR community with TbasCO

Dimensionality of the high-affinity phosphorus transporter system PstABCS

Distribution and expression of truncated denitrification steps among EPBR community members

Biosynthetic potential and expression dynamics of amino acid and vitamin synthesis pathways

Conclusions and future perspectives

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Search

Quick links