High functional diversity among Nitrospira populations that dominate rotating biological contactor microbial communities in a municipal wastewater treatment plant

Nitrification, the oxidation of ammonia to nitrate via nitrite, is an important process in municipal wastewater treatment plants (WWTPs). Members of the Nitrospira genus that contribute to complete ammonia oxidation (comammox) have only recently been discovered and their relevance to engineered water treatment systems is poorly understood. This study investigated distributions of Nitrospira, ammonia-oxidizing archaea (AOA), and ammonia-oxidizing bacteria (AOB) in biofilm samples collected from tertiary rotating biological contactors (RBCs) of a municipal WWTP in Guelph, Ontario, Canada. Using quantitative PCR (qPCR), 16S rRNA gene sequencing, and metagenomics, our results demonstrate that Nitrospira species strongly dominate RBC biofilm samples and that comammox Nitrospira outnumber all other nitrifiers. Genome bins recovered from assembled metagenomes reveal multiple populations of comammox Nitrospira with distinct spatial and temporal distributions, including several taxa that are distinct from previously characterized Nitrospira members. Diverse functional profiles imply a high level of niche heterogeneity among comammox Nitrospira, in contrast to the sole detected AOA representative that was previously cultivated and characterized from the same RBC biofilm. Our metagenome bins also reveal two cyanase-encoding populations of comammox Nitrospira, suggesting an ability to degrade cyanate, which has only been shown previously for several Nitrospira representatives that are strict nitrite oxidizers. This study demonstrates the importance of RBCs as model systems for continued investigation of environmental factors that control the distributions and activities of AOB, AOA, comammox Nitrospira, and other nitrite oxidizers.


Introduction
Municipal wastewater contains ammonium that is removed by wastewater treatment plants (WWTPs) to prevent eutrophication, oxygen depletion, and toxicity to aquatic animals in receiving waters. Nitrification involves the sequential oxidation of ammonia to nitrate, via nitrite, and these two enzymatic steps were historically thought to be mediated by distinct microorganisms, with aerobic ammonia oxidation conducted by ammonia-oxidizing bacteria (AOB) or ammonia-oxidizing archaea (AOA), and nitrite oxidation catalyzed by nitrite-oxidizing bacteria (NOB) [1][2][3]. Microorganisms capable of oxidizing both ammonia and nitrite via complete ammonia oxidation (comammox) were predicted over a decade ago to be slow growing and to inhabit biofilms exposed to relatively low ammonium concentrations [4]. These predictions were confirmed by the discovery of Nitrospira members capable of catalyzing comammox [5,6], and all known comammox Nitrospira belong to lineage II of the genus Nitrospira. Although two major clades of comammox Nitrospira have been described (i.e., clades A and B), based on ammonia monooxygenase (amoA) gene phylogeny, all enriched and cultivated species of comammox Nitrospira belong to clade A [5,6].
The current study examined full scale tertiary rotating biological contactors (RBCs; Fig. 1a) treating post-aeration basin municipal waste from~132,000 residents of Guelph, Ontario, Canada. Comprising a total biofilm surface area of 440,000 m 2 and processing~53,000 m 3 of wastewater daily, the RBCs are organized into four "trains" that are each composed of eight individual RBC "stages" (Fig. 1b). The RBC-associated AOA and AOB communities were characterized previously, demonstrating that the sole detectable AOA population increased in abundance along the RBC flowpath [29]. This enriched species is named "Candidatus Nitrosocosmicus hydrocola" (formerly "Candidatus Nitrosocosmicus exaquare") and affiliates with the class Nitrososphaeria (formerly phylum Thaumarchaeota) [30]. Inconsistent in situ activity data obtained using differential inhibitors for AOB and an abundance of labeled Nitrospira cells, as viewed by microautoradiography and fluorescence in situ hybridization, raised the possibility that some of the RBC-associated Nitrospira may contribute to comammox [30]. Due to the predicted low ammonium niche and biofilm growth of comammox bacteria [4,5], we hypothesized that comammox Nitrospira would dominate the RBC biofilm and that the relative abundance of comammox Nitrospira would increase as ammonium concentrations decrease along the RBC flowpath, as demonstrated previously for "Ca. N. hydrocola" [29]. To test these hypotheses, we assessed the relative abundance, diversity, and temporal stability of comammox Nitrospira, in relation to AOA and other AOB, throughout the tertiary treatment system, using a combination of quantitative PCR (qPCR), 16S rRNA gene sequencing, and metagenomics.

Sampling
Using ethanol-cleaned spatulas, biofilm was sampled from the RBC trains in Guelph, Ontario, Canada (Fig. 1a, b). Samples were stored on dry ice until delivered to the lab and were stored at −70°C. All RBCs were sampled in October 2016, except for RBC 1 of the southeast (SE) train and RBC 2 of the southwest (SW) train, which were not operational at the time of sampling (Table S1). We included biofilm samples that were collected previously for temporal comparison [29]. These reference samples were collected in February, June, and September 2010 from RBCs 1 and 8 of the northeast (NE) train (Table S1)

Water chemistry
Ammonium was measured fluorometrically using orthophthaldialdehyde reagent [31] as described previously [32] and nitrite/nitrate was assessed colorimetrically with the Griess reagent [33]. Measurements represent total ammonium as nitrogen, and nitrite and nitrate as nitrogen; details are described in the supplemental methods. Based on data obtained from WWTP operators, monthly and yearly ammonium concentrations (measured as total ammonia as nitrogen) of secondary effluent (i.e., RBC influent) and water from the SE RBC train were compiled for the years 2010-2017. Plant influent and effluent data were retrieved from Guelph WWTP annual reports for 2010-2017 (https:// guelph.ca/living/environment/water/wastewater/).

DNA extractions
Extractions were performed from 0.25 g (wet weight) of biofilm with the PowerSoil DNA Isolation Kit (Mo Bio, Carlsbad, CA, USA) as described previously [30]. Total isolated DNA was visualized on a 1% agarose gel and quantified using the NanoDrop 2000 (Thermo Scientific, Waltham, MA, USA) and the Qubit dsDNA HS Assay kit (Thermo Scientific).
Extracted DNA from RBC 1 and 8 biofilm samples of the NE train from 2010 was used for quantitative PCR (qPCR) and metagenome sequencing (Table S1). The 2010 biofilm samples were those collected and analyzed by qPCR previously [29], but DNA extractions were repeated while processing the 2016 samples. The DNA extracts from all RBCs within the four trains sampled in 2016 were analyzed with qPCR, whereas DNA from RBCs 1 and 8 alone were used for 16S rRNA gene and metagenome sequencing (Table S1).

Quantitative PCR
The 16S rRNA genes of Nitrososphaeria members and bacteria were quantified using the primers 771F/957R [34] and 341F/518R [35], respectively (Table S2). Quantification of AOB amoA genes was carried out using the primers amoA1F/amoA2R [36] (Table S2). Comammox Nitrospira clade A and clade B amoA genes were amplified using equimolar primer mixes of comaA-244f (a-f) and comaA-659r (a-f), and comaB-244f (a-f) and comaB-659r (a-f), respectively [7] (Table S2). Primers for each of comammox Nitrospira clades A and B amoA genes were initially tested with end-point PCR to check for a single dominant band in all amplifications, but subsequent qPCR was performed with clade A primers only because clade B primers produced no amplicons (data not shown). All qPCR amplifications were carried out as technical duplicates on a CFX96 Real-Time PCR Detection System (Bio-Rad, Hercules, CA, USA). Additional thermal cycling conditions and standards are described in the supplemental methods.

16S rRNA gene amplicon sequencing
The V4-V5 regions of 16S rRNA genes were amplified using primers 515F [37] and 926R [38] with Illumina adapters. The sequence data were produced by the US Department of Energy Joint Genome Institute (JGI) using standard operating procedures. Triplicates were combined, quantified, and sample amplicons were then pooled equally. The pooled library was sequenced on a MiSeq (Illumina) with 2 × 300 base reads. Sequence analysis was performed with QIIME2 version 2019.1.0 [39]. Sequences were trimmed to remove primer and adapter sequences with cutadapt [40]. Quality trimming, denoising, errorcorrection, paired-end read merging, chimera removal, and dereplication was performed with DADA2 [41], producing an amplicon sequence variant (ASV) table with 80,180-125,219 assembled sequences per sample (Supplemental file 1). The ASVs were taxonomically classified according to the SILVA database release 132 [42] using the scikit-learn classifier [43].

Metagenome sequencing and analysis
Shearing of DNA, library preparation, and sequencing were performed at The Centre for Applied Genomics (TCAG) in Toronto, Ontario. Extracted DNA was quantified using the Qubit dsDNA HS Assay kit (Thermo Scientific, Waltham, MA, USA) and 500 ng of input DNA was sheared intõ 550 bp fragments using a LE220 Focused-ultrasonicator (Covaris, Woburn, MA, USA). Library preparation was performed using the TruSeq PCR-free Library Prep Kit (Illumina). Paired-end sequencing (2 × 250 bases) was performed on a HiSeq 2500 (Illumina) using the HiSeq Rapid SBS Kit v2 (500 cycle; Illumina), resulting in a total of~250 million paired-end reads with an average of 18.1 million paired-end reads per sample (Table S3).
Quality trimming and removal of adapter sequences were performed using AdapterRemoval version 2.2.2 [44], and the quality of the reads was checked with FastQC version 0.11.5 [45]. Reports were combined using MultiQC version 1.0 [46]. Open reading frames were predicted on the unmerged and unassembled trimmed forward reads using FragGeneScan-Plus [47]. Profile hidden Markov models (HMMs) for taxonomic marker (i.e., rpoB) and functional genes (i.e., amoA_AOA, amoA_AOB, and nxrB), downloaded from FunGene [48], were used to quantify the relative abundances and taxonomic affiliations of nitrifiers from the unassembled reads using MetAnnotate [49]. The HMM e-value threshold used by MetAnnotate for gene detection was 10 -3 , and the e-value threshold used for the USEARCH-based taxonomic classification step was 10 −6 . Within MetAnnotate, the database used for taxonomic classification was RefSeq release 80 (March 2017). MetAnnotate results were analyzed using the custom R script metannotate_barplots.R version 0.9 (available at https://github.com/jmtsuji/metannotate-analysis). To allow for approximate between-sample and between-HMM comparisons, hits for each HMM were normalized both to HMM length and total number of length-normalized HMM hits for rpoB (for more details, see the Github README).

Assembly and binning of metagenome sequence reads
Sequence reads were processed with the ATLAS pipeline (version 2.0.6), which includes quality control, assembly, annotation, binning, bin dereplication, and read mapping [50,51]; for further details, see supplemental methods and the run configuration file (Supplementary file 2). Taxonomy was assigned to bins according to the Genome Taxonomy Database [52] release 86, version 3, using the Genome Tree Database Toolkit version 0.2.2 [53]. Relative abundances of genome bins in metagenome sequence data were approximated by calculating the number of mapped reads to genome bins, normalized to the total number of assembled reads per metagenome. Each read required a minimum alignment score ratio of 0.908 (i.e.,~95% sequence identity to the aligned region) to be considered and was mapped to as many as the ten best aligned sites across the full genome set. Principal coordinate analysis ordination of samples used Bray-Curtis dissimilarities of relative abundance data using a custom R script.

Analysis of genome bins
Average nucleotide identity (ANI) was calculated for the dereplicated AOA bin and Nitrospira bins compared to reference genomes using FastANI version 1.1 [54]. The Nitrospira bins that were at least 80% complete were further searched for functional genes using reciprocal BLASTP [55] using the BackBLAST pipeline [56], version 2.0.0-alpha2 (https://doi.org/10.5281/zenodo.3465955). A concatenated core protein phylogeny, using a set of 74 core bacterial proteins, was generated to assess the phylogenetic placement of Nitrospira genome bins (see supplemental methods). The phylogenetic tree was constructed using IQ-TREE version 1.6.9 [57]. All predicted protein sequences from assembled contigs were compiled and clustered into 99% identity threshold groups within ATLAS; an HMM was used to identify AmoA sequences. Nitrospira AmoA were aligned along with reference sequences using MUS-CLE [58] and a phylogenetic tree was built using MEGA7 [58]. Cyanase protein sequences (CynS) identified via BackBLAST were further analyzed phylogenetically as described further in the supplemental methods.

Results and discussion
Prior to sequence-based analysis of RBC nitrifying biofilm sample communities, ammonia oxidizer cohort sizes were assessed using quantitative PCR (qPCR; Fig. 2). The results showed distinct nitrifying community compositions, both along the treatment train (spatially; RBC 1 versus RBC 8) and between sampling time points (temporally; 2010 and 2016). Within each RBC train, the measured relative abundance of comammox Nitrospira genes, as well as the proportion of comammox Nitrospira within the total community, were consistently higher in RBC 1 than RBC 8 ( Fig. 2, Table S5), as were ammonium concentrations in water samples collected from below the sampled RBCs (Fig. 1c, Table S4). Comammox Nitrospira amoA genes were more abundant than those of other ammonia oxidizers for most samples and time points (Fig. 2, Fig. S1, Table S5). In addition to comammox Nitrospira, the qPCR data also indicated a higher relative abundance of Nitrososphaeriaassociated 16S rRNA genes in RBC 8 than RBC 1 in the 2010 samples. In contrast, in 2016, Nitrososphaeria-associated 16S rRNA gene abundances were higher in RBC 1 than RBC 8 for all four trains and were roughly equivalent in relative abundance to comammox Nitrospira (Fig. 2). Although the abundances of AOB amoA genes were higher in RBC 1 than RBC 8 in both sampling years, the abundances of these genes were one to two orders of magnitude lower in 2016 than in 2010. Higher resolution sampling of all 2016 RBC stages showed decreasing amoA gene abundances across all four RBC flowpaths (Fig. S1).
Although 16S rRNA genes cannot be used to differentiate between comammox Nitrospira and strict nitrite oxidizers [7], the overall relative abundances of Nitrospira amplicon sequence variants (ASVs; Fig. 3) were consistent with observations from qPCR data (Fig. 2), showing higher relative abundances of Nitrospira in RBC 1 compared with RBC 8 for each sample pair. Eight ASVs present at ≥1% relative abundance were classified as Nitrospira (Fig. 3, Supplemental file 1) and, taken together, these Nitrospira ASVs dominated all other microbial ASVs detected in the RBC biofilm samples. In addition to Nitrospira, five ASVs present at ≥1% relative abundance were classified into the Nitrosomonadaceae family (Fig. 3), and four that corresponded to the genus Nitrosomonas were affiliated only with 2010 samples. Although a fifth Nitrosomonadaceae ASV was detected in 2016 samples (ASV ID 4), subsequent metagenome analysis suggested that this was not an ammonia oxidizer due to lack of a detectable amo gene system in the corresponding genome bin and phylogenetic distance from canonical AOB (data not shown). Thus, qPCR data were consistent with ASV data, showing fewer AOB for 2016 samples overall. We detected a single Nitrososphaeria-associated ASV that was identical to the known AOA population within the RBCs (Nitrosocosmicus genus; [30]), and this ASV was present in all samples (Supplemental file 1) but at highest relative abundance (i.e., >1%) for 2016 samples (Fig. 3), as expected based on qPCR data ( Fig. 2). Single copy taxonomic marker gene analysis of unassembled metagenome sequence data was consistent with qPCR and 16S rRNA gene sequence data, revealing that Nitrospira sequences dominated all samples and comprised between 8 and 32% of the total rpoB gene sequences among RBC samples from the two sampling years (Fig. 4). The Nitrospira-affiliated rpoB gene relative abundances were higher in RBC 1 than RBC 8 for each train sampled. In addition, the normalized relative abundances of comammox Nitrospira amoA genes correlated well with the relative abundances calculated from qPCR data for the proportion of their amoA genes within the RBC biofilm communities (Spearman's rank correlation, r s = 0.79, p < 0.001).
Normalized relative abundances of functional genes detected in unassembled read data allowed us to further estimate the relative community contributions of nitrifying community members. Assuming one amoA gene copy per genome, comammox Nitrospira represented between~5 -30% of the total microbial community for all samples based on normalized amoA HMM hits (Fig. 5b). Nitrospiraassociated rpoB and normalized amoA gene abundances were similar (Figs. 4, 5b), suggesting that the majority of Nitrospira present in the RBCs were comammox bacteria. The HMM search for nxrB/narH revealed that sequences affiliated with this model were also prevalent in RBC biofilm metagenomes, with combined hits at over 50% abundance normalized to total rpoB hits (Fig. 5c). Based on phylogenetic placement of assembled genes in our metagenome sequence data that were detected using the nxrB/ narH HMM, the majority of non-Nitrospira HMM hits likely represent denitrifying organisms encoding narH (data not shown), whereas Nitrospira HMM hits likely represent nxrB-encoding nitrifiers. The closest BLASTP search result for some non-Nitrospira affiliated nxrB/narH HMM hits was to a related protein (e.g., dimethyl sulfoxide reductase; data not shown), meaning that the normalized relative abundance we detected may be an overestimate of the true community contribution of nxrB/narH. Nevertheless, the high relative abundance of Nitrospira-affiliated nxrB gene sequences, which were more abundant in RBC 1 than RBC 8 metagenome sequences across all samples (Fig. 5c), is further evidence for the dominance of comammox or NOB Nitrospira in the RBCs. Normalized relative abundances of Nitrospira-affiliated nxrB sequences that exceed the corresponding relative abundances of Nitrospira-affiliated rpoB sequences may be due to multiple copies of nxrB common to Nitrospira genomes [59], in contrast to single copies common for rpoB genes [60], or may be due to known biases that occur during HMM hit count normalization.  Consistent with qPCR data, Nitrosomonas-affiliated rpoB gene sequences were detected at ≥1% in the 2010 samples only. The rpoB gene of AOA was not detected at ≥1% relative abundance in any of the samples but ranged from 0.01-0.3% across all samples (data not shown). The HMM search for the amoA gene of AOA similarly revealed low relative abundance in the metagenome dataset (<2% relative to rpoB for all samples; Fig. 5a). For 2010 samples, amoA gene sequences affiliated with the Nitrososphaeria were more abundant in RBC 8 than RBC 1 samples, whereas in 2016 they were more abundant in RBC 1 than RBC 8 samples, except in the NE 2016 sample. The HMM hits for AOA amoA genes in the unassembled metagenome sequences were low, with only 64 hits to this HMM for all samples combined, but the AOA amoA normalized relative abundance pattern was nevertheless correlated with the corresponding proportion of AOA 16S rRNA genes within the total community determined via qPCR (Spearman's rank correlation, r s = 0.67, p = 0.009). The amoA HMM for AOB detected sequences of both comammox Nitrospira and AOB. Overall, the relative abundances of AOB amoA genes were higher in RBC 1 than RBC 8 of the corresponding sample pairs, in both sampling years (Fig. 5b). In 2010, Nitrosomonas spp. and Nitrospira spp. were at roughly equal relative abundances, whereas Nitrospira spp. were the dominant ammonia oxidizers detected in 2016 metagenome sequences. The amoA genes of the AOB Nitrosospira were also detected, but always at <1% relative abundance. Overall, the relative abundance patterns for AOB amoA genes correlated well with their corresponding relative abundances determined by qPCR for the proportion of their amoA genes within the total community (Spearman's rank correlation, r s = 0.97, p < 0.001).  Metagenome-assembled genomes (MAGs) obtained from RBC metagenome sequence data were also examined for the temporal and spatial distributions of Nitrospira populations. Summed together, Nitrospira MAGs (see taxonomic classification description below) generally had a higher relative abundance in RBC 1 than RBC 8 for each sample pair with the exception of NE1/NE8 (June 2010) and NW1/NW8 (October 2016; Fig. 6). This pattern was also the case for the bins classified as clade A comammox Nitrospira, which comprised a majority of recruited reads to Nitrospira MAGs across most samples. At the same time, individual Nitrospira MAGs displayed distinct temporal and spatial distributions. Several of the comammox MAGs had different abundance patterns in 2010 than in 2016. Specifically, RBC035, RBC100, and RBC069 were present at higher relative abundances in 2010 samples than in 2016 samples, and relative abundances of RBC001 and RBC083 were generally higher in 2016 than 2010 samples (Fig. 6, Supplementary file 3). Multiple comammox Nitrospira populations were found in the same RBC, and such high diversity of comammox Nitrospira contrasts with the observed low AOA diversity (i.e., only one Nitrosocosmicus species; see below) for all RBCs sampled. Although 0.5-3.1% (median 1.1%) of assembled reads associated with Nitrospira MAGs were mapped to more than one MAG (data not shown), this level of crossover read mapping is insufficient to explain the high observed diversity of comammox Nitrospira for most RBCs, meaning that the high observed diversity is likely not an artifact of read mapping. Similar to an ordination prepared from 16S rRNA gene sequence data (Fig. S2A), the overall microbial community profiles generated by using all of the dereplicated bins of the samples grouped by year and RBC number (Fig. S2B). This demonstrates that distinct microbial communities were present in each of the two sampling years and along the RBC flowpaths.
The dominance of comammox Nitrospira in biofilm samples collected from RBCs treating municipal wastewater of Guelph, Ontario, Canada is in contrast to several previous studies of full-scale activated sludge WWTPs that report comammox Nitrospira at lower relative abundances than AOA and other AOB [7,[18][19][20][21], albeit for systems with ammonium concentrations and operational parameters distinct from those in the RBCs. This current study, along with other recent studies [12,15,17,18,22], shows that comammox Nitrospira can dominate municipal wastewater treatment systems. Although previous analyses of the Guelph WWTP did not test for these newly discovered ammonia oxidizers [29,30], the presence of comammox Nitrospira was predicted previously from activity data and an abundance of Nitrospira cells that actively assimilated labeled bicarbonate when incubated with ammonium [30]. This discovery of abundant and diverse comammox Nitrospira populations in the Guelph RBCs, which together provide post-aeration basin tertiary treatment, suggests a correspondingly important role for comammox within the  440,000 m 2 of biofilm associated with this full-scale wastewater treatment system. High relative abundances of comammox Nitrospira have been reported for other engineered environments with relatively low ammonium levels, such as drinking water treatment systems. For example, comammox Nitrospira were found in a drinking water treatment plant [8] and distribution systems [10], and they were more abundant than other ammonia oxidizers in a groundwater well studied previously [7]. Associated with drinking water treatment plants, comammox Nitrospira were the most abundant nitrifiers in groundwater-fed rapid sand filters [13]. In those filters, clade B comammox Nitrospira dominated, in contrast to clade A members that we detected in the sampled RBCs. A study examining recirculating aquaculture system biofilters found that comammox Nitrospira were more abundant than AOA and AOB [61]. Along with these other studies, the dominance of comammox Nitrospira in the RBCs suggests that they compete well in engineered environments with relatively low ammonium concentrations. At the Guelph WWTP, the RBCs are located downstream of activated sludge aeration basins. Ammonium concentrations entering this tertiary treatment system are approximately two orders of magnitude lower than within aeration basins. For example, on the day of sampling in 2016, the ammonium concentration in aeration basin influents averaged 2.5 mM NH 3 -N, which is typical of aerobic secondary aeration basin conditions reported elsewhere [e.g., [20][21][22]62]. A large surface area for attached growth may be an important factor explaining the dominance of comammox bacteria, given their high prevalence in wastewater treatment systems with attached growth components [18]. The predicted low ammonium niche and biofilm-specific growth of comammox Nitrospira [4,5] may help explain the dominance of these nitrifiers in the tertiary treatment system RBCs in Guelph, Ontario.
Although relatively low ammonium concentrations likely contribute to the overall success of comammox Nitrospira within RBC biofilm, a gradient of decreasing ammonium along the RBC flowpaths could also account for relative abundance changes of nitrifying microbial communities, both temporally and spatially. We predicted initially that comammox Nitrospira abundances in RBC 8 samples would exceed those in RBC 1 samples due to the predicted low ammonium niche of comammox Nitrospira [4], but the opposite trend was observed for both sample years (Figs. 2, 5, 6). However, ammonium concentrations detected beneath RBC 1 were already relatively low (e.g., <18 µM for all 2016 samples; Fig. 1c), and it may be that ammonium concentrations in RBC 8 influent (e.g., <4 µM for all 2016 samples; Fig. 1c) were below the optimum for comammox Nitrospira. Consistent with an "oligotrophic lifestyle", Kits et al. [63] reported a high apparent affinity of "Ca. Nitrospira inopinata" for ammonium (K m(app) of 0.65-1.1 µM total ammonium), and V max reached by ammonium concentrations of~5 µM, which would be similar to ammonium concentrations observed for water samples collected from RBCs sampled in this study, especially for RBC 1 stages. The affinities for ammonium of the comammox Nitrospira detected in the RBCs are unknown but may be similar to those reported for "Ca. N. inopinata".
In addition to targeting nitrification-associated genes to identify comammox Nitrospira in extracted DNA or unassembled metagenome sequences, we explored Nitrospira diversity within assembled and binned metagenome sequence data. From a final dereplicated dataset with 101 genome bins (Supplementary file 3), 12 MAGs were classified within the Nitrospirota phylum, and had high completeness (≥80%) and low contamination (<5% ; Table S6). Three other dereplicated MAGs were classified within the Nitrospirota phylum, but they were only 75-80% complete, and so were not included in our analysis. All but one of the recovered Nitrospirota MAGs were classified under the Genome Taxonomy Database taxonomy within the Nitrospira or Nitrospira A genera of the Nitrospiraceae family; the final MAG (Nitrospira bin RBC003) was classified within the uncultured UBA8639 lineage (genus and family name) of the Nitrospirales order. Using a concatenated set of 74 core bacterial proteins, these MAGs were further classified according to traditional Nitrospira phylogeny (Fig. 7, S3). A majority (8 of 12) of the Nitrospirota MAGs clustered within clade A comammox Nitrospira (sublineage II), three clustered within sublineage I (strict NOB), and one (RBC003) clustered within sublineage IV (strict NOB) of the Nitrospira. Thus, the set of 12 Nitrospirota MAGs will be referred to as Nitrospira MAGs for conciseness. This high diversity of Nitrospira MAGs is consistent with multiple Nitrospira ASVs detected with 16S rRNA gene amplicon sequencing (Fig. 3). Using ANI analysis, the 12 Nitrospira MAGs were compared with one another and to Nitrospira genomes from enrichment cultures and metagenomic surveys (Fig. S3). Only one MAG, RBC035, had ≥95% ANI to previously recovered comammox genomes (Nitrospira bin UW_LDO_01 [27] and "Ca. Nitrospira nitrosa" [6]); all other comammox Nitrospira MAGs had <95% ANI to other reference genomes. This suggests that these MAGs represent distinct comammox Nitrospira species from most other previously described comammox species.
Based on 99% identity clustering of amino acid sequences, all assembled amoA genes of Nitrospira spp. grouped phylogenetically within comammox Nitrospira clade A (Fig. S4), aside from RBC group F (i.e., a truncated sequence which was dropped from phylogenetic analysis). This grouping within comammox Nitrospira clade A is consistent with other studies reporting comammox Nitrospira in WWTPs [7,12,[14][15][16][17][18]. Clade B comammox Nitrospira were not detected and the factors affecting the presence and absence of clades A and B remain unknown. Relatedness of Nitrospira bins based on amoA phylogeny (Fig. S4) was similar to that obtained with the concatenated protein set, but not identical (Fig. 7, S3). For example, RBC069 (RBC group I amoA) clustered together with "Candidatus Nitrospira nitrosa" and Nitrospira sp. UW-LDO-01 in both phylogenies, but the phylogenetic relationship between RBC001 (RBC group C amoA) and Nitrospira sp. SG-bin2 differed between these two phylogenies. Recently, Xia et al. [14] proposed that clade A could be further subdivided into clade A.1, which includes the cultivated comammox Nitrospira, and clade A.2, which contains only uncultivated comammox amoA sequences, including those from drinking water systems (OQW38018 and OQW37964) [10]. The Nitrospira amoA gene sequences detected in the RBCs were classified into both of these clade A subdivisions (Fig. S4). Translated sequences of clade A.2 amoA genes from the RBCs were most closely related to amoA sequences obtained from metagenome sequences of drinking water systems [10], or to a clone library sequence from a rice paddy (Genbank accession AKD44274). Several comammox Nitrospira related to the rice paddy clone have been detected in other full-scale WWTPs [22].
Using a reciprocal BLASTP analysis, the Nitrospira MAGs were further evaluated for the presence of amo, hao, and nxr genes to assess potential contributions to nitrification [5,6]. Nitrospira bins RBC069, RBC001, and RBC044 were missing nxrB genes but contained the other genes for ammonia and nitrite oxidation (including nxrA) and had a Rh50 type ammonium transporter (Fig. 7). Both the nxrA and nxrB subunit genes were missing from Nitrospira bins RBC083, RBC042, and RBC093, but these bins encoded  Fig. 7 Phylogeny and gene pathways in Nitrospira bins and reference genomes. The bins recovered from this study are indicated in bold, an asterisk indicates that the genome is from an enrichment or pure culture. Phylogenomic analysis was performed using a concatenated set of 74 core bacterial proteins. All bootstrap values are 100% except those shown, and the scale bar represents the proportion of amino acid change. Gene annotation was determined using reciprocal BLASTP against four reference Nitrospira genomes ("Ca. N. inopinata", Nitrospira moscoviensis, Nitrospira lenta, Nitrospira japonica), and amino acid identity to the reference gene is indicated in two or more amo genes (Fig. 7). The missing nxr genes likely are a result of incomplete genome binning, although we cannot exclude the possibility that the latter three MAGs missing nxrAB might represent yet-undescribed strict ammonia oxidizing population of Nitrospira. In total, there were four clade A comammox MAGs that contained the amoA gene, which is used as a marker gene for comammox Nitrospira. The other four MAGs that clustered phylogenetically within clade A comammox Nitrospira contained other genes for ammonia oxidation, supporting their phylogenetic placement on the tree. Remaining Nitrospira MAGs contained genes only expected in NOB. Overall, three Nitrospira MAGs contained a near-complete gene pathway required for comammox (i.e., missing nxrB), four MAGs contained only genes expected for NOB, and five MAGs could potentially represent comammox Nitrospira but lacked the complete gene pathway. Cells of Nitrospira within RBC biofilm samples were previously demonstrated to assimilate labeled bicarbonate in the presence of ammonium [30], indicating autotrophic activity of these bacteria and supporting the possibility of ammonia oxidation as demonstrated by these genomic data.
High functional diversity was observed in the RBC Nitrospira MAGs. In the current study, the presence of a complete set of ure genes (Fig. 7) in several Nitrospira MAGs indicates that comammox Nitrospira of the RBCs could hydrolyze urea to ammonia, which they could then use for ammonia oxidation. The presence of these genes is consistent with clade A comammox Nitrospira genomes and enrichment cultures that use urea [5,6,69,70]. Because the comammox Nitrospira MAGs did not contain any fdh genes for formate dehydrogenase (Fig. 7), the RBC populations represented by these MAGs likely cannot use formate as an alternative electron donor, which can be used by NOB Nitrospira and most clade B comammox bacteria [70][71][72][73]. Most clade A comammox bacteria described so far lack the genes for formate dehydrogenase [70], with the exception of Nitrospira bin RCA [74]. Also consistent with previous findings [18,70], several of our comammox Nitrospira MAGs contained partial pathways for group 3b [Ni-Fe] sulfur-reducing hydrogenase genes (hyb and hyd genes; Fig. 7), which indicates that dihydrogen/protons may have alternative electron donor/acceptor roles for some of these comammox Nitrospira. Although several of these MAGs also encoded genes for a putative group 4 hydrogenase (hyf), the amino acid sequences of the putative large subunit hydrogenase genes (hyfG) did not contain the two CxxC motifs needed for ligating the NiFe center, implying a potential alternative function of these gene products other than hydrogen oxidation [75], as has been suggested previously for other bacteria [76]. Multiple clade A comammox Nitrospira MAGs contained genes for accessory proteins for hydrogenases (hyp) but none of the Nitrospira MAGs contained group 2a [Ni-Fe] hydrogenase genes   Fig. 8 Phylogeny and relative abundances of cyanases (CynS) detected in RBC metagenome sequencing data. a Maximum likelihood phylogeny of CynS primary sequences detected in Nitrospira genome bins from this study compared with reference sequences from NCBI. Sublineages of the Nitrospira clade are highlighted by the colored vertical bar, including potential clade A comammox bacteria, which are highlighted in orange. Genome bins recovered from the RBCs are bolded, and an asterisk appears after the names of genomes of cultivated organisms. Bootstrap values over 50% are shown, and the scale bar represents the proportion of amino acid change. To the right of the phylogeny, a portion of the CynS primary sequence alignment is shown along with overall sequence similarity based on the BLO-SUM62 metric. Residues with 100% conservation are marked with an asterisk. Blue asterisks are used to mark known catalytic residues of CynS, namely Arg96, Glu99, and Ser122. b Relative abundances of all RBC genome bins containing cynS among RBC metagenomes. Genome bins are collapsed at the genus level into stacked bars, and relative abundances are based on the proportion (%) of mapped assembled reads to genome data.
Surprisingly, two clade A comammox Nitrospira bins (RBC069 and RBC093) that contained genes for ammonia oxidation also encoded the cyanase gene (cynS; Figs. 7, 8). Prior to this observation, only canonical NOB Nitrospira were known to possess this gene [70,73,74,77,78]. The cynS genes were on long contigs (i.e., 182 kb and 121 kb, RBC069 and RBC093, respectively), and the primary sequences of most genes adjacent to the cynS genes, when queried against RefSeq using BLASTP, were most similar to genes belonging to known comammox Nitrospira. This implies that the presence of cynS in RBC comammox MAGs was not caused by an error in genome binning. Both cyanase genes were distinct from those of other strict NOB Nitrospira, sharing <75% identity (BLASTN) to Nitrospira homologues in the nonredundant nucleotide database. Nevertheless, the predicted primary sequences of the cynS genes contained the key active site residues for cyanases (Arg96, Glu99, and Ser122) that were proposed previously [79], further supporting the functional role of the corresponding gene products (Fig. 8a).
Comammox Nitrospira cyanases were basal to all other known cyanases among canonical NOB Nitrospira (Fig. 8a). This basal placement does not match housekeeping gene phylogenies of Nitrospira (e.g., Fig. 7), indicating a potential role for lateral gene transfer as a mechanism for cyanase acquisition. Indeed, we detected two toxin-antitoxin genes in the vicinity of the cyanase gene for RBC093, supporting the potential for a recent lateral gene transfer event [27]. Bins containing cynS together represent a substantial proportion of the RBC microbial community (i.e.,~10-25%; Fig. 8b). If these genes are expressed and produce functional gene products, our results indicate that Nitrospira, including novel cynScontaining comammox Nitrospira, dominate cyanate degradation to ammonia within RBC biofilm for a direct supply of ammonia for comammox Nitrospira or for supplying ammonia to other nitrifiers through reciprocal feeding [77].
The comammox Nitrospira in the RBCs lack traditional cyanate transporters (cynABD), though one cyanate transporter gene (cynD) seemed to have moderate homology to another non-orthologous ABC transport gene, as indicated by the low amino acid identity hits to cynD among certain Nitrospira (Fig. 7). These hits may represent another ABC transport protein, a nitrate/sulfonate/bicarbonate ABC transporter (accession number WP_121988621). If the comammox Nitrospira are unable to transport cyanate, our results may suggest that these Nitrospira degrade cyanate generated internally, for example from the degradation of carbamoyl phosphate or from a spontaneous isomeric conversion of urea to cyanate [80][81][82][83][84]. However, it is also possible that alternate transporters could be used for uptake of cyanate, such as nitrate permease [85], or the nitrite/ nitrate transporter NarK, which is proposed to also transport cyanate [77]. If confirmed, this would represent an undescribed method of nitrogen uptake and utilization for comammox bacteria.
Multiple populations of comammox Nitrospira within the same RBC system suggests that these nitrifiers do not compete directly but instead occupy unique niches within the RBCs as a result of encoded functional diversity, for example because of distinct energy sources (e.g., hydrogen, urea, and cyanate) or microenvironments within the biofilm that vary in oxygen or ammonium concentrations. Indeed, the abundances of sequences that recruited to comammox Nitrospira (based on read recruitment to bins) varied temporally and spatially among RBCs (Fig. 6), implying that complex physicochemical factors govern comammox in the RBCs. Several RBC comammox Nitrospira MAGs may even represent closely related strains (Fig. S3), adding further interest to understanding the factors influencing the distribution and ecology of these bacteria.
Genome bins of other ammonia oxidizers were also recovered. Following dereplication, four bins were classified as Nitrosomonas (Supplementary file 3). These populations were present at a higher relative abundance in 2010 samples than 2016 samples. A single dereplicated Nitrosocosmicus (AOA) bin was detected (RBC071; Supplementary file 3), which indicates the presence of a single Nitrososphaeria-associated population and matches previous data from the RBCs [29,30]. In 2010, this species was present at a higher relative abundance in RBC 8 than RBC 1, with the opposite pattern occurring in 2016 samples. Classified as "Ca. N. hydrocola" under GTDB taxonomy, the AOA bin had an ANI of 96.1% between it and the "Ca. N. hydrocola" genome (GenBank accession number CP017922.1), which indicated that this AOA MAG corresponded to the same archaeon enriched and characterized from an RBC biofilm sample.
The observed pattern of AOA amoA abundances for 2010 samples analyzed in this study (Fig. 2, Fig. 5a) was consistent with previous data obtained from separate DNA extractions of the same samples [29], demonstrating that AOA were indeed more abundant in RBC 8 than RBC 1 samples in 2010. Although AOA in 2016 samples were at a lower relative abundance in RBC 8 compared with RBC 1, this matches data from more recent samples collected from the RBCs in December 2015 [30]. Lower ammonium concentrations in recent years (Fig. S5) may have shifted the "Ca. N. hydrocola" AOA population toward RBC 1, and cleaning of the RBC trains in 2014 would have impacted the microbial communities. Lower ammonium concentrations beneath RBCs and RBC cleaning may have impacted AOB populations as well. Both qPCR (Fig. 2) and metagenome sequence data (Fig. 5b) indicated an overall decrease in AOB amoA gene relative abundances in the 2016 samples compared with 2010. Most AOB have a relatively low ammonia affinity [63,86,87] and may be outcompeted, when ammonium concentrations are low, by ammonia oxidizers with higher ammonia affinities (i.e., comammox Nitrospira and AOA).

Conclusions
This study used complementary methods to demonstrate that comammox Nitrospira dominate microbial communities in biofilms from this engineered WWTP system. Detected comammox Nitrospira exhibited a high level of taxonomic and functional diversity, in contrast to the single population of AOA found in the same systems. The evolutionary history of comammox Nitrospira prior to establishing within the WWTP, perhaps within heterogeneous soil or sediment samples, may be a factor that contributed to the high diversity observed within RBCs. There is some phylogenetic evidence that the AOA selected within the RBCs may have originated on skin surfaces [30,88], which would offer relatively low habitat heterogeneity. In addition, other studies have found the coexistence of closely related Nitrospira in WWTPs [17,89,90], suggesting that Nitrospira may be able to occupy many different niches in engineered nitrifying environments. Although comammox Nitrospira were numerically dominant, their contributions to nitrification in the RBC environment remain unclear. Future studies using differential inhibitors, isotope tracer studies, and laboratory cultivation may be useful for elucidating the contributions of comammox Nitrospira to nitrification in the RBC biofilms. In addition, the comammox Nitrospira that fall within the newly proposed clade A.2, which lacks cultured representatives [14], along with the cynS-containing comammox Nitrospira populations, would be ideal targets for cultivation.
The Guelph WWTP RBCs represent a unique and useful system to study the ecology of comammox Nitrospira as a result of inbuilt environmental gradients, and the combined presence of comammox Nitrospira, AOA, AOB, and NOB. Tertiary treatment systems such as these RBCs can be used to produce high quality effluents, which may be particularly important when WWTPs discharge into environments that have low assimilative capacities or are ecologically sensitive. A longer-term study of the microbial community and operating parameters (e.g., hydraulic retention time, oxygen concentration, and organic loading) would help elucidate the factors that affect the microbial community, with a focus on the ammonia oxidizing microbial community. Linking the performance of the RBCs (measured by ammonia removal) to the ammonia oxidizing microbial community would help determine which members of this community have important roles in the ammonia removal process. The activity of the nitrifiers could be further assessed with differential inhibitor and isotope labeling experiments. Understanding the microorganisms that are present within water treatment systems is an important step towards optimizing operational practices for improved effluent quality in municipal, industrial, and aquacultureassociated water treatment systems.

Data availability
Amplicon sequence data are available on the JGI genome portal under sequencing project ID 1137811. Metagenome sequencing data and high completeness, low contamination genome bins were deposited in the European Nucleotide Archive under study accession number PRJEB30654.