Introduction

Dissolved oxygen (O2) concentration is a primary driver of nutrient and energy flow patterns within marine ecosystems (Diaz and Rosenberg, 2008; Diaz et al., 2009). O2-deficiency selects for microbial groups capable of utilizing alternative respiratory substrates including nitrate (NO3−), NO2− (nitrite), Mn (manganese), Fe (iron), SO4− (sulfate) or carbon dioxide (CO2) (Zehnder and Stumm, 1988). Within O2-deficient waters, the use of NO2− or NO3− as alternative electron acceptors results in the production of N2O (nitrous oxide) and dinitrogen gas (N2) (Lam and Kuypers, 2010). Similarly, reduction of SO4− and CO2 under anoxic conditions results in the production of toxic hydrogen sulfide (H2S) (Teske, 2010) and methane (CH4), respectively (Naqvi et al., 2010). Recent studies of microbial community structure and systems metabolism within marine O2 minimum zones (OMZs) indicate a versatile capacity to produce and consume climate active trace gases (defined as gases making up <1% of the atmosphere that absorb the electromagnetic energy resulting from reflection of solar radiation from the Earth's surface) or to limit accumulation of H2S within the surrounding water column (Lavik et al., 2009; Walsh et al., 2009; Canfield et al., 2010; Zaikova et al., 2010). Although prokaryotic (bacteria and archaea) microorganisms are the primary drivers of these biogeochemical transformations (Arrigo, 2005), it is likely that microbial eukaryotes (protists) act as important biological controls through predation on (Taylor, 1982), parasitism of (Chambouvet et al., 2008), and symbioses with (Edgcomb et al., 2011c), different microbes.

Information regarding the influence of OMZ formation on marine protists remains incremental, although recent studies using next generation sequencing approaches point to complex and diverse protistan communities in these habitats (Stoeck et al., 2009; Behnke et al., 2010). Restructuring of these protists’ communities in response to changing levels of water column O2-deficiency likely influences prokaryotic population structure and activities with resulting feedback on nutrient and climate active trace gas cycling. Several recent studies provided insight into the biogeography, species richness, endemicity and habitat specialization of protists along O2 gradients (Behnke et al., 2006; Zuendorf et al., 2006; Behnke et al., 2010; Edgcomb et al., 2011a, 2011b; Orsi et al., 2011b). Work in the Cariaco Basin revealed a specialization of many protistan taxa to different biogeochemical niches and sites in the basin (Orsi et al., 2011b) and the estimated protistan species richness there was found to be exceptionally high (Edgcomb et al., 2011a). A recent study of Framvaren Fjord found evidence for seasonal fluctuations in protistan community structure (Behnke et al., 2010). Although major taxonomic lineages remained consistent throughout the time course of the study, subgroup diversity changed extensively from season to season among and between sampling depths consistent with dynamic recruitment from diverse low abundance populations. A similar finding was also reported by an additional study of microbial eukaryotes in the western North Atlantic and eastern North Pacific oceans (Caron et al., 2009).

In the present study, we monitored changes in protistan community structure in Saanich Inlet, a seasonally anoxic fjord on the coast of Vancouver Island, Canada, using small subunit ribosomal RNA gene (SSU rRNA gene, 18S rRNA gene) clone library sequencing. We charted the spatiotemporal variability of protists in relation to dissolved gases and nutrients, and employed multivariate statistical approaches to identify potential relationships between compositional profiles, taxonomic groups and environmental parameters at different stages of water column stratification and renewal. The resulting data sets are compared with related studies in anoxic Cariaco Basin and Framvaren Fjord and used to identify common and unique patterns of community composition.

Materials and methods

Sample collection and processing

Samples from Saanich Inlet were collected and processed as described previously (Zaikova et al., 2010; Walsh and Hallam, 2011) as part of a monthly monitoring program in Saanich Inlet aboard the MSV John Strickland (JS) or CCGS John P Tulley (JPT). Briefly, water samples and environmental parameter data were collected from station S3 (48°35.30N, 123°30.22W). Approximately 20 l from 10, 100, 120 and 200 m depth intervals representing oxic, dysoxic, suboxic and anoxic regions of the water column was prefiltered through 2.7 μm GF/D prefilters onto 0.22 μm Sterivex filters for downstream molecular analyses. Biomass samples were accompanied by higher resolution physical and chemical data spanning 16 depth intervals including cell counts, temperature, salinity, O2, NO3−, PO43− (phosphate), SiO4− (silicate), NO2−, N2O, NH4+ (ammonia), CO2, CH4 and H2S concentration measurements. Aspects of this workflow are presented as a series of on-line video protocols including: (1) seawater collection and environmental sampling (URL: http://www.jove.com/index/Details.stp?ID=1159) (Zaikova et al., 2009, 2) small volume filtration (URL: http://www.jove.com/index/Details.stp?ID=1163) and large volume filtration (URL: http://www.jove.com/index/Details.stp?ID=1161) (Walsh et al., 2009a, 2009b), and (3) genomic DNA extraction and purification (URL: http://www.jove.com/index/Details.stp?ID=1352) (Wright et al., 2009). Additional information relating to hydrology and water column chemistry in Saanich Inlet is available through the Saanich undersea array, a streaming cabled observatory node situated on the seafloor near the mouth of the Inlet (URL: venus.uvic.ca).

PCR amplification of 18S rRNA genes

DNA extracts from 10, 100, 120 and 200 m depth intervals were amplified using small subunit ribosomal DNA primers targeting the eukaryotic domain: Euk515F (5′-GTGCCAAGCAGCCGCGGTAA) and Euk1209R (5′-GACGGGCRGTGWGTRCA) under the following PCR conditions. PCR conditions: 2 min at 95 °C followed by 20 cycles of 95 °C for 40 s, 55 °C for 30 s, 72 °C for 90 s and final extension of 7 min at 72 °C. Each 50 μl reaction contained 1 μl of template DNA and 1 μl each 0.4 μM forward and reverse primer added to a PCR Master Mix (Stratagene, Santa Clara, CA, USA, Cat #600640). Reactions were aliquoted into 3 × 15 μl reactions before PCR (to minimize bias) and re-pooled after PCR. For specific details see URL: http://my.jgi.doe.gov/general/protocols/SOP_16S18S_rRNA_PCR_Library_Creation.pdf

Clone library construction and sequencing

Resulting amplicons were gel purified using the MiniElute gel extraction Kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions. Ligation, transformation and sequencing steps were performed as described in Zaikova et al. (2009). The number of resulting transformants per ligation ranged between ∼100 000 and 800 000 colony forming units. One 384-well plate per sample ligation was sequenced with variable success using M13F and M13R primers as described at http://jgi.doe.gov/sequencing/protocols/prots_production.html.

Quality control of 18S SSU rRNA data set

The sequences were checked for chimeras using the Bellerophon Chimera Check and the Check_Chimera utilities (Ribosomal Database Project) (Cole et al., 2003). After removal of putative chimeras, bacterial, archaeal and metazoan sequences, the remaining sequences were grouped into operational taxonomic units (OTUs) based on 98% rRNA gene sequence similarity levels. This was achieved by first making all possible pairwise sequence alignments by using ClustalW (Thompson et al., 1994), calculating percentage sequence identities, followed by clustering the sequences by using the unweighted pair group method with arithmetic mean as implemented in the OC clustering program (http://www.compbio.dundee.ac.uk/Software/OC/oc.html). OTUs clustered at the 98% identity threshold were subjected to ordination and multivariate statistical analyses.

Community composition analysis

18S rRNA gene sequences from Saanich Inlet (19 samples), Cariaco Basin (16 samples) (Edgcomb et al., 2011a; Orsi et al., 2011b) and Framvaren Fjord (nine samples) (Behnke et al., 2010) were obtained from the GenBank-nt database and were aligned using the Needleman–Wunsch algorithm implemented in mothur version 1.18 (Patrick Schloss, Detroit, MI, USA) (Schloss, 2009), with gap penalty of –1 and k-mer size of 9. The resulting alignment file was used as an input to generate 6768 OTUs based on 98% similarity threshold and a representative sequence for each OTU was selected using the recommended distance-based method, Get.oturep (http://www.mothur.org/wiki/Get.oturep). The resulting cluster file and list of representative sequences for each OTU were combined to generate an OTU table containing the number of sequences in each OTU across the 44 samples. Representative sequences were then used in a BLASTn search against Silva SSU reference database (Bremen, Germany) version 10.6 with sequences affiliated with the Bacteria and Archaea removed. The results of clustering were also used to calculate non-parametric Chao, Shannon and Simpson indices of α diversity (Supplementary Table S2).

Circos visualization

The BLASTn output and the OTU table were combined together to generate a community composition table containing the number of sequences across each sample belonging to a specific taxonomic group. From this table, the relative abundance of each protistan group in a given sample was calculated as a percentage value by dividing the raw number of sequences associated with the specific taxa by the total number of sequences in the sample. Circos was used to generate circular link diagrams illustrating community composition differences among and between samples (Krzywinski et al., 2009).

Histoheatmap generation

The same table used in Circos visualization was also used to generate combined histograms and heatmaps (that is, histoheatmaps illustrating the presence of different OTUs across each sample) using the R statistical software package (http://www.r-project.org). This script is available upon request from the authors.

Phylogenetic analysis

For our phylogenetic analyses, we focused on ciliate and euglenozoan-affiliated sequences in the July 200 m sample. We focused on this sample because it contained the most novel sequences of all clone libraries/samples in our study. For these analyses, we used representative sequences from each ciliate and euglenozoan-affiliated OTU clustered at the 98% sequence identity level. Representative sequences were compared against the Genbank-nt database using BLASTn in search of their closest relatives and the highest scoring cultured and uncultured sequence relatives were retrieved. Sequences were aligned using the ARB automated aligner (Bremen, Germany) (Ludwig et al., 2004), the alignment was manually refined using secondary structure information, and only unambiguous positions were used to construct phylogenetic trees. Phylogenetic trees were constructed using Bayesian inference (Ronquist and Huelsenbeck, 2003) and PhyML (Montpellier, France) (Guindon et al., 2005).

Statistical analyses

After clustering of our Sanger sequence data set, we obtained ‘frequency count’ data at the 90%, 95%, 98% and 99% sequence identity levels. These are the numbers of OTUs registered once (the ‘singletons’), or twice (the ‘doubletons’), etc. Using these data, we estimated the total number of OTUs at each level of sequence identity, representing the sum of seen (empirically registered) and unseen OTUs (present but undetected due to limited sequencing effort). This was performed using the program CatchAll (Ithaca, NY, USA) (Bunge, 2010) to compute eight parametric (Poisson; negative binomial; inverse Gaussian, Pareto and lognormal-mixed Poisson; and mixtures of one, two or three geometrics) estimators as described previously (Hong et al., 2006).

Canonical correspondence analysis

Canonical correspondence analysis (CCA) was used to elucidate relationships between protistan community structure and concentrations of dissolved O2, NO3−, CH4 and H2S. Multiresponse permutation procedure (MRPP) was used to test for a statistically significant influence of season, depth, NO3−, sulfide and O2 on the observed OTU distribution. A Monte Carlo test was also used to assess the null hypothesis of no relationship between OTU distributions and environmental variables. All ordination and multivariate statistical analyses were performed on our data set clustered at the 98% sequence identity threshold. Monte Carlo tests, MRPP and CCA were implemented using the PC-ORD software package (MjM Software Design, Gleneden Beach, OR, USA).

Principal component and hierarchical clustering analysis

To determine the correlation between protistan community structure and the O2 concentration in each sample, a table containing the raw number of sequences associated each major taxon was used as an input for principal component analysis (PCA) using the FactorMineR module (http://factominer.free.fr). Based on the first two principal components calculated from the PCA analysis, samples were hierarchically clustered using the Manhattan distance method with complete linkage implemented in the same software module. The results of the analysis were visualized as dendrograms with dot plots using the custom perl script, bubble.pl (http://www.cmde.science.ubc.ca/hallam/bubble.php).

Results

Community diversity measures

We sampled four depth intervals over 3 years (2006–2008) during the months of February, April, July and November, representing different water column redox states resulting from seasonal stratification and renewal (Walsh et al., 2009; Zaikova et al., 2010). A total of 4987 18S rRNA gene sequences recovered from 19 different clone libraries were analyzed (Table 1). Clustering of these sequences at the 99%, 98%, 95% and 90% sequence identity thresholds resulted in a total of 1217, 993, 596 and 244 OTUs, respectively. We estimated OTU richness using a statistical tool designed for estimating microbial species richness with a reliable standard error (Hong et al., 2006; Edgcomb et al., 2011a; Orsi et al., 2011b). At the 99% and 98% sequence identity thresholds, we estimate 13 442 (−/+: 7963–23 373 CI (95% confidence interval)) and 8176 (−/+: 4861–14 333 CI) taxa, respectively. At 95% and 90% sequence identity, we estimate 2687 (−/+: 1440–5778 CI) and 510 (−/+: 376–781 CI) taxa, respectively (Table 2). Using the same statistical tool, we estimated taxonomic richness within the anoxic Framvaren Fjord using recently published 18S rRNA gene data sets from this environment (Behnke et al., 2006; Behnke et al., 2010). The number of OTUs estimated to exist in Framvaren Fjord at the 99%, 98%, 95% and 90% sequence identity levels, amount to 28%, 13%, 17% and 45% of the OTUs predicted for Saanich Inlet, respectively (Table 2). Non-parametric methods estimated similar richness and diversity for Saanich Inlet vs Framvaren Fjord, although predicted richness was less than in Cariaco. Non-parametric estimates for all three sites were lower than parametric estimates (Supplementary Table S2).

Table 1 Sample index for Saanich Inlet, Cariaco Basin and Framvaren Fjord microbial eukaryotes
Table 2 Predicted richness of protistan assemblages in Saanich Inlet, Cariaco Basin and Framvaren Fjord

Protistan community structure in Saanich Inlet

CCA showed a clear division of Saanich Inlet protistan communities into different clusters associated with oxic (>90 μM), dysoxic (20–90 μM), suboxic (1–20) μM or anoxic/sulfidic (<1 μM/±sulfide) water column conditions (Figure 1). The anoxic samples from 200 m clustered together, separate from all other samples on the biplot (Figure 1). However, the 200-m sample taken in November grouped with dysoxic samples. A Monte Carlo test of the null hypothesis of no relationship between the OTU distribution and the measured environmental variables O2, NO3−, CH4 and sulfide had a P-value of 0.01. MRPP tests of influence of season, depth, O2, NO3−, CH4 and sulfide on the OTU distribution all yielded P-values ⩽0.01.

Figure 1
figure 1

CCA of the Saanich Inlet 18S rRNA gene sequence data set clustered at the 98% identity threshold. Samples are represented in the biplot by dots, the size and color of which indicates the presence and concentration of dissolved O2. Axis 1 and 2 explained 8% and 7.7% of the variance in OTU distribution, respectively. A Monte Carlo test for significance of the Eigenvalues yielded a P-value of 0.03.

Changes in taxonomic representation at 120 and 200 m coincided with the annual renewal cycle in fall and re-stratification of the water column in summer (Table 1) (Walsh et al., 2009). Only 10% and 3% of OTUs were detected in both July and November, respectively, at these depths. At 120 m in July, clone libraries were dominated by Stramenopile-affiliated sequences, while in November an increase in Cercozoan- and Dinophyceae-affiliated sequences was observed (Figures 2 and 3). At 200 m in July, 66% and 11% of sequences were affiliated with the Ciliophora and Euglenozoa, respectively (Figures 2 and 3), while in November, 70% of sequences were affiliated with the Dinophyceae, 80% of which were affiliated with the subgroup Syndiniales (Figures 2 and 3). The majority of sequences recovered from anoxic waters exhibited low (<92%) sequence identities with their closest described relatives (Supplementary Table S1) and formed new lineages based on phylogenetic trees (Figures 4 and 5). The uncultured environmental sequences with the highest identities to many of these sequences were recovered from O2-deficient marine waters from the Cariaco Basin (Stoeck et al., 2003), Framvaren Fjord (Behnke et al., 2006; Behnke et al., 2010) and Mariager Fjord (Denmark) (Zuendorf et al., 2006).

Figure 2
figure 2

Phylum-level taxonomic affiliations of Saanich Inlet 18S rRNA gene sequences (see Table 1 for sample_id information). The color-coded outer histogram represents the abundance of 17 major taxonomic groups identified. The relative abundance (percentage of total) of sequences affiliated with each taxonomic group within a sample is indicated by the thickness of the colored area at the perimeter of the circle. Black circles represent individual samples. The concentration of O2 within each sample is represented by a black bar, and gray bars indicate samples with no detectable O2. The height of each bar is scaled according to the value of the O2 concentration (in μM) normalized using natural logarithm (ln).

Figure 3
figure 3

Combined histogram and heatmap describing the diversity of Ciliophora, Cercozoa, Fungi and Euglenozoa OTUs among and between sampling depths and locations (see Table 1 for sample_id information). The heatmap shows the number of sequences in each OTU that are affiliated with each color-coded protistan group. Color intensity of each cell is proportional to the log-corrected number of sequences in the OTU. The histogram shows total number of uncorrected sequences in corresponding OTU (i.e., sum of sequences in a given OTU across all the samples). Only those OTUs are shown for which the total number of sequences is ⩾1% of the total sequences affiliated with a given taxa. The count legends indicate the number of cells in each heatmap that contain the designated log-corrected value.

Figure 4
figure 4

Phylogenetic relationships of ciliate-affiliated 18S rRNA gene sequences. The tree was constructed under maximum likelihood using an alignment of 757 unambiguous positions under the GTR+I+Gamma model of sequence evolution. Bootstrap (PhyML, 1000 iterations) and posterior probability (5 000 000 generations with 25% of trees discarded as burnin) values >50% are shown at the nodes in the order PP/ML (posterior probability/maximum likelihood bootstrap). Black circles at nodes represent full posterior probability and bootstrap support. OTUs from our study appear in bold font. The number of sequences per OTU recovered from oxic, suboxic, dysoxic, and oxic samples are represented by circles. The size and color of the circles denotes the number of sequences and O2 concentration, respectively.

Figure 5
figure 5

Phylogenetic relationships of Euglenozoa-affiliated 18S rRNA gene sequences. The tree was constructed under maximum likelihood using an alignment of 809 unambiguous positions under the GTR+I+Gamma model of sequence evolution. Bootstrap (PhyML, 1000 iterations) and posterior probability (5 000 000 generations with 25% of trees discarded as burning) values >50% are shown at the nodes in the order PP/ML. Black circles at nodes represent full posterior probability and bootstrap support. OTUs from our study appear in bold font. The number of sequences per OTU recovered from oxic, suboxic, dysoxic and oxic samples are represented by circles. The size and color of the circles denotes the number of sequences and O2 concentration, respectively.

At 10 m, Dinophyceae- and Stramenopile-affiliated sequences dominated during all seasons (Figure 2; Supplementary Figure S4). The majority (90%) of Dinophyceae-affiliated sequences were related to the parasitic Syndiniales. The same observation was made at 100 m in February, April and November while in July, radiolarian-affiliated sequences were most abundant (Figure 2; Supplementary Figure S4). The number of cercozoan-affiliated sequences increased in April at 10 and 100 m, but were less abundant during the rest of the year (Figures 2 and 3). The occurrence of larger-sized protists (such as the Radiolaria and Cercozoa) in pico-eukaryote size clone libraries has also been reported in previous studies (Not et al., 2009)

Taxonomic relationships between Saanich Inlet, Framvaren Fjord and Cariaco Basin

To better constrain protistan community structure and dynamics to changing levels of O2 and sulfide on a global scale, we compared available 18S rRNA sequence data sets from Saanich Inlet, Cariaco Basin and Framvaren Fjord. Overall, taxonomic assignments for the Cariaco Basin and Framvaren Fjord (Supplementary Figures S1 and S2) data sets were congruent with previous studies (Behnke et al., 2010; Edgcomb et al., 2011a; Orsi et al., 2011b). Sequences affiliated with the Ciliophora, Euglenozoa, Choanoflagellata, Stramenopiles, Fungi and Dinophyceae were well represented from anoxic samples in all three sites (Figure 3; Supplementary Figures S1–S5). However, many taxa were differentially represented. Examples include Stramenopile-affiliated sequences that were more prevalent in Cariaco Basin and Framvaren Fjord than Saanich Inlet (Supplementary Figures S1–S4), as well as Polycystinea- and fungal-affiliated sequences that were more abundant in Cariaco Basin relative to Saanich Inlet and Framvaren (Figures 2 and 3; Supplementary Figures S1–S5). Sequences affiliated with the Stramenopiles, Cercozoa, Dinophyceae, Polycystinea and Acantharea were represented in oxygenated samples from both the Cariaco Basin and Saanich Inlet (Figures 2 and 3; Supplementary Figures S1–S5).

Combined principal component (PCA) and hierarchical cluster analyses of the Saanich Inlet, Cariaco Basin (Edgcomb et al., 2011a; Orsi et al., 2011b) and Framvaren Fjord (Behnke et al., 2010) data sets revealed biogeographic and niche-specific clustering patterns. For the most part, protistan communities clustered by location according to depth and O2 concentration. The majority of oxic, dysoxic and suboxic samples from Saanich Inlet formed a nested series in one cluster, with anoxic Cariaco Basin and Framvaren Fjord samples, forming independent clusters. Interestingly, anoxic/sulfidic samples from Saanich Inlet and one anoxic deep sample from the Cariaco western subbasin (site BC) clustered with the hyper-sulfidic Framvaren Fjord samples (Figure 6).

Figure 6
figure 6

PCA and hierarchical clustering of the 38 samples from Saanich Inlet, Cariaco Basin and Framvaren Fjord (see Table 1 for sample_id information). The x and y axes of the grid represent the first and second principal components, respectively. Each dot represents one of the 38 samples used in the analysis. The visual properties of each dot can be divided into three categories. The shape of each dot represents sample location, Saanich Inlet (circle), Cariaco Basin (hexagon) and Framvaren Fjord (square). Each dot is color-coded based on dissolved O2 concentration, oxic (red), dysoxic (green), suboxic (light blue) and anoxic (purple). The size of each dot is scaled according to the value of the O2 concentration (in μM) normalized using natural logarithm (ln). The clustering pattern is further linked to the dendrogram generated from hierarchical clustering.

Fluctuations in rare microbial populations

Our comparison of microbial populations during stratification and renewal suggests selection for growth of many low abundance taxa once the preferred conditions arise. Examples include an increase in the number of Ciliate- and Stramenopile-affiliated sequences recovered at 200 m during periods of anoxia (Figures 2 and 3; Supplementary Figure S4). One Stramenopile-affiliated OTU was detected 228 times in July and 11 times in November (Supplementary Figure S4). Furthermore, an increase in Dinophyceae and Cercozoan-affiliated sequences at 120 and 200 m was observed after the oxygenated renewal event in autumn (Figures 2 and 3; Supplementary Figure S4). Several Dinophyceae-affiliated OTUs detected in July and November at 200 m increased significantly in size in November after the renewal event (Supplementary Figure S4).

Discussion

Saanich Inlet provides a model ecosystem for studying microbial community responses to changes in dissolved O2 concentrations, that is, OMZ formation, because the bottom 80 m of the water column (120–200 m) annually fluctuates between oxygenated and reduced states (Zaikova et al., 2010). In summer months, restriction of movement by a shallow glacial sill at the mouth of the inlet results in water mass stability. High primary production at the surface fuels aerobic respiration of microorganisms, which in turn leads to a progressive loss of dissolved O2 as this organic matter sinks and is remineralized. The resulting chemical transformation of the water column produces a redoxcline between 100 and 120 m with anoxic waters stretching from 120 m depth to the seafloor. During autumn–winter, nutrient-rich oxygenated waters flow over the glacial sill shoaling anoxic waters upward.

We applied an 18S rRNA gene sequencing approach to assess the influence of OMZ formation on community structure of marine protists in Saanich Inlet over a 3-year period. Phylogenetic and multivariate statistical analyses of the resulting data set revealed defined shifts in OTUs that correlated with changes in water column chemistry. Taxonomic and multivariate comparisons, as well as comparisons of statistically estimated richness, between the Saanich Inlet, the anoxic Cariaco Basin and Framvaren Fjord provide insights into how different communities of marine protists respond to changing levels of water column oxygen-deficiency.

However, it is important to note that differences in sample processing across studies add potential bias to this meta-analysis. Water samples in the current survey of Saanich Inlet were filtered through a 2.7-μm prefilter, whereas no prefilter was applied in the studies of the Cariaco Basin and Framvaren Fjord. Furthermore, biomass from Saanich Inlet waters was collected on filters with a pore size of 0.22 μm, while in Cariaco and Framvaren filters with pore sizes of 0.65 and 0.45 μm were used. Additionally, different primer combinations were used in all three studies, increasing potential amplification biases. Given these biases, we argue that while the current meta-analysis reveals broad patterns of similarity in protistan community responses to low O2 conditions, interpretation of site-specific differences must be made with extreme caution.

It is known that O2 and sulfide concentrations have a strong influence on microbial distributions in anoxic marine environments such as the Cariaco Basin (Taylor et al., 2001; Li et al., 2008; Lin et al., 2008; Edgcomb et al., 2011a; Orsi et al., 2011b) and the Framvaren Fjord (Behnke et al., 2006; Stoeck et al., 2009, 2010). These findings are validated by our CCA, MRPP and Monte Carlo analyses that indicate O2 and sulfide, as well as CH4, to be the primary drivers of protistan distribution in Saanich Inlet (see Results and Figure 1). Ours is the first investigation into the influence of CH4 on the distribution of protistan communities in low O2 marine environments and our results suggest that CH4 may have a stronger influence than sulfide, based on the differential length of the CH4 and sulfide vectors in the CCA (Figure 1). Thus, the selective influence of CH4 on protistan communities should be considered in future studies of OMZs. Interestingly, methanogenic archaea were not detected in Saanich Inlet water samples that incorporated a 2.7-μm prefilter (Zaikova et al., 2010). While diffusive flux from underlying sediments is one likely source of CH4 in the water column, new CH4 production originating from methanogens associated with the anaerobic ciliates detected in our study may also contribute to CH4 accumulation in basin waters. Methanogenic symbionts of ciliates are well known (Embley and Finlay, 1993; Fenchel and Finlay, 1995; van Hoek et al., 2000; Edgcomb et al., 2011c). This finding suggests that such symbioses may have the potential to contribute to climate active trace gas cycling in low O2 and anoxic marine environments.

The CCA analysis also suggests that unique assemblages of protists inhabit different niches along the redoxcline in Saanich Inlet throughout the year depending on the intensity of renewal events. The November 200 m sample collected during renewal does not group with the other 200 m samples from April, February and July on the biplot (Figure 1). Rather, this sample groups with dysoxic samples on the biplot, most likely reflecting the physical movement of an oxygenated water mass into basin waters (Table 1). This is supported by the minimal overlap in OTUs observed at 200 m between July and November and the MRPP analysis of the influence of season and depth. These findings confirm that OMZ formation in Saanich Inlet has a strong influence on the protistan community, with different OTUs being selected for as a result of the annual stratification and renewal cycle.

At 200 m during November and February, the number of 18S rRNA gene sequences affiliated with the Syndiniales and Stramenopiles increased relative to July and April (Figure 2; Supplementary Figure S4). The appearance of these groups is not unexpected as the upwelling water originates from coastal marine sources, a habitat in which representatives of the Syndiniales and Stramenopiles have been detected previously (Lin et al., 2006; Massana et al., 2006; Guillou et al., 2008). The majority of Stramenopile sequences were affiliated with the uncultured MAST (Marine Stramenopiles), which have been shown to exhibit a range of trophic modes and specificity for different prey species and sizes (Massana et al., 2009). Thus, as MAST stramenopiles have been detected in the anoxic waters of Framvaren Fjord and Cariaco Basin (Behnke et al., 2010; Orsi et al., 2011b), they likely play a role in regulating the abundances of microbial populations that mediate biogeochemical cycling in OMZs.

After re-stratification of the water column in July (Table 1), the majority of sequences in the anoxic portion of the water column were affiliated with the Ciliophora and Euglenozoa (Figures 2 and 3). Similar observations have been made in the Cariaco Basin where waters below the oxic/anoxic interface contain over twice the number of ciliate- and euglenozoan-affiliated OTUs relative to oxygenated waters (Orsi et al., 2011b). However, as no other studies of eukaryotic communities in seasonal OMZs have been conducted, we can only speculate at this point that such shifts may represent seasonally related succession within the protist community. Furthermore, we can only speculate that the dominant ciliates and euglenozoans found at 200 m in July represent species that survive periodic exposure to O2 during renewal events by becoming less active (and less numerous) until favorable conditions are restored. Overall, these survey results indicate that Ciliophora and Euglenozoa are selected for in Saanich Inlet during periods of water column stratification. Both contain many species of anaerobes and microaerophiles, and indeed most of the closest described relatives of the sequences affiliated with these groups (that is, Calkinsia, Cyclidium, Strombidium and Nyctotherus) fall into this category (Supplementary Table S1).

Our phylogenetic analyses of ciliate and euglenozoan-affiliated OTUs recovered from anoxic waters (Figures 4 and 5), as well as the relatively low (<92%) identities of most of these sequences to their closest described species in public databases, suggests that OMZ formation in Saanich Inlet selects for novel lineages within these phyla. The new Symbiontida-affiliated lineages (Figure 5) with low (<90%) identities to the euglenozoan Calkinsia aureus (Yubuki et al., 2009) (Supplementary Table S1) may correspond to protists exhibiting symbioses with bacteria. C. aureus is a euglenozoan flagellate recently recovered from the Santa Barbara Basin (California) (Bernhard et al., 2000; Yubuki et al., 2009; Edgcomb et al., 2011c) with a cortex that is completely covered by epibiotic bacteria belonging to the Arcobacter, a group that includes chemoautotrophs and chemoorganotrophs capable of NO3− reduction and sulfide oxidization (Edgcomb et al., 2011c). Phylogenetic analyses of ciliate-affiliated sequences reveal two clades branching basal to the novel ciliate class Cariacotrichea (Orsi et al., 2011a) recovered from the Cariaco Basin, suggesting this newly discovered taxon to be highly diverse.

Despite the use of a 2.7-μm prefilter, sequences from ciliates >2.7 μm were recovered in Saanich Inlet samples, indicating the presence of DNA from lysed cells. Aside from their high copy number of ribosomal RNA genes (Prescott, 1994), the increase in abundance of ciliate-affiliated sequences may be induced by O2 depletion and accumulation of sulfide that selects for ciliates adapted to such conditions. Also, ciliates, being significant grazers of bacteria, may be responding to spikes in prey species that occur after OMZ formation. Ciliates can act as primary bacterial grazers (Sherr and Sherr, 2002) and may regulate abundances of denitrifying and anammox bacteria responsible for the production of N2O that are known to exist in the Inlet at this depth (Zaikova et al., 2010). Because N2O is a greenhouse gas and causes ozone depletion, a potentially important relationship may exist between the abundances of ciliate grazers, denitrifying and anammox bacteria and the release of N2O from the surrounding water column.

Our data set reveals a possible linkage between environmental perturbations and a response from microbial populations present at relatively low abundances, also termed ‘the rare biosphere’ (Pedros-Alio, 2007). While sequence abundance in clone libraries is by no means an exact indicator of cell numbers at the time of sampling, gross differences can be used as a proxy for cellular abundance (Not et al., 2009). Fluctuations in sequence representation within OTUs affiliated with the Stramenopiles, Dinophyceae, Ciliophora and Euglenozoa (see Results and Figures 2 and 3; Supplementary Figure S4) all suggest that some temporally rare microbial populations become abundant in Saanich Inlet after preferred conditions arise. The potential impact of seasonally abundant ciliate grazers and heterotrophic flagellates on the release of biologically produced N2O during periods of anoxia would serve as a prime example of the potential ecological importance of such a ‘rare biosphere.’

Comparisons of parametric and non-parametric richness estimates for the Saanich Inlet, Cariaco Basin and Framvaren Fjord data sets need to be interpreted with caution because of methodological differences in sample collection. Although potentially biased, this comparison suggests that the Cariaco Basin contains roughly twice the number of species- (defined as OTUs sharing 99–98% sequence identity; see Caron et al., 2009 for discussion) and genus-level (defined as OTUs sharing 95–90% sequence identity) taxa than are estimated for the Saanich Inlet and roughly 10 times the number of such taxa from the Framvaren Fjord (Table 2). Non-parametric estimations, which typically underestimate microbial diversity (Chao and Bunge, 2002), expectedly resulted in smaller predictions (Supplementary Table S2), with differences between the three locations being proportional to the parametric richness estimates. Non-parametric richness estimations (Chao, Simpsons and Shannon indices) indicate a general trend of decreasing microbial richness with increasing depth (and anoxia) in the Saanich Inlet water column (Supplementary Table S2). The same trend is apparent in the empirically registered OTU richness (Supplementary Table S2). Together with phylogenetic and multivariate analyses (Figures 1, 4 and 5), these data suggest that the anoxic waters of Saanich Inlet house a genetically distinct, albeit less rich, protistan community relative to that present in the shallower, oxygenated waters. Differences in estimated richness between the Framvaren, Saanich Inlet and Cariaco (Table 2) are supported by a multivariate comparative analysis (Figure 6), indicating that the two fjord communities are more similar to one another under anoxic/sulfidic conditions than either is to Cariaco communities. However, differences in library size may influence diversity estimates (Chao and Bunge, 2002). Consequently, the relatively low taxon richness detected in Saanich and Framvaren may be impacted by undersampling.

The differences between the communities at the three locations are likely due to several parameters that differ between oceanographic provinces. One likely reason for the variation in richness is the large difference in size between the Cariaco Basin, Framvaren Fjord and Saanich Inlet. Protistan communities in the eastern and western subbasins of the Cariaco contain widely divergent assemblages, and this phenomenon is likely driven in part by differences in primary production, riverine inputs and trophic responses to differential prey items (Orsi et al., 2011b). The presence of additional niches that are attributed to the larger size of the Cariaco Basin may permit higher protist diversity relative to Saanich Inlet and Framvaren Fjord. Second, the difference in climate and seawater temperatures between Saanich Inlet, Framvaren Fjord and Cariaco may contribute to the differences in richness, as temperature has been shown to be a significant driver of the diversification of marine microbial populations (Fuhrman et al., 2008). Also, unlike Saanich Inlet and Cariaco, the oxic/anoxic interface of Framvaren Fjord lies in the photic zone and contains significantly higher sulfide levels (Table 1). The Cariaco Basin has remained anoxic for millions of years (Schubert, 1982) and experiences only limited O2 intrusion events (Lin et al., 2008). This timeframe has likely allowed for expanded speciation and diversification and could explain, in part, the higher richness of this environment. This may also contribute to the differential representation of taxonomic groups recovered in Cariaco, such as Fungi and Polycystinea (Supplementary Figure S2), as well as the separation of the fjord communities from Cariaco on the PCA biplot (Figure 6).

Our analysis revealed common and unique responses of protists to water column O2-deficiency in space and time. Similar to studies of Framvaren Fjord and Cariaco Basin, we observed that protistan taxon representation in Saanich Inlet changed in response to environmental perturbations associated with altered redox status. However, differences in taxonomic representation and diversity estimations between the three locations indicated patterns of endemism not fully explained by sampling and detection biases alone. At the same time, we obtained evidence for temporal fluctuations in rare protistan populations, particularly anaerobic ciliates, that may be of significant importance to biogeochemical cycling within OMZs.