Introduction

A central debate in microbial biogeography has contrasted species distributions as either ubiquitous or restricted in space and time, but the coexistence of both patterns is now well supported (for example, van der Gast, 2015). In addition to observations of cosmopolitan distributions for some species, multiple examples of endemism have been found for terrestrial and aquatic archaea, bacteria and eukaryotic microbes, including protists (Foissner, 2008; Fontaneto, 2011). Even marine planktonic microbes, which are thought to have high dispersal potential due to large population sizes and the apparent lack of barriers in the ocean, include both globally-distributed and geographically-isolated taxa (Pomier et al., 2007; Casteleyn et al., 2010). However, a major factor influencing our understanding of biogeography is the way we define and study taxa (Hanson et al., 2012), thus raising the question whether our current perceptions are real or semantic.

In protists, two main characteristics have been used to track taxon distributions: morphologies and molecules. As opposed to bacteria and archaea, many protist taxa have structural traits that allow species identification. For instance, some marine protists build shells with morphological features that have allowed taxonomic classification by standard microscopy (for example, foraminiferans, radiolarians and tintinnid ciliates). In these groups, morphologically-defined species (morphospecies) have historically been found to display distinct distribution patterns in both the horizontal and vertical dimensions (for example, Boltovskoy, 1999). Nevertheless, biogeographical patterns based on morphospecies may be obscured by cryptic and polymorphic species, given that conserved, convergent or plastic morphologies can mask major molecular, biological, physiological and ecological differences (Dolan, 2015). The increasing combination of morphological and DNA sequence analyses in single specimens has revealed, for instance, cryptic species with different distributions and likely dissimilar ecology in marine plankton (Weiner et al., 2012; Ishitani et al., 2014; Santoferrara et al., 2015). Furthermore, the characterization of natural communities by high-throughput sequencing (HTS) is now offering the potential to reveal new distribution patterns based on the detection of rare species and the discovery of novel, atypical taxa that may not be recognized by microscopy, even in some groups with a long tradition of morphological description (Lecroq et al., 2011). At the same time, these conspicuous groups are useful to test known issues of environmental HTS, as parallel morphospecies identification in the microscope helps to differentiate meaningful operational taxonomic units (OTUs based on sequence similarity) from potential HTS errors (for example, Bachy et al., 2013; Santoferrara et al., 2014).

We use tintinnid ciliates as a model to compare morphologies and molecules. The lorica is the main character used to delimit tintinnid morphospecies, which agree partially with OTUs defined by ribosomal DNA sequences (Santoferrara et al., 2013). Tintinnids are major consumers of phytoplankton, and their feeding ecology relates to morphology given that prey size is determined by the diameter of the lorica aperture (Dolan et al., 2013a, 2013b). Biogeographical patterns based on worldwide morphological data are known for about half of the 75 extant genera, including both restricted (coastal versus oceanic and warm-temperate versus polar) and cosmopolitan examples (Dolan and Pierce, 2013). Indeed, assemblages of tintinnid morphospecies differ markedly in the bathymetric, latitudinal and vertical profiles (for example, Alder, 1999; Modigh et al., 2003; Santoferrara and Alder, 2012), as well as in the seasonal cycle of temperate coasts (for example, Bojanić et al., 2012). Structuring of morphospecies assemblages has been explained by environmental selection in a coastal site (Sitran et al., 2009) or random dispersal in open waters (Dolan et al., 2007, 2009, 2013b), but the processes that affect assembly in the transitions between environments are unknown. Likewise, it is not known whether patterns and processes based on morphologies and molecules agree, as few distribution studies have used DNA sequences to target tintinnids and aloricate sister lineages (Doherty et al., 2010; Tamura et al., 2011; Bachy et al., 2014; Grattepanche et al., 2015).

We studied the fluctuation of the tintinnid assemblage in a coast-to-ocean gradient of the NW Atlantic Ocean (sites from ca. 30 to 1000 m deep) using microscopy and HTS. First, we questioned whether morphospecies and OTUs agree in terms of assemblage composition, distribution patterns and correlation with environmental variables. Given the complex relationship between morphologies and molecules, our hypothesis was that the two approaches would provide different answers. Second, we explored which ecological processes may explain assembly in the transition between coast and ocean, under the hypothesis that the environmental gradient across shelf waters selects assemblage members.

Materials and methods

Sampling

We sampled 23 stations regularly separated by 7.4 km in a transect from 13 to 176 km off the coast of Rhode Island (USA) during 6–9 July 2012 on board the R/V Cape Hatteras (Figure 1a; Grattepanche et al., 2015). Vertical profiles of temperature, salinity, oxygen concentration and chlorophyll fluorescence were measured with a Seabird CTD profiler mounted on a rosette. Seawater was collected with Niskin Bottles at four depths: surface, pycnocline, chlorophyll maximum depth (CMD) and deepest (ca. 5 m above the bottom, except for stations 21–23 sampled at 100–150 m). From the CMD, 250 ml of seawater was concentrated on glass microfiber GF/F filters, which were stored in the dark at −80 °C until high-performance liquid chromatography (HPLC) analysis of phytoplankton accessory pigment composition and concentration. Three pigment-based ratios were estimated as proxies for the relative abundances of micro-, nano- and pico-phytoplankton (Supplementary Table S1).

Figure 1
figure 1

The environmental gradient analyzed. (a) Stations analyzed by microscopy only (circle) or microscopy and HTS (square; modified from Grattepanche et al., 2015). (b) Principal component analysis (PCA) showing a coast-to-ocean gradient (represented by clearer-to-darker colors) at four depth layers. For each environmental variable, eigenvectors and their highest values on the corresponding PCs are shown. (c–e) HPLC-estimated chlorophyll a concentration and phytoplankton size fraction ratios at the CMD, and mesozooplankton biomass in the water column, respectively. Linear fits and significant Pearson’s coefficients (R, P<0.05) against the distance to the coast are shown.

From each of the four depths sampled, two series of samples were collected for ciliates. For microscopy, 250 ml of seawater was preserved with acid Lugol’s solution (2% f.c.). For molecular analysis, 1 l of seawater was screened through an 80-μm mesh to remove metazoa, and then sequentially filtered through 10 μm and 2 μm polycarbonate filters, which were stored in 0.5 ml of DNA buffer at 4 °C. Additionally, surface samples collected with a 20-μm-mesh net were preserved with non-acid Lugol’s solution (2% f.c.) for single-cell sequencing. To estimate the biomass of potential predators at each station, a 150-μm-mesh net was towed vertically from the deepest waters reached by the rosette to the surface. Samples were fixed with formalin (10% f.c.), collected on glass fiber filters in duplicate, dried at 60 °C for 24 h and weighed.

Microscopy and single-cell sequencing

Tintinnids were studied with an inverted microscope. For each of 91 acid Lugol’s samples, 150 ml was settled and entirely surveyed. Loricae containing their cell were counted, measured and identified following original descriptions and the work of Alder (1999). Morphospecies found in non-acid Lugol’s samples were isolated and sequenced using the method by Santoferrara et al. (2013). The small subunit ribosomal DNA (SSU rDNA) sequences from 10 morphospecies, 5 of them newly sequenced, were deposited in NCBI GenBank (accession codes KT792924 to KT792933; Supplementary Results).

HTS and sequence analysis

HTS focused on 28 samples corresponding to the four depths sampled at seven stations regularly distributed across the continental shelf (Figure 1a; Supplementary Table S2). DNA was extracted from one-half of each filter and only samples from the 10-μm filters were used in this study given the prevalence of known tintinnid species above this size. Library preparation, HTS, sequence quality filtering and OTU identification were based on published methods (Santoferrara et al., 2014; see details in the Supplementary Methods). In brief, a SSU rDNA region was amplified with primers specific for tintinnids, other choreotrichs and oligotrichs (Tamura et al., 2011). Then, five independent PCR products per sample were pooled and sequenced (454 Life Sciences, Roche, Branford, CT, USA). Pyrosequences were deposited in NCBI Sequence Read Archive (accession code SRP064473).

QIIME v. 1.8.0 (Caporaso et al., 2010) was used for quality filtering and OTU clustering. OTU clustering was done de novo using the UCLUST algorithm (Edgar, 2010) and a 100% similarity cutoff. Chimeras were identified de novo with UCHIME as implemented in USEARCH (Edgar et al., 2011) and removed. OTUs derived from only five or fewer reads, as well as OTUs that occurred in only one sample, were excluded as potential errors. Non-tintinnid OTUs were removed (Supplementary Methods).

A self-curated reference database of morphospecies sequences was used to identify putative tintinnid OTUs in two approaches. First, the tintinnid OTUs were combined with the reference sequences and used in a phylogenetic approach. Sequences of aloricate choreotrichs, oligotrichs and stichotrichs were added as outgroups. An alignment was done with MUSCLE (Edgar, 2004), and refined with Clustal Omega (Sievers et al., 2011). Positions containing more than 70% gapped characters were removed. A Maximum Likelihood tree was built with RAxML (Stamatakis, 2006) using 1000 bootstrap replicates and the GTR model of evolution with a gamma model of rate heterogeneity and a proportion of invariable sites, as previously identified with jModelTest under the Akaike Information Criterion (Darriba et al., 2012). In the second approach, the tintinnid OTUs were contrasted against the reference sequences using BLASTN (Altschul et al., 1990). Unidentified environmental sequences in GenBank were excluded from the analyses, except for the BLASTN comparison of OTU groups that did not include any known species.

Sample comparison and multivariate analyses

To compare HTS data among samples, the number of sequences per sample was standardized to the minimum obtained (4783 sequences; station 4, surface) using a random sub-sampling without replacement in QIIME (Caporaso et al., 2010). Standardization was not needed among microscopy samples, because identical volumes were always examined. However, inherent technical differences prevent accurate standardization of sampling effort between HTS and microscopy (Supplementary Figure S1; see below).

Multivariate analyses of HTS data were done with QIIME (Caporaso et al., 2010) or with R using the vegan package (Oksanen et al, 2014); the latter was used also for microscopy and environmental data. Ordination of all samples based on normalized environmental data (temperature, salinity, oxygen concentration and chlorophyll fluorescence) was done with a principal components analysis (PCA) based on Euclidean distance. Correlation between the distance to the coast and HPLC-estimated chlorophyll a concentration and phytoplankton size fraction ratios, as well as mesozooplankton biomass, was tested with the Pearson’s coefficient. Ordination of samples based on tintinnids was done by non-metric multidimensional scaling using Bray-Curtis dissimilarity matrices of either morphospecies or OTUs. Additional ordinations were done by principal coordinates analysis based on either weighted or unweighted Unifrac distance matrices of OTUs. Clustering of samples based on tintinnids was done with each of these matrices using the unweighted pair-group method with arithmetic averages and 1000 replicates for jackknife analyses. The main morphospecies and OTUs driving Bray-Curtis dissimilarities among sample groups were identified.

Overall correlation of Bray-Curtis β-diversity and matrices of pairwise environmental distance (normalized Euclidean distance), horizontal distance (between stations) or vertical distance (between depths sampled) were tested using simple and partial Mantel tests with 10 000 permutations. Significance levels were corrected for multiple comparisons. The best subset of environmental variables producing the maximum correlation with β-diversity was estimated (BIOENV test).

Results and Discussion

Environmental conditions

The environmental variables we measured showed a gradient from coastal to oceanic waters (Figure 1). The PCA showed continuous changes both horizontally and in the vertical profile (PC1 and 2, respectively; Figure 1b). These changes were linked mainly to a decrease in chlorophyll fluorescence and oxygen concentration from inshore to offshore, a decrease in temperature from surface to the deepest waters, and an increase in salinity in the two dimensions (Figure 1b). Distance to the coast was significantly correlated with a decrease in chlorophyll a concentration and changes in phytoplankton size fraction proxies (the microphytoplankton ratio decreased as the nanophytoplankton ratio increased) from inshore to offshore in the CMD (Figures 1c and d) as well as with a decrease in mesozooplankton biomass in the water column (Figure 1e). These trends suggest a productivity decrease toward open waters and a switch in phytoplankton size fractions in relation to the vertical structure of the water column (Supplementary Figure S2A), in agreement with known dominance of larger cells in coastal mixed waters and smaller cells offshore (for example, Li, 2002).

Comparison of tintinnid morphospecies and OTUs

By contrasting morphospecies and OTUs in a coast-to-ocean gradient from surface to bottom waters, we provide a framework to explore some of the strengths and weaknesses of HTS in environmental surveys and expand the spatial scale of tintinnid assemblage comparisons by modern and classical methods (Bachy et al., 2013; Santoferrara et al., 2014). An inherent limitation of such comparisons is the difficulty in standardizing sampling effort between methods due to the different steps involved in each procedure (Supplementary Figure S1). For microscopy, a precisely measured volume of sample is settled and examined (generally based on cell density and work effort). For HTS, an unknown portion of the sample volume ends contributing to the final data set: only a fraction of extracted DNA is used for non-quantitative PCR, then only a fraction of amplified DNA is used for HTS, and then only a fraction of sequences is kept after bioinformatic quality filtering. Although HTS is (ideally) designed for representativeness and sensitivity, the volume actually examined is not exactly comparable to the one studied by microscopy, even starting with identical sample sizes. This prevents the strict comparison of α-diversity values between both techniques (Magurran, 2003), which is also limited by other factors (for example, different susceptibility to cell destruction during preservation or filtration; Santoferrara et al., 2014). Still, we were able to extract valuable information by evaluating the overall performance of each method and comparing a) assemblage composition, b) distribution patterns and c) correlations with environmental variables in terms of morphospecies and OTUs.

a) Assemblage composition

We found a taxonomic composition that was less diversified in terms of morphospecies than OTUs (Figure 2). Among the 28 samples analyzed by both approaches, morphospecies and OTUs shared four taxonomic groups (Steenstrupiella/Amphorides, Salpingella/Amphorellopsis, Eutintinnus and other tintinnids), but there were four additional OTU groups. Of these OTU groups, three may correspond to microscopically-observed species with unknown sequences or to novel taxa (S1, S2 and Eu); the fourth is known (Tintinnidium) but was detected in very low proportion (0.04% reads). Furthermore, of the 30 morphospecies detected by microscopy, 17 have known SSU rDNA sequence and all of them were found also by HTS. There were no cases of morphospecies with known sequence detected by microscopy but not by HTS. In contrast, of the 509 putative tintinnid OTUs detected, 23 are identical to sequences of known morphospecies, the 17 above and 6 additional OTUs (<0.2% reads) that were not found by microscopy (Figure 2, Supplementary Tables S4 and S5). Although differences in species detection may be related to sample volumes (Supplementary Figure S1), our HTS approach showed high sensitivity for tintinnid studies.

Figure 2
figure 2

Tintinnid assemblage composition by microscopy and HTS. Left, sketch of a RAxML tree based on sequences of known morphospecies and pyrosequenced OTUs (Supplementary Figure S4). Underlined species were newly sequenced (Supplementary Results,Supplementary Figure S3, Supplementary Table S3). Circles indicate nodes with bootstrap support >50%, although the short sequence analyzed mostly prevents statistically-relevant inferences. Morphospecies of known sequence detected by both HTS and microscopy are highlighted in yellow, based on both identity in the RAxML tree and 100% similarity in BLASTN (199.7% similarity, equivalent to one base substitution). BLASTN results are shown for three groups of OTUs that did not include any sequence in GenBank (despite their similarity to known genera, these groups may belong to other taxa according to the RAxML tree) and for the only dominant OTU with unknown morphology but 100% similar to an environmental GenBank sequence. Right, bubble plots representing relative abundance of cells or pyrosequences (values 3% are indicated). Arrows indicate taxa with contribution 3% in both approaches (black), only by microscopy (white), or only by HTS (gray). Microscopy results are shown for 228 samples also studied by HTS and 391 total samples. Not detected (−), unknown sequence or morphology (?).

Relative abundances agree only partially for morphospecies and OTUs (Figure 2). Nine morphospecies and nine OTUs were dominant (3% cells or reads, respectively), but only four of them were common to both approaches (black arrows in Figure 2). An effect of either species dimensions compared with the pore size used for sample filtering or primer mismatches is not evident, but there could be other PCR bias (for example, selective amplification of closely-related species), taxon-specific susceptibility to DNA extraction or variability in the copy number of the rDNA operon (Medinger et al., 2010; Heywood et al., 2011). This supports known inconsistencies in relative abundances estimated by HTS and microscopy for ciliates and other protists (Bachy et al., 2013; Egge et al., 2013; Santoferrara et al., 2014; Stoeck et al., 2014).

The total number of OTUs we found (Figure 2) represents almost half of the number of described tintinnid species (ca. 1200). Although many of the unidentified OTUs may be real (for example, cryptic or rare species not detected in the microscope), others may be artifactual variants of the dominant OTUs, which cause richness inflation in the HTS data. Each dominant OTU (generally correlated to the distribution of their corresponding morphospecies, if known) was associated with multiple, closely-related OTUs (that is, same cluster in RAxML tree and similarity >99%) that usually occurred in the same samples but with many fewer reads (Figures 3a–e). These variants may result from the known intra-individual and intra-specific variability in tintinnid SSU rDNA (Gong et al., 2013; Santoferrara et al., 2013) or from HTS errors (Reeder and Knight, 2009), in part retained in the data due to our OTU clustering at 100% similarity. A lower cutoff is not adequate for our amplicon because that can cause sequences of separate known species to be clustered into the same OTU (Grattepanche et al., 2014; Santoferrara et al., 2014).

Figure 3
figure 3

Occurrence of dominant OTUs and their corresponding closely-related OTUs. Some identified OTUs co-occurred with multiple, much less abundant closely-related OTUs (ad). OTU-4245 from the group S2 co-occurred with all but 12 of its closely-related OTUs (e). In the case of Salpingella OTUs (f), four dominant (3% reads) and two moderately-abundant (1–2% reads) OTUs were closely-related but had different distributions (OTU-4040 was found in most samples, OTU-6989 peaked in the sample 7d and OTUs 5650, 3982, 2810 and 4106 were detected only in stations 12–23); there were also multiple, scarce, close OTUs. The distribution of dominant OTUs was also compared with their corresponding morphospecies, if known (r=Spearman’s coefficient; P<0.01**).

Our 100% OTU clustering strategy, however, did allow for the detection of potentially novel diversity within known genera. For example, closely-related Salpingella OTUs that had high abundances in distant samples could represent different species (Figure 3f). These OTUs could not be linked with the Salpingella spp. observed in the microscope, because their sequence is unknown. Furthermore, some morphospecies with small loricae (diameter 4–14 μm) were difficult to differentiate, did not match any described species (Supplementary Figure S3, Supplementary Table S4) and were probably underestimated by both microscopy (Figure 2) and HTS (as we used DNA samples from 10-μm filters). Also in the Mediterranean, where tintinnids have been characterized by morphology and molecules, some Salpingella OTUs are only detected by environmental sequencing (Bachy et al., 2013). Thus, Salpingella may include closely-related species that cannot be discriminated by light microscopy. Cryptic species with non-overlapping distributions possibly linked to niche separation have been detected in tintinnids (Santoferrara et al., 2015) and may be more common than expected.

b) Distribution patterns

Morphospecies and OTUs captured similar patterns of spatial distribution: inshore and offshore sample groups were detected by all the analyses done (Figure 4, Supplementary Figure S6). A deep sample group was detected using morphospecies and non-quantitative OTU data (Figures 4a and c; Supplementary Figure S6A and C), but it was not shown by quantitative OTU data (Figures 4b and d; Supplementary Figure S6B and D). The inconsistent detection of the deep group (mostly the deepest samples from stations 14 to 23) may be related to the lower number of samples examined by HTS compared with microscopy, although an influence of sample reduction was not expected given that morphological findings were very similar for the 28 samples studied by both methods and the total 91 samples (Figure 2). A more likely explanation for differences is the impact of biased HTS quantification on β-diversity (Lozupone and Knight, 2008). Using phylogenetic information, unweighted (non-quantitative) and weighted (quantitative) Unifrac ordinations were significantly correlated (Procrustes analysis, m2=0.75, P=0.001), but the latter was less consistent even in separating inshore and offshore samples (Figures 4c and d). Bray-Curtis (non-phylogenetic, quantitative) analyses showed an intermediate performance, with inshore and offshore separation in the ordination analysis (Figure 4b), but less support according to cluster analysis (Supplementary Figure S6B). Despite the differences, two-thirds of the samples were always assigned to the same group regardless of the approach used (Supplementary Figure S7).

Figure 4
figure 4

Ordination of samples based on morphospecies and OTUs. Results were compared with cluster analyses (Supplementary Figure S6) to establish sample groups: inshore (green), offshore (light blue) and deep (dark blue; not detected in b and d). Outliers and missing samples correspond to very low occurrence (<5 cells) or non-detection of tintinnids, respectively. (a, b) Non-metric multidimensional scaling based on Bray-Curtis similarity matrices of morphospecies and OTUs, respectively. (c, d) Principal coordinates analysis based on unweighted and weighted Unifrac distance matrices of OTUs, respectively. Samples examined by both microscopy and HTS or only by microscopy are indicated by black and gray circles, respectively. Samples are labeled with station number and depth (s=surface, p=pycnocline, c=chlorophyll maximum, d=deepest). In (a), a part of the ordination was magnified (incomplete line; see all labels in Supplementary Figure S5). In (c, d), the proportion of variance explained by each axis (Pc) is shown.

Tintinnid assemblages clearly differed among sample groups, despite some differences in the identities of the main morphospecies and OTUs driving the groupings (Figure 5). These results were driven by the most abundant taxa and reflected the previously mentioned differences in relative abundances between microscopy and HTS (Figure 2). However, because the most abundant taxa were also the most frequent within each group, they were still found as the main contributors to the grouping using presence/absence data (Figure 5). This supports the idea that rare OTUs (that is, with low abundance and occurrence, such as many of our potential artifacts) have a weak effect on β-diversity detection, unless they are studied separately from the abundant OTUs (Zinger et al., 2012; Logares et al., 2014; Lynch and Neufeld, 2015). Also, given that the detection of rare taxa is the most affected by sample size, presence/absence data suggest that our β-diversity findings were not strongly impacted by the different sample volume examined by microscopy and HTS (Supplementary Figure S1).

Figure 5
figure 5

Assemblage composition in inshore, offshore and deep waters. (a, b) Proportion of cells and pyrosequences by morphospecies and OTUs, respectively, in each sample group (see values in Supplementary Table S6). Numbers on the bars indicate the proportion of group dissimilarity attributable to the relative abundance (or, in brackets, presence/absence) of each morphospecies or OTU. Only values >5% for OTU presence/absence or >10% for the rest are shown.

Our cross-shelf data supported patterns of tintinnid species ubiquitous in both coastal and open waters, as well as species typical from only one type of environment (Alder, 1999; Dolan and Pierce, 2013). Some morphospecies and OTUs occurred both inshore and offshore, and even in the deepest waters we examined (for example, Eutintinnus perminutus and the Salpingella OTUs 4040 and 6989; Supplementary Table S6). This, and a pattern of distinct inshore and offshore distributions, agrees with the findings for the most abundant choreotrich and oligotrich ciliates in the same samples (Grattepanche et al., 2015). The tintinnid morphospecies or OTUs we found inshore were nearly or completely undetectable offshore, and vice versa (Supplementary Table S6), with a boundary at approximately 35 km from shore, near the 50-m isobath (Station 4, Figure 1a). Limited distributions on the scale we analyzed (10 s of km) do not mean that these species are spatially restricted in a global context. Indeed, our identified OTUs restricted to inshore and offshore waters were usually 100% identical to the same morphospecies sequenced in other coastal and oceanic environments, respectively (Supplementary Table S7). Although data on markers less conserved than SSU rDNA are still needed, this suggests ubiquitous dispersal within each kind of environment (Finlay, 2002). However, truly restricted species are also known (for example, the Antarctic Cymatocylis spp.; Alder, 1999), so tintinnids may support a model of microbial biogeography where some species are cosmopolitan and others are endemic (Foissner, 2008).

c) Correlation with environmental variables

Environmental factors explained tintinnid β-diversity according to the Mantel tests, although significant results were more frequent for morphospecies than for OTUs (Table 1). Because environmental variables can covary with spatial distance in both the horizontal and vertical dimensions (Martiny et al., 2006), we controlled distance effects using partial Mantel tests, which still indicated significant results for morphospecies, but not for OTUs (Table 1). Compared with morphospecies, the weaker OTU results may be linked to their lower consistency to detect β-diversity patterns in our analyses (see previous section). This may also explain the unclear relationship between environmental variables and OTU distribution found for choreotrichs and oligotrichs (Doherty et al., 2010; Tamura et al., 2011; Grattepanche et al., 2015).

Table 1 Simple and partial Mantel tests for the correlations of tintinnid assemblage dissimilarity versus environmental, horizontal or vertical distance (Pearson’s coefficient=R)

Ecological processes linked to spatial patterns

The variables that best explained significant relationships between β-diversity and environmental variables in the CMD were proxies for food resources (chlorophyll a concentration and microphytoplankton ratio), which showed a higher relevance than hydrographic variables (temperature, salinity and oxygen) or the biomass of potential predators (Table 1). Although we estimated food composition proxies only at the CMD, the general results also support a role of potential prey as β-diversity drivers. The morphospecies associations in each sample group differed in the size of the lorica aperture (that is, oral diameter), considering either the CMD or all samples (Figure 6). Given that lorica oral diameter and prey size are correlated and tintinnids can consume prey as large as about half of their oral diameter (Dolan, 2010; Dolan et al., 2013b), the species that dominated inshore and offshore would consume mainly microplankton and nanoplankton, respectively (Figure 6). Thus, competitive exclusion by dominant species could be linked to the changes in phytoplankton size fraction ratios estimated in the CMD and expected across the shelf as vertical mixing decreases (Figure 1d, Supplementary Figure S2A). Small species from deep waters may even consume bacteria (Figure 6), which would be an alternative resource given the scarcity of phytoplankton below the photic zone (Supplementary Figure S2B; Grattepanche et al., 2015).

Figure 6
figure 6

Proportion of cells by their corresponding lorica oral diameter. Loricae with oral diameters of 41–50, 31–40 and 30 μm prevailed in the inshore, offshore and deep sample groups, respectively. Size-scaled pictures exemplify size fractions and main taxonomic groups in Figure 2 (from the largest to the smallest: Stenosemella, Amphorides, Eutintinnus and Salpingella; loricae >50 μm excluded).

Consequently, it is possible that tintinnid distribution is structured by the use of different food resources across the shelf. This supports environmental selection, particularly food partitioning, as a mechanism for assemblage structuring in planktonic ciliates (Sitran et al., 2009; Claessen et al., 2010; Wickham et al., 2011). To further explore the processes that structure tintinnid assemblages, we used the distance–decay relationship, which seeks to correlate assemblage dissimilarity with distance between samples (Hanson et al., 2012; Nemergut et al., 2013). Mantel tests indicated that assemblage dissimilarity increased significantly as horizontal distance between samples increased (although CMD results became non-significant when controlling for environmental variables), while it was not significantly affected by vertical distance (Table 1; Supplementary Figure S8). Given that dispersal between neighboring inshore and offshore waters is potentially unlimited, the distance–decay relationship across the shelf supports contemporary environmental selection as more important than evolutionary processes in assemblage structuring (Hanson et al., 2012; Monier et al., 2015). Even if water mixing is expected to transport species across the shelf, environmental selection could prevent establishment (growth) when conditions are non-optimal, and thus limit the successful dispersal (colonization) in the new environment. An additional or alternative process operating on the shelf may be drift (for example, random dispersal), especially in determining the distribution of ecologically similar species within inshore, offshore and deep waters (Dolan et al., 2007, 2009).

Conclusions

Combining DNA sequences with morphological data (by parallel microscope examination and/or via barcoded morphospecies included in reliable databases) is critical to realize the full extent of protist diversity. Contrary to our expectations, OTUs and morphospecies show a general agreement in assemblage composition, distribution patterns and relationships with environmental variables, with the exception of known biases in HTS quantification. Data integration, however, is crucial because each approach provides unique insights. On one hand, molecular OTUs detected by HTS include hidden diversity, even within well-characterized taxa. These OTUs comprise real species: all the ones observed in the microscope, and also others represented only by few environmental sequences. Furthermore, the distribution of unique OTUs suggests that cryptic species are not distinguished by regular morphological and molecular analyses, unless custom bioinformatics is used (for example, 100% OTU clustering for our tintinnid amplicon). Remaining HTS uncertainties (for example, intra-specific variability and sequencing errors) will be lowered as reference databases become populated with known DNA sequences, thus consolidating HTS usefulness to study, for example, the rare biosphere. On the other hand, morphological characterization helps to understand the processes that structure assemblages, for example, by associating distribution with potential food resources. Our cross-shelf data allow tracking assemblage fluctuations over an environmental gradient and support food partitioning as a mechanism for community assembly in marine plankton. Tintinnids exemplify the necessity of linking data from genotypes (for example, SSU rDNA and more variable markers) and phenotypes (for example, morphology, physiology and ecology) in order to understand the patterns and processes of microbial biogeography.