Main

Recent advancements in high-throughput sequencing of marker genes, such as the 16S rRNA gene, have provided microbial ecologists the tools to accurately infer the relative composition of microbial communities (Franzosa et al., 2015). This resulted in a widespread application of the technology in longitudinal studies where shifts in community structure are related to environmental variables and functional outputs (Faust et al., 2015; Wilhelm et al., 2015). An inherent limitation of the sequencing technology is that the calculated taxon abundances comprise relative values (Widder et al., 2016). Hence, caution must be taken with the biological interpretation of these values, since inter-sample differences in cell density are not considered. To our knowledge, there are no descriptive studies that assess the extent to which relative abundances deliver a skewed image of the actual microbial community dynamics. In this study, we combined robust cell density measurements from flow cytometry (Prest et al., 2013; Van Nevel et al., 2013) with the relative abundances derived from 16S rRNA gene amplicon sequencing. We performed two extensive longitudinal surveys on the central water reservoir of a cooling water system. This engineered freshwater ecosystem was subjected to highly controlled operational phases (Supplementary Information and data set). We quantified the absolute taxon abundances and assessed whether additional insights could be attained with the combined approach.

Based on the sample-specified total cell density, the absolute taxon abundances were calculated for each time point. Individual taxon densities ranged from 0.5 to 1 679 cells per μl. Several inter-taxon differences became apparent by performing ordinary least squares regression analysis between the relative and absolute abundances. We focused on the three most abundant taxa, which fully represented two distinct freshwater clades in the community (Figure 1; Newton et al., 2011). We identified a significant difference between OTU1 (betI-A clade) and both OTU2 and OTU3 (bacI-A clade taxa; P<0.05), however, no significant difference was observed between OTU2 and OTU3 (P=0.51). These findings suggest that identical relative abundances of different taxa may require taxon-dependent biological interpretation because they do not necessarily reflect the same absolute abundances. To further verify the limitations of relative abundances, we closely inspected the temporal trajectories of both clades (Figure 2). Throughout the two surveys, the betI-A clade (OTU1) displayed similar variation in relative abundance (coefficient of variation (CVrel)=54%) and cell density (CVdens=50%), whereas the bacI-A clade (OTU2 and OTU3) displayed distinct transient behaviour (CVrel=124%, CVdens=172%). Overall, both surveys were characterized by dynamic shifts in community composition and density that were interspersed with stable periods (that is, beginning and end of each survey).

Figure 1
figure 1

Scatter plot of the absolute and relative abundance of the three most abundant OTUs registered at 79 time points and throughout two time-separated 40-day surveys of a secondary cooling water circuit that operates on a nuclear test reactor. The variance in the relation between absolute and relative abundances increases at elevated values (Breusch–Pagan test, P<0.0001). OTU1 belongs to the betI-A clade. OTU2 and OTU3 belong to the bacI-A clade. Coloured dashed lines depict ordinary least squares regression lines for each OTU. These regressions were used solely for statistical inference and do not necessarily represent the optimal predictive models for these data.

Figure 2
figure 2

Temporal dynamics for taxa of the two most abundant freshwater clades (that is, bacI-A (OTU2, red; OTU3, orange) and betI-A (OTU1, blue)) during two time-separated 40-day surveys of a secondary cooling water circuit that operates on a nuclear test reactor. The top panel displays the relative abundances (in %) inferred from the 16S rRNA gene amplicon sequencing data. The bottom panel displays the absolute OTU abundances (in cells per μl) and the circle labels represent the total cell density of the microbial community (in cells per μl±s.d.). Horizontal stacked bars highlight different phases of the system during surveillance. Grey zones indicate time periods where the cooling water system was not in operation (control phases), green zones indicate the start-up and blue zones indicate steady-state operation.

When interpreting the temporal trajectories of the relative and absolute abundances, two primary discrepancies could be detected, potentially leading to misinterpretation if conclusions would have been based solely on the relative abundances. First, during periods of community growth, such as the start-up and early reactor operation in survey 1, there was a well-defined transition in absolute abundance, showing the systematic outgrowth and decay of the bacI-A clade. In contrast, the relative abundance profiles were more ambiguous. They remained relatively constant during the growth and decay event and only conclusively indicated the beginning and the end. Another striking discrepancy could be observed in the second half of reactor operation during survey 2. The relative abundance profiles displayed an increase from ±40% to >90%, potentially indicating a selective outgrowth of the betI-A clade. If instead the absolute abundances were considered, there was no clear pattern of active outgrowth visible for this clade; its cell densities never considerably exceeded the maximum density that was observed at the end of the start-up phase (relative abundance=57%). This suggests that the decay of other taxa is responsible for its enrichment within the community structure, or that environmental constraints limit its effective outgrowth. Overall, the dynamics of the betI-A and bacI-A clades could be more precisely specified with the absolute abundance profiles. Bulk cell density measurements were, in themselves, a poor descriptive parameter of the microbial community dynamics (Supplementary Figure S1). Only one OTU’s relative abundance (OTU2) was strongly correlated to the total cell density (Pearson’s correlation: rp=0.60, P<0.01, n=79), while the mean correlation strength for the entire community composition was −0.08±0.15 (n=427). This shows that only for OTU2, its outgrowth (increase in absolute abundance) frequently corresponds with its enrichment (increase in relative abundance).

From our results we are able to show that absolute quantification of taxon dynamics is essential, and has the potential to shed additional light on many outstanding questions within microbial ecology. Next to flow cytometry, quantitative PCR (qPCR) and fluorescence in situ hybridization (FISH) may represent alternative approaches for estimating absolute cell densities. The tandem of qPCR and sequencing may be appealing because qPCR and amplicon sequencing analyses start from the same DNA extract and thus incorporate similar laboratory-induced bias. However, for environmental samples, qPCR is only sensitive enough to separate twofold changes in gene concentration (proxy for cell abundance; Smith and Osborn, 2009). qPCR also suffers from specific limitations such as amplification efficiency and primer specificity, which makes it unadvised to compare the results between studies and even assays on the same device (Smith and Osborn, 2009; Brankatschk et al., 2012). FISH provides a PCR-independent approach for calculating relative taxon abundances or, in case of a standardized methodological approach, even estimates of absolute abundances (Daims et al., 2001). Unfortunately, FISH analyses enumerate only the active fraction of the community as the analysis is based on the hybridization of fluorescent probes with the 16S rRNA (Amann and Fuchs, 2008). These analyses are also more laborious, and generally provide limited sample sizes (that is, hundreds of cells). In contrast, our flow cytometric enumeration approach of absolute cell densities is robust and high throughput, but it requires supervised denoising strategies to account for instrument and (in)organic noise, as well as cell aggregates (Materials and methods section, Supplementary Information).

Methodological limitations become particularly crucial when the central hypothesis pertains to the ‘rare microbiome’. This fraction is currently defined at arbitrary thresholds of 0.1 to 0.01% (Lynch and Neufeld, 2015). Although the definition of rare taxa will always remain partially based on ad hoc assumptions (Haegeman et al., 2013), several bioinformatics-based tools have shown the substantial impact of varying rarity thresholds on community analyses (Gobet et al., 2010). By also taking into account the absolute taxon densities, which are comparable between studies, the development of a more consistent framework may now prove possible. In this study, we did not take into account the number of 16S rRNA gene copies during the calculation of the absolute abundances because there were insufficient closely related reference genomes available (Supplementary Information and Supplementary Figure S2). For more characterized environments, this additional normalization may improve the resolution of absolute abundance calculations (Langille et al., 2013; Stoddard et al., 2015). Overall, our results demonstrate that when united, robust cell density measurements and phylogenetic marker gene data are able to project a more comprehensive image of the compositional dynamics occurring in microbial ecosystems.

Data availability

Flow cytometry data (.fcs format) are available on the FlowRepository archive under repository ID FR-FCM-ZZNA and the Dryad Digital Repository (http://dx.doi.org/10.5061/dryad.m1c04). Sequences are available on the NCBI Sequence Read Archive (SRA) under accession number SRP066190.