Introduction

Slow sand filtration is one of the oldest and most effective means of treating drinking water for the control of microbiological contamination (for example, achieving >99% removal of enteric bacteria, Hijnen et al. (2007)). Such purification is attributed to naturally occurring biochemical processes in the filters (for example, predation and bio-oxidation), however, these have not yet been comprehensively verified, mainly due to methodological limitations (Haig et al., 2011). Previously, Haig et al., (2014a) compared laboratory-scale filters to full-scale sand filters using phyla-specific quantitative PCR primers, and both 454 and Illumina sequencing. This revealed that the microbial communities underpinning slow sand filters (SSFs) are extremely complex, with specific organisms correlating with overall water quality performance (Haig et al., 2014a). Further, it was found that the treatment efficiency and microbial community structure of full-scale units were reproducible in the laboratory (Haig et al., 2014a), a finding which now allows more pertinent questions relating to human health and microbial ecology to be addressed. In particular, understanding how pathogenic microorganisms (for example, E. coli) are removed is a critical question, which addresses one of the primary tasks of modern ecology; linking the biotic interactions of organisms within an ecosystem to their functional performance (Mikola and Setala, 1998).

The need to remove pathogens and understand the mechanisms responsible for pathogen removal in potable water supplies is a well-recognised issue, emphasized by the fact that 3.4 million people each year die from water-related diseases (WHO, 2004). Determining and understanding these mechanisms would be highly advantageous and would vastly improve the implementation of drinking water technologies in developing countries, including household systems. In addition, it could allow water companies in developed countries to control pathogen levels by managing the SSF community. Further, by determining the trophic mechanisms and interactions involved in E. coli removal in a ‘real world’ food-web, great insight and knowledge for general microbial ecology will be obtained. This will provide a paradigm for similar studies and the opportunity to create more realistic trophic interaction models in the future.

Previous SSF studies have examined the ability of specific organisms (for example, Chrysophyte) to remove pathogenic bacteria (Weber-Shirk and Dick, 1999), or the overall pathogen removal efficiency of SSFs (Bomo et al., 2004, Grobe et al., 2006, Hijnen et al., 2007; Elliott et al., 2008). However, these studies are limited by their specificity. Further, on the basis of these studies, and knowledge from marine and terrestrial environments, both top-down (predation by protozoa and viral lysis) and bottom-up (nutrient/resource availability) mechanisms have been suggested as important for the regulation of microbial mortality (Lloyd, 1973; Hunter and Price, 1992; Pace and Cole, 1994; Weber-Shirk and Dick, 1999; Rosemond et al., 2001). In addition, theoretical models and empirical surveys have indicated that the majority of the mortality is due to grazing by protists, and to a lesser extent to viral lysis (Pernthaler, 2005). However, abiotic factors, such as UV radiation and reactive oxygen species (ROS)-associated lysis, have also been hypothesized as potential lysis routes for microbes/pathogens (Curtis et al., 1992; Alonso-Saez et al., 2006; Liu et al., 2007; Kadir and Nelson, 2014).

Although these studies are informative, they are also unrealistic as they have been performed in microcosms, focussing on one or a small group of organisms and hence over-simplify and potentially provide inaccurate or biased conclusions on regulatory mechanisms. Currently, no study, to the author’s knowledge, exists which aims to determine the mechanisms responsible for pathogen removal in a real biological system without prior knowledge as to which removal mechanisms or organisms to target. However, the complexity of real communities requires an untargeted approach capable of quantifying the importance of all trophic groups simultaneously. Here, we develop such an approach by combining stable-isotope probing (SIP) with metagenomics (Sul et al., 2009) and apply it to the tractable, though complex, system in SSFs, allowing all mechanisms and organisms involved in the removal of non-pathogenic E. coli K12 to be determined. We will use this organism, a commonly used faecal indicator, as a proxy for true pathogens, such as other E. coli strains, making the assumption that the removal processes will be the same. The experiment was used to test the hypothesis that the principal modes of removal will be top-down removal mechanisms, such as predation by protozoa and viral lysis, although the extent of these processes is expected to differ throughout time.

Materials and methods

Filter operation and sampling

The same SSF set-up (eight filters of 2.5 m in height and 54 mm in diameter) and operational procedures, as employed by Haig et al., (2014a), were used in this study. The only difference was the addition of high-power LED lights fitted with a cool white (240 lumens or 5.4 W) bulb, erected above the SSFs to simulate daylight conditions. These lights functioned on a 12 h light/12 h dark cycle for the duration of the experiment, with times being regulated by a digital electronic timer. The source of water feeding each of the filters was the River Kelvin in Glasgow, and was supplied at a constant filtration rate of 0.15 m3 m−2 h−1, which is consistent with full-scale SSFs. The sand used in all filters was sourced from a full-scale SSF site (Haig et al., 2014a) and was sterilized by autoclaving at 121 °C for 20 min before being put into the eight laboratory-scale SSFs. As in Haig et al., (2014a), water quality analyses were performed weekly.

Spiking with isotopically labelled E. coli

After 7 weeks of operation, each of the filters was spiked with isotopically labelled E. coli K12 (transformed TOP10 strain) following the protocol outlined in Marley et al. (2001). Briefly, E. coli K12 was grown overnight in M9 minimal medium with 20 ml of filter-sterilized 20% (w/v) 13C-glucose (Sigma, Dorset, UK) as the sole carbon source at 37 °C, with shaking at 200 r.p.m. The overnight culture was then centrifuged at 3000 g for 10 min and washed twice with sterile PBS before resuspension in autoclaved river water at a concentration of 300 c.f.u. ml−1, 5 min before spiking into the SSFs. Spiking entailed feeding the isotopically labelled E. coli to all filters for 1 h at the same filtration rate used previously (0.15 m3 m−2 h−1), after which normal filter operation resumed with non-spiked, non-autoclaved river water. The concentration of E. coli used was approximately 10 times the normal concentration found in the river water and was chosen to mimic levels found during pollution and storm run-off events.

Sampling spiked filters

To determine the mechanisms responsible for E. coli removal, sand was sampled from the filters at depths (1, 5, 10, 15 cm) and times of 0.5, 1, 2, 3 and 4 h after spiking. In addition, all depths (0, 5, 10, 15, 20, 30, 45, 70 cm) were sampled from the filters 24 and 96 h after spiking. Sand samples (0.5 g wet weight) were used for: direct E. coli plate counts on membrane lauryl sulphate agar containing 100 μg ml−1 ampicillin, 50 μg ml−1 kanamycin and 25 μg ml−1 streptomycin (Life Technologies, Glosgow, UK); direct protozoa quantification following the procedure of (Dehority, 1984); and SIP in conjunction with metagenomic sequencing.

DNA-stable-isotope probing

To separate the labelled (13C) and unlabelled (12C) DNA, the procedure of Neufeld et al. (2007) was used. Separation was achieved by using density gradient fractionation of the total DNA extract (50 μL) on a CsCl gradient with a buoyant density of 1.725 g ml−1 that was subjected to ultracentrifugation in a Sorvall 100SE Ultracentrifuge (Thermo Scientific, Loughborough, UK) at 44 100 r.p.m. for 40 h at 20°C. The density gradient was fractionated into 12 aliquots (400 μL each) by a drop-wise collection method, where fractions were taken from the bottom of the ultracentrifugation tube by pumping water into the top of the tube with a constant-flow (500 μL min−1) syringe pump (Gilson’s Miniplus 2 peristaltic pump). The density of the resulting fractions was measured with an AR200 refractometer (Reichert, Munich, Germany) and ranged from 1.47 to 1.86 g ml−1 with a median density of 1.68 g ml−1. Fractions were precipitated using a polyethylene glycol solution and dissolved in 30 μL of TE buffer, and used for qPCR quantification of 18S rRNA, total 16S and E. coli specific 16S rRNA genes (Supplementary Information 1). On the basis of qPCR and density profiles of the samples compared with 12C and 13C controls, two fractions from each sample, one representing labelled (density: >1.68 g ml−1) DNA and one representing non-labelled (density <1.68 g ml−1) DNA, were chosen for metagenomic library construction and analysis (Supplementary Information 2 and 3).

Illumina metagenomic library preparation

Thirty-six Illumina libraries (18 pairs of labelled 13C and non-labelled 12C fractions from various filters and time points) were prepared using the Nextera XT kits (Illumina, Essex, UK), following the manufacturers instructions. Briefly, 5 μl (0.2 ng μl−1) of extracted DNA were tagmented and then subjected to PCR using specific index primers and common adaptors (P5 and P7). Amplified libraries were cleaned using the AMPure XP beads (Beckman Coulter, High Wycombe, UK) and eluted in a final volume of 12 μl TE. Libraries were checked for their fragment size distribution and concentration using a Bioanalyzer 2100 (Agilent, Cheshire, UK), and appropriate libraries were size selected (500–800 bp) using a Pippin Prep (Sage Science, Beverly, MA, USA) with a 1.5% cassette. Size-selected libraries were pooled using equimolar quantities to obtain the desired number of reads for each sample. The pool was sequenced on a HiSeq 2000 (Illumina) at the Centre for Genomic Research (Liverpool).

Metagenomic sequence analysis

Sequenced reads for each sample were quality trimmed using sickle [https://github.com/najoshi/sickle]. Quality profiles were constructed with FastQC (Andrews, 2010), which revealed a non-uniform distribution of nucleotides at the start of the reads, indicating the possible partial remainder of adapter or transposon sequences. Therefore, the first 20 bp of the MiSeq and 16 bp of the HiSeq reads were trimmed and reads were filtered based on a minimum read length of 80 bp and 40 bp for MiSeq and HiSeq, respectively (originally MiSeq: 150 bp and HiSeq: 99 bp). Resulting samples contained on average 8, 157, 287±1, 373, 145 reads.

For taxonomic classification, paired-end reads were converted to a format suitable for MEGAN (Huson et al., 2011). LAST (Frith et al., 2010) was used to align the reads (maximum of 20 matches) against a customized subset of the NCBI database containing the microbial, protozoan, viral and fungal databases to achieve a more time-efficient analysis. The output was converted into ‘blast format’ and piped into MEGAN where the lowest common ancestor was assigned to each read (lowest common ancestor parameters: max-matches=100, min-score=35.0, top-percent=10.0, win-score=0.0, min-support=1, min-complexity=0.3). Occurrence tables of the taxonomic assignments were generated using a custom designed script in which the last column of the MEGAN output files was converted into the corresponding ‘taxid’ and the taxonomic path was inferred by utilizing the perl library Bio::LITE::Taxonomy. Directly exporting taxonomic paths with MEGAN caused problems, particularly for eukaryotes and viruses as not all taxonomic levels were defined. In addition, Bio::LITE::Taxonomy was unable to resolve issues due to synonyms present in the database. Therefore, the taxonomic paths of all ‘taxids’, which were unresolved by Bio::LITE::Taxonomy were directly inferred from the NCBI taxonomy database (names.dmp). Furthermore, the full taxonomic paths of several organisms (for example, Monosiga brevicollis and Dictyostelium discoideum) were added to the database.

Statistics

To determine which organisms had a significant role in E. coli removal, pairwise similarities among samples based on the Bray–Curtis similarity index were calculated. The resulting matrices were examined for temporal patterns and differences between 13C (labelled) and 12C (non-labelled) samples through non-metric multidimensional scaling and canonical correspondence analysis as implemented in the Vegan package in R (Oksanen et al., 2012). Significant differences in the metagenomic community composition between different time points (0.5–4 h) and carbon sources (labelled—13C and non-labelled—12C) after spiking with E. coli were determined using the Adonis function in the Vegan package, which performs a nonparametric multivariate analysis of variance (Anderson, 2001). To determine individual contributions from each taxon to the differences between labelled and non-labelled samples, and for the various time points, SIMPER analysis was used (Clarke, 1993). SIMPER analysis is a useful measure of the magnitude of difference; however, to decide whether a taxa differed significantly, pairwise t-tests (Kendall nonparametric) adjusted for multiple comparisons using the Benjamini–Hochberg false-discovery method were performed. Only taxa with a false-discovery rate of <5% were reported.

Results

To determine the magnitude with which protozoa and other eukaryotes affect E. coli removal; recombinant E. coli and protozoa counts, total and E. coli specific 16S rRNA, and 18S rRNA, qPCR assays were performed on samples taken from SSFs challenged with isotopically labelled (13C) E. coli. To resolve which organisms (bacteria, eukaryotes and viruses) were responsible for E. coli removal, the different carbon densities (12C and 13C) in the samples were separated (DNA-SIP) and used for metagenomic analysis. Increased abundance of any organism in 13C-labelled samples indicated potential involvement in E. coli removal.

Protozoan predator–prey response—direct counts and qPCR

Direct counts of E. coli and total protozoa (Figure 1) revealed a clear predator–prey relationship, with most removal occurring at the top of the filters around 2–3 h after spiking with 13C-labelled E. coli. The gradual decrease over time in the abundance of 13C-labelled E. coli, as well as the peak in the number of labelled (13C) 18S rRNA copies in qPCR assays on samples 3 and 24 h after spiking (Figure 2), suggested that protozoan grazing is a major mode of E. coli removal. However, as the 12C 18S rRNA results showed a similar trend compared with the labelled 18S rRNA, it can be assumed that incomplete 13C incorporation has occurred, that is, samples with a density resembling normal carbon (12C) may have started to incorporate labelled 13C but have not incorporated enough into their biomass to cause a density change.

Figure 1
figure 1

Scatter plot showing the predator–prey response between protozoa and E. coli.

Figure 2
figure 2

Absolute numbers of 16S, 18S and E. coli specific 16S rRNA in 12C and 13C fractions, determined by qPCR assays.

All domains of life are important for E. coli removal

To resolve which organisms (bacteria, eukaryotes and viruses) were responsible for E. coli removal, DNA-SIP in conjunction with metagenomic analysis was employed. Metagenomics, unlike qPCR and conventional sequencing approaches, does not rely on prior knowledge of the organisms of interest and, thus, organism-specific primers are not required (Wooley et al., 2010). However, metagenomics does still suffers from biases introduced during DNA extraction, enzymatic cutting during library preparation and PCR amplification. Multivariate analysis of variance revealed significant differences in the metagenomic communities at all levels of taxonomic classification between: types (13C or 12C), and times after spiking, of the 36 samples taken from four SSFs after adding the 13C-labelled E. coli. Time was the most significant variable explaining 15.4% (P-value: 0.01) of the variance followed by type (11.3% P-value: 0.001). Nonparametric t-tests on individual taxa proportions adjusted for multiple comparisons identified 10 orders (two bacterial, six eukaryotic and two viral) as being statistically significant in explaining differences between 12C and 13C metagenomic communities (Figure 3). We then investigated differences at higher levels of taxonomic resolutions within the prokaryotes, eukaryotes and viruses separately. The number of reads associated with each time point (Supplementary Information 4) within each group (12C and 13C) was not significantly different (P-value: 0.9356), and both fractions had similar overall community structures (12C: 0.09% viruses, 9.18% eukaryotes and 90.73% prokaryotes; 13C: 0.08% viruses, 8.50% eukaryotes and 91.42% prokaryotes).

Figure 3
figure 3

(a) Relative abundances of statistically significant orders of organism which explain differences between 12C and 13C samples. Adjusted P-values calculated using the Benjamini–Hochberg method, average percentages are relative to the kingdom the order belongs to. (b) Principle component analysis of all orders of organism identified from the metagenomic analysis. Green text represents viral orders, brown text represents bacterial orders and orange text represents eukaryotic orders.

The importance of viral lysis for E. coli removal

Twenty-two viral species (Supplementary Information 5) were identified by pairwise t-tests and Benjamini–Hochberg false-discovery tests as being significantly different between 12C and 13C metagenomic communities. None were prophages found within the E. coli K12 genome. Overall, 15 of these viruses were present at higher abundances in 13C communities compared with 12C samples and hence are involved directly or indirectly in E. coli removal. Collectively these accounted for 22% of the dissimilarity between 13C and 12C samples. Visually referring to the 22 significant viral species (Figures 4a and b) the importance of Enterobacteria phages is apparent, and in particular, Enterobacteria phage lambda, Enterobacteria phage cdtl and Enterobacteria phage N15, which account for over 14% of the difference between labelled and non-labelled samples (Supplementary Information 5). In particular, the abundance of E. phage lambda was over 117 times more abundant in 13C samples than 12C communities, implying its importance in E. coli removal.

Figure 4
figure 4

Stacked barplot of the abundance changes of statistically significant viral (a, b) and protozoan (c, d) species at various time points after spiking with 13C-labelled E. coli. A and C are 13C-labelled samples and B and D are 12C-labelled samples.

The importance of protozoan grazing for E. coli removal

Following the same approach previously used to identify significant viral species involved with E. coli removal and 13C metabolism, 52 eukaryotic species were identified (Supplementary Information 6), of which 20 were protozoa (Figures 4c and d), 15 algae (Figures 5a and b) and 17 fungi (Figures 5c and d). The presence of significant species from all eukaryotic kingdoms underscores the complex mechanisms involved with E. coli removal.

Figure 5
figure 5

Stacked barplot of the abundance changes of statistically significant algal (a, b) and fungal (c, d) species at various time points after spiking with 13C-labelled E. coli. A and C are 13C-labelled samples and B and D are 12C-labelled samples.

The 20 significant protozoan species represented 15 different genera and members from the flagellate, ciliate and amoeboid groups, all of which are known predators of E. coli (Weekers et al., 1993; Fleck et al., 2000; Fey et al., 2007; Cassidy-Hanley, 2012; Yue et al., 2013). Referring to Figure 4c, it can be seen that the proportion of significant protozoan species in 13C samples increased over time, compared with the relatively stable proportion of 7% in non-labelled (12C) samples (Figure 4d). Further, the biggest difference between 13C and 12C communities are due to fluctuations in the populations of: Chromerida RM11, Euglena gracilis, Malawimonas jakobiformis, M. brevicollis, Paulinella chromatophora, Reclinomonas americana, Tetrahymena paravorax and Vermamoeba vermiformis, (Figures 4c and d) all of which are highly motile and possess voracious appetites. Specifically, the importance of protozoan grazing on labelled E. coli, and hence E. coli removal, is apparent 2 h post spiking, where a large increase in the protozoan population is observed. In particular, a large increase in the proportion of M. brevicollis and Tetrahymena spp. was observed, with both genera collectively being responsible for 2% of the dissimilarity between 12C and 13C communities (Supplementary Information 6).

The mutualistic relationship of fungi and algae

The abundances of 17 fungal and 15 algal species were identified as being significantly different between 12C and 13C samples, of which only the fungal species Naumovozyma castelli was present in greater proportions in non-labelled samples. Therefore, the remaining species appear to be involved in E. coli removal and/or 13C-labelled carbon metabolism. On initial analysis of the algal and fungal abundances, a staggering similarity in community dynamics can be seen, with both communities increasing and decreasing in abundance at the same time points (Figure 5); Kendall correlation tests of all significant fungal and algal species confirmed this to be a significant relationship (tau 0.81952, P-value: 0.01071). This mirrored behaviour is indicative of a mutualistic relationship. Such symbioses between algae and fungi have been widely documented in various environments (Harte and Kinzig, 1993; Danger et al., 2013).

The 15 significant algal species represented nine different genera, with Saccharina spp. accounting for 47% of the significant algal species. Clear shifts in the abundance of significant algae can be seen between 12C and 13C samples at all time points (Figures 5a and b), in particular, at 2, 3, 24 and 96 h after spiking with 13C-labelled E. coli, where the average abundance tripled in 13C samples compared with the non-labelled samples. In addition, fungal species followed the same trend with Sordaria macrospora and Chaetomium globosum showing the largest increase (Figures 5c and d).

Calculating the importance of viruses and protozoa

To approximate the importance of protozoa and viruses in E. coli removal, the following assumptions were made:

  1. 1

    The added 13C-labelled E. coli does not undergo replication

  2. 2

    Only protozoa and viruses are responsible for E. coli removal

  3. 3

    All protozoa and viruses are grazing/infecting E. coli at a constant rate

  4. 4

    E. phage lambda is used to represent all viruses (as it is the most abundant and significant virus identified in 13C samples)

  5. 5

    M. brevicollis and Tetrahymena thermophila are used to average the abilities of protozoa (as they are the two most significant and abundant protozoa in 13C samples)

  6. 6

    For a protozoa to become 13C labelled, 50% of its grazing consumption must be from 13C-labelled E. coli

On the basis of these assumptions and taking genome size and progeny production into consideration it was concluded (Supplementary Information 7) that protozoan grazing appeared to be the major driving force behind E. coli removal (99.86%), with M. brevicollis accounting for the biggest proportion of E. coli removal (24.83%), followed by 4.68% achieved by T. thermophila. Conversely, viral-associated lysis was responsible for 0.14% removal of which E. phage lambda was responsible for 0.076%, which was 326 times smaller than the removal achieved by M. brevicollis.

Discussion

Identifying and unpicking trophic interactions, particularly those involved with pathogen removal, is an extremely complex and important question. Previously, mathematical models and work in simplified microcosms have shown the individual importance of viral lysis, protozoan grazing and endogenous and exogenous ROS in E. coli removal (Curtis et al., 1992; Bomo et al. 2004; Grobe et al., 2006; Liu et al., 2007; Kadir and Nelson, 2014). Within this study, the level of involvement in E. coli removal of each kingdom has been approximated from the metagenomic analysis following several assumptions. Furthermore, to optimize for the success of the DNA-SIP approach, that is, obtain enough genomic material for metagenomic sequencing, a higher concentration of E. coli than normally found in surface water was spiked into the SSFs. However, this is the first study, to our knowledge, to examine and identify the ecosystem-wide trophic interactions and mechanisms responsible for E. coli removal in a ‘real’ system, without prior bias as to which organisms and mechanisms to target. In addition, this study is the first to show that it is possible to isotopically label E. coli and follow removal through an ecosystem.

Top-down trophic interactions are essential for E. coli removal

On the basis of the direct counts, qPCR (Figures 1 and 2) and DNA-SIP metagenomic sequencing, the importance of top-down regulatory mechanisms for E. coli removal is apparent. Among a consortium of phages, protozoa, fungi and algae, which were 13C labelled (hence involved in E. coli removal/metabolism) and identified as highly significant, E. phage lambda, M. brevicollis and T. sp. were identified as the main organisms responsible for E. coli removal. On the basis of our calculations (Supplementary Information 7) and direct count observations (Figure 1), protozoan grazing was responsible for more than 99% of the E. coli removal within 4 h of spiking, which is consistent with previous investigations into pathogen removal in constructed wetlands (Weber and Legge, 2008) and estuaries (McCambridge and McMeekin, 1980). From the 20 statistically significant protozoan species identified (Supplementary Information 5), M. brevicollis was predicted to be responsible for the majority (24.83%) of the removal once factors such as replication rate, progeny production, grazing rates and genome size were taken into consideration. This equated to 326 times more E. coli removal than that achieved by E. phage lambda. The lowered involvement of viral-associated lysis for E. coli removal is also consistent with previous work (Withey et al., 2005).

The importance, and dominance, of M. brevicollis from the beginning to the end of the experiment is not surprising as it is known to have a critical role in marine global carbon cycling (Yue et al., 2013). Further, its dominance over other protozoan grazers including T. spp. (responsible for 4.68% of E. coli removal) may be explained by its very short doubling time of 4.6 h (Christaki et al., 2005) and fast grazing rate of 196 bacterial cells per hour (Parry, 2004), which in theory would allow them to outcompete other identified significant protozoan species that were identified. Although the feeding rate of M. brevicollis is slower than T. spp., such dominance could be explained by M. brevicollis:

  1. 1

    Possessing a microvilli collar, which holds bacteria from the water flow and allows them to be engulfed at a later time (Yue et al., 2013), hence providing energy storage for less plentiful times.

  2. 2

    Possessing six oxidative stress genes; four of which are algal in origin—two ascorbate peroxidases and two metacaspases—that help protect the protozoa from various algal-mediated ROS (Nedelcu et al., 2008).

Although protozoan grazing appeared to be the major route for E. coli removal, the role of viral-associated lysis cannot be overlooked, especially, as E. phage lambda was more abundant in 13C-labelled samples (Figures 4a and b) and was identified by SIMPER analysis to contribute to more than 10% of the dissimilarity between 12C and 13C communities (Supplementary Information 5).

Overall, with the exception of the Microbacterium, Pseudomonas, Salmonella and Rhodococcus phages, the remaining eight statistically significant phages are all known prophages of E. coli and have been shown to reduce Enterobacteria (including E. coli) numbers. However, unlike the protozoa, viral abundances changed dramatically with time (Figure 4), with increased abundance at 1, 4 and 96 h and lower abundance at the remaining time points. Such behaviour suggests that these viruses are fluctuating between states of pseudolysogeny/lysogeny and lytic pathways (Figure 6), behaviour widely documented for environmental phages (Abedon, 2008). This choice of life-cycle has been shown to aid in the regulation of bacterial biomass (Ripp and Miller, 1997), which is not taken into consideration in our calculations of removal. In addition, the extremely high abundance of phages 1 h after incubation with 13C-labelled E. coli was surprising; however, Zeng and Golding (2011) showed that E. phage lambda can infect, replicate and enter the lysogenic cycle within E. coli after only 80 min. Therefore, although viral lysis has been shown to be responsible for only 0.14% of the E. coli removal (Supplementary Information 7), it is likely that the phages are supporting the regulation of the population (by allowing the E. coli population to recover) to ensure sufficient hosts for subsequent viral infection (Abedon, 2008). In addition, the rapid appearance of E. coli phages in this experiment may reflect the heightened metabolic state of introduced E. coli at the start of the spiking period due to glucose availability.

Figure 6
figure 6

Trophic interaction diagrams involved with E. coli removal. Food-web showing ecosystem-wide involvement and trophic interactions involved in E. coli removal. Red shaded text shows the progressive relationship between host bacterial cells and different bacteriophage life cycles, adapted from Abedon [2008]. Red lines indicate lytic pathways and blue lines indicate lysogenic pathways. Purple shading represents predation by protozoa. Blue and green shading represents the mutulistic relationship between algae and fungi. The involvement of and production of ROS by interactions between algae and fungi are hypothesized based on work from other authors.

Ecosystem-wide associations are needed for successful E. coli removal

Although top-down regulatory trophic interactions, such as protozoan grazing and viral lysis are the major mechanisms for E. coli removal, the importance of indirect and abiotic mechanisms cannot be overlooked. For example, previous work has shown algae to be actively involved in E. coli removal by the production ROS (Curtis et al., 1992; Maynard et al., 1999; Feng et al., 2011), which causes lysis of E. coli and other bacterial species. In particular, extensive work has shown that Chattonella marina (one of the algae which significantly increased in labelled samples; Figure 5a) produces several ROS species (Liu et al., 2007) known to significantly reduce coliform numbers. Therefore, it is conceivable that algae are actively participating in E. coli removal by indirect mechanisms (Figure 6). Furthermore, as 11 out of the 15 significant algal species are mixotrophs (Nasr et al., 1968, Semple and Cain, 1996) it is likely that they accessed carbon from the released biomass from the 13C-labelled E. coli (via viral lysis and protozoan grazing).

In addition to indirect bacterial autolysis induced by algal ROS, these products help to explain the dominance of M. brevicollis (which contains ROS protection genes). This autolysis, alongside protozoa and viral lysis of labelled E. coli, will have increased the amount of free 13C-labelled biomass components available for fungal degradation, explaining the dominance of the fast-growing saprotrophs S. macrospora and C. globosum (Kavak, 2012) which dominated the fungal 13C community. This is likely to have amplified changes in the carbonate–bicarbonate equilibrium induced by algal respiration, which has been shown to induce elevated growth rates and fruiting body formation in Sordaria spp (Elleuche and Pöggeler, 2010). This, in turn, results in further elevated CO2 levels due to fungal respiration causing an additional imbalance in the carbonate–bicarbonate equilibrium and inducing a knock-on effect to the water pH, which has been shown to induce elevated algal ROS production (Liu et al., 2007). Such an association helps to explain the apparent mutualistic relationship displayed by fungi and algae during E. coli removal (Figure 6). Nonetheless, in addition to biological removal mechanisms, physical removal mechanisms such as straining, sedimentation and absorption have an important part in pathogen removal in SSFs (Haarhoff and Cleasby, 1991).

In summary, it was possible to ascertain that E. coli removal was achieved by both direct (protozoan grazing and viral lysis) and indirect (lysis induced by algal ROS production and fungal degradation of released biomass) mechanisms (Figure 6). These mechanisms appeared to occur simultaneously with the involvement of species from various kingdoms, in particular, fungi and algae, which exhibited mutualistic interactions. The highest removal of E. coli occurred between 1 and 3 h after spiking. This level of removal at these time points is consistent with the following characteristics of the 13C communities:

  1. 1

    Phages peaked in abundance at 1 h, with extensive replication as part of their lytic pathway, resulting in reduced E. coli numbers after 2 h.

  2. 2

    Protozoa numbers peaked at 2–3 h, allowing extensive grazing on 13C-labelled E. coli before this.

  3. 3

    Algal abundance peaked at 2 and 3 h, which was likely due to the increased availability of 13C-labelled CO2 and other inorganics, created during viral lysis and protozoan grazing.

  4. 4

    Fungal abundance peaked at 2 and 3 h, when extensive reduction in E. coli numbers occurred, hence releasing biomass for decomposition and resulting in changes in the carbonate–bicarbonate equilibrium inducing algal ROS production and further autolysis of E. coli.

Conclusion

Although various studies have shown the individual importance of viral lysis, protozoan grazing, and endogenous and exogenous ROS in E. coli removal (Curtis et al., 1992; Bomo et al., 2004; Grobe et al., 2006; Liu et al., 2007; Kadir and Nelson, 2014), this is the first study, to our knowledge, to indicate the importance and interactions of all of these mechanisms for E. coli removal. Further, our approach enabled us to identify that the majority of the E. coli removal is due to top-down trophic interactions, such as protozoan grazing by M. brevicollis and T. spp and viral lysis by E. phages. In addition, although E. coli K12 was used in this study it is highly likely that the mechanisms of removal of pathogenic strains of E. coli would follow similar routes. The protozoan grazers identified are nonspecific grazers, affected only by the size of the prey community and the phages identified are Enterobacteria-specific, rather than species-specific. However more work is required to determine if these removal mechanisms are similar for pathogenic and environmentally persistent bacterial species.

This study has shown that SSFs provide an ideal laboratory-scale system to study relevant and functional food webs. By applying cutting-edge molecular methods to these systems, we have furthered our understanding of the processes, mechanisms and organisms responsible for E. coli removal. The work and methodology adopted in this study will provide both a paradigm for similar studies and the opportunity to:

  1. 1

    Design and improve pathogen removal and overall performance of new and existing water purification systems by managing the community.

  2. 2

    Predict E. coli removal rates in natural treatment systems that have biological components, particularly during pollution and weather events.

  3. 3

    Further our understanding of complex food webs and trophic interactions.

  4. 4

    Create more complex and realistic trophic interaction models.

Future work should aim to develop more sophisticated trophic interaction models using data generated from DNA-SIP studies. These models should be further integrated into pathogen prediction models to allow better pathogen tracking and removal prediction. Finally, the conclusion that ecosystem-wide associations are essential for complete E. coli removal may help to explain the reduced performance of household purification systems, which support a less diverse ecosystem. Therefore, future work should investigate the ability and benefit that introducing a more complex community may create.