Introduction

As for all ecological systems, dynamics in bacterial communities are governed by interaction among community members as well as by the environmental conditions (Read and Taylor, 2001; Bell et al., 2005). In general ecological science one often encounters terms such as within and between species competition, cooperation and density dependence (Turchin, 1995; Stenseth et al., 1998; Ciannelli et al., 2005; Moe et al., 2005). Such concepts are much less frequently explored in the context of microbes (Hagen et al., 1982; Bradshaw et al., 1994; Rainey and Rainey, 2003; You et al., 2004; Balagadde et al., 2005). This is partly due to a historical division between the fields of microbiology and general ecology, but the difficulty of quantitative population data collection from complex microbial systems has also been a major factor contributing to the paucity of information on population dynamics in microcosm. Nevertheless, given a convenient sampling scheme, microbial communities as model systems offer certain advantages relative to traditional ecological field systems (Jessup et al., 2004). Large population sizes can provide analytical results of increased statistical robustness, short generation times enable researchers to follow hundreds of generations in a limited amount of time, and controlled laboratory settings ensure experimental tractability. Also, the wealth of knowledge on microbial cell function, along with the fact that microbes are easily amenable to genetic engineering, provides a unique opportunity to investigate the physiological and genetic underpinnings of ecological processes. Indeed, microbial model systems have, in several cases, proved valuable for ecological studies. The pioneering works of Gause during the 1930s, using experimental systems of eukaryotic microorganisms, were instrumental to the development of some of the basic tenets of population ecology (Gause, 1932, 1935), and since then, microbial models have been used to study various aspects of ecology and evolution (Gorden et al., 1969; Dykhuizen and Davies, 1980; Cooper and Lenski, 2000; Rainey and Rainey, 2003).

Even though a substantial amount of ecological research has been carried out using microbial model systems, functional descriptions of community dynamic structures are all but absent from the literature. For such a description to be formulated, time-series population data, collected on a scale relevant to the organisms under investigation, are required (Paerl and Steppe, 2003; McArthur, 2006). In the work presented here, we have taken a computational approach to quantification of individual species in mixed populations, based on multivariate analysis of bacterial community 16S rDNA sequence electropherograms (Trosvik et al., 2007). A model community consisting of the common gut bacteria Escherichia coli, Lactobacillus salivarius and Bacteroides uniformis was sampled hourly for a period of 28 h, and community composition was determined from the cell samples for each time point. This approach provided us with close interval time-series population data, and subsequent analysis ultimately produced a functional description of the dynamic structure of this simple model system. Furthermore, using online Fourier transform infrared (FT-IR) spectroscopy we were able to follow the entire growth process in terms of main metabolites. Such an analysis has, to our knowledge, not been carried out previously. The study integrates multivariate analysis of molecular genetic—and spectroscopic data with nonlinear statistical modeling of ecological data, highlighting the great potential for a synthesis between traditional ecology, molecular genetics and microbiology in order to advance our knowledge of fundamental population dynamic phenomena in microbial communities.

Materials and methods

Mixed bacterial populations

In order to obtain dynamic community growth data, three species of common gut bacteria were grown in an anaerobic fermentor system. The species used were type strains of B. uniformis (ATTC 8492), E. coli (ATTC 25922) and L. salivarius (DSMZ 20555, ATTC 11741). Prior to inoculation of the fermentor each strain was grown for 24 h in 10 ml tubes of anaerobe basal broth (Oxoid, Basingstoke, UK) under anaerobic conditions. For the initial inoculate, 2 ml of each of the three cultures was transferred to a clean tube, and the fermentor containing 1.5 l of sterile anaerobe basal broth was immediately inoculated.

The fermentor was a 2 l glass chamber equipped with a flange designed to fit an accompanying glass lid with multiple connection ports. The lid was fitted tightly using a steel clamp. One port was connected to a gas tank containing a mixture of N2 (80%) and CO2 (20%). The chamber was continually flushed with this gas during the incubation period in order to obtain anaerobic conditions, as well as to achieve mixing of the culture. A second port was used as a gas outlet to avoid pressure buildup. A third port was opened only during collection of cell samples, which was carried out using 5 ml sterile pipettes. Two ports connected the chamber to the FT-IR measurement cell. The chamber was partially submerged in a stirred water bath at 37 °C in order to obtain normal ambient temperature of the species involved.

The first cell sample was taken 2 h after inoculation, and subsequent samples were collected every hour, for a total of 27 samples. pH values were also measured at the points of sample collection.

Optical density at 600 nm (OD600) was measured in all the fermentation samples using an Ultrospec 3000 UV/Visible spectrophotometer (Pharmacia Biotech, Chalfont St Giles, UK).

FT-IR spectrocopy

Throughout the process, we measured FT-IR spectra of the culture, the main objective being to acquire information on metabolites. We used a two chamber attenuated total reflectance cell fitted with an IR transparent optical crystal (ZnSe), and an Equinox 55 spectrometer (Bruker Optics, Ettlingen, Germany). The measurement system was connected to the growth chamber by a looped plastic tube, with growth medium being pumped through the cell and back into the chamber by a peristaltic pump. Spectra were collected automatically at approximately 12 min intervals.

We analyzed the spectral region corresponding to wavenumbers 3000–2800 cm−1, in which vibrational modes characteristic of fatty acids can be observed (Socrates, 2001). Prior to analysis, the second derivative was calculated before extended multiplicative signal correction (Kohler et al., 2005) was applied. The spectra were analyzed using the software package The Unscrambler (Camo Process, Oslo, Norway).

DNA extraction, PCR amplification and DNA sequencing

Isolation of total DNA was carried out using a Biomek 2000 Laboratory Automation Workstation (Beckman Coulter Inc., Fullerton, CA, USA) and MagPrep Silica particles (Merck, Whitehouse Station, NJ, USA) according to an optimized automated protocol (Skanseng et al., 2006).

PCR amplification of 16S rDNA was carried out using a universally conserved primer set (Nadkarni et al., 2002). The PCR reaction mixture contained 1.25 U of AmpliTaq Gold DNA polymerase (Applied Biosystems, Foster City, CA, USA), 200 μM dNTP mix, 1 μl of template DNA with 0.2 μM of each primer and 2.5 mM MgCl2 in a total volume of 25 μl. The amplification profile was a 5 min activation step at 95 °C followed by 25 cycles of 95 °C for 15 s, 50 °C for 15 s and 72 °C for 1 min. The reaction was terminated with a 7 min elongation step at 72 °C.

Cyclic labeling and chain termination for sequencing was carried out using the BigDye v1.1 sequencing chemistry (Applied Biosystems) and a nested sequencing primer, U515F (Baker et al., 2003), according to instructions supplied by the manufacturer.

The three species involved in the experiment were identical with respect to both PCR and labeling primer sequences.

Excess dye and labeling primers were removed from the labeled DNA fragments using the Montage SEQ96 Sequencing Reaction Cleanup Kit (Millipore Corp., Billerica, MA, USA), and a Biomek 2000 Laboratory Automation Workstation, according to instructions supplied by the manufacturer. Sequencing was carried out using an ABI PRISM 3100 Genetic Analyzer (Applied Biosystems).

Estimation of species composition in fermentation samples

For estimation of relative species abundances in cell samples from the fermentation, we used an approach of direct DNA sequencing and application of a set of partial least squares regression (PLSR) models to mixed sample sequence electropherograms for a 35 nucleotide region (positions 622–656 in E. coli) of the 16S rRNA gene (Trosvik et al., 2007). In brief, the regression technique relies on a set of training spectra, X, for mixtures of gene fragments of the relevant genetic region, corresponding to a matrix of known mixture ratios, Y. Y is subsequently modeled as a function of X by PLSR, a type of factor analysis that uses the covariance structure between X and Y in order to find latent variables (PLS components) in X. The active use of Y in the analysis ensures that the extracted components are optimal for prediction of mixture ratios in new samples of unknown composition (Martens and Næs, 1989). Design, production and pre-processing of PLSR calibration data used in this work have been described previously (Trosvik et al., 2007).

PLSR models, one for quantification of each species in the experiment, were computed with full cross-validation, producing three linear predictors of the form: yi=βi1x1+ … +βi1800x1800, where yi is the proportion (approximate percentage) of either B. uniformis, E. coli or L. salivarius. βi1, …, βi1800 are regression coefficients for prediction of yi, and x1, …, x1800 are the entries of the relevant spectral vector for a mixed sample. For all the models, the linear predictors were based on three PLS components (Figure 1). PLSR model statistics can be seen in Table 1.

Figure 1
figure 1

Panels a–c show convergence of partial least squares regression (PLSR) models on three PLS components. The figure shows root mean squared error of prediction (RMSEP, red lines) and squared Pearson's correlations (R2, black lines) calculated from cross-validation of the three PLSR models. Correlations are between actual mixture ratios and proportions predicted in cross-validation. For each model the RMSEP decreases rapidly while R2 increases in a similar fashion until they both stabilize upon the inclusion of the third PLS component.

Table 1 Summary statistics demonstrating the good predictive ability of PLSR models for quantification of species composition in fermentation samples

Raw spectral data from the fermentation sample sequences, corresponding to the region used in the calibration were extracted from the raw data files and pre-processed as previously described (Trosvik et al., 2007). The resulting data matrix of dimension (28 × 1800) was used as input for the PLSR models. Relative abundances of the three bacterial species were calculated for all 28 time points (the initial inoculate being time=0) providing us with time-series data for the duration of the process.

The species composition in these samples was also checked by quantitative tRFLP analysis (Supplementary Figure S1 and Table S1).

All computations were carried out using the statistics programming interface R (R Development Core Team (2006). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria). PLSR was carried out using the R-package ‘pls’ using the orthogonal scores algorithm for model fitting.

Time-series modeling

The time-series data were analyzed using the R implementation of generalized additive models (GAM) in the mgcv library (Wood, 2000). This is a nonparametric modeling approach which does not presuppose any underlying functional form between response and predictor variables, allowing for effective modeling of nonlinear relationships. The technique fits smooth additive functions to each model covariate where the smooth terms are weighted sums of basis cubic spline functions. The weights are estimated according to the generalized cross-validation (GCV) criterion (Wood, 2000). The response variable is then approximated by the sum of the smooth terms plus an intercept representing the response mean. For a systematic account of the GAM methodology we refer to Wood (2006).

For each smooth term, the dimension of the spline basis used is set prior to model fitting. In practice, this is equivalent to setting an upper limit to the degrees of freedom associated with the smooth term. This, in turn, determines the degree of curvature tolerated for the smooth in question. In order to avoid overfitting, we constrained the maximum degrees of freedom to three in the case of each model. The general model as applied to the case of E. coli growth is as follows:

where ΔEt is the per time unit changes in relative bacterial abundance computed as the difference in logarithmic abundance between two consecutive measurements (that is, log(Et+1)−log(Et). fe, ge, he, ke and le are nonparametric smooth functions specifying the effects of E. coli (that is, density dependence), B. uniformis, L. salivarius, pH and OD600, respectively. By convention the smooth terms are of zero mean over the data. be is an intercept and ɛt is the noise term. For the analysis, all the independent variables, save pH, were log transformed. The same formulation was used to specify the models for relative abundance change rates for the other two bacterial species. Data for the initial inoculum were not included in the time-series analysis. The general model formulation above implies the absence of interactions between the covariates. We used a Lagrange multiplier test (Chan et al., 2003) to validate this assumption in the case of each model equation. The test showed no significant interactions (Supplementary Table S2), suggesting that the general, fully additive model formulation used above is appropriate.

To select the optimal model structure of each GAM, we used the GCV criterion (Wood, 2000). This criterion provides a measure of the goodness-of-fit model, balancing prediction accuracy with model complexity, and the GAM structure that minimizes the GCV score is generally preferred. The model fitting algorithm provides approximate P-values indicating significance of model smooth terms. Final models were selected by successively removing nonsignificant terms from the full models, thereby minimizing the GCV scores (Supplementary Table S3).

Time-series simulation was carried out using the last observed values of the covariates (relative bacterial abundances and pH measured at 27 h) and predicting the next time-step using the fitted GAMs. For each step of the simulation, covariate values representing bacterial abundances were updated using model predictions from the previous time-step. Also, for each simulated step, errors were sampled randomly, with replacement, from the pertinent residual vectors from Equation (1). In order to maintain the contemporaneous correlation structure, sampled error terms for all three bacterial species always corresponded to the same observed time-step. Throughout the simulation, pH was kept relatively constant (around the final measured value of 5.41), but some variation was allowed by adding a random normal error term to the starting value, for each step, with μ=0 and σ=0.1. The same simulation was also carried out using starting values for pH within a 1 U range of the final measured value.

Results

During the fermentation experiment, 27 cell samples were collected at hourly intervals (28 samples counting the starter culture). Relative abundances of the three species present in the 28 samples were estimated, providing us with time-series data for the duration of the process. Figure 2a shows the fluctuating population proportions from the starter culture to the final sample collected from the fermentation chamber. Figure 2b shows pH, OD600 and metabolic data measured for the 27 collected samples.

Figure 2
figure 2

Time-series data collected from the fermentation. (a) Proportions (in percents) of the three bacterial species in the initial inoculum (time=0) and the 27 fermentation samples. (b) pH, optical density at 600 nm (OD600) and metabolic data (volatile fatty acids, VFA) for the 27 sampling points. The curve-labeled VFA are scores along the first component of a partial least squares regression (PLSR) model with process references (response variables) and Fourier transform infrared (FT-IR) absorption measurements (explanatory variables), plotted against time. These scores are indicative of changing ratios of methylene/methyl content of metabolites in the culture (see text below and Supplementary Figures S3 and S4).

In order to describe the successive dynamic structure of the microbial system under study, we fitted a GAM to each population time series. The following models (Equation (1)) were selected from the general class of full additive models according to the GCV criterion (Supplementary Table S3):

Model diagnostics showed that the residuals were white (Supplementary Figure S2) and approximately normally distributed. We assessed the goodness-of-fit of the models by checking the additivity assumption (Chan et al., 2003) for each equation, and there was no evidence of nonadditivity (Supplementary Table S2). Hence, the model provides a good fit to the data. The residuals were, however, contemporaneously, negatively correlated (Supplementary Table S4), that is, community competition is reflected in both the conditional mean and covariance structure.

Descriptive statistics for the three selected models are presented in Table 2, demonstrating highly significant effects of the smooth terms and major proportions of explained variance in the case of each model. Figure 3 summarizes the forms of the functional relationships between the relative proportion change rates and the independent covariates (log-relative abundances or pH) for the three species. The models indicate some interspecies interactions in the system. In the case of L. salivarius there is an apparent negative linear relation to increasing proportions of B. uniformis (Figure 3d), while E. coli and B. uniformis have a reciprocal and explicitly nonlinear interaction (Figures 3a and h). All three species display a density-dependent effect of some sort. For L. salivarius this effect is near-linear (Figure 3e), whereas for E. coli and B. uniformis the corresponding negative effects level off at high relative abundances (Figures 3b and g).

Table 2 Summary statistics for the three selected GAMs (Equation (1) stating the degree of linearity (edf, see below) and significance of smooth terms
Figure 3
figure 3

Panels a–h are plots of smooth terms from the generalized additive models (Equation (1)). These summarize the functional relationships between response variables and covariate terms for each model equation. The abscissae give the values of the covariate terms (log-relative abundances and pH). The ordinate axes indicate the partial additive effects of the covariate terms on the response variables (log proportiont+1−log proportiont). The predicted values of the responses are given as the sum of the smooth terms plus an intercept term. The broken lines are 95% confidence bands of the fitted curves.

E. coli, not unexpectedly, has a linear positive relationship with pH within the experimental range (Figure 3c), increasing toward neutral which is optimal for the growth of this species. L. salivarius, being well adapted to an acidic environment, shows the opposite trend with a slightly curved positive relationship corresponding to declining pH, appearing to stabilize around 5.6 (Figure 3f). No significant effect of changing pH was found in the case of B. uniformis, indicating robustness to this environmental perturbance. OD600, representing total bacterial density, was not found to have a significant effect in any of the models.

The main metabolites expected in this fermentation are acetate, lactate, propionate and succinate. We thus focused on the FT-IR spectral region corresponding to fatty acid signatures (3000–2800 cm−1). A PLSR model was computed in order to explore correlations between spectral wavenumbers and the process reference variables (bacterial relative abundances, time, pH and OD600). The first model component was found to contain the bulk of relevant information (Figures 2b, Supplementary Figures S4 and S5), explaining 48% and 49% of X and Y variance, respectively. Along this component, time and OD600 correlate strongly with wavenumbers around 2925 and 2850 cm−1 (Supplementary Figure S4). These wavenumbers correspond to the antisymmetric and symmetric C-H stretch modes of methylene groups (Supplementary Figure S5). On the opposite side of the first component axis, pH correlates with wavenumbers around 2960 and 2870 (Supplementary Figure S4). These regions are characteristic of the antisymmetric and symmetric C-H stretch modes of methyl groups (Supplementary Figure S5). The first model component also separates B. uniformis and L. salivarius (correlating with time, OD600 and methylene vibrations) from E. coli (correlating with pH and methyl vibrations). In general, methylene groups are associated with longer fatty acids (the groups being internal in molecules), while methyl groups are associated with shorter chains (being terminal in molecules). Our data clearly indicate a gradual shift toward fatty acids with more methylene groups throughout the fermentation (Figure 2b). This shift is also concurrent with increasing acidity and optical density in the batch culture.

Discussion

This study of a simple model system demonstrates how bacteria living in communities are affected by each other and by the physical environment surrounding them. While the interaction between L. salivarius and B. uniformis indicates a scenario of direct competition in terms of the effect of Bacteroides on Lactobacillus (Figure 3d), the converse effect is not observed. One likely explanation for this is cross-feeding between the two species. While both consume glucose (the primary nutrient of the growth medium), and compete for this nutrient, Bacteroides may also ferment lactate (Macy et al., 1978; Schultz and Breznak, 1979), the main metabolic by-product of L. salivarius’ metabolism. For this nutrient, there would be no competition between the two species, rendering the competitive relationship unreciprocal. This may also explain, to some extent, the fermentation pattern observed in FT-IR spectroscopy, where both B. uniformis and L. salivarius correlate with progressive formation of methylene-containing fatty acids. In our system these would primarily be propionate and succinate produced by Bacteroides fermenting both glucose and lactate.

The interactions between B. uniformis and E. coli are more complex. In both cases, the smooth functions have a parabolic shape where the covariate species appears to have a positive effect on the response species up to a point where the relationship turns negative (Figures 3a and h). One positive feedback effect of E. coli might have on B. uniformis growth is that of removing residual oxygen from the ambient growth medium (Hagen et al., 1982). At low E. coli abundances this might be beneficial to the strictly anaerobic Bacteroides species, but at a certain level of E. coli abundance the relationship may shift to one of direct competition, outweighing the benefits of oxygen removal and thus having a negative effect on B. uniformis growth. The reverse case is similar, but with a more pronounced positive relationship, leveling off and turning negative at some point of Bacteroides abundance. This is likely a result of Bacteroides’ metabolism feeding back on E. coli growth. Excessive amounts of lactate may be toxic to many bacteria, and lactic acid bacteria are known to inhibit growth of E. coli (Tomas et al., 2003). Bacteroides can, to some extent, remove this deleterious metabolite, fermenting it mainly to propionate, and thus relieve the inhibition. This scenario is also supported by the FT-IR spectroscopic data where Bacteroides shows the highest degree of correlation with the transition from nonmethylene (acetate and lactate) to methylene-containing (propionate and succinate) acids. In our system, however, the commensal relationship shifts at some point when direct competition for nutrients becomes the dominant phenomenon.

Altogether, species interactions appear to be an important factor in our system, and metabolic commensalism of the kind observed is most likely a general characteristic of bacterial communities (Bradshaw et al., 1994). Through variable selection according to the stated criteria, we were able to identify, as well as to quantify, significant interactions taking place between different bacteria, and between bacteria and the environment. The selected model formulation (Equation (1)) benefits from the simplifying assumption of additivity (that is, within each model equation the effect of a given covariate is not contingent on changing values of another covariate). This assumption is often rather strong, but it can greatly simplify both model fitting and interpretation. In this particular case, the additive formulation used in Equation (1) was found to be appropriate (Supplementary Table S2). As such, we were able to characterize the isolated effect of each covariate, providing a clear picture of the system's dynamics.

By applying statistical methods traditionally used by animal ecologists (for example, Ciannelli et al., 2005), we estimated the underlying density-dependent effects of all three bacterial species. In the case of L. salivarius, this effect is indicative of a simple metabolic feedback mechanism (Figure 3e). As for E. coli and B. uniformis, the density-dependent effects are nonlinear, leveling off at high proportions (Figures 3b and g). These nonlinearities may be caused by interaction between the two species. Since direct competition for nutrient substrates does not take place before some critical abundance is reached, the negative density dependence becomes the dominant feedback mechanism. Once the point is reached where competition has significant effect on the growth of the two species, the interspecies struggle for nutrients may dissipate the density-dependent effects.

It should be noted that since the population data presented in this study are from a batch culture experiment, our model (Equation (1)) does not necessarily reflect steady-state dynamics. The fact that no new resources were added to the culture during the period of data collection will most likely have had an effect on the level of competition between the bacteria. Thus, ascertaining the degree to which our model is influenced by transitional phase dynamics is difficult. However, when we simulated 1000 h of continued growth with the dynamic model estimated for our system of bacteria (given a growth system with continuous supply of nutrients and dilution), using Equation (1), the process remained quite stable throughout the simulation (Supplementary Figure S3). In the simulation, pH was represented by a constant term plus independent normal errors with zero mean and standard deviation of 0.1. Constant term values for pH within a 1 U range of the final measured value of 5.41 were tried, producing similar results. Thus, for a relatively stable pH, the simulation indicates that the fitted model (Equation (1)) may reflect a steady state.

It is hardly surprising that bacterial community members compete for growth limiting nutrients. The functional forms of such dynamic relationships, however, have not been thoroughly explored. Infections of, for example, the gastrointestinal system involve complex interactions among the pathogenic species, the resident microflora and the environment (the host), about which relatively little is currently known (Hooper and Gordon, 2001; Eckburg et al., 2005; Nicholson et al., 2005). The long-term dynamics (that is, stability, cyclicity or chaos) of pathogens and commensals will largely be determined by the density-dependent structure of the system. As shown here, density dependence can be heavily influenced by interspecies interactions. Such knowledge is important when it comes to treating and preventing the colonization of human and animal pathogens, and further experimentation will be needed in order to learn more about long-term infection dynamics. Even though in vitro systems, in most cases, are far from being representative of natural scenarios, such simplifications of the natural world may serve to increase our understanding of fundamental dynamic properties of microbial consortia, perhaps enabling us to predict the behavior of, and ultimately to manipulate such systems.