Like many other environments, Lake Mendota, WI, USA, is populated by many thousand microbial species. Only about 1,000 of these constitute between 80 and 99% of the total microbial community, depending on the season, whereas the remaining species are rare. The functioning and resilience of the lake ecosystem depend on these microorganisms, and it is therefore important to understand their dynamics throughout the year. We propose a two-layered set of dynamic mathematical models that capture and interpret the yearly abundance patterns of the species within the metapopulation. The first layer analyzes the interactions between 14 subcommunities (SCs) that peak at different times of the year and together contain all species whereas the second layer focuses on interactions between individual species and SCs. Each SC contains species from numerous families, genera, and phyla in strikingly different abundances. The dynamic models quantify the importance of environmental factors in shaping the dynamics of the lake’s metapopulation and reveal positive or negative interactions between species and SCs. Three environmental factors, namely temperature, ammonia/phosphorus, and nitrate+nitrite, positively affect almost all SCs, whereas by far the most interactions between SCs are inhibitory. As far as the interactions can be independently validated, they are supported by literature information. The models are quite robust and permit predictions of species abundances over many years both, under the assumption that conditions do not change drastically, or in response to environmental perturbations.
Lake Mendota, WI, is home to over 18,000 microbial Operational Taxonomic Units (OTUs). The OTUs are defined at the 97% 16S rRNA gene identity level and serve as proxies for species.1 1,140 OTUs (5%) constitute between 80 and 99% of the total microbial community (Supplementary Figure S2), depending on the time of the year, whereas the remaining OTUs are rare. The OTU composition of the community changes markedly throughout the year, and the dynamics of these changes is an important determinant of the functionality of the lake. In particular, it has been shown that higher microbial species diversity is typically associated with more robust and resilient ecosystems.2 Thus, if the normal, healthy interaction dynamics could be quantified, then one could possibly develop tests, based on sentinels or early biomarkers, predicting ecosystem health or potential problems in the near future. The challenge is that the interaction dynamics of OTUs is difficult to assess owing to their sheer number, and because most of the microbes cannot be cultured in the laboratory.3 Simple algebra says that potentially over 300,000,000 pairwise interactions would have to be considered, because the interactions can easily be ‘asymmetric’ in a sense that the effect of OTU-A on OTU-B is different from the reverse effect.
Two approaches are currently used to infer the relationships among microbial species from 16S-rRNA amplicon data.4 The first establishes correlation networks that are based on the presence, absence, or abundance of the species across multiple locations or time points.5,
The second approach utilizes differential equations to reconstruct dynamic networks.11,
Here we propose slightly modified LV models, which become manageable owing to a novel manner of parameter estimation based on linear regression.23 The models capture not only the metapopulation dynamics of the more than 1,000 highly abundant species in Lake Mendota, but also the pairwise interactions between individual OTUs and other SCs. To the best of our knowledge, this is the first time that LV models of the magnitude addressed here are applied to a real-world system.
Yearly abundances of 14 subcommunities
The top 200 parametric instantiations of the SC model (see Materials and Methods) are able to capture the dynamic trends well (Figure 1). They also correlate well with the trends of the observed abundances. Although the figure only shows the abundances during 2 years, the models successfully run for at least 50 consecutive years, if the conditions do not change drastically (not shown).
Twelve of the 14 SCs peak once per year, whereas SC 13 and SC 14 peaks twice (Figure 1 and Supplementary Figure S5). Supplementary Figure S5 shows that the fold-change profiles within SCs are very similar. It also reveals that the peaks are highly relevant, with an abundance that is 3- to 10-fold higher at the peak than the minimum abundance. Throughout the year, the total abundances of SC1–SC13 constitute 88.4–94.9% of the entire population (Supplementary Figure S6a).
Pairwise interactions between the 14 subcommunities
Using the best 200 model instantiations, we computed the means and s.d. of the parameter values (Figure 2). Among them, the estimated αij and βik values, when normalized (divided by − αii), are consistent with very small s.d. About two-thirds of all αij’s (62–67% in each model) are negative, which suggests strong competition between SCs (Supplementary Figure S7). Intriguingly, the terms βik×Xk are usually much higher than the corresponding terms αij×Xj: although Xj and Xk change over time, the median values of normalized αij×Xj and βik×Xk are 2.8 and 9.6, respectively. Expressed differently, the environmental conditions appear to have a greater effect (per unit of abundance) on the abundance of a SC than other SCs, at least qualitatively (Figures 2 and 3).
Interestingly, the means of the αii values for SC 7, 8, and 9 are the smallest in magnitude (Supplementary Figure S7). This result may reflect that these SCs, which peak in July through September when the water temperature is highest and biomass is higher, have the lowest death rates due to ‘crowding.’
The αij matrix is asymmetrical, because interaction effects are not necessarily reciprocal. Among the pairs αij and αji, about 75%, 4%, and 21% are −/−, +/− or +/+, respectively. The number of positive αji values is smaller than in studies of communities growing in human or mouse gut or on spoiling pork,13,14,21 suggesting that the availability of food sources may affect the types of relationships differently within each habitat.
Except for SC 4, 11, 13 and 14, all SCs are positively affected by environmental conditions (Figures 3 and 4). Ammonia/phosphorus, which rapidly declines in April, negatively affects SC 4, which peaks in April. Nitrate+nitrite, which is low in November, negatively affects SC 11, which peaks in November. In SC 11, two of the five top OTUs belong to the family Oxalobacteraceae and one to the family ACK-M1. These families are either responsive to ammonia24 or have members that fix nitrogen25 (Supplementary Figure S10, Supplementary Table S5). Interestingly, SC 13 and 14 have very small (β ik) values, indicating relative tolerance to variations in environmental conditions.
To summarize, the pairwise interactions between SCs are mostly negative, whereas the environmental effects on SCs are mostly positive.
Bacterial distribution within subcommunities
Almost all (18,642) of the identified OTUs were classified into 63 phyla; only 12 OTUs do not have a phylum classification. For each OTU, we computed the average abundance over all data points, and for each phylum, we summed the abundances for all OTUs, then ranked them based on the total abundance. The top seven phyla, accounting for 92.6–99.9% of the population are: Actinobacteria, dominant in SC1, 3, 4, and 13; Proteobacteria in SC2, 7–11 and 14; and Bacteroidetes in SC5 and 12 (Table 1). Notwithstanding the dominance of particular phyla, each bacterial SC contains bacterial OTUs from a broad range of taxonomic groups. This result is not surprising, because each SC has to execute a wide array of tasks. It also reveals why clustering by taxonomy is not an effective strategy for characterizing the interaction dynamics in the lake.
Using the software PICRUSt26 and the Greengenes Database,27 we assigned KEGG functions to the the OTUs presented in 14 SCs. In total, 42.2% of the total community were mapped to KEGG pathways. We observed specific enrichment of certain pathways in SCs, as shown in Supplementary Figure S13. Data are available at http://www.bst.bme.gatech.edu/research.php.
Abundances of individual OTUs
We assessed the abundances of individual OTUs using three models, as described in the Methods Section. Among the top 1,140 OTUs, 89.3% can be predicted successfully when the individual OTU is implemented as a new group and the parameters are reoptimized (Model #3; Supplementary Figure S14). Interestingly, the αij×Xj and βik×Tk terms of OTUs belonging to the same phylum, class, or order cluster together and are significantly different from random clusters (Supplementary Table S4). For the top 1,140 OTUs, we extracted 922 OTUs whose abundances are predicted best by Model #3. We found that the pairs αi,sc/αi,sc are often positively correlated, whereas the pairs βik/αi,sc are often negatively correlated (data not shown). This result suggests that the change in the abundance of an OTU is driven either by competition with other bacteria in the community or by positive influences from the environment. Examples of the dynamics of individual OTUs are given in Figure 5. The Supplement Information provides further details.
The individual OTU–SC interaction network adds a second layer to our investigation. The first layer (SC model) captures pairwise interactions between SCs that reflect average effects contributed by all OTUs in each SC. At the second layer, OTU–SC interactions describe the effects of each SC and of the environmental conditions on an individual OTU. As an example, OTU#141903 (a member of the family Nitrosomonadaceae) has a large positive βi,2 value, which indicates that it is strongly, positively affected by ammonia. Although we cannot assign this OTU to a more specific taxonomic group, previous studies suggest that all cultivated representatives of this group are able to oxidize ammonia,28 which reflects our result. OTU#517152 (a member of the genus Roseomonas) has a small negative αii value and a large positive βi,2 value, suggesting that this species has a relatively low death rate and is strongly affected by temperature. Various members of this genus are well-studied aquatic organisms. They were described as slow growing29 and growing better at 25–28 °C,29 than in colder water, in some cases thriving up to 42 °C.30 The Supplementary Information offers further discussions (Supplementary Table S6).
Among these results, we identified 33 OTUs with outliers in βik or αi,sc values and good abundance prediction results and searched the literature for evidence to support or reject our predictions. We found indirect evidence to support the prediction of 15 OTUs and evidence for one, suggesting that further investigation is needed (Supplementary Tables S7a,b). For the remaining OTUs, little is known about their characteristics. These results are summarized in the Supplementary Information. A table with notable interactions among SC-OTUs is available at http://www.bst.bme.gatech.edu/research.php.
Naturally occurring microbial consortia in lakes, and elsewhere, follow annual cycles, where species abundances are correlated with seasonal changes in environmental conditions.31,
Recent metagenomic sequencing technologies have revolutionized this line of investigation. However, while OTU abundances are informative, they do not by themselves convey the dynamics within a metapopulation, but require computational analysis. We perform such an analysis here with LV models (Supplementary Figures S8–S12 and S15, Supplementary Tables S1–S3 and S5). Our models suggest that the dynamics of OTUs can be described in terms of the parameters αij and βik, and that these parameters are biologically relevant, as they signify the strength and nature of interactions between OTU groups as competitive, parasitic, commensal, or neutral (Supplementary Figure S8). The interaction models of individual OTUs furthermore generate hypotheses about the importance of environmental factors and other bacterial groups on the growth of individual OTUs. The models could in principle be used to predict consequences of changes in OTU distribution, but it is unclear how to validate such predictions. For example, we used the SC model to test the effect of environmental conditions on the abundances of SCs (Supplementary Figure S11). Seven SCs (1, 3, 5, 10, 11, 13, and 14) were predicted to return to their normal abundance patterns when the disturbances ended. Other SCs were strongly affected by the environment and their abundance profiles did not recover even several years after the disturbances stopped.
In most other network models, the grouping of OTUs has been based on taxonomy,12,
Although the paper focuses on an aquatic metapopulation, it is easy to imagine that similar types of analyses could be applied to other microbial consortia that display periodic annual or daily patterns.
Materials and methods
The data were collected at 91 time points, from March 2000 to June 2011,31,32,37 and made publicly available at www.lter.limnology.wisc.edu.37,38 The dataset consists of abundance measurements, which were interpreted through 16S-sequences. Using the software Qiime,39 with 97% identity as a cutoff, and the Greengenes database27 (greengenes.lbl.gov), 18,696 OTUs were identified (see Supplementary Information for details).
Also measured were nineteen physical and chemical conditions of the lake, collected from 1995 to 2013;38 see references 6,40 and Supplementary Figure S1. Fourteen of these remain fairly constant, while water temperature, nitrate+nitrite, ammonia, total phosphorus unfiltered, dissolved reactive phosphorus, and dissolved reactive silica vary substantially over time.
In order to manage the large number of OTUs, we first followed conventional wisdom and clustered the OTUs by taxonomy (cf. references 12,
We therefore decided to cluster differently, based on the annual peak time for each OTU. For each OTU, the abundances throughout the years 2000–2011 were superimposed, which resulted in a single, ‘collective 1-year period.’ The results were smoothed by computing the mean value of each 30-day window (Supplementary Figure S4). These smoothed profiles reflect the seasonal changes in abundances well. We omitted from clustering OTUs with only one observed data point and OTUs whose abundances were indicated by the smoothed curves to be zero.
For each OTU profile, we identified the positions of the top one or two abundance peaks and then clustered OTUs based on these peak profiles. This analysis resulted in 13 groups plus one additional group for all remaining OTUs. We refer to these 14 groups as subcommunities (SCs).
We chose water temperature and two chemical conditions (ammonia and total nitrate+nitrite) that follow distinct annual pattern. The patterns of other chemical conditions were omitted because they were highly related to the chosen patterns (Supplementary Figure S1a). The data were processed similarly to the abundance data. Their variability over the years fell within ranges of the mean±s.d. of the observed data, superimposed onto one ‘typical year’ (Supplementary Figure S1b).
In our modeling format, Xi is the abundance of an OTU or SC i. The interactions between Xi and other Xj’s and with environmental conditions Tk are described through product terms, which have their origin in mass action kinetics.41 The model takes the form:
Ẋi is the rate of change of variable i, and the indexed parameters α and β indicate the type and strength of an interaction between pairs of OTUs or between OTUs and the environment, respectively. We use this structure to represent interactions among the 14 bacterial SCs and among individual OTUs and SCs. The quality of results is assessed with two similarity scores (see Supplementary Information).
Ignoring the less interesting situation that Xi=0, Equation (1) can be rewritten as
If abundances and slopes can be determined from the time courses of all SCs, this equation becomes an algebraic system of linear equations.23,42,
Predicting the abundances of individual OTUs
We also used the model to predict the abundances of individual OTUs. Formally, the model has exactly the same format as in Equation (1). However, to test whether environmental conditions alone could model the data (Model #1), all αij parameters were set to zero, except for αii, and the βik values were re-estimated. For Model #2, αij and βik values were chosen from the filtered parameter values described in the Supplementary Information. For Model #3, we removed the OTU of interest from its SC and considered it as a new group. The αij and βik values were then re-estimated for individual OTUs, based on 100 values for αii selected from the range [−1, 0]. The goodness of fit was evaluated with similarity scores (see Supplementary Information).
This work was funded by grant DEB-1241046 of the National Science Foundation, USA. The authors are grateful to Luis-Miguel Rodriguez Rojas and Despina Tsementzi for their help with the OTU identification, beneficial discussions, and valuable feedback.