Article | Open

# Dynamic models of the complex microbial metapopulation of lake mendota

• npj Systems Biology and Applications 2, Article number: 16007 (2016)
• doi:10.1038/npjsba.2016.7
Revised:
Accepted:
Published online:

## Abstract

Like many other environments, Lake Mendota, WI, USA, is populated by many thousand microbial species. Only about 1,000 of these constitute between 80 and 99% of the total microbial community, depending on the season, whereas the remaining species are rare. The functioning and resilience of the lake ecosystem depend on these microorganisms, and it is therefore important to understand their dynamics throughout the year. We propose a two-layered set of dynamic mathematical models that capture and interpret the yearly abundance patterns of the species within the metapopulation. The first layer analyzes the interactions between 14 subcommunities (SCs) that peak at different times of the year and together contain all species whereas the second layer focuses on interactions between individual species and SCs. Each SC contains species from numerous families, genera, and phyla in strikingly different abundances. The dynamic models quantify the importance of environmental factors in shaping the dynamics of the lake’s metapopulation and reveal positive or negative interactions between species and SCs. Three environmental factors, namely temperature, ammonia/phosphorus, and nitrate+nitrite, positively affect almost all SCs, whereas by far the most interactions between SCs are inhibitory. As far as the interactions can be independently validated, they are supported by literature information. The models are quite robust and permit predictions of species abundances over many years both, under the assumption that conditions do not change drastically, or in response to environmental perturbations.

## Introduction

Lake Mendota, WI, is home to over 18,000 microbial Operational Taxonomic Units (OTUs). The OTUs are defined at the 97% 16S rRNA gene identity level and serve as proxies for species.1 1,140 OTUs (5%) constitute between 80 and 99% of the total microbial community (Supplementary Figure S2), depending on the time of the year, whereas the remaining OTUs are rare. The OTU composition of the community changes markedly throughout the year, and the dynamics of these changes is an important determinant of the functionality of the lake. In particular, it has been shown that higher microbial species diversity is typically associated with more robust and resilient ecosystems.2 Thus, if the normal, healthy interaction dynamics could be quantified, then one could possibly develop tests, based on sentinels or early biomarkers, predicting ecosystem health or potential problems in the near future. The challenge is that the interaction dynamics of OTUs is difficult to assess owing to their sheer number, and because most of the microbes cannot be cultured in the laboratory.3 Simple algebra says that potentially over 300,000,000 pairwise interactions would have to be considered, because the interactions can easily be ‘asymmetric’ in a sense that the effect of OTU-A on OTU-B is different from the reverse effect.

Two approaches are currently used to infer the relationships among microbial species from 16S-rRNA amplicon data.4 The first establishes correlation networks that are based on the presence, absence, or abundance of the species across multiple locations or time points.5,​6,​7,​8,​9 The vertices represent species, whereas the edges represent either pairwise or complex relationships. Pairwise interactions are typically characterized with a similarity index or a modified Pearson Correlation Coefficient (PCC),5,​6,​7,​8,​9 while complex relationships are derived from regression or rule-based networks.4,10 Although static correlation networks can address large and complex communities of thousands of species across multiple environments,5,10 they do not capture potentially important dynamic trends and typically ignore the asymmetry of relationships between species.

The second approach utilizes differential equations to reconstruct dynamic networks.11,​12,​13,​14,​15,​16,​17,​18,​19,​20,​21 These equations often include terms that describe growth and decay, pairwise interactions between species, and the effects of nutrients or environment. Most of these approaches have been linear owing to the ease of parameter estimation. Among the nonlinear approaches, the Lotka–Volterra (LV) model has been used extensively,12,​13,​14,​15,17,19,​20,​21 because it is easily interpreted and allows the incorporation of time-dependent external perturbation.22 The main challenge of this approach is the estimation of parameters.

Here we propose slightly modified LV models, which become manageable owing to a novel manner of parameter estimation based on linear regression.23 The models capture not only the metapopulation dynamics of the more than 1,000 highly abundant species in Lake Mendota, but also the pairwise interactions between individual OTUs and other SCs. To the best of our knowledge, this is the first time that LV models of the magnitude addressed here are applied to a real-world system.

## Results

### Yearly abundances of 14 subcommunities

The top 200 parametric instantiations of the SC model (see Materials and Methods) are able to capture the dynamic trends well (Figure 1). They also correlate well with the trends of the observed abundances. Although the figure only shows the abundances during 2 years, the models successfully run for at least 50 consecutive years, if the conditions do not change drastically (not shown).

Twelve of the 14 SCs peak once per year, whereas SC 13 and SC 14 peaks twice (Figure 1 and Supplementary Figure S5). Supplementary Figure S5 shows that the fold-change profiles within SCs are very similar. It also reveals that the peaks are highly relevant, with an abundance that is 3- to 10-fold higher at the peak than the minimum abundance. Throughout the year, the total abundances of SC1–SC13 constitute 88.4–94.9% of the entire population (Supplementary Figure S6a).

### Pairwise interactions between the 14 subcommunities

Using the best 200 model instantiations, we computed the means and s.d. of the parameter values (Figure 2). Among them, the estimated αij and βik values, when normalized (divided by αii), are consistent with very small s.d. About two-thirds of all αij’s (62–67% in each model) are negative, which suggests strong competition between SCs (Supplementary Figure S7). Intriguingly, the terms βik×Xk are usually much higher than the corresponding terms αij×Xj: although Xj and Xk change over time, the median values of normalized αij×Xj and βik×Xk are 2.8 and 9.6, respectively. Expressed differently, the environmental conditions appear to have a greater effect (per unit of abundance) on the abundance of a SC than other SCs, at least qualitatively (Figures 2 and 3).

Interestingly, the means of the αii values for SC 7, 8, and 9 are the smallest in magnitude (Supplementary Figure S7). This result may reflect that these SCs, which peak in July through September when the water temperature is highest and biomass is higher, have the lowest death rates due to ‘crowding.’

The αij matrix is asymmetrical, because interaction effects are not necessarily reciprocal. Among the pairs αij and αji, about 75%, 4%, and 21% are −/−, +/− or +/+, respectively. The number of positive αji values is smaller than in studies of communities growing in human or mouse gut or on spoiling pork,13,14,21 suggesting that the availability of food sources may affect the types of relationships differently within each habitat.

Except for SC 4, 11, 13 and 14, all SCs are positively affected by environmental conditions (Figures 3 and 4). Ammonia/phosphorus, which rapidly declines in April, negatively affects SC 4, which peaks in April. Nitrate+nitrite, which is low in November, negatively affects SC 11, which peaks in November. In SC 11, two of the five top OTUs belong to the family Oxalobacteraceae and one to the family ACK-M1. These families are either responsive to ammonia24 or have members that fix nitrogen25 (Supplementary Figure S10, Supplementary Table S5). Interestingly, SC 13 and 14 have very small (β ik) values, indicating relative tolerance to variations in environmental conditions.

To summarize, the pairwise interactions between SCs are mostly negative, whereas the environmental effects on SCs are mostly positive.

### Bacterial distribution within subcommunities

Almost all (18,642) of the identified OTUs were classified into 63 phyla; only 12 OTUs do not have a phylum classification. For each OTU, we computed the average abundance over all data points, and for each phylum, we summed the abundances for all OTUs, then ranked them based on the total abundance. The top seven phyla, accounting for 92.6–99.9% of the population are: Actinobacteria, dominant in SC1, 3, 4, and 13; Proteobacteria in SC2, 7–11 and 14; and Bacteroidetes in SC5 and 12 (Table 1). Notwithstanding the dominance of particular phyla, each bacterial SC contains bacterial OTUs from a broad range of taxonomic groups. This result is not surprising, because each SC has to execute a wide array of tasks. It also reveals why clustering by taxonomy is not an effective strategy for characterizing the interaction dynamics in the lake.

Using the software PICRUSt26 and the Greengenes Database,27 we assigned KEGG functions to the the OTUs presented in 14 SCs. In total, 42.2% of the total community were mapped to KEGG pathways. We observed specific enrichment of certain pathways in SCs, as shown in Supplementary Figure S13. Data are available at http://www.bst.bme.gatech.edu/research.php.

### Abundances of individual OTUs

We assessed the abundances of individual OTUs using three models, as described in the Methods Section. Among the top 1,140 OTUs, 89.3% can be predicted successfully when the individual OTU is implemented as a new group and the parameters are reoptimized (Model #3; Supplementary Figure S14). Interestingly, the αij×Xj and βik×Tk terms of OTUs belonging to the same phylum, class, or order cluster together and are significantly different from random clusters (Supplementary Table S4). For the top 1,140 OTUs, we extracted 922 OTUs whose abundances are predicted best by Model #3. We found that the pairs αi,sc/αi,sc are often positively correlated, whereas the pairs βik/αi,sc are often negatively correlated (data not shown). This result suggests that the change in the abundance of an OTU is driven either by competition with other bacteria in the community or by positive influences from the environment. Examples of the dynamics of individual OTUs are given in Figure 5. The Supplement Information provides further details.

The individual OTU–SC interaction network adds a second layer to our investigation. The first layer (SC model) captures pairwise interactions between SCs that reflect average effects contributed by all OTUs in each SC. At the second layer, OTU–SC interactions describe the effects of each SC and of the environmental conditions on an individual OTU. As an example, OTU#141903 (a member of the family Nitrosomonadaceae) has a large positive βi,2 value, which indicates that it is strongly, positively affected by ammonia. Although we cannot assign this OTU to a more specific taxonomic group, previous studies suggest that all cultivated representatives of this group are able to oxidize ammonia,28 which reflects our result. OTU#517152 (a member of the genus Roseomonas) has a small negative αii value and a large positive βi,2 value, suggesting that this species has a relatively low death rate and is strongly affected by temperature. Various members of this genus are well-studied aquatic organisms. They were described as slow growing29 and growing better at 25–28 °C,29 than in colder water, in some cases thriving up to 42 °C.30 The Supplementary Information offers further discussions (Supplementary Table S6).

Among these results, we identified 33 OTUs with outliers in βik or αi,sc values and good abundance prediction results and searched the literature for evidence to support or reject our predictions. We found indirect evidence to support the prediction of 15 OTUs and evidence for one, suggesting that further investigation is needed (Supplementary Tables S7a,b). For the remaining OTUs, little is known about their characteristics. These results are summarized in the Supplementary Information. A table with notable interactions among SC-OTUs is available at http://www.bst.bme.gatech.edu/research.php.

## Discussion

Naturally occurring microbial consortia in lakes, and elsewhere, follow annual cycles, where species abundances are correlated with seasonal changes in environmental conditions.31,​32,​33,​34 It is important to understand this dynamics because it is, without doubt, associated with the health of the ecosystem.

Recent metagenomic sequencing technologies have revolutionized this line of investigation. However, while OTU abundances are informative, they do not by themselves convey the dynamics within a metapopulation, but require computational analysis. We perform such an analysis here with LV models (Supplementary Figures S8–S12 and S15, Supplementary Tables S1–S3 and S5). Our models suggest that the dynamics of OTUs can be described in terms of the parameters αij and βik, and that these parameters are biologically relevant, as they signify the strength and nature of interactions between OTU groups as competitive, parasitic, commensal, or neutral (Supplementary Figure S8). The interaction models of individual OTUs furthermore generate hypotheses about the importance of environmental factors and other bacterial groups on the growth of individual OTUs. The models could in principle be used to predict consequences of changes in OTU distribution, but it is unclear how to validate such predictions. For example, we used the SC model to test the effect of environmental conditions on the abundances of SCs (Supplementary Figure S11). Seven SCs (1, 3, 5, 10, 11, 13, and 14) were predicted to return to their normal abundance patterns when the disturbances ended. Other SCs were strongly affected by the environment and their abundance profiles did not recover even several years after the disturbances stopped.

In most other network models, the grouping of OTUs has been based on taxonomy,12,​13,​14,35 resulting in very large networks with millions of pairwise interactions that are difficult to manage. In contrast, our model captures the bacterial dynamics in the lake at the levels of SCs and individual OTUs. This approach succeeded due to the grouping of OTUs into SCs based on their abundance peak times and to our novel estimation strategy. Notably, the OTUs in each SC are taxonomically very diverse at the species and genus levels, suggesting that taxonomically related OTUs are distributed over SCs throughout the year, such that each SC contains representatives of all functionally important taxonomic genera. Horizontal gene transfer, which is frequent in the microbial world and often accounts for the functional redundancy among phyla,36 is likely to contribute to the widely distributed abundances.

Although the paper focuses on an aquatic metapopulation, it is easy to imagine that similar types of analyses could be applied to other microbial consortia that display periodic annual or daily patterns.

## Materials and methods

### Data

The data were collected at 91 time points, from March 2000 to June 2011,31,32,37 and made publicly available at www.lter.limnology.wisc.edu.37,38 The dataset consists of abundance measurements, which were interpreted through 16S-sequences. Using the software Qiime,39 with 97% identity as a cutoff, and the Greengenes database27 (greengenes.lbl.gov), 18,696 OTUs were identified (see Supplementary Information for details).

Also measured were nineteen physical and chemical conditions of the lake, collected from 1995 to 2013;38 see references 6,40 and Supplementary Figure S1. Fourteen of these remain fairly constant, while water temperature, nitrate+nitrite, ammonia, total phosphorus unfiltered, dissolved reactive phosphorus, and dissolved reactive silica vary substantially over time.

### Data processing

In order to manage the large number of OTUs, we first followed conventional wisdom and clustered the OTUs by taxonomy (cf. references 12,​13,​14,35). Specifically, we identified the top seven phyla, but found that their abundance profiles varied widely among OTUs within the phyla (Supplementary Figures S3 and S6b). Grouping by order, class, or genus yielded similar results. In spite of extensive efforts, none of these taxonomic clustering modalities led to new insights or interesting results.

We therefore decided to cluster differently, based on the annual peak time for each OTU. For each OTU, the abundances throughout the years 2000–2011 were superimposed, which resulted in a single, ‘collective 1-year period.’ The results were smoothed by computing the mean value of each 30-day window (Supplementary Figure S4). These smoothed profiles reflect the seasonal changes in abundances well. We omitted from clustering OTUs with only one observed data point and OTUs whose abundances were indicated by the smoothed curves to be zero.

For each OTU profile, we identified the positions of the top one or two abundance peaks and then clustered OTUs based on these peak profiles. This analysis resulted in 13 groups plus one additional group for all remaining OTUs. We refer to these 14 groups as subcommunities (SCs).

We chose water temperature and two chemical conditions (ammonia and total nitrate+nitrite) that follow distinct annual pattern. The patterns of other chemical conditions were omitted because they were highly related to the chosen patterns (Supplementary Figure S1a). The data were processed similarly to the abundance data. Their variability over the years fell within ranges of the mean±s.d. of the observed data, superimposed onto one ‘typical year’ (Supplementary Figure S1b).

### Model

In our modeling format, Xi is the abundance of an OTU or SC i. The interactions between Xi and other Xj’s and with environmental conditions Tk are described through product terms, which have their origin in mass action kinetics.41 The model takes the form:$(1)X˙i=∑j=1nαijXiXj+∑k=1mβikXiTk.$

i is the rate of change of variable i, and the indexed parameters α and β indicate the type and strength of an interaction between pairs of OTUs or between OTUs and the environment, respectively. We use this structure to represent interactions among the 14 bacterial SCs and among individual OTUs and SCs. The quality of results is assessed with two similarity scores (see Supplementary Information).

Ignoring the less interesting situation that Xi=0, Equation (1) can be rewritten as$(2)X˙iXi=∑j=1nαijXj+∑k=1mβikTk.$

If abundances and slopes can be determined from the time courses of all SCs, this equation becomes an algebraic system of linear equations.23,42,​43,​44,​45 Thus, even though the system is highly nonlinear, linear regression can be used to solve for all parameter values (for details, see Supplementary Information).

### Predicting the abundances of individual OTUs

We also used the model to predict the abundances of individual OTUs. Formally, the model has exactly the same format as in Equation (1). However, to test whether environmental conditions alone could model the data (Model #1), all αij parameters were set to zero, except for αii, and the βik values were re-estimated. For Model #2, αij and βik values were chosen from the filtered parameter values described in the Supplementary Information. For Model #3, we removed the OTU of interest from its SC and considered it as a new group. The αij and βik values were then re-estimated for individual OTUs, based on 100 values for αii selected from the range [−1, 0]. The goodness of fit was evaluated with similarity scores (see Supplementary Information).

## References

1. 1.

& Prokaryotic taxonomy and phylogeny in the genomic era: advancements and challenges ahead. Curr. Opin. Microbiol. 10, 504–509 (2007).

2. 2.

et al. Fundamentals of microbial community resistance and resilience. Front. Microbiol. 3, 417 (2012).

3. 3.

, & Phylogenetic identification and in situ detection of individual microbial cells without cultivation. Microbiol. Rev. 59, 143–169 (1995).

4. 4.

& Microbial interactions: from networks to models. Nat. Rev. Microbiol. 10, 538–550 (2012).

5. 5.

, , & Using network analysis to explore co-occurrence patterns in soil microbial communities. ISME J. 6, 343–351 (2012).

6. 6.

et al. Defining seasonal marine microbial community dynamics. ISME J. 6, 298–308 (2012).

7. 7.

et al. Microbial co-occurrence relationships in the human microbiome. PLoS Computat. Biol. 8, e1002606 (2012).

8. 8.

& Inferring correlation networks from genomic survey data. PLoS Comput. Biol. 8, e1002687 (2012).

9. 9.

et al. Local similarity analysis reveals unique associations among marine bacterioplankton species and environmental factors. Bioinformatics 22, 2532–2538 (2006).

10. 10.

, , & A global network of coexisting microbes from environmental and whole-genome sequence data. Genome Res. 20, 947–959 (2010).

11. 11.

& The dynamics of Helicobacter pylori infection of the human stomach. J. Theor. Biol. 176, 281–290 (1995).

12. 12.

, , , & Mathematical modeling of primary succession of murine intestinal microbiota. Proc. Natl Acad. Sci. USA 111, 439–444 (2014).

13. 13.

& Identifying keystone species in the human gut microbiome from metagenomic timeseries using sparse linear regression. PLoS ONE 9, e102451 (2014).

14. 14.

et al. Ecological modeling from time-series inference: insight into dynamics and stability of intestinal microbiota. PLoS Comput. Biol. 9, e1003388 (2013).

15. 15.

et al. Microbial interactions within a cheese microbial community. Appl. Environ. Microbiol. 74, 172–181 (2008).

16. 16.

, & Dynamic flux balance modeling of S. cerevisiae and E. coli co-cultures for efficient consumption of glucose/xylose mixtures. Appl. Microbiol. Biotechnol. 93, 2529–2541 (2012).

17. 17.

et al. A synthetic Escherichia coli predator-prey ecosystem. Mol. Syst. Biol. 4, 187 (2008).

18. 18.

, , & Predicting a human gut microbiota's response to diet in gnotobiotic mice. Science 333, 101–104 (2011).

19. 19.

& Prediction of competitive microbial growth in mixed culture at dynamic temperature patterns. Biocontrol Sci. 19, 121–127 (2014).

20. 20.

& Deciphering microbial interactions and detecting keystone species with co-occurrence networks. Front. Microbiol. 5, 219 (2014).

21. 21.

, & Interactions of microorganisms during natural spoilage of pork at 5 degrees C. J. Food Eng. 72, 24–29 (2006).

22. 22.

& Towards predictive models of the human gut microbiome. J. Mol. Biol. 426, 3907–3916 (2014).

23. 23.

& Parameter estimation in canonical biological systems models. Int. J. Syst. Synth. Biol. 1, 1–19 (2010).

24. 24.

, , & Diverse populations of lake water bacteria exhibit chemotaxis towards inorganic nutrients. Isme Journal 7, 1661–1664 (2013).

25. 25.

et al. in The Prokaryotes—Alphaproteobacteria and Betaproteobacteria (eds Rosenberg E., Delong E. F., Lory S., Stackebrandt E. & thompson F.) 919–974 (Springer-Verlag Berlin Heidelberg, 2014).

26. 26.

et al. Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences. Nat. Biotechnol. 31, 814–821 (2013).

27. 27.

et al. Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl. Environ. Microbiol. 72, 5069–5072 (2006).

28. 28.

, , in The Prokaryotes (eds Rosenberg E., DeLong E., Lory S., Stackebrandt E. & Thompson F.) 901–918 (Springer Berlin Heidelberg, 2014).

29. 29.

, , & Roseomonas aquatica sp nov., isolated from drinking water. Int. J. Syst. Evol. Microbiol., 56, 2291–2295 (2006).

30. 30.

et al. Roseomonas, a new genus associated with bacteremia and other human infections. J. Clin. Microbiol. 31, 3275–3283 (1993).

31. 31.

, , , & A decade of seasonal dynamics and co-occurrences within freshwater bacterioplankton communities from eutrophic Lake Mendota, WI, USA. ISME J. 7, 680–684 (2013).

32. 32.

et al. Interannual dynamics and phenology of bacterial communities in a eutrophic lake. Limnol. Oceanogr. 52, 487–494 (2007).

33. 33.

& Community structure of marine bacterioplankton: patterns, networks, and relationships to function. Aquat. Microb. Ecol. 53, 69–81 (2008).

34. 34.

et al. A latitudinal diversity gradient in planktonic marine bacteria. Proc. Natl Acad. Sci. USA 105, 7774–7778 (2008).

35. 35.

, & Coherent dynamics and association networks among lake bacterioplankton taxa. ISME J. 6, 330–342 (2012).

36. 36.

& Inter-phylum HGT has shaped the metabolism of many mesophilic and anaerobic bacteria. ISME J. 9, 958–967 (2015).

37. 37.

NTL-LTER. Time series of bacterial community dynamics in Lake Mendota. North Temperate Lakes Long-Term Ecological Research (NTL-LTER) program, NSF, Katherine Trina McMahon, Center for Limnology, University of Wisconsin-Madison. (2014).

38. 38.

NTL-LTER. Chemical Limnology of North Temperate Lakes LTER Primary Study Lakes: Nutrients, pH and Carbon. North Temperate Lakes Long-Term Ecological Research (NTL-LTER) program, NSF, Center for Limnology, University of Wisconsin-Madison. (2012).

39. 39.

et al. QIIME allows analysis of high-throughput community sequencing data. Nature Methods 7, 335–336 (2010).

40. 40.

North Temperatue Lakes LTER. (15 October 2014).

41. 41.

, & 150 years of the mass action law. PLoS Comput. Biol. 11, e1004012 (2015).

42. 42.

& Power-law approach to modeling biological systems; III. Methods of analysis. J. Ferment. Technol. 60, 233–241 (1982).

43. 43.

A spline least-squares method for numerical parameter-estimation in differential-equations. Siam J. Sci. Stat. Comput. 3, 28–46 (1982).

44. 44.

& Decoupling dynamical systems for pathway identification from metabolic profiles. Bioinformatics 20, 1670–1681 (2004).

45. 45.

& Recent developments in parameter estimation and structure identification of biochemical and genomic systems. Math. Biosci. 219, 57–83 (2009).

## Acknowledgements

This work was funded by grant DEB-1241046 of the National Science Foundation, USA. The authors are grateful to Luis-Miguel Rodriguez Rojas and Despina Tsementzi for their help with the OTU identification, beneficial discussions, and valuable feedback.

## Affiliations

1. ### Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA, USA

• Phuongan Dam
• , Luis L Fonseca
•  & Eberhard O Voit
2. ### School of Civil and Environmental Engineering and School of Biology, Georgia Institute of Technology, Atlanta, GA, USA

• Konstantinos T Konstantinidis

## Authors

### Competing interests

The authors declare no conflict of interest.

## Corresponding author

Correspondence to Eberhard O Voit.

1. 1.