Introduction

The human gut microbiota is the collection of microorganisms located in the stomach, large intestines, and small intestines, and this system plays an important role in sustaining overall human health. Each individual’s gut composition is unique, and, besides their genetic makeup, their long-term dietary patterns, specifically concerning the types and amounts of carbohydrates, proteins, and fats consumed, affect its composition1,2. Specifically, gut microbiota have been shown to be in involved in numerous physiological processes, such as digestion of undigested food (complex starch), development and regulation of immune system, blocking growth of pathogens, and generation of neurotransmitter and vitamins. Thus, any changes in environmental conditions in this gut ecosystem (microbiota) can result in a shift in its composition, which can predispose and/or worsen chronic inflammatory diseases such as inflammatory bowel diseases, multiple sclerosis, rheumatoid arthritis and neurological diseases like Alzheimer’s disease, Parkinson disease and autism3,4,5. Though the changes in the microbial composition of the human gut microbiota have been shown to be associated with rapid changes in metabolism and overall health, the underlying interactions among species within this microbiota are not yet well understood2. Among all the factors linked with regulation of gut microbiota, diet has emerged as the strongest as it can override genetic influences on microbiota6,7. A better understanding of the microbial dynamics, particularly mathematical approaches, may elucidate the role that the gut microbiota plays in human diseases1.

Common approaches to analyzing the human gut microbiota for its diversity and composition include, but are not limited to, next-generation sequencing, metatranscriptomics, culturomics, and mass spectrometry analyses2,8. These approaches give little insight as to how species interact amongst themselves and with the human host2. Mathematical modeling, however, attempts to answer such questions and can supplement the knowledge gained from these common approaches. Mathematical models can provide evidence to strengthen a hypothesis about these interactions and offer guiding principles for further study of a phenomenon9.

We use ordinary differential equation (ODE)-based modeling as the main tool for our analysis, which tracks information about biomass and concentration levels over time rather than genomic information as in the case of genetically engineered mouse (GEM) models. In this so-called dynamical system approach, we can identify the system’s driving parameters and analyze its stability. Despite the advantage of having easily interpretable terms in the model’s equations, this approach still has some key drawbacks, specifically the determination of unknown parameters.

The parameters used in biological models like ours include substrate conversion rates, utilization rates, and cellular death rates, which typically can be determined and validated by laboratory experiments. However, some gut microorganisms are unable to be cultivated in a laboratory setting, leaving specific information about these species’ metabolic activity unknown. Additionally, ODE models are often used to track only a small subset of species within a community. While this is often due to the difficulty of determining a larger number of parameters when the system is expanded8, focusing on a few subspecies at a time may aid in identifying the dominant relationships in that subsystem.

Despite the impracticality of ODE-based modeling for some very large systems, ODE models paired with other types of modeling, such as agent-based modeling, can provide an insightful understanding of the gut microbiota’s dynamics and interactions8. As a first step in our modeling efforts with its limitations in mind, we consider an ODE-based modeling approach for a small-scale system of three abundant microorganisms in the human gut microbiota. Our ODE model is based on those of chemostats in that we include inflow and outflow terms in the equations that are missing in batch models, by their nature. Chemostats are idealized laboratory systems for microbial ecology that have a long history of mathematical modeling behind them10,11.

Similar approaches to ours, such as those for continuously stirred tank reactor (CSTR) models for anaerobic digestion, have been under development for some time. One example of such a model is the IWA Anaerobic Digestion Model No.1 (ADM1), which provides a modeling foundation for generalized biochemical and physio-chemical systems for anaerobic digestion12. Additionally, Godon et al.13 outlines an overview of anaerobic digestion modeling schemes for organ types for a wide variety of animals, humans included. In particular, it is suggested that the stomach can be reasonably represented as a CSTR, and the large intestine can be represented as a series of CSTRs13. More recently, Jegatheesan and Eberl14 created a CSTR model of microbial functional groups and their metabolic products. This model includes a main compartment, referred to as a lumen, in which substrates and microorganisms interact, and a diffusion compartment, referred to a mucus, in which nutrients are uptaken by the host14.

We continue this emphasis currently in the literature on the vital role of the inflows and outflows in the human gut ecosystem in our modeling approach. As a first modeling step with our chosen subsystem of the human gut microbiota, we consider a single chemostat, which is a bioreactor subvariant of generalized CSTRs.

Our representation of the human gut microbiota considers a subset of this system, namely the three microorganisms Bacteroides thetaiotaomicron, Eubacterium rectale, and Methanobrevibacter smithii. These species represent the three main phyla in the human gut: Bacteroidetes, accounting for 17–60% of the total biomass; Firmicutes, 35–80% of the biomass; and Euryarchaeota the bulk of the remainder15. The system’s main product, butyrate, is of specific interest due to its role in sustaining human health. Butyrate provides energy to colonocytes, affects overall energy homeostasis, and inhibits histone deacetylase, which is an enzyme that directly affects colorectal cancer15. Along with butyrate, other notable intermediates and products in this system include acetate, propionate, glutamine, carbon dioxide (CO\(_2\)), hydrogen (H\(_2\)), and methane (CH\(_4\)). Acetate, propionate, and butyrate, which are short chain fatty acids, are absorbed in the gut’s epithelial cells and regulate an individual’s immune system and metabolism15. Individually, acetate acts as a substrate for cholesterol synthesis and lipogenesis15; propionate regulates gluconeogenesis and cholesterol synthesis15; glutamine fuels the metabolism and maintains the intestinal barrier16; and the gases CO\(_2\), H\(_2\), and CH\(_4\) are products of bacterial fermentation that can cause intestinal discomfort in excess17. As for the microorganisms themselves, M. smithii removes hydrogen gas, which affects bacterial fermentation and energy gathering, and produces methane gas15; E. rectale produces butyrate, which is beneficial to the gut’s epithelial cells15; and B. thetaiotaomicron utilizes dietary polysaccharides and indirectly facilitates butyrate production with its outputs2.

In our model, we represent the interactions of these three short-chain fatty acid producing/utilizing gut microorganisms (see Fig. 1) and their metabolites. B. thetaiotaomicron is an abundant bacterial species in the human gut microbiome whose main function is the utilization of polysaccharides2,18. Through polysaccharide degradation, B. thetaiotaomicron contributes to the overall ecosystem diversity in the colon, which is its regular environment19. B. thetaiotaomicron can survive solely on the uptake of carbon-rich polysaccharides20; however, its growth is enhanced in the presence of inorganic ammonia due to inorganic ammonia’s contribution of nitrogen to B. thetaiotaomicron’s metabolism21. Through the utilization of inorganic ammonia, B. thetaiotaomicron can synthesize all amino acids that are essential to human health2, which makes this bacteria a key interest in our study.

When both E. rectale and B. thetaiotaomicron are present in an environment, B. thetaiotaomicron up-regulates gene expression for starch utilization and the degradation of specific glycans that E. rectale is unable to utilize. Simultaneously, E. rectale down-regulates the genes associated with glycan degradation even though it cannot grow efficiently without a carbohydrate source. Previous research on the interactions of these two species suggests that the presence of B. thetaiotaomicron enhances E. rectale’s ability to uptake nutrients22. E. rectale shifts from uptaking polysaccharides to utilizing amino acids, such as glutamine, when B. thetaiotaomicron is present2. This reciprocal response suggests that these two gut microorganisms may adapt their metabolic strategies in the presence of one another to reduce competition for the same nutrients and optimize their use of available resources20.

M. smithii, which is one of the main methanogenic archaeon in the human gut, improves the productivity of carbohydrate metabolism by utilizing H\(_2\) from E. rectale and formate from B. thetaiotaomicron to produce methane gas. This process prevents the environment from becoming too saturated with B. thetaiotaomicon and E. rectale’s by-products, which consequently improves carbohydrate metabolism. Additionally, M. smithii removing H\(_2\) in this environment allows for B. thetaiotaomicron to generate NAD\(^+\), which is used for glycolysis, a fundamental process in producing cellular energy22.

M. smithii is known to use substrates, including hydrogen and carbon dioxide, for methane production20. The availability of amino acids produced by B. thetaiotaomicron can serve as alternative carbon and energy sources, which may compete with these traditional substrates20. If B. thetaiotaomicron-produced amino acids are abundant, M. smithii may favor their utilization over other substrates, potentially reducing methane production20.

Figure 1
figure 1

Graphical representation of the interactions of the three species, B. thetaiotaomicron, E. rectale, and M. smithii, and their metabolites. Each microorganism or substance given a letter is included in the mathematical model equations. This figure was developed from information presented in Ji and Nielsen2, Shoaie et al.15, and Adamberg et al.19. We emphasize that B. thetaioaomicron’s products acetate and the gases CO\(_2\) and H\(_2\) cannot be produced without the presence of polysaccharides.

The information known about these three species’ interactions is translated into a graphical representation in Fig. 1. This schematic highlights the competition among these species, specifically between E. rectale and M. smithii and between B. thetaiotaomicron and E. rectale.

The results of our modeling effort suggest the efficacy of a relatively simple ODE approach in understanding and potentially predicting the dynamics of important subsystems in the larger human gut ecosystem.

Results

Using our mathematical model of B. thetaiotaomicron, E. rectale, and M. smithii and their metabolites in Eqs. (2) and (3), we present numerical results, estimation of key parameters, and sensitivity analysis. The code used to generate our results can be found at our GitHub page: https://github.com/Melissa3248/gut_microbiota.

Numerical results

Using the set of parameter estimates in Table 2, the solutions to our model are given for the dynamics of B. thetaiotaomicron, M. smithii, and E. rectale in Fig. 2, for acetate in Fig. 3, for CO\(_2\) and H\(_2\) in Fig. 4, and for polysaccharides in Fig. 5. These plots show the solutions to our system of ODEs in Eqs. (2) and (3) from 9900 to 10,000 h. This time range was selected in order to allow for a sufficient amount of time to pass in order for the system to converge to a regular oscillation. The center values of the oscillations are represented by dotted lines in each plot. We note in Fig. 2 that the mean of three bacterial species was fitted to match the experimental biomass sum instead of individually. This choice was made given the data availability; the data available reported the sum of the biomass. We note that there can be many different combinations of individual biomass for the three species that lead to this specific sum. We emphasize that the fact that we found a set of parameters that allow for the species to coexist long term provides some validation of our model since we know these species coexist naturally in the gut.

In Figs. 3, 4, and 5, the data values given in Supplementary Table 1 are superimposed on their respective plots as dashed lines in order to provide a visual representation of the error between the mean of the regular oscillation and the observed data. Despite some error in the center value of the ODE solutions compared to the data, the observed data values for acetate, polysaccharides, and the gases CO\(_2\) and H\(_2\) are contained in the oscillation range of the ODEs’ numerical solutions, so we conclude that our parameter estimates sufficiently fit the data. Thus, the relatively simple model we have presented here is consistent with the data in our literature review. Further details on our parameter estimation process can be found in Supplementary Information Appendix B.

Figure 2
figure 2

Plot of the solutions to Eqs. (2a), (2b), and (2c) using the observed data in Supplementary Table 1 over the time interval 9900 to 10,000 hours. Based on Supplementary Table 1, the sum of the three species’ biomass should be 0.001412 gDW. The middle of the solutions to the ODEs using the parameter estimates in Table 2 for all three species sums to 0.001277 gDW, resulting in an error of—enter of acetate’s oscillatio0.000135 gDW, or a \(-9.5\)% error.

Figure 3
figure 3

Plot of the solutions to Eq. (3a) using the observed data in Supplementary Table 1 over the time interval 9,900 to 10,000 hours. Based on Supplementary Table 1, the center of acetate’s oscillations should be 9.71 μM. The middle concentration of the ODE solutions for acetate is 9.66 μM, resulting in a − 0.05 μM error, or a − 0.5% error.

Figure 4
figure 4

Plot of the solutions to Eq. (3b) using the observed data in Supplementary Table 1 over the time interval 9900 to 10,000 hours. Based on Supplementary Table 1, the center of CO\(_2\) and H\(_2\)’s oscillations should be roughly 7.96 μM. The middle concentration of the ODE solutions for CO\(_2\) and H\(_2\) is 8.03 μM, resulting in a 0.07 μM error, or a 0.9% error.

Figure 5
figure 5

Plot of the solutions to Eq. (3c) using the observed data in Supplementary Table 1 over the time interval 9900 to 10,000 hours. Based on Supplementary Table 1, the center of polysaccharides’ oscillations should be roughly 32.06 \(\mu\)M. The middle concentration of the ODE solutions for polysaccharides is 32.1 μM, resulting in a 0.05 μM error, or a 0.2% error.

Sensitivity analysis

Based on the fitted parameter values and the available data from the literature, we analyzed our model’s sensitivity to its parameters. In order to identify the parameters in the model that have the greatest effect on the model output, we conducted sensitivity analysis on our model by computing the first- and total-order effects based on Sobol’ indices.

Sensitivity analysis can be defined as the study of how uncertainty in the model input propagates into uncertainty in the model output23. Once parameters are estimated, sensitivity analysis can be used to identify the driving parameters of the system that contribute to the most variability in the model’s output. Sensitivity analysis tests the robustness of the model, identifies if the model relies on weak assumptions, and allows for model simplification23. The results of our sensitivity analysis using the methods described in the section titled “Sensitivity analysis methods” are given in Figs. 6 and 7.

We obtained first- and total-order Sobol’ indices results, shown in Figs. 6 and 7. Based on Fig. 6 of the first-order Sobol’ indices, \(\beta _B\) is an influential parameter for B. thetaiotaomicron’s output variance; \(\beta _p\) and \(\mu _{pE}\) for E. rectale’s variance, \(\gamma _a\) and q for acetate’s variance, and q and \(\beta _B\) for polysaccharide’s variance.

Based on Fig. 7 of the total-order Sobol’ indices, we can qualitatively see that parameters that we would biologically expect to be relevant to the variance of corresponding output are in fact relevant. Specifically, \(\beta _B\) has a higher order effect on the output variance of polysaccharide’s (p), the consumption term \(\mu _{pE}\) has a higher order effect on the output variance of E. rectale, \(\beta _B\) has a higher order effect on the output variance of B. thetaiotaomicron, etc. Based on the changes in the estimated indices from first-order to total-order indices, there appears to be a higher-order interaction among model parameters for E. rectale’s, M. smithii’s, acetate’s, CO\(_2\) and H\(_2\)’s, and polysaccharides’ output variance. We note that observing Sobol’ indices whose sum is greater than 1, as seen in Fig. 7, is evidence of potentially correlated inputs in the model24.

Discussion

We suggest and illustrate that a mathematical representation using inflow and outflow terms, such as those used in models of chemostats, is a natural way to capture the effects of the inflow and outflow in the gut on its microbial dynamics. The use of inflow and outflow terms in our approach differentiates our model from many other dynamical systems representations, such as batch models. One difference is that in a chemostat model with inflows and outflows, persistence of all species is possible, whereas this would usually not be the case in batch models, even with otherwise identical representations of the cellular and chemical interactions.

The three species in our model, B. thetaiotaomicron, E. rectale, and M. smithii, play an important role in polysaccharide degradation and the production of butyrate, which both aid in the human gut’s ability to absorb nutrients through the epithelial cells15. The system of the three microbial species has been considered in previous works2,15, which largely informed our knowledge of this system. By creating a mathematical model based on the interactions of these species, we have analyzed their interactions and identified aspects of this system that should be further explored through empirical investigation.

When dealing with observed phenomena, it is often the case that several different mathematical representations can mimic those observations. Models, even simplified and abstracted models like ours, contain critical assumptions concerning the relationships between constituent parts. The implications of these assumptions are at least somewhat illuminated by parameter sensitivity analyses, which can identify the most impactful relationships.

Due to the limited availability of data for a more rigorous parameter estimation, our sensitivity analysis took on additional importance. Through the results of the sensitivity analysis using first- and total-order Sobol’ indices, we more narrowly identified specific links in the microbial food web that would be fruitful targets for additional empirical work. Specifically, we identified the growth rate coefficients \(\beta _B\), \(\beta _{E_1}\), \(\beta _{E_2}\), \(\beta _{M_1}\), \(\beta _{M_2}\), \(\beta _p\), \(\mu _{pB}\), \(\mu _{pE}\) and the scaled volumetric flow q as being largely significant in contributing to the variance of the model output, including higher-order interactions among these parameters. With these results, we suggest that estimates of these significant parameters be obtained through laboratory experimentation in order to capture these values to a higher degree of precision and accuracy.

The significant growth rate coefficients (\(\beta\) parameters) correspond to the growth rates of the three microorganisms when supported on a medium containing specific nutrients. Experiments should focus on cultivating these microorganisms in isolation in germ-free mice with only one nutrient15. Despite the fact that these microorganisms are able to be cultured in isolation of other microorganisms, they may not be able to be sustained on one single nutrient, but rather require the presence of additional substrates. In this case, nonlinear effects from these secondary nutrients would factor in to the resulting estimates of the growth parameters. Table 1 provides a general outline of the the specific nutrient and microorganism necessary to estimate each parameter. For example, the growth rate coefficient for B. thetaiotaomicron (\(\beta _B\)) can be experimentally estimated by cultivating B. thetaiotaomicron in a medium containing only polysaccharides. Additionally, experiments approximating the rate of flow of the digestive system’s fluids would largely inform the estimate for the true value of the volumetric flow rate parameter q.

Table 1 Table of microorganism growth rate parameters suggested to be estimated experimentally.

Due to the possibility of correlation among model parameters, variance-based sensitivity analyses specifically for correlated parameters should be explored and applied to this system, such as the methods discussed in Iooss and Prieur24 and Rabitz25. In regards to specifying the prior distributions on parameters, further analyses should include testing differing prior parameter distributions in order to compute the Sobol’ indices, especially to explore the amount of variability in results based on the chosen parameter distributions23.

Our main suggestion for future work is to collect more complete longitudinal data on this biological system, including for all three species B. thetaiotaomicron, E. rectale, and M. smithii and its relevant substrates. With this experimental data, more precise estimates of the model parameters can be achieved. With this approach, particular attention can be paid to estimating the parameters that were identified as sensitive by the Sobol’ indices.

Additionally, further extensions of our model may include relaxing some of the volumetric flow assumptions. Specifically, the rate of volumetric flow through the gut can be generalized to reflect aspects of the natural flow of the gut, such as periodically restricted flow, and the framework of a single vessel representation can be extended to consider additional compartments to mimic the system of digestive organs in the human body. More details about a chemostat framework and existing theory can be found in Smith and Waltman10 and Harmand et al.11. Future implementations of a similar model can account for absorption rates of additional substrates within the gut, particularly amino acids. With these potential improvements of our baseline model, additional aspects about the dynamics of this biological system can be uncovered, and these improvements to our model could fuel further research directions related to this system.

Our model has relevance to human microbiome studies as it can be used to test the significance of clinical findings. For example, microbiome studies for human autoimmune disease, such as multiple sclerosis, have shown modulation of butyrate and methane-producing gut bacteria26. This can be incorporated into our modeling framework in combination with experimental studies to determine the mechanism through which gut bacterial modulate disease.

Methods

The key assumption underlying our mathematical model is that the human gut can be represented as a chemostat. A chemostat is a laboratory device used in the simulation and ecological study of populations, which provides an idealized representation of naturally occurring phenomena and has a rich history of mathematical representation. Though the conditions of a chemostat are simplified and controlled in a laboratory setting, a chemostat can be useful in the study of population dynamics and the underlying mechanisms of interactions among populations. Using aspects of a simple chemostat model is a first step in developing an initial theoretical framework, which can then be refined and extended.

In developing an ODE model to represent our chosen microbiota subsystem, we assume that the human gut has inflows and outflows in a manner similar to chemostat-like models. We utilize ODE-based dynamical systems modeling to track the changes in biomass of the three prevalent microorganisms, B. thetaiotaomicron, E. rectale, and M. smithii, as well as their chemical inputs, intermediates, and byproducts, with the goal of providing a better understanding of their interactions within this subsystem.

In order to supplement the knowledge gained from transcriptomic analysis and GEMs, we incorporate this information into our own mathematical interpretation of this small-scale system using ODEs. Throughout our model explanation, we present information learned through previous investigations of B. thetaiotaomicron, E. rectale, and M. smithii that we utilized in creating our mathematical model.

In constructing a first model, we assume that the contents of the vessel are well-mixed, the rate at which liquid enters the system equals the rate at which the well-stirred contents leave the compartment, and that some significant factors potentially affecting growth, such as temperature, are held constant10.

For specificity we assume that microorganisms grow at a rate following the Monod form

$$\begin{aligned} \beta _X\left( \frac{u}{u+\gamma }\right) X \end{aligned}$$
(1)

where \(\beta _X\) is the maximum birth rate of population X, u is the concentration of the nutrient population on which X’s growth depends, \(\gamma\) is the Michaelis-Menten constant, and X is the concentration of the microorganism27. This form is commonly used, especially for a first effort. Other functional forms that could be used include Hill functions28, which are generalizations of the Monod form, and S-forms29. The accompanying large increase in the number of parameters, and lack of data to capture the detail these parameters provide, means that they are less appropriate for our effort here than the Monod form.

Model variables

The variables in our model for the microbial and chemical species depend only on one independent variable, time, denoted by t. These dependent variables are

  • B for B. thetaiotaomicron,

  • E for E. rectale,

  • M for M. smithii,

  • a for acetate,

  • h for CO\(_2\) and H\(_2\),

  • p for polysaccharides.

Model equations

The equations for the three microbial species are

$$\begin{aligned} \frac{dB}{dt}&= \beta _B\Psi _{\gamma _p}(p)B - qB , \end{aligned}$$
(2a)
$$\begin{aligned} \frac{dE}{dt}&= \left[ \beta _{E_1}\Psi _{\gamma _a}(a) + \beta _{E_2}\left( 1-\Psi _{\gamma _B}(B)\right) \Psi _{\gamma _p}(p)\right] E - qE ,\end{aligned}$$
(2b)
$$\begin{aligned} \frac{dM}{dt}&= \left[ \beta _{M_1}\Psi _{\gamma _a}(a) + \beta _{M_2}\Psi _{\gamma _h}(h)\right] M - qM . \end{aligned}$$
(2c)

The equations for the three chemical species are

$$\begin{aligned} \frac{da}{dt}&= \beta _a\Psi _{\gamma _p}(p)B-qa-\mu _{aE}\Psi _{\gamma _a}(a)E - \mu _{aM}\Psi _{\gamma _a}(a)M , \end{aligned}$$
(3a)
$$\begin{aligned} \frac{dh}{dt}&= \beta _{h_1}\Psi _{\gamma _a}(a)E + \beta _{h_2}\Psi _{\gamma _p}(p)B + \beta _{h_3}(1-\Psi _{\gamma _B}(B) )\Psi _{\gamma _p}(p)E -qh-\mu _{hM}\Psi _{\gamma _h}(h)M , \end{aligned}$$
(3b)
$$\begin{aligned} \frac{dp}{dt}&= \beta _pq(\text {cos}(t)+1)^3 - qp - \mu _{pB}\Psi _{\gamma _p}(p)B - \mu _{pE}(1 -\Psi _{\gamma _B}(B))\Psi _{\gamma _p}(p)E . \end{aligned}$$
(3c)

The \(\Psi\) and q terms in these equations are defined by

$$\begin{aligned} \Psi _{\gamma }(u)&=\frac{u}{u+\gamma }, \end{aligned}$$
(4)
$$\begin{aligned} q&=\frac{V}{Q} . \end{aligned}$$
(5)

In our model we assume that there is a high enough rate of turnover of fluids in the human gut such that these microorganisms are almost always flushed out of the system before their life expectancy, so death terms are neglected in these three equations. All three microorganism equations thus have the general format:

$$\Delta _i = P_{ij}- F_i,$$

where \(\Delta _i\) is the rate of change of biomass for microorganism i, \(P_{ij}\) is the rate at which microorganism i proliferates based on the availability of substance j, and \(F_i\) is the rate at which microorganism i is flushed out of the gut. In all three microorganism equations, the term \(F_i\) is the biomass of the given population multiplied by the constant q. The fixed quantity q is interpreted as the rate at which the contents of the gut leave the system as expressed in Eq. (5), where V is the volume of the gut and Q is the rate of volumetric flow within the gut.

In order for E. rectale to grow in biomass, acetate or polysaccharides need to be present in the ecosystem15. We assume that E. rectale has different maximum growth rates depending on each nutrient, leading us to split \(\beta _E\) into three different, related constants. Because E. rectale shifts to uptaking inorganic ammonia when B. thetaiotaomicron is present, we included the term \(\big (1-\Psi _{\gamma _B}(B)\big )\) to reflect this shift. M. smithii depends on the presence of acetate and the gases CO\(_2\) and H\(_2\), which are both incorporated in the standard Monod form from Eq. (1). Again, we assume that M. smithii grows at different maximum growth rates in the presence of only one of these metabolites, which lead to the separation of \(\beta _M\) into the constants \(\beta _{M_1}\) and \(\beta _{M_2}\).

The metabolite equations detail the rates of change over time for the intermediate substances produced and consumed by these three species, where the concentrations are tracked for acetate in Eq. (3a), CO\(_2\) and H\(_2\) in Eq. (3b), and polysaccharides in Eq. (3c). These concentrations depend on the rate at which these metabolites are produced by the microorganisms or enter into the system, the rate at which they are flushed out of the system, and the rate at which they are consumed by surrounding microorganisms. All three metabolite equations are constructed in the general format:

$$\begin{aligned} \Delta _j = P_{ij} - F_j - C_{ij}, \end{aligned}$$

where \(\Delta _j\) is the rate of change of metabolite j’s concentration, \(P_{ij}\) is the rate at which metabolite j is produced by microorganism i or enters the system, \(F_j\) is the rate at which metabolite j is flushed out of the gut, and \(C_{ij}\) is the rate at which the metabolite j is consumed by microorganism i. In this \(C_{ij}\) term for metabolite j consumed by microorganism i, we have a yield constant \(\mu _{ji}\), and in Eqs. (3a) and (3b), \(\Psi _{\gamma _j}(j)X_i\) describes the contribution of metabolite j to microorganism i’s biomass, where this biomass is denoted by \(X_i\). For polysaccharides in Eq. (3c), \(\left( 1-\Psi _{\gamma _B}(B)\right) \Psi _{\gamma _j}(j)X_i\) describes the contribution of metabolite j to microorganism i’s biomass depending on the biomass of B. thetaiotaomicron. We include the term \(\left( 1-\Psi _{\gamma _B}(B)\right)\) in Eqs. (3b) and (3c) to denote E. rectale’s shift in utilization of polysaccharides and corresponding production of CO\(_2\) and H\(_2\) when B. thetaiotaomicron is present.

Polysaccharides enter the human gut through diet, so we accounted for their addition to the gut through a sinusoidal function, \(\big (\text {cos}(t)+1\big )^3\). This function attempts to account for the duration of time in between meals through the period of the curve. In addition, this function is defined to be a smooth curve to illustrate the gradual breakdown of food and release of nutrients in the gut. We note that the terms \(\left( \text {cos}(t)+1\right) ^3\) is one possible choice of many for representing how polysaccharides enter the system. Future work can explore different reasonable choices of this term and the effects on the system parameters and sensitivities.

The forms for the rate functions in Eqs. (2) and (3) were chosen so that the system is mathematically conservative and thus meet the conservation criterion for a chemostat10,11. Further reading of chemostat conservation principles and the construction of chemostat equations for competing species can be found in Harmand et al.11.

A mathematical model is a simplification and idealization. In our case, the three microbial species and their attendant chemical species are not an isolated system but rather part of a much larger ecosystem. In this larger ecosystem, other unaccounted for microorganisms and substrates interact with and affect the substrates and microorganisms we have considered. We note that it is often easier to investigate the dynamics and interactions of interest by focusing on and isolating the mechanisms within a relevant subsystem.

Parameter estimation techniques

In our model, many of the parameter values are unknown or cannot be determined experimentally. In order to estimate these parameters mathematically, we searched for data tracking the biomass changes of B. thetaiotaomicron, E. rectale, and M smithii in order estimating these parameters. This system as its written in Eqs. (2) and (3), however, does not account for all possible parameters that could potentially affect the fluctuations in biomass or substrate concentration. We provide a description of our data in Supplementary Information Appendix A. We emphasize that this data is limited in that it only contains information about one time point, which makes our parameter estimates highly uncertain.

By fitting the model parameters to the data shown in Supplementary Table 1, we obtained estimates of our model’s parameters, which are shown in Table 2. We obtained this set of parameters by optimizing Eq. (6) using the Nelder-Mead method. Optimization details can be found in Supplementary Information Appendix B.

$$\begin{aligned} \text {MAPE}(\hat{x}_{t_1:t_2},x^\text {obs}) = \frac{1}{4}\sum _{i=1}^4 \left| \frac{\hat{x}_{t_1:t_2}-x^\text {obs}}{x^\text {obs}}\right| , \end{aligned}$$
(6)

where \(\hat{x}_{t_1:t_2} = \left[ \frac{1}{3}\left( \hat{B}_{t_1:t_2} + \hat{E}_{t_1:t_2} + \hat{M}_{t_1:t_2}\right) , \hat{a}_{t_1:t_2}, \hat{h}_{t_1:t_2}, \hat{p}_{t_1:t_2}\right]\), \(x^\text {obs} = \left[ \frac{1}{3} (B^\text {obs} + E^\text {obs} + M^\text {obs}), a^\text {obs}, h^\text {obs}, p^\text {obs}\right]\), and \(\hat{x}_{t_1:t_2}\) can be defined as \(\hat{x}_{t_1:t_2}= \frac{1}{2}\big (\text {max}(x_{t_1:t_2}) + \text {min}(x_{t_1:t_2})\big )\). The matrix \(x_{t_1:t_2} \in \mathbb {R}^{4\times (t_2-t_1)/\Delta t}\) corresponds the the ODE solutions of equations 2 and 3 between times \(t_1\) and \(t_2\) with a \(\Delta t\) time step. This choice of \(\hat{x}_{t_1:t_2}\) allows us to measure the discrepancy between the observed data and the center of the oscillations of the ODE solutions between times \(t_1\) and \(t_2\). In our setting, we choose \(t_1= 1400\) and \(t_2 = 1500\) to allow the ODE solutions to approach a steady state.

Table 2 Table of fitted parameter values for Eqs. (2) and (3) based on the experimental data in Supplementary Table 1. In this table, we report rounded versions of the fitted parameter values.

Sensitivity analysis methods

We utilized global sensitivity analysis in order to identify the driving parameters of our model. In this framework, which utilizes Monte-Carlo (MC) methods, each parameter is assigned a probability density function (pdf) based on a priori information known about the parameter. Samples are drawn from these probability density functions to evaluate the overall model output.

Because we are interested in detecting significant nonlinear interactions among the model parameters, implementations of global sensitivity analysis methods for linear relationships like the partial rank correlation coefficient (PRCC), the Pearson correlation coefficient (CC), and standardized regression coefficients (SRC) would not be useful in our case. Consequently, we employed the Sobol’ method, a variance-based decomposition method30.

Sobol’ indices are a MC variance-based approach to calculating all first-order and total-effects indices in a model with k parameters. These sensitivity indices are computed based on model evaluations for N simulations. In order to improve the sensitivity estimates, the values tested for each parameter in the model are drawn from quasi-random number generators. Though the use of quasi-random numbers from a given distribution is not necessary, we incorporate this approach into our computations.

Figure 6
figure 6

Heatmap of model parameters’ estimated first-order Sobol’ indices for each output, B, E, M, a, h, and p, using \(N=2^{19}\) simulations.

Figure 7
figure 7

Heatmap of model parameters’ estimated total-order Sobol’ indices for each output, B, E, M, a, h, and p, using \(N=2^{19}\) simulations.

To compute the first- and total-order Sobol’ indices for our reduced model parameters, we specified prior distributions for each parameter from which to draw samples. Parameter samples are drawn using lattice rule samples on the interval [\(a_i\),\(b_i\)] for \(i \in 1,\dots ,20\) specified in Supplementary Table 3, using the QuasiMonteCarlo.jl31 package. Additional details of our sensitivity analysis can be found in Supplementary Information Appendix B.