Steering ecological-evolutionary dynamics to improve artificial selection of microbial communities

Xie, Li; Shou, Wenying

doi:10.1038/s41467-021-26647-4

Download PDF

Article
Open access
Published: 23 November 2021

Steering ecological-evolutionary dynamics to improve artificial selection of microbial communities

Nature Communications volume 12, Article number: 6799 (2021) Cite this article

7279 Accesses
20 Citations
48 Altmetric
Metrics details

Subjects

Abstract

Microbial communities often perform important functions that depend on inter-species interactions. To improve community function via artificial selection, one can repeatedly grow many communities to allow mutations to arise, and “reproduce” the highest-functioning communities by partitioning each into multiple offspring communities for the next cycle. Since improvement is often unimpressive in experiments, we study how to design effective selection strategies in silico. Specifically, we simulate community selection to improve a function that requires two species. With a “community function landscape”, we visualize how community function depends on species and genotype compositions. Due to ecological interactions that promote species coexistence, the evolutionary trajectory of communities is restricted to a path on the landscape. This restriction can generate counter-intuitive evolutionary dynamics, prevent the attainment of maximal function, and importantly, hinder selection by trapping communities in locations of low community function heritability. We devise experimentally-implementable manipulations to shift the path to higher heritability, which speeds up community function improvement even when landscapes are high dimensional or unknown. Video walkthroughs: https://go.nature.com/3GWwS6j; https://online.kitp.ucsb.edu/online/ecoevo21/shou2/.

Improving microbial phylogeny with citizen science within a mass-market video game

Article Open access 15 April 2024

Hybrid speciation driven by multilocus introgression of ecological traits

Article Open access 17 April 2024

Elucidation of genes enhancing natural product biosynthesis through co-evolution analysis

Article 12 April 2024

Introduction

Multispecies microbial communities often display community functions—biochemical activities not achievable by any member species alone. For example, a community of Desulfovibrio vulgaris and Methanococcus maripaludis, but not either species alone, converts lactate to methane in the absence of sulfate¹. Community function arises from “interactions” where one community member influences the physiology of other community members. Interactions are typically complex and difficult to characterize, making it challenging to rationally design communities^2,3. In a different approach, one could mutagenize individual community members, assemble them at various ratios, and screen the resultant communities for high community function. However, this requires community members to be culturable, and the number of combinatorial possibilities increases rapidly with the number of species and genotypes. In addition, such assembled communities might be vulnerable to ecological invasion⁴.

Alternatively, community function may be improved by artificial selection (directed evolution; Fig. 1a)^5,6,7. During each selection cycle, newly assembled Newborn communities (“Newborns”) grow into Adult communities (“Adults”) over a period of “maturation” time set by the experimentalist. During community maturation, community members can proliferate and mutate. At the end of community maturation, Adults expressing the highest community function are chosen to “reproduce” where each is randomly partitioned into multiple Newborns for the next selection cycle. Artificial community selection, if successful, can improve useful community functions such as fighting pathogens⁸, producing drugs⁹, or degrading wastes¹⁰ without detailed knowledge of the underlying mechanisms.

**Fig. 1: Artificial community selection to improve a community function.**

Theoretical work predicts that artificial selection of communities can succeed, at least under certain conditions^{4,11,12,13,14,15,16}. Experimental work on community selection often yielded variable outcomes^{17,18,19,20,21,22,23,24,25,26,27,28}, and some studies were not conclusive due to the lack of a “no selection” control. In some cases, communities indeed responded to selection, presumably driven by changes in species composition^22,23,24,25 and/or evolution^17,18. In other cases, selecting for high-function communities yielded similar outcomes as selecting for low-function or random communities^19,26,27,28, and community function could even decline despite selection^20,25. For example, selecting marine microbial communities for enhanced chitin-degradation activity was ineffective, unless community maturation time was progressively adjusted to prevent undesirable species from taking over²⁵.

Successful community selection requires three elements: variation in community function, preferential survival of high-functioning communities, and heritability of community function²⁹. Preferential survival of high-functioning communities is enabled by intercommunity selection. Variation and heritability of community function can be understood in terms of variation and heritability of community function determinants. Community function determinants (“determinants”) are defined as factors that determine community function and that vary among communities (Fig. 1b). Examples of community function determinants include genotype and species composition. Variation in community function is induced by variation in community function determinants. For example, mutations and species migration can introduce variation in community function by modifying genotype and species compositions. Heritability of community function³⁰ is determined by heritability of community function determinants. Both can be estimated from the slope of the parent–offspring linear regression (Fig. 1c). Indeed, artificial community selection could fail, unless one promotes both variation (e.g., choosing a sufficiently large number of Adult communities to reproduce) and heritability of community function (e.g., promoting species coexistence)^{6,15,25,26,28}.

Artificial community selection becomes particularly challenging when community function incurs a fitness cost to one or more community members. For example, fast-growing species must evolve slower growth to coexist with slow-growing partner species. If contributing to community function requires diversion of cellular resources, then within a species, contributors to community function will be outcompeted by cheaters who make little or no contributions. Hence, cheaters are favored by intracommunity selection during community maturation (Fig. 1a, olive bracket). However, cheaters are disfavored by intercommunity selection that only allows high-functioning communities to reproduce (Fig. 1a, red bracket). Thus, to improve a costly community function, intercommunity selection must overcome intracommunity selection (Fig. 1a).

To learn general principles on effective community selection and to gain insights that can guide future experiments, we use individual-based simulations to compare multiple selection strategies. We have conjured a highly simplified microbial community where two species coexist due to a commensal ecological interaction, and both species are required for community function. In our system, species composition is nonheritable: variations in Newborn species compositions are rapidly dampened during community maturation due to an “attractor” (steady-state species composition) induced by the commensal ecological interaction. In contrast, genotype composition is heritable. We visualize “community function landscape” relating community function to its heritable and nonheritable determinants, similar to a phenotype landscape relating an individual’s phenotype to its genetic and environmental determinants^31,32. We find that the steady-state species composition confines evolving communities to a path in the landscape. This confinement can generate counterintuitive evolutionary dynamics and prevent the attainment of maximal community function. Importantly, the local landscape geometry near the steady state species composition is indicative of community function heritability, an idea similar to phenotype landscape indicative of heritability of individual traits. When communities are trapped in low-heritability landscape regions, community function does not improve or improves only slowly under selection. Inspired by these observations, we devise perturbation strategies that improve community function heritability and thus the rate of community function improvement, even when community function landscape is high-dimensional and cannot be visualized.

Results

A commensal community with a species-composition attractor

In our previous work¹⁵, we simulated artificial selection on a two-species Helper–Manufacturer community (“H–M community”, Fig. 2a). In this community, Helper (H) digests agricultural waste and consumes Resource, grows biomass, and, at no cost to itself, releases a metabolic Byproduct essential for Manufacturer (M). As M consumes Byproduct and Resource, its cellular resource is partitioned, so that a fraction f_P (0 ≤ f_P ≤ 1) is used to synthesize Product P that is of interest to the experimentalist, while the rest (1 − f_P) is used for its own biomass accumulation. Thus, H helps M to grow, and such commensal interaction is commonly found in microbial communities^{33,34,35,36,37,38}.

**Fig. 2: A two-species commensal community, its species-composition attractor and restrictor, and its community function landscape in relation to heritable and nonheritable determinants.**

Since M relies on H’s Byproduct, H can either drive M extinct or coexist with M¹⁵. When M’s cost f_P is large, M always grows slower than H. Thus, ϕ_M, the fraction of M biomass in a community, declines within a cycle and over cycles until M goes extinct (Fig. 2b, arrows on the right approaching the horizontal axis). When M’s cost f_P is moderate or small, and when we choose growth parameters of the two species properly (Table 1, Methods “Parameter choices”), H and M can coexist at a steady-state ratio (Fig. 2b, arrows on the left approaching the positive portion of the blue dashed line). Note that communities with stably coexisting strains have been engineered in the lab^39,40. The steady-state fraction of M biomass at various cost f_P forms a species-composition “attractor” (Fig. 2b, blue dashed line): at a given cost, species composition away from the attractor is pulled toward the attractor as the community matures. Because Adult composition is restricted by the attractor, compositions of offspring Newborns will also be restricted. We define this restriction on Newborn composition as “attractor-induced Newborn restrictor” (“Newborn restrictor” or “restrictor” marked by the orange line in Fig. 2d i). Note that manipulations of the Newborn restrictor will play a key role in this work.

Table 1 Parameters for genotypes (and thus phenotypes) of H and M used in the simulations.

Full size table

Visualizing community function landscape

H–M community function is defined as the total amount of Product accumulated in an Adult community, denoted by P(T) with T being the community maturation time. Community function requires both species, since H supports M growth while M makes Product. Community function is not costly to H, but costly to M (cost = f_P). Although evaluated at the end of a maturation cycle, community function accumulates throughout the cycle and is thus sensitive to not only species genotypes (and thus phenotypes) but also initial conditions. In our models, biological parameters (i.e. cost f_P, growth rates, etc.) are inherited upon cell division; we thus interchangeably refer to them as genotypes or phenotypes, depending on context.

If a community is well-mixed and if the populations are clonal with deterministic dynamics (i.e., all members of a species share the same genotypes and thus phenotypes), then community function P(T) can be deterministically calculated from Equations (1–7), given model parameters (species genotypes), maturation time T, and initial conditions. Here, we allow only M’s cost f_P and the fraction of M biomass in the Newborn (ϕ_M(0)) to vary. We fix all other parameters and all other initial conditions (i.e., Newborn total biomass, excess agriculture waste that can be regarded as a constant, and initial Resource). Thus, community function has two determinants (f_P and ϕ_M(0)), and “community function landscape” can be visualized as a function of these two determinants, similar to a topographic map (Fig. 2c). The single peak corresponds to the global maximal community function achieved at an intermediate species composition and an intermediate M’s cost¹⁵. Note that when M pays a low cost, community function is low because little Product is made, but when M pays a very high cost, community function is also low because M cannot proliferate substantially. Similarly, community function peaks at an intermediate value of Newborn species composition (ϕ_M(0)), since community function requires both M and H.

The community function landscape calculated above can be used even when the populations are not clonal, as long as genotype composition does not change drastically within a cycle. Specifically, when maturation time is ~6 population doublings, community functions from individual-based simulations (where individual M cells can have different cost genotypes; Supplementary Fig. 1, i–iii; Methods) are well predicted by deterministic calculations based on Newborn species composition (ϕ_M(0)) and the average cost paid by M in the Newborn (${\overline{f}}_{P}(0)$) (Supplementary Fig. 3A). Overall, community function during evolution has been simplified to rely on only two determinants: Newborn species composition (ϕ_M(0)) and Newborn genotype composition (average cost ${\overline{f}}_{P}(0)$).

Heritability of community function determinants

Of the two community function determinants, M’s average cost in a Newborn ${\overline{f}}_{P}(0)$ is heritable, while Newborn species composition ϕ_M(0) is nonheritable (Fig. 2e). The heritability of genotype composition is intuitive: if a parent Newborn is dominated by cheaters (low ${\overline{f}}_{P}(0)$) or cooperators (high ${\overline{f}}_{P}(0)$), then so will its offspring Newborns (Fig. 2e top). Note that offspring costs are generally less than the parent cost, since cooperator frequency declines during community maturation due to cheater takeover (circles are below the dotted line of slope 1 in Fig. 2e top). Newborn species composition ϕ_M(0) is not heritable due to the attractor. Parent Newborns experience stochastic fluctuations in species composition (e.g., a species ratio of 50:50 can become 40:60 by chance due to pipetting a small number of cells). However, due to the attractor, parent Adults end up sharing similar species compositions (Fig. 2d i, left), and so will their offspring Newborns (Fig. 2d i, right). In essence, variations in Newborn species composition are not transmitted across cycles, and are therefore not heritable (Fig. 2e bottom). Since any elevation in community function due to nonheritable determinant will not transmit to the next cycle, we can quantify selection efficacy as the progress in the heritable determinant ${\overline{f}}_{P}(0)$.

Restrictor leads to counterintuitive and suboptimal outcomes

We now consider the case where ancestral M pays a cost smaller than what is optimal for maximal community function. This scenario poses a common problem that is challenging to address: while maximal community function requires a higher cost, intracommunity selection favors a lower or no cost (Fig. 1a).

To obtain selection dynamics, we performed individual-based stochastic simulations (Methods, Supplementary Fig. 1). In each cycle, we select from 100 communities. To discourage cheater takeover, each Newborn has a small total biomass (100 biomass units or 50 ~ 100 cells; Supplementary Fig. 2), and Newborns mature into Adults over a relatively short period of time (T = 6 ~ 7 doublings¹⁵). We track individual H and M cells as they consume and release metabolites, grow biomass and divide, and stochastically die. As an M cell divides, the f_P of both daughters have a probability (0.002/cell/generation) to mutate, with 50% of mutations setting f_P to 0, while the rest increasing or decreasing f_P by on average 5–6 percent. M cells with a higher f_P contribute more toward community function but grow slower. At the end of T, Adults are ranked on their functions. The top 2 or 10 Adults are chosen to reproduce, with their H and M cells randomly distributed into offspring Newborns so that Newborn total biomass fluctuates around the target value (100 biomass units or 50–100 cells) and Newborn species compositions fluctuate around that of the parent Adult. We do not mix different community lineages to preserve variations among communities and to prevent cheaters from spreading across communities (Supplementary Fig. 5). Our choices of model parameters are supported by the microbial experimental literature (Methods, Parameter choices).

In the absence of community selection (e.g., allowing each Adult to reproduce one offspring, or randomly selecting Adults to reproduce), then unsurprisingly, community function rapidly declines to zero as cheaters take over¹⁵.

To ensure effective community selection, we increase community function heritability by reducing variations in the nonheritable determinant ϕ_M(0). To do so, we reproduce Adults via “cell sorting”, as if by a flow sorter capable of measuring the biomass of individual H and M cells, so that all Newborns from the same parent have nearly identical species composition and total biomass. With this experimentally challenging method, community selection successfully improves community function over ~200 selection cycles (Fig. 3a). However, community function P(T) never reaches the theoretical maximum (Fig. 3a dashed line). The average cost paid by M stays above ${f}_{P}^{* }$ optimal for community function (Fig. 3a, middle panel), which is surprising because high cost would be disfavored by both intracommunity selection (which favors fast growth and low cost) and intercommunity selection (which favors ${f}_{P}^{* }$).

**Fig. 3: Restriction in Newborn species composition can produce counterintuitive evolutionary dynamics and prevent the attainment of maximal community function.**

To understand this suboptimal selection outcome, we overlay community function landscape (Fig. 3b gray contours) with attractor-induced Newborn restrictor (Fig. 3b orange line, which coincides with the attractor—see Fig. 2d i). We note that the Newborn restrictor does not pass through the maximal community function (black star in Fig. 3b). During selection, Newborn species compositions are strictly confined to the restrictor (Fig. 3b, circles on top of the orange line). This explains the suboptimal selection outcome: Like a hiker who is restricted to a trail that does not traverse the mountain top, community function can only climb to the highest value along the restrictor (magenta circle), which is lower than the global maximum (black star). Consequently, the corresponding Newborn average cost ${\overline{f}}_{P}(0)$ and Newborn species composition ϕ_M(0) are higher than those optimal for community function. Not surprisingly, community function can reach the maximum if we push down the Newborn restrictor by an appropriate amount (e.g., by replacing 15% of Newborn total biomass with H during each cycle of selection), as shown in Supplementary Fig. 8. We will not dwell on maximal community function further, since the maximal function is typically unknown in practice, as in many global optimization problems. Instead, we will now focus on the rate of community function improvement.

Landscape determines heritability and selection efficacy

Successful selection requires heritability in community function. Community function heritability depends on (1) variations in the heritable and nonheritable determinants, and (2) how strongly variations in these determinants impact variations in community function. Minimizing variations in the nonheritable determinant ϕ_M(0) (e.g., reproducing the chosen Adults through cell sorting, Fig. 3) results in high heritability in community function and thus rapid improvement under selection¹⁵. However, such a technique is often difficult to apply.

We now explore how to achieve rapid improvement in community function when variations in nonheritable determinants cannot be easily diminished. We will show that the local geometry of community function landscape predicts the heritability of community function and thus the efficacy of intercommunity selection. To illustrate, we consider a cartoon model in Fig. 4 where community contours are straight lines. In Fig. 4a, community function contours are perpendicular to the axis of the heritable determinant. Therefore, variation in community function (light to dark gray) is fully attributed to variation in the heritable determinant. Thus, community function is heritable (Fig. 4d), and intercommunity selection can make large progress in the heritable determinant over a selection cycle (Fig. 4a). In contrast, when community function contours are parallel to the axis of the heritable determinant (Fig. 4b), no variation in community function can be attributed to variation in the heritable determinant. Thus, community function is nonheritable (Fig. 4e), and intercommunity selection makes no progress (Fig. 4b). An intermediate case is shown in Fig. 4c and f.

**Fig. 4: The local geometry of community function landscape dictates the heritability of community function and consequently the efficacy of intercommunity selection.**

Boosting heritability hastens community function improvement

We can now examine community selection when the nonheritable determinant (Newborn’s fraction of M biomass ϕ_M(0)) is allowed to fluctuate during community reproduction. Specifically, if we simulate pipetting cells from Adults to seed Newborns (while keeping the total Newborn biomass fixed), ϕ_M(0) will fluctuate stochastically (due to small Newborn size), essentially creating a “cloud” of Newborns around the Newborn restrictor (Fig. 5b ii). Each Newborn’s community function at Adulthood can be read out from the value (gray shade) of the contour it resides on. Although landscape contours near the restrictor are not straight lines, they are largely parallel to the heritable determinant axis (similar to Fig. 4b). Thus, variations in community function are largely attributed to variations in the nonheritable determinant. Indeed, community function has low heritability (~0 slope in Fig. 5c ii), and intercommunity selection makes only a small progress in the heritable determinant (short red arrow in Fig. 5b ii; statistics in the red box plot of Fig. 5b iv). This small progress is just enough to counter the decline due to intracommunity selection (olive box in Fig. 5b iv), resulting in a net of zero-improvement rate (black box in Fig. 5b iv). Since heritability remains low from cycle to cycle (Supplementary Fig. 23 top panels), community function and heritable determinant barely improve despite 1000 selection cycles (Fig. 6a).

**Fig. 5: Increasing community function heritability improves selection efficacy.**

**Fig. 6: Increasing community function heritability by shifting Newborn restrictor can improve selection efficacy under various conditions.**

We then seek to improve selection by shifting the Newborn restrictor to a location with higher heritability, i.e., where community function contours are largely perpendicular to the axis of heritable determinant (e.g., south of the orange restrictor). This can be achieved by replacing 30% of total Newborn biomass with nonevolving H (“30%-H spiking”, teal box in Fig. 5a i, Fig. 2d ii). Under 30%-H spiking, community function becomes more heritable (positive slope in Fig. 5c iii) due to reduced dependency on the nonheritable determinant and enhanced dependency on the heritable determinant (Supplementary Fig. 6). Intercommunity selection thus makes a larger improvement in the heritable determinant (compare red arrows in Fig. 5b iii versus ii; compare red boxes in Fig. 5b v versus iv), and the total improvement rate becomes positive (black box in Fig. 5b v). Since heritability fluctuates around a high level from cycle to cycle (Supplementary Fig. 23 bottom panels), community selection improves community function efficiently (Fig. 6b). Spiking must be performed at each cycle since spiked species composition rapidly returns to the attractor (Fig. 2d ii).

Under a variety of conditions, community function improves faster when community function heritability is enhanced through species spiking. The 30%-H spiking strategy boosts the already increased selection efficacy when we choose top-10, instead of top-2 Adults, to reproduce (Fig. 6: c more effective than a due to increased variation among communities¹⁵; d more effective than c due to spiking and consequently improved heritability). H spiking promotes selection when measurement noise in community function interferes with selection (Fig. 6: f better than e, h better than g), and when both total biomass and species composition of Newborns are allowed to fluctuate as if the chosen Adults are reproduced and spiked through pipetting without fixing the inoculum biomass (Supplementary Fig. 10). Although quantitative details differ, H spiking at a wide range of percentages speeds up the improvement of community function (Supplementary Fig. 10). Importantly, compared with the ancestral community, evolved communities selected under one maturation time (T = 17) exhibit higher functions across a range of maturation times (from T = 13 to 21) and Newborn species compositions (Supplementary Fig. 7).

Species spiking is but one of the perturbations we can use to alter community function heritability. For example, if we extend maturation time T from 17 to 20 (and assume that any potential resource depletion will not affect cell phenotypes), then community function heritability is improved (larger slope in Supplementary Fig. 9c than in Fig. 5c ii). This leads to a faster improvement of community function (compare Supplementary Fig. 9a with Fig. 6a).

Enhancing selection efficacy without knowing the landscape

So far, we have examined a simple scenario where community function landscape can be visualized. Since landscapes of most community functions are high-dimensional and unknown, it is infeasible to devise a perturbation strategy based on landscape visualization. However, landscape geometry is reflected in the heritability of community function (Fig. 4; Fig. 5b and c), which can be estimated from experimental measurements (similar to Fig. 5c). Thus, we can try several different perturbation strategies, compare them, and choose the strategy yielding the highest community function heritability. Since communities move on the landscape as they evolve, periodic heritability check is needed.

As an example, let us consider a more complex scenario with the H–M community. If we allow growth parameters of H and M to also evolve, then community function will have six heritable determinants all defined at the Newborn stage (Supplementary Fig. 3b): M’s average cost, the average maximal growth rates of H and of M, the average affinities of M to Resource and to Byproduct, and the average affinity of H to Resource. If we reproduce the Adult via “pipetting”, then Newborn’s total biomass and species composition fluctuate stochastically, adding two nonheritable determinants. If we additionally consider community function measurement noise (a normal random variable with mean 0 and standard deviation comparable to the ancestral community function), we have yet another nonheritable determinant. Overall, community function now has nine determinants, six heritable and three nonheritable.

We simulate community selection in the above complex scenario (schematic in Supplementary Fig. 11). We start with the no-spiking strategy, and always choose top-10 Adults where each reproduces 10 Newborns. If a fraction of Newborn biomass is to be replaced with H (or M) biomass, the spiking mix consists of equal parts of five evolved H (or M) clones randomly isolated from the previous cycle of the same lineage (Supplementary Fig. 11a). During reproduction, portions of a chosen Adult and the spiking mix are “pipetted” to initiate Newborns, so that both the total biomass and the species composition fluctuate stochastically in Newborns. Every 100 cycles, we update the spiking strategy based on heritability of community function. Specifically, we quantify community function heritability for five candidate spiking strategies (no spiking, 30%-H spiking, 60% H spiking, 30% M spiking, and 60% M spiking) by regressing parent function with median offspring function (similar to Fig. 5c). The current spiking strategy is then updated if an alternative strategy confers significantly higher community function heritability (Supplementary Fig. 11b, Methods).

With the no-spiking strategy, community selection moderately improves community function and heritable determinants (Fig. 7a and b; Supplementary Fig. 12a). The rate of community function improvement is higher when the spiking strategy is periodically adjusted according to community function heritability (Fig. 7c and d; Supplementary Fig. 12b). In contrast, adopting the spiking strategy with the lowest heritability leads to a slower rate of improvement (Supplementary Fig. 12c), while randomly choosing spiking strategy leads to variable results (Supplementary Fig. 12d). It is also noteworthy that under periodic heritability check, the adopted spiking strategy is not static (Fig. 7e). A static 60% H or 30%-H spiking strategy offers negligible improvement over no spiking (Supplementary Fig. 13). Communities obtained through selection with periodic heritability check exhibit higher functions than those obtained through no spiking, even if Newborn species compositions are readjusted to values over a wide range (Fig. 7f). Therefore, it is important to evaluate heritability periodically and update perturbation strategy accordingly.

**Fig. 7: Adjusting perturbation strategies based on periodic heritability check improves selection efficacy even when the landscape is high-dimensional and unknown.**

Increasing heritability as an effective and general approach

Qualitatively similar results are obtained when the number of colonies used to make the spiking mix is changed from 5 to 1, 2, or 10 clones (Supplementary Figs. 14, 15 and 16). Simulations with three candidate spiking strategies (no spiking, 30% M spiking, and 30%-H spiking) instead of five strategies generate qualitatively similar results (Supplementary Fig. 22). The frequency of heritability check can be adjusted. For example, similar outcomes are obtained if heritability checks are performed “adaptively” (e.g., only when the average rate of community function improvement over the last 50 cycles is less than zero, Supplementary Fig. 17). In this particular case, adaptive check reduces the number of checks by ~50% compared with Fig. 7 (periodic check every 100 cycles).

Heritability checks can also speed up community function improvement for communities engaging in mutualistic and exploitative ecological interactions, even when the community function landscape is high-dimensional and unknown. First, we simulated community selection on a mutualistic H–M community, where M relies on H’s Byproduct, and H’s Byproduct inhibits H’s growth. By removing Byproduct, M promotes H’s growth. Similar to the simulations shown in Fig. 7, genotypes that can be modified by mutations include M and H’s maximal growth rates, affinities to metabolites, and M’s cost f_P. Additionally, H’s sensitivity to its Byproduct can also be modified by mutations. At the end of each cycle, Adult communities with top-10 functions are chosen and each reproduces 10 offspring Newborn communities through pipetting for the next cycle. The community function of this mutualistic H–M community thus has seven heritable determinants and three nonheritable determinants. Improvement in community function is rapid under selection even without spiking, and is sped up slightly but significantly (Mann–Whitney U test, n₁ = n₂ = 6, p = 10⁻³, one-tailed) when we periodically adopt the spiking strategy conferring the highest community function heritability (Supplementary Fig. 20). Next, we simulated community selection on an exploitative H–M community. In this community, M releases a compound that inhibits H. That is, H helps M, but M inhibits H. Since H’s sensitivity to the compound released by M can be modified by mutations, this exploitative community also has seven heritable determinants and three nonheritable determinants. Compared with selection with no spiking, the rate of community function improvement is much faster when we periodically adopt the spiking strategy conferring the highest community function heritability (Supplementary Fig. 21).

Discussion

We start with a highly simplified case where the community function of interest varies due to variations in two determinants—one heritable (genotype composition) and one nonheritable (species composition that can change rapidly due to ecological interactions). These two determinants capture crucial elements common to community function in general. Hence, our work serves as a fundamental building block for conceptualizing community selection, akin to physicists studying the ideal gas, or population geneticists studying a single-locus two-allele trait. Below, we recap our work’s major conclusions, assumptions, limitations, and generality. Then, we reflect on what makes community selection effective, and discuss future directions.

We have demonstrated how ecological interactions, especially those that encourage species coexistence, can result in a species-composition attractor that impacts community selection. The attractor restricts Newborn species compositions to a narrow region, which can (1) generate counterintuitive evolutionary dynamics (Fig. 3a middle panel); (2) constrain communities away from maximal community function (Fig. 3b); and (3) trap communities in a region of low heritability where selection efficacy is low (i.e., community function improves slowly or not at all despite selection; Figs. 5c ii and 6a). This understanding has helped us to devise perturbation strategies that improve community function heritability and in turn selection efficacy (Figs. 5c iii and 6b, Supplementary Fig. 9). When we can visualize the landscape, we can identify high-heritability regions where community function contours are perpendicular to the axis of heritable determinant (Fig. 4). We can then design perturbation strategies (e.g., species spiking and varying maturation time) to shift the restrictor to these regions (Fig. 5). When we can not visualize the landscape, we can choose perturbation strategies based on community function heritability (Fig. 7), a quantity that can be estimated from experiments (similar to Fig. 5c). Since evolving communities move in the landscape, perturbation strategy needs to be adjusted—either periodically (Fig. 7; Supplementary Fig. 11) or whenever selection progress slows down (Supplementary Fig. 17).

Our study assumes no spatial structure within communities, the presence of a species-composition attractor, and time scale separation (ecological dynamics much faster than evolutionary dynamics within a cycle). These assumptions capture realistic situations: spatial structure can be disturbed by convection-induced mixing, and by fast diffusion of metabolites if communities are encapsulated in small droplets. Composition attractor and thus Newborn restrictor can arise from ecological interactions that benefit at least one species, or when different species show distinct environmental preferences^{16,35,37,41,42,43,44,45,46}. Ecological timescale is faster than evolutionary timescale because mutation rate is small (our mutation rate of 0.002/cell/generation is already on the high end among the published values) and because community maturation time needs to be short to prevent cheater takeover¹⁵. Quantification of heritability does require variation in community function, or else parent–offspring regression can become a single dot and heritability becomes ill-defined. One such example is found in the theoretical work by Doulcier et al.¹⁶ where species ratio evolves to a target value and is thus identical among all evolved communities and their offspring communities. However, most published experimental work does show large variations in community functions^{17,18,19,20,21,22,23,24,25,26,27,28}, at least initially.

Our work has limitations and caveats. First, for species spiking, member species need to be culturable, so that we can isolate clones to form spiking mixes. Flow sorting could potentially bypass this problem if member species can be distinguished by light-scattering or fluorescence patterns. Second, evaluating community function heritability is resource-intensive. Thus, checking heritability only when necessary (Supplementary Fig. 17) may help reduce workload. Third, the effect of perturbation may be erased every cycle (Fig. 2d ii). However, improved selection efficacy yields desired genotypes that can be frozen and repeatedly revived and used. These evolved genotypes lead to higher community functions even when maturation time or Newborn species composition differs from those during the evolution experiment (Fig. 7f and Supplementary Fig. 7). Fourth, extreme perturbations (e.g., extreme spiking percentages) can induce spurious heritability. For example, when we consider only three candidate strategies (no spiking, 30%-H spiking, and 30% M spiking), selection efficacy can be higher than when we also include the 60% H spiking and 60% M spiking strategies. In Supplementary Fig. 22a, the last stretch of community function ascent closely follows the switching of spiking strategy from 60% H to 30% H. This is presumably because 60% H spiking was so extreme that species composition does not return to the attractor within one cycle (Supplementary Fig. 24). Consequently, stochastic fluctuations in species composition become partially heritable (similar to Supplementary Fig. 19c), misleading the choice of spiking strategy.

We have tested the generality of our work in several ways. We varied the nature of perturbation (species spiking in Figs. 5–7, altering maturation time in Supplementary Fig. 9), the number of clones used in the spiking mixture (Supplementary Figs. 14, 15 and 16), the number of spiking strategies under heritability check (Supplementary Fig. 22), the frequency of heritability check (Supplementary Fig. 17), and ecological interactions between species (Supplementary Figs. 20 and 21). In all cases, improvements in community function can be sped up by perturbations that improve community function heritability. Although we mainly studied a costly community function, we obtained similar results when community function is not costly (Supplementary Fig. 18). Selection can be sped up by improving community function heritability even when species composition fails to reach the steady state by the end of maturation, in which case an otherwise nonheritable determinant becomes partially heritable (Supplementary Fig. 19). In sum, for communities with a stable species composition, appropriate perturbation strategies under the guidance of heritability checks can speed up the improvement of community function.

Our most simplified case (Figs. 5 and 6) involves a single-peaked landscape (Fig. 2c). Even if the landscape has multiple peaks, we only need to consider the landscape region near the Newborn restrictor. Should the restrictor traverse multiple peaks, we have a multipeak optimization problem. The broader optimization field deals with this by iteratively invoking an algorithm designed for a single peak, but at different starting points (e.g., MATLAB’s GlobalSearch and MultiStart algorithms). In the case of community selection, this means sampling diverse starting compositions.

What makes community selection effective? Effective community selection relies on optimizing intercommunity variation, selection strength, and community function heritability. Experiments showed that low heritability of community function could limit selection efficacy^26,27,28. Since our community function is affected by Newborn species composition, having an attractor does not solve the problem of nonheritable variations in Newborn species composition (contrary to the claim by Doulcier et al.¹⁶). Our work identifies two strategies for improving community function heritability. One strategy is to reduce the variation in nonheritable determinant ϕ_M(0), for example, through cell sorting¹⁵. Although nonheritable variation can also be reduced by increasing Newborn size, large Newborn size leads to cheater takeover in all communities and consequently selection failure (Supplementary Fig. 2)¹⁵. Intriguingly, large Newborn size hinders community selection even for noncostly community function, presumably because large Newborn size reduces inter-community variation (Supplementary Fig. 18b). The other strategy for improving heritability is to reduce the dependence of community function on the nonheritable determinant. This can be achieved through proper perturbations (e.g., manipulating species composition in Figs. 5–7 and Supplementary Fig. 6; altering maturation time in Supplementary Fig. 9). Overall, selection efficacy is improved when we improve community function heritability (compare improvement rates in Fig. 6a with Fig. 3a and Fig. 6b, Figs. 5–7). In contrast, if we reduce community function heritability (and variation) by mixing Adults before reproduction, selection efficacy is poor (Supplementary Fig. 5).

Optimizing community selection is challenging, partly because variation, selection strength, and heritability are interconnected. For example, strong selection can diminish selection efficacy by reducing intercommunity variation (“top 2” working less well than “top 10” in Fig. 6¹⁵). Drastically diluting a parent Adult (strong bottleneck) increases intercommunity variation, but also creates large stochastic variations that reduce heritability. Our current work showcases an additional difficulty in achieving effective community selection. On the one hand, species-composition attractor can enhance heritability of community function by promoting species coexistence. On the other hand, the attractor and its associated Newborn restrictor mean that communities may only sample a small region in the community function landscape. This can constrain both selection dynamics (Fig. 5) and selection outcome (Fig. 3b). These concepts have prompted us to devise perturbation strategies to shift the Newborn restrictor to a region conferring higher heritability.

For future directions, we start from the empirical side. (1) How to balance variation with heritability during community selection? In Chang et al.⁴, selection efficacy was improved by alternating community perturbations (whose stochastic effects boost intercommunity variation but reduce heritability) with community stabilization (so that stabilized communities might display attractors). Our work suggests that following stabilization, applying perturbation strategies to increase heritability could further increase selection efficacy. (2) How to reduce workload of community selection? (3) What general principles might we learn from applying selection to diverse types of communities? (4) How to scale up evolved communities for industrial applications? Unlike community selection, large-scale productions involve large microbial populations and long duration, and these conditions will facilitate cheater takeover¹⁵ (Supplementary Fig. 2). To combat cheaters during large-scale production, several strategies can be deployed. If the ability to make a product is engineered, then the costly synthesis of the product can be induced at the end of growth phase, thus reducing the growth advantage of cheaters. Additionally, biosensors can be engineered to link cellular growth to product level, thus blocking cheater growth⁴⁷. Moreover, community selection can be sped up by using strains with high mutation rates (i.e., mutators), and after the desired genotype has been obtained, the mutator genotype can be repaired, so that fewer cheaters arise during large-scale production. Finally, a spatially structured environment (such as microwells or droplets) can be introduced to reduce Newborn size and restrict cheater takeover^48,49.

We also need new theories on community selection. Over the last century, a rich body of theory has been developed to understand evolution of quantitative traits in individual organisms (e.g., the work by Lynch and Walsh⁵⁰). Two of the key concepts we use here are landscape and heritability, which are fundamental for understanding the evolution of individual traits. For example, phenotype landscape as a function of genetic and environmental determinants was used to illustrate the evolution of developmental interactions^31,32. The local geometry (gradient) of landscape determines how sensitive the phenotype is to underlying variations, and thus to selection based on the phenotype. Heritability is a pivotal concept in evolutionary biology, particularly in breeding. Multiple statistical methods have been developed to estimate heritability and facilitate designing effective breeding schemes⁵⁰. These methods, although providing inspirations for this work, will need to be expanded to be directly applicable to community selection. Thus, it will be important to develop new theories that incorporate unique features of community selection, such as ecological dynamics resulting from species interactions, interactions between ecological and evolutionary dynamics, and the interplay between intra- and intercommunity selection.

Methods

Calculating landscape, attractor, and restrictor

In this work, we considered communities with commensal, mutualistic, and exploitative interactions. Below, we describe the differential equations for each type of interaction, and how we calculate the corresponding community function landscape, species-composition attractor, and Newborn restrictor.

Commensal H–M community: The model community for most simulations is the same commensal H–M community used in our previous work¹⁵. The community function landscape plots P(T) as a function of ϕ_M(0) and ${\overline{f}}_{P}(0)$. Assume that a Newborn community has 100 biomass units, that all cells have the same genotype (all M cells have the same ${f}_{P}={\overline{f}}_{P}(0)$), that death and birth processes are deterministic, and that there is no mutation. P(T) can then be numerically integrated from the following set of scaled differential equations for any given pair of ϕ_M(0) and ${\overline{f}}_{P}(0)$¹⁵:

$$\frac{dR}{dt}=-{c}_{{RM}}{g}_{M}M-{c}_{{RH}}{g}_{H}H$$

(1)

$$\frac{dB}{dt}={g}_{H}H-{c}_{{BM}}{g}_{M}M$$

(2)

$$\frac{dP}{dt}={f}_{P}{g}_{M}M$$

(3)

$$\frac{dH}{dt}={g}_{H}H-{\delta }_{H}H$$

(4)

$$\frac{dM}{dt}={g}_{M}\left(1-{f}_{P}\right)M-{\delta }_{M}M$$

(5)

where

$${g}_{H}(R)={g}_{{Hmax}}\frac{R}{R+{K}_{{HR}}}$$

(6)

$${g}_{M}(R,\ B)={g}_{{Mmax}}\frac{{R}_{M}{B}_{M}}{{R}_{M}+{B}_{M}}\left(\frac{1}{{R}_{M}+1}+\frac{1}{{B}_{M}+1}\right)$$

(7)

and R_M = R/K_MR and B_M = B/K_MB. Unless otherwise specified, landscapes in this paper are obtained by integrating Equations (1–5) from t = 0 to t = 17.

Equation (1) states that Resource R is depleted by biomass growth of M and H, where c_RM and c_RH represent the amount of R consumed per unit of M and H biomass, respectively. Equation (2) states that Byproduct B is released as H grows, and is decreased by biomass growth of M due to consumption (c_BM amount of B per unit of M biomass). Equation (3) states that Product P is produced as f_P fraction of potential M growth. Equation (4) states that H biomass increases at a rate dependent on Resource R in a Monod fashion (Equation (6)) and decreases at the death rate δ_H. Note that Agricultural waste is not a state variable here as it is present in excess. Equation (5) states that M biomass increases at a rate dependent on Resource R and Byproduct B according to the Mankad and Bungay model (Equation (7)⁵¹) discounted by (1 − f_P) due to the fitness cost of making Product, and decreases at the death rate δ_M. In the Monod growth model (Equation (6)), g_Hmax is the maximal growth rate of H and K_HR is the R at which g_Hmax/2 is achieved. In the Mankad and Bungay model (Equation (7)), K_MR is the R at which g_Mmax/2 is achieved when B is in excess; K_MB is the B at which g_Mmax/2 is achieved when R is in excess.

Mutualistic H–M community: If Byproduct is harmful for H, then the community is mutualistic: H and M promote the growth of each other. Such a mutualistic community can still be described by Equations (1–5) and (7), but Equation (6) is replaced with

$${g}_{H}(R)={g}_{{Hmax}}\frac{R}{R+{K}_{{HR}}}\exp \left(-\frac{B}{{B}_{0}}\right)$$

(8)

where larger B₀ indicates lower sensitivity, or higher resistance of H to its Byproduct B.

Exploitative H–M community: If M releases an antagonistic byproduct A that inhibits the growth of H, then the interaction is exploitative: H promotes the growth of M, but M inhibits the growth of H. Besides Eqs (1–5) and (7), we then need to add an equation that describes the dynamics of A

$$\frac{d\widetilde{A}}{dt}={r}_{A}{g}_{M}\left(1-{f}_{P}\right)M$$

where r_A is the amount of A released when M’s biomass grows by 1 unit. We can then normalize $\widetilde{A}$ with r_A

$$A=\widetilde{A}/{r}_{A}$$

so that

$$\frac{dA}{dt}={g}_{M}\left(1-{f}_{P}\right)M.$$

(9)

We also need to modify the growth rates for H:

$${g}_{H}={g}_{H}(R)={g}_{{Hmax}}\frac{R}{R+{K}_{{HR}}}\frac{{A}_{0}}{A+{A}_{0}}$$

(10)

where larger A₀ indicates lower sensitivity, or higher resistance of H to M’s Antagonistic by product A.

To calculate the community function landscape, species attractor, and Newborn restrictor, all phenotype parameters, except ${\overline{f}}_{P}(0)$ take the value from the Bounds column in Table 1. To construct the landscape such as in Fig. 2c, we calculated P(T) for every grid point on a 2D quadrilateral mesh of 10⁻² ≤ ϕ_M(0) ≤ 0.99 and $1{0}^{-2}\ \le \ {\overline{f}}_{P}(0)\ \le \ 0.99$ with a mesh size of Δϕ_M(0) = 10⁻² and ${{\Delta }}{\overline{f}}_{P}(0)=1{0}^{-2}$. To construct the landscapes in Fig. 5b(ii) and b(iii), P(T) was similarly calculated on a 2D grid with a finer mesh of Δϕ_M(0) = 5 × 10⁻³ and ${{\Delta }}{\overline{f}}_{P}(0)=1{0}^{-4}$.

To calculate the species composition attractor, we integrated Equations (1–5) to obtain ϕ_M(T) − ϕ_M(0) for each grid point on the 2D mesh of ϕ_M(0) and ${\overline{f}}_{P}(0)$. The contour of ϕ_M(T) − ϕ_M(0) = 0 is then the species attractor (blue dashed curve in Fig. 2b).

The attractor-induced Newborn restrictor at a given ${\overline{f}}_{P}(0)$ is calculated from its definition: if ϕ_M(0) of a parent Newborn is on the restrictor, then so is the average ϕ_M(0) among its offspring Newborns. Under no spiking, since the average ϕ_M(0) among offspring Newborn is the same as ϕ_M(T) of their parent Adult, the Newborn restrictor coincides with the species attractor (Fig. 3b and Fig. 5b ii). Under x% H spiking, x% of the biomass in Newborns is replaced with H cells. Thus if the parent Adult’s fraction of M biomass is ϕ_M(T), the average ϕ_M(0) among its offspring Newborns is (1 − x%)ϕ_M(T) under x% H spiking. The Newborn restrictor therefore is the contour of (1 − x%)ϕ_M(T) − ϕ_M(0) = 0 (teal curve in Fig. 5a ii and b iii, Fig. 2d ii). Compared with the orange restrictor under no spiking, the teal restrictor is shifted down.

Parameter choices

Details justifying our parameter choices are given in the Methods section of our previous work¹⁵. Briefly, our parameter choices are based on experimental measurements of microorganisms (e.g., S. cerevisiae and E. coli). To ensure the coexistence of H and M, M must grow faster than H for part of the maturation cycle since M has to wait for H’s Byproduct at the beginning of a cycle. Because we have assumed M and H to have similar affinities for Resource (Table 1), the maximal growth rate of M (g_Mmax) must exceed the maximal growth rate of H (g_Hmax), and M’s affinity for Byproduct (1/K_MB) must be sufficiently large. Moreover, metabolite release and consumption need to be balanced to avoid extreme species ratios. We assume that H and M consume the same amount of Resource per new cell (c_RH = c_RM) since the biomass of various microbes shares similar elemental (e.g., carbon or nitrogen) compositions. We set consumption value so that the input Resource can support a maximum of 10⁴ total biomass. The evolutionary bounds are set, such that evolved H and M could coexist for f_p < 0.5, and that Resource was on average not depleted by T to avoid cells entering stationary phase.

In our simulations, we define “mutation rate” as the rate of nonneutral mutations that alter a phenotype. For example in yeast, mutations that increase growth rate by ≥2% occur at a rate of ~10⁻⁴ per genome per generation (calculated from Fig. 3 of Levy et al.⁵²), and mutations that reduce growth rate occur at a rate of 10⁻⁴ ~ 10⁻³ per genome per generation^53,54. Moreover, mutation rate can be elevated by as much as 100-fold in hypermutators. In our simulations, we assume a high, but biologically feasible, rate of 2 × 10⁻³ phenotype-altering mutations per cell per generation per phenotype to speed up computation. At this rate, an average community would sample ~20 new mutations per phenotype during maturation. When we simulated with a 100-fold lower mutation rate, evolutionary dynamics slowed down, but all of our conclusions still held¹⁵. Among phenotype-altering mutations, tens of percent create null mutants, as illustrated by experimental studies on protein, viruses, and yeast^53,55,56. Thus, we assumed that 50% of phenotype-altering mutations were null (i.e., resulting in zero maximal growth rate, zero affinity for metabolite, or zero f_P). Among nonnull mutations, the relative abundances of enhancing versus diminishing mutations are highly variable in different experiments. We based our distribution of mutation effects on experimental studies on S. cerevisiae where the fitness effects of thousands of mutations were quantified under various nutrient limitations in an unbiased fashion⁵⁷. The relative fitness changes caused by beneficial (phenotype-enhancing) and deleterious (phenotype-diminishing) mutations can be approximated by a bilateral exponential distribution with means s₊ = 0.050 ± 0.002 and s₋ = 0.067 ± 0.003 for the positive and negative halves, respectively.

Simulating community selection with small population size

Individual-based stochastic simulation codes used in this work are largely similar to those in our previous work, except for the modification to simulate species spiking. Below, we briefly recapture the flow of the simulation, which can be found in our previous work¹⁵.

Each simulation begins with n_tot = 100 identical Newborns. The total biomass of each Newborn is BM_target. In most simulations, BM_target = 100 consisting of 60 M cells and 40 H cells of biomass 1. Each Newborn is supplied with abundant agriculture waste and a fixed amount of Resource that supports the growth of 10⁴ total biomass. Unless otherwise specified, maturation time is set to T = 17 (~6 generations) to avoid Resource depletion (i.e., stationary phase in experiments) and cheater takeover.

Each maturation cycle is divided into time steps of length Δτ = 0.05. During each time step, the biomass of each M and H cell grows deterministically according to Equations (4) and (5) (or the corresponding equations for mutualistic and exploitative communities) without the death terms, while the concentration of Resource, Byproduct and Product changes according to Equations (1–3). At the end of each Δτ, each M and H cell dies with a probability of δ_MΔτ and δ_HΔτ, respectively.

Among the survived cells, if a cell’s biomass exceeds the threshold of 2, the cell divides into two identical daughter cells. Each daughter cell then mutates with a probability of P_mut. For a M cell, its f_P, g_Mmax, K_MR and K_MB can mutate independently. For a H cell, its g_Hmax and K_HR can mutate independently. Additionally, H’s resistance to Byproduct B₀ in a mutualistic H–M community and H’s resistance to M’s antagonistic byproduct A₀ in an exploitative H–M community can mutate independently. In our simulations, these biological parameters are inherited upon cell division, and thus we refer to them as phenotypes or genotypes interchangeably. In most simulations of the simple scenario, only f_P of M mutates, while other phenotypes are held at their bounds whose values are shown in the “Bounds” column of Table 1. In simulations where six phenotypes of the commensal H–M community or seven phenotypes of the mutualistic H–M community could be modified by mutations (e.g., Fig. 7 and Supplementary Fig. 12 for the commensal and Supplementary Fig. 20 for the mutualistic community), these phenotypes start from ancestral values shown in the “Ancestral” column of Table 1. In simulations where seven phenotypes of the exploitative H–M community could be modified by mutations (Supplementary Fig. 21), M’s f_P and H’s sensitivity to M’s antagonistic byproduct A₀ start from ancestral values shown in the “Ancestral” column of Table 1. The other five growth phenotypes (2 maximal growth rates and 3 affinities to B and R) start from evolutionary bound shown in the “Bounds” column of Table 1. This choice allows us to speed up the simulation, since if we initiate these five growth phenotypes from ancestral values, there is not enough biomass in Adult communities to perform heritability check until after more than 1000 cycles.

If a mutation occurs, it could be a null mutation with a probability of $\frac{1}{2}$. A null mutation reduces f_P, g_Mmax, g_Hmax, A₀, or B₀ to zero, while increases K_MR, K_MB and K_HR to infinity (equivalent to reducing affinities to zero). If a mutation is not null, it modifies each phenotype by ~5–6% on average. Specifically, each phenotype is multiplied by (1 + Δs), where Δs is a random variable with a distribution

$${\mu }_{{{\Delta }}s}({{\Delta }}s)=\left\{\begin{array}{ll}\frac{1}{{s}_{+}+{s}_{-}(1-\exp (-1/{s}_{-}))}\exp (-{{\Delta }}s/{s}_{+}),&{{{{{{{\rm{if}}}}}}}}\ {{\Delta }}s\ge 0;\\ \frac{1}{{s}_{+}+{s}_{-}(1-\exp (-1/{s}_{-}))}\exp ({{\Delta }}s/{s}_{-}),& {{{{{{{\rm{if}}}}}}}}\ -\!1 \; < \;{{\Delta }}s \; < \;0.\end{array}\right.$$

(11)

Here, s₊ = 0.05 and s₋ = 0.067 are the average percentage by which a mutation increases or decreases a phenotype, respectively (for parameter justifications, see our previous work¹⁵).

At the end of a maturation cycle, the amount of Product P accumulated in the Adult, P(T), is the community function. In some simulations, measurement noise is added to the true P(T) to yield the measured community function. For the simple scenario where only f_P is modified by mutations (e.g., Fig. 6(e–h)), measurement noise is a normal random variable with 0 mean and standard deviation of 100, approximately 10% of the community function of Cycle 1. For the complex scenario where 6 or 7 phenotypes of H and M are modified by mutations (e.g., Fig. 7), measurement noise is a normal random variable with 0 mean and standard deviation of 50. The magnitude of noise is comparable to the ancestral commensal and mutualistic community function, and ~1/4 of the ancestral exploitative community function. Top n_chosen Adult communities with the highest measured function are chosen to be reproduced. Sometimes, more than n_chosen Adults might be needed to obtain n_tot = 100 Newborns for the next cycle if there is not enough biomass in n_chosen Adults.

Chosen Adults are reproduced into Newborns with different methods. If “cell sorting” is used, then the deviation of a Newborn’s total biomass from the target BM_target = 100 is within 2, and the deviation of a Newborn’s species ratio from that of the parent Adult is within 2%. If “pipetting an inoculum of a fixed total biomass” is used, then a Newborn’s total biomass is within a deviation of two from the target BM_target, while its species composition fluctuates stochastically. If “pipetting” is used, then Newborn’s total biomass and species composition both fluctuate stochastically. The dilution fold of each Adult is adjusted, so that the average Newborn community’s total biomass is BM_target over all selection cycles. If a fraction φ_S of a Newborn’s biomass is to be replaced by M or H cells, each Newborn gets on average a biomass of $B{M}_{{{{{{{{\rm{target}}}}}}}}}\left(1-{\varphi }_{S}\right)$ from its parent Adult community and on average a biomass of BM_targetφ_S from M or H-spiking mix. Specifically, suppose that the biomass of an Adult is BM(T) = M(T) + H(T) where M(T) and H(T) are the biomass of M and H at time T, respectively. If a fraction φ_S of each Newborn’s biomass is to be replaced by a spiking mix, this Adult is then reproduced into n_D Newborns, where

$${n}_{D}=\lfloor BM(T)/[B{M}_{{{{{{{{\rm{target}}}}}}}}}\left(1-{\varphi }_{S}\right)]\rfloor$$

(12)

and $\left\lfloor x\right\rfloor$ is the floor (round-down) function. If n_D is larger than n_tot/n_chosen, only n_tot/n_chosen Newborns are kept. Otherwise, all n_D Newborns are kept and as many additional Adults with the next highest functions are reproduced to obtain n_tot Newborns for the next cycle. These Newborns are then topped off with either M or H spiking mixes so that their total biomass is on average BM_target = 100, as described in the next subsection. Note that the fold of dilution of an Adult is calculated based on biomass, a continuous variable. However, the biomass is composed of individual biomass of discrete cells. During reproduction, integer number of cells is distributed into each Newborn community.

Simulating species spiking when only M’s cost f _P mutates

In the simple scenario where only f_P of M is modified by mutations, phenotypes of all H cells are the same. Within an Adult community, all H cells also have identical individual biomass L_H, because simulations start with H cells of biomass 1 and because growth is synchronous. To mimic reproducing a chosen Adult through pipetting an inoculum of a fixed total biomass into each Newborn with a φ_S-H-spiking strategy, H and M cells from the chosen Adult are randomly assigned to a Newborn community, until its total biomass comes closest to $B{M}_{{{{{{{{\rm{target}}}}}}}}}\left(1-{\varphi }_{S}\right)$. If φ_S > 0, the number of H cells supplemented to the Newborn community is the nearest integer to $B{M}_{{{{{{{{\rm{target}}}}}}}}}{\varphi }_{S}{L}_{H}^{-1}$. Because integer number of cells is assigned to each Newborn, the total biomass might not be exactly BM_target but within a small deviation of ~2 biomass units.

To mimic reproducing through pipetting, each M and H cell in an Adult community is assigned a random integer between 1 and dilution factor n_D (Equation (12)). All cells assigned with the same random integer are then dealt to the same Newborn, generating n_D Newborn communities. If φ_S > 0, the number of H cells supplemented into each Newborn is a random number drawn from a Poisson distribution of a mean of $B{M}_{{{{{{{{\rm{target}}}}}}}}}{\varphi }_{S}{L}_{H}^{-1}$.

To mimic reproducing through cell sorting, each Newborn receives a biomass of $B{M}_{{{{{{{{\rm{target}}}}}}}}}\left(1-{\varphi }_{S}\right)$ from its parent Adult. Suppose that the fraction of M biomass in the parent Adult is ϕ_M(T), then M cells from the parent Adult are randomly assigned to the Newborn, until the total biomass of M comes closest to $B{M}_{{{{{{{{\rm{target}}}}}}}}}{\phi }_{M}(T)\left(1-{\varphi }_{S}\right)$ without exceeding it. H cells with a total biomass of $B{M}_{{{{{{{{\rm{target}}}}}}}}}\left(1-{\phi }_{M}(T)\right)\left(1-{\varphi }_{S}\right)$ are assigned similarly. If φ_S > 0, the number of H cells supplemented to the Newborn community is the nearest integer to $B{M}_{{{{{{{{\rm{target}}}}}}}}}{\varphi }_{S}{L}_{H}^{-1}$ where L_H is the biomass of individual H cell in the parent Adult. Because each of M and H cells had a length between 1 and 2, the actual biomass of M and H assigned to a Newborn could vary from the target by up to 2 biomass units. Consequently, deviations of BM(0) from BM_target and of ϕ_M(0) from parent Adult’s ϕ_M(T) are only a few percent.

Simulating species spiking when both H and M cells evolve

In the more complex scenario, both H and M evolve. We thus need to spike with evolved H and M clones. Additionally, Newborns are spiked with H or M clones from their own lineage as demonstrated in Supplementary Fig. 11a. Below, we describe the simulation code for the experimental procedure (Supplementary Fig. 11a) we simulated.

In all simulations where 6 or 7 phenotypes are modified by mutations, chosen Adults are reproduced through pipetting in a similar fashion as described above. After Newborns are reproduced from a chosen Adult in Cycle C − 1, a preset number of H or M cells are randomly picked from the remaining of this Adult to form H or M-spiking mix for Cycle C. At the end of Cycle C, we choose 10 Adults with the highest functions. Assuming that each chosen Adult is reproduced through pipetting with φ_S-H-spiking strategy, a Newborn receives on average a biomass of $B{M}_{{{{{{{{\rm{target}}}}}}}}}\left(1-{\varphi }_{S}\right)$ from its parent Adult community and on average a biomass of BM_targetφ_S from H spiking mix generated at the end of Cycle C − 1. Since each chosen Adult usually gives rise to 10 Newborns, the number of cells distributed from the chosen Adult to each Newborn is drawn from a multinomial distribution. Specifically, denote the integer random numbers of cells that would be assigned to 10 Newborns to be {x₁, x₂,…, x₁₀}. If the chosen Adult has a total biomass of BM(T) composed of I_M M cells and I_H H cells (both I_M and I_H are integers), the probability that {x₁, x₂,…, x₁₀} cells are assigned to 10 Newborns, respectively, and x₁₁ cells remain, is

$$\Pr \left(\{{x}_{1},{x}_{2},...,{x}_{10},{x}_{11}\}\right)=\frac{({I}_{H}+{I}_{M})!}{{x}_{1}!\cdots {x}_{10}!{x}_{11}!}\,{p}_{0}{{\,}^{{x}_{1}+\cdots +{x}_{10}}}\,{p}_{11}^{{x}_{11}}.$$

Here, ${p}_{0}=B{M}_{{{{{{{{\rm{target}}}}}}}}}\left(1-{\varphi }_{S}\right)/BM(T)$ is the probability that a cell is assigned to one of 10 Newborns, p₁₁ = 1 − 10p₀ is the probability that a cell is not assigned to Newborns. Thus, ${x}_{11}={I}_{H}+{I}_{M}-\mathop{\sum }\nolimits_{i = 1}^{10}{x}_{i}$ is the number of cells remaining after reproduction, from which H and M cells are randomly picked to generate the spiking mix for Cycle C + 1.

Suppose that the current spiking strategy is φ_S-H, then these 10 Newborns are spiked with H-spiking mix generated in Cycle C − 1. An average of BM_targetφ_S of H biomass is spiked into each Newborn so that the total biomass of Newborns is on average BM_target. Suppose that five H cells from the parent Adult’s lineage are randomly picked at the end of Cycle C − 1, and that they have biomass {L_H1, L_H2, L_H3, L_H4, L_H5}, respectively. The total number of H cells assigned to each Newborn, x_H, is then randomly drawn from a Poisson distribution with a mean of $B{M}_{{{{{{{{\rm{target}}}}}}}}}{\varphi }_{S}/{\overline{L}}_{H}$, where ${\overline{L}}_{H}=\frac{1}{5}\mathop{\sum }\nolimits_{j = 1}^{5}{L}_{Hj}$ is the average biomass of the five H cells. Each spiked H cell has an equal chance of being one of the five cells.

Updating spiking percentage based on heritability checks

When the community function landscape is unknown, we can estimate heritability of community function under different spiking percentages through parent–offspring regression. In most simulations (e.g., Fig. 7), heritability evaluation is carried out about every 100 cycles (“periodic heritability check”). In the simulations demonstrated in Supplementary Fig. 17, the average improvement rate in community function is estimated from the chosen Adults over the last 50 cycles. Heritability evaluation is carried out when this average improvement rate becomes negative (“adaptive heritability check”). For both periodic and adaptive checks, heritability evaluation can be postponed until within-community selection improves cell growth sufficiently to provide sufficient biomass for heritability check.

During one round of heritability evaluation, heritability of community function is estimated through parent–offspring community function regression under all candidate spiking strategies (Supplementary Fig. 11b). The current spiking strategy is updated if an alternative spiking strategy confers significantly higher community function heritability.

To evaluate heritability under one spiking strategy, up to 100 Newborn communities are generated under this spiking strategy. After these mature into Adults, their functions are the parent functions. Each Adult parent then gives rise to six Newborn offspring under the same spiking strategy. When the six Newborn offspring mature into Adults, the median of their functions is the average offspring function. When offspring functions are plotted against their parent functions, the slope of the least-squares linear regression (green dashed line in Supplementary Fig. 11b) quantifies the heritability of community function. Heritability of a community function is thus similar to heritability of an individual trait, except that we use median instead of mean of offspring functions, because median is less sensitive to outliers. The 95% confidence interval of heritability is then estimated by nonparametric bootstrap^58,59. More specifically, first, 100 pairs of parent–offspring community functions are resampled with replacement. Second, heritability is calculated with the resampled data. Third, 1000 heritabilities are calculated from 1000 independent resamplings, from which the 95% confidence interval is estimated from the 5th and 95th percentile.

An alternative spiking strategy is considered significantly more advantageous than the current spiking strategy if heritability of the alternative spiking strategy is higher than the right endpoint of the 95% confidence interval of the heritability of the current spiking strategy. If more than one alternative spiking strategies are more advantageous, the one with the highest heritability is implemented to replace the current strategy. Similarly, an alternative spiking strategy is considered more disadvantageous if heritability of the alternative spiking strategy is lower than the left endpoint of the 95% confidence interval of the heritability of the current spiking strategy. When implementing random spiking strategy, the current spiking strategy is updated with a strategy randomly picked from candidate spiking strategies.

Simulating community selection with large population size

When the population size of each community is scaled up by 10 or 100 times (Supplementary Figs. 2 and 18b), the simulation codes described above become inefficient. Instead of tracking the biomass and phenotype of each cell in a large population, we divide the cells into categories and track the number of cells from different categories, where a category is defined by a unique combination of cell biomass and phenotype ranges. In our simulations, the biomass of each cell ranges between 1 and 2, f_P of each M cell ranges between 0 and 1. Since H cells do not mutate, H cells are divided into 100 categories. H cells that belong to category i have a biomass between [1 + (i − 1) × ΔL, 1 + i × ΔL] where ΔL = 10⁻². Since only f_P of M cells are modified by mutations, M cells are divided into 100 × 10⁵ categories. M cells that belong to category (i, j) have a biomass between [1 + (i − 1) × ΔL, 1 + i × ΔL] and f_P between [(j − 1) × Δf_P, j × Δf_P] where Δf_P = 10⁻⁵. Every time f_P of a M cell is modified by mutations, this cell jumps from the current category to a new category determined by its new f_P value.

Similar to simulations with small population sizes, each selection cycle starts with n_tot = 100 Newborn communities. Maturation time T is divided into time steps of length Δτ = 0.05. Over each time step, the growth in cell biomass and the changes in metabolites are simulated in a similar fashion as described above. At the end of each time step, the number of cells to die or to mutate in each category is drawn from a bionomial distribution. If f_P of a M cell is modified by mutation, the mutation effect is drawn from the same distribution as described above: $\frac{1}{2}$ of mutations reduce f_P to 0 and the other $\frac{1}{2}$ is randomly drawn from the distribution in Equation (11).

At the end of a maturation cycle, top 10 Adults with the highest functions are chosen. Each then reproduces 10 Newborns via pipetting for the next cycle. The fold of dilution is similarly adjusted, so that the average of Newborn total biomass is BM_target over all selection cycles. From each category of a chosen Adult, the number of cells assigned to a Newborn community is randomly drawn from a multinomial distribution.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability

The data that support the findings of this study are available in https://github.com/shougroup/Xie_Shou_2021_SteeringEcoEvoDynamics/.

Code availability

All codes used in this study are available in https://github.com/shougroup/Xie_Shou_2021_SteeringEcoEvoDynamics/.

References

Hillesland, K. L. & Stahl, D. A. Rapid evolution of stability and productivity at the origin of a microbial mutualism. Proc. Natl Acad. Sci. USA 107, 2124–2129 (2010).
Article CAS PubMed PubMed Central ADS Google Scholar
Widder, S. et al. Challenges in microbial ecology: building predictive understanding of community function and dynamics. ISME J. 10, 2557–2568 (2016).
Lindemann, S. R. et al. Engineering microbial consortia for controllable outputs. ISME J. 10, 2077–2084 (2016).
Article CAS PubMed PubMed Central Google Scholar
Chang, C.-Y. et al. Engineering complex communities by directed evolution. Nat. Ecol. Evolution 5, 1011–1023 (2021).
Article Google Scholar
Wilson, D. S. The natural selection of populations and communities (Benjamin/Cummings Pub. Co., 1980).
Goodnight, C. J. Heritability at the ecosystem level. Proc. Natl Acad. Sci. USA 97, 9365–9366 (2000).
Article CAS PubMed PubMed Central ADS Google Scholar
Arias-Sánchez, F. I., Vessman, B. & Mitri, S. Artificially selecting microbial communities: If we can breed dogs, why not microbiomes? PLoS Biol. 17, e3000356 (2019).
Article PubMed PubMed Central Google Scholar
Lawley, T. D. et al. Targeted restoration of the intestinal microbiota with a simple, defined bacteriotherapy resolves relapsing Clostridium difficile disease in mice. PLoS Pathog. 8, e1002995 (2012).
Article CAS PubMed PubMed Central Google Scholar
Zhou, K., Qiao, K., Edgar, S. & Stephanopoulos, G. Distributing a metabolic pathway among a microbial consortium enhances production of natural products. Nat. Biotechnol. 33, 377–383 (2015).
Kato, S., Haruta, S., Cui, Z. J., Ishii, M. & Igarashi, Y. Effective cellulose degradation by a mixed-culture system composed of a cellulolytic clostridium and aerobic non-cellulolytic bacteria. FEMS Microbiol. Ecol. 51, 133–142 (2004).
Article CAS PubMed Google Scholar
Wilson, D. S. Complex interactions in metacommunities, with implications for biodiversity and higher levels of selection. Ecology 73, 1984–2000 (1992).
Article Google Scholar
Penn, A. Modelling artificial ecosystem selection: A preliminary investigation. In European Conference on Artificial Life, pp. 659–666 (Springer, 2003).
Penn, A. & Harvey, I. The role of non-genetic change in the heritability, variation, and response to selection of artificially selected ecosystems. In Artificial Life IX: Proceedings of the Ninth International Conference on the Simulation and Synthesis of Artificial Life, vol. 9, 352 (MIT Press, 2004).
Williams, H. T. P. & Lenton, T. M. Artificial selection of simulated microbial ecosystems. Proc. Natl Acad. Sci. USA 104, 8918–8923 (2007).
Article CAS PubMed PubMed Central ADS Google Scholar
Xie, L., Yuan, A. E. & Shou, W. Simulations reveal challenges to artificial community selection and possible strategies for success. PLoS Biol. 17, e3000295 (2019).
Article CAS PubMed PubMed Central Google Scholar
Doulcier, G., Lambert, A., De Monte, S. & Rainey, P. B. Eco-evolutionary dynamics of nested darwinian populations and the emergence of community-level heredity. eLife 9, e53433 (2020).
Article CAS PubMed PubMed Central Google Scholar
Goodnight, C. J. Experimental studies of community evolution I: the response to selection at the community level. Evolution 44, 1614–1624 (1990).
Article PubMed Google Scholar
Goodnight, C. J. Experimental studies of community evolution II: the ecological basis of the response to community selection. Evolution 44, 1625–1636 (1990).
Article PubMed Google Scholar
Swenson, W., Wilson, D. S. & Elias, R. Artificial ecosystem selection. Proc. Natl Acad. Sci. USA 97, 9110–9114 (2000).
Article CAS PubMed PubMed Central ADS Google Scholar
Swenson, W., Arendt, J. & Wilson, D. Artificial selection of microbial ecosystems for 3-chloroaniline biodegradation. Environ. Microbiol. 2, 564–71 (2000).
Article CAS PubMed Google Scholar
Blouin, M., Karimi, B., Mathieu, J. & Lerch, T. Z. Levels and limits in artificial selection of communities. Ecol. Lett. 18, 1040–1048 (2015).
Article PubMed Google Scholar
Panke-Buisse, K., Poole, A. C., Goodrich, J. K., Ley, R. E. & Kao-Kniffin, J. Selection on soil microbiomes reveals reproducible impacts on plant function. ISME J. 9, 980 (2015).
Article CAS PubMed Google Scholar
Mueller, U. G. et al. Artificial microbiome-selection to engineer microbiomes that confer salt-tolerance to plants. bioRxiv 10.1101/081521 (2016).
Jochum, M. D., McWilliams, K. L., Pierson, E. A. & Jo, Y.-K. Host-mediated microbiome engineering (hmme) of drought tolerance in the wheat rhizosphere. PloS ONE 14, e0225933 (2019).
Article CAS PubMed PubMed Central Google Scholar
Wright, R. J., Gibson, M. I. & Christie-Oleza, J. A. Understanding microbial community dynamics to improve optimal microbiome selection. Microbiome 7, 1–14 (2019).
Article Google Scholar
Raynaud, T., Devers, M., Spor, A. & Blouin, M. Effect of the reproduction method in an artificial selection experiment at the community level. Front. Ecol. Evol. 7, 416 (2019).
Article Google Scholar
Arora, J., Brisbin, M. A. M. & Mikheyev, A. S. Effects of microbial evolution dominate those of experimental host-mediated indirect selection. PEERJ 8, e9350 (2020).
Article PubMed PubMed Central Google Scholar
Chang, C.-Y., Osborne, M. L., Bajic, D. & Sanchez, A. Artificially selecting microbial communities using propagule strategies. Evolution 74, 2392–2403 (2020).
Article PubMed PubMed Central Google Scholar
Lewontin, R. C. The Units of Selection. Annu. Rev. Ecol. Syst. 1, 1–18 (1970).
Article Google Scholar
Okasha, S. Evolution and the Levels of Selection (Oxford University Press, 2006).
Rice, S. H. The evolution of canalization and the breaking of Von Baer’s Laws: modeling the evolution of development with epistasis. Evolution 52, 647–656 (1998).
Article PubMed Google Scholar
Rice, S. H. A general population genetic theory for the evolution of developmental interactions. Proc. Natl Acad. Sci. USA 99, 15518–15523 (2002).
Article CAS PubMed PubMed Central ADS Google Scholar
Klitgord, N. & Segrè, D. Environments that induce synthetic microbial ecosystems. PLoS Comput. Biol. 6, e1001002 (2010).
Article PubMed PubMed Central ADS Google Scholar
Seth, E. C. & Taga, M. E. Nutrient cross-feeding in the microbial world. Front. Microbiol. 5, 350 (2014).
Article PubMed PubMed Central Google Scholar
Goldford, J. E. et al. Emergent simplicity in microbial community assembly. Science 361, 469–474 (2018).
Article CAS PubMed PubMed Central ADS Google Scholar
Piccardi, P., Vessman, B. & Mitri, S. Toxicity drives facilitation between 4 bacterial species. Proc. Natl Acad. Sci. USA 116, 15979–15984 (2019).
Article CAS PubMed PubMed Central Google Scholar
Green, R. et al. Metabolic excretion associated with nutrient-growth dysregulation promotes the rapid evolution of an overt metabolic defect. PLoS Biol. 18, e3000757 (2020). Publisher: Public Library of Science.
Article CAS PubMed PubMed Central Google Scholar
Kehe, J. et al. Positive interactions are common among culturable bacteria. bioRxiv 10.1101/2020.06.24.169474 (2020).
Shou, W., Ram, S. & Vilar, J. M. G. Synthetic cooperation in engineered yeast populations. Proc. Natl Acad. Sci. USA 104, 1877–1882 (2007).
Article CAS PubMed PubMed Central ADS Google Scholar
Zhang, H., Pereira, B., Li, Z. & Stephanopoulos, G. Engineering Escherichia coli coculture systems for the production of biochemical products. Proc. Natl Acad. Sci. USA 112; 8266–8271 (2015).
Stolyar, S. et al. Metabolic modeling of a mutualistic microbial community. Mol. Syst. Biol. 3, 92 (2007).
Article PubMed PubMed Central Google Scholar
Momeni, B., Brileya, K. A., Fields, M. W. & Shou, W. Strong inter-population cooperation leads to partner intermixing in microbial communities. eLife 2, e00230 (2013).
Kelsic, E. D., Zhao, J., Vetsigian, K. & Kishony, R. Counteraction of antibiotic production and degradation stabilizes microbial communities. Nature 521, 516–519 (2015).
Article CAS PubMed PubMed Central ADS Google Scholar
Friedman, J., Higgins, L. M. & Gore, J. Community structure follows simple assembly rules in microbial microcosms. Nat. Ecol. Evol. 1, 0109 (2017).
Article Google Scholar
Estrela, S. et al. Metabolic rules of microbial community assembly. bioRxiv 10.1101/2020.03.09.984278 (2020).
Niehaus, L. et al. Microbial coexistence through chemical-mediated interactions. Nat. Commun. 10, 1–12 (2019).
Article CAS Google Scholar
Li, Z. et al. Enhancing anthranilic acid biosynthesis using biosensor-assisted cell selection and in situ product removal. Biochem. Eng. J. 162, 107722 (2020).
Article CAS Google Scholar
Harcombe, W. Novel cooperation experimentally evolved between species. Evolution 64, 2166–2172 (2010).
PubMed Google Scholar
Momeni, B., Waite, A. J. & Shou, W. Spatial self-organization favors heterotypic cooperation over cheating. eLife 2, e00960 (2013).
Article PubMed PubMed Central Google Scholar
Walsh, B. & Lynch, M. Evolution and Selection of Quantitative Traits (Oxford University Press, 2018).
Mankad, T. & Bungay, H. Model for microbial growth with more than one limiting nutrient. J. Biotechnol. 7, 161–166 (1988).
Article CAS Google Scholar
Levy, S. F. et al. Quantitative evolutionary dynamics using high-resolution lineage tracking. Nature 519, 181 (2015).
Article CAS PubMed PubMed Central ADS Google Scholar
Wloch, D. M., Szafraniec, K., Borts, R. H. & Korona, R. Direct estimate of the mutation rate and the distribution of fitness effects in the yeast saccharomyces cerevisiae. Genetics 159, 441–452 (2001).
Article CAS PubMed PubMed Central Google Scholar
Zeyl, C. & DeVisser, J. A. G. Estimates of the rate and distribution of fitness effects of spontaneous mutation in saccharomyces cerevisiae. Genetics 157, 53–61 (2001).
Article CAS PubMed PubMed Central Google Scholar
Sanjuán, R., Moya, A. & Elena, S. F. The distribution of fitness effects caused by single-nucleotide substitutions in an rna virus. Proc. Natl Acad. Sci. USA 101, 8396–8401 (2004).
Article PubMed PubMed Central ADS Google Scholar
Sarkisyan, K. S. et al. Local fitness landscape of the green fluorescent protein. Nature 533, 397–401 (2016).
Article CAS PubMed PubMed Central ADS Google Scholar
Payen, C. et al. High-throughput identification of adaptive mutations in experimentally evolved yeast populations. PLoS Genet. 12, e1006339 (2016).
Article PubMed PubMed Central Google Scholar
Carpenter, J. & Bithell, J. Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians. Stat. Med. 19, 1141–1164 (2000).
Article CAS PubMed Google Scholar
James, G., Witten, D., Hastie, T. & Tibshirani, R. An Introduction to Statistical Learning: with Applications in R (Springer-Verlag, 2013).
Hartl, D. L. Principles of population genetics, 4th edn (Sinauer Associates, 2007).

Download references

Acknowledgements

We are grateful for the suggestions from the other Shou lab members: Caroline Cannistra, David Skelding, Sonal, and Alex Yuan. This work is supported by US National Institutes of Health (R01GM124128), US National Science Foundation (#1917258), UK Academy of Medical Sciences Professorship, and UK Royal Society Wolfson Fellowship.

Author information

Authors and Affiliations

Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, United States
Li Xie
Centre for Life’s Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
Wenying Shou

Authors

Li Xie
View author publications
You can also search for this author in PubMed Google Scholar
Wenying Shou
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

L.X. designed the study, performed the simulations, analyzed the data, and wrote the paper. W.S. designed the study, analyzed the data, and wrote the paper.

Corresponding authors

Correspondence to Li Xie or Wenying Shou.

Ethics declarations

Competing interests

The authors declare no competing interest.

Additional information

Peer review information Nature Communications thanks Maria Rebolleda-Gomez and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information File

Peer Review File

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Xie, L., Shou, W. Steering ecological-evolutionary dynamics to improve artificial selection of microbial communities. Nat Commun 12, 6799 (2021). https://doi.org/10.1038/s41467-021-26647-4

Download citation

Received: 31 March 2021
Accepted: 30 September 2021
Published: 23 November 2021
DOI: https://doi.org/10.1038/s41467-021-26647-4

This article is cited by

Enhancing phosphate-solubilising microbial communities through artificial selection
- Lena Faller
- Marcio F. A. Leite
- Eiko E. Kuramae
Nature Communications (2024)
Steering and controlling evolution — from bioengineering to fighting pathogens
- Michael Lässig
- Ville Mustonen
- Armita Nourmohammad
Nature Reviews Genetics (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.