Evolution of biological cooperation: an algorithmic approach

Sudakow, Ivan; Reinitz, John; Vakulenko, Sergey A.; Grigoriev, Dima

doi:10.1038/s41598-024-52028-0

Download PDF

Article
Open access
Published: 17 January 2024

Evolution of biological cooperation: an algorithmic approach

Ivan Sudakow¹,
John Reinitz²^na1,
Sergey A. Vakulenko^3,4^na1 &
…
Dima Grigoriev⁵

Scientific Reports volume 14, Article number: 1468 (2024) Cite this article

5376 Accesses
15 Altmetric
Metrics details

Subjects

Abstract

This manuscript presents an algorithmic approach to cooperation in biological systems, drawing on fundamental ideas from statistical mechanics and probability theory. Fisher’s geometric model of adaptation suggests that the evolution of organisms well adapted to multiple constraints comes at a significant complexity cost. By utilizing combinatorial models of fitness, we demonstrate that the probability of adapting to all constraints decreases exponentially with the number of constraints, thereby generalizing Fisher’s result. Our main focus is understanding how cooperation can overcome this adaptivity barrier. Through these combinatorial models, we demonstrate that when an organism needs to adapt to a multitude of environmental variables, division of labor emerges as the only viable evolutionary strategy.

Enforcement is central to the evolution of cooperation

Article 24 June 2019

The social coevolution hypothesis for the origin of enzymatic cooperation

Article 16 December 2019

Evolution of costly signaling and partial cooperation

Article Open access 19 June 2019

Introduction

The explanation of organismal development, starting from simple compounds and progressing to complex multicellular forms, presents a significant challenge for the fields of mathematics and physics^1,2,3. In this manuscript, we employ an algorithmic approach to elucidate the emergence of cooperation within cell communities. Our proposal entails a mathematical model that demonstrates how the emergence of new cooperation can be effectively explained using Hard Combinatorial models. These models and our approach arise naturally when we assume that gene regulation is governed by networks resembling neural nets with Boolean inputs-a widely accepted assumption supported by notable works (e.g.,^4,5,6,7). Consequently, the optimization of fitness becomes a hard combinatorial problem.

Such problems are of great importance in various applications and have been extensively studied^8,9,10. It is widely recognized that many of these problems are inherently difficult, often classified as NP-hard⁸. Due to the impracticality of exhaustive search in such cases, specialized or approximation algorithms must be employed. Notably, Sewall Wright was the first to realize the immense challenge of finding the optimal fitness value within a multidimensional hypercube (genotype space)¹¹. Recent publications, such as¹², delve into the consideration of epistasis effects. In¹², it is observed that the mutation graph defines the proximity between genotypes, where a genotype A is considered close to B if B can be transformed into A through a single mutation. The presence of epistasis and gene pleiotropy (gene incompatibility) renders the search for fitness landscape peaks a hard combinatorial problem^5,6,12,13.

Our approach can be outlined as follows. We utilize genotype-fitness maps, also known as the fitness landscape ($s \rightarrow F(s)$). These maps play a crucial role in the mathematical description of evolution using an algorithmic approach. However, the structure of these maps remained unknown until the last few decades. Nowadays, real fitness landscapes are being analyzed by constructing genotypes with all possible combinations of small sets of mutations observed in evolution experiments^14,15,16,17. We employ models that are consistent with this data.

Furthermore, it is worth noting that the human genome comprises approximately 22,000 protein-coding genes, a number that is comparable to the genomes of fruit flies and nematodes. Surprisingly, more complex organisms do not necessitate a higher number of genes, despite having a greater number of phenotypic traits and the need to adapt to numerous environmental constraints. These facts challenge the classical ideas of modern evolutionary synthesis. According to the celebrated Fisher geometric model, it can be demonstrated that the likelihood of improving fitness through random mutations diminishes as the organism’s complexity increases¹⁸. Building upon Fisher’s approach, Orr estimated the adaptation rate $R_e = \frac{d\log F}{dt}$ as a function of the number of environmental constraints M, with F representing the average population fitness¹⁹. Orr’s findings indicate that the adaptation rate becomes exponentially small as $M \gg 1$. We refer to this effect as the ’adaptivity (complexity) barrier.’ Thus, for large values of M, the adaptivity barrier increases exponentially. Defining the complexity of a real organism or an adaptation in rigorous mathematical terms is challenging. However, in our formal approach, the adaptation complexity is simply represented by the number M.

Using various genotype-fitness map models, including one from²⁰ that generalizes the Fisher model and classical methods^21,22,23, we demonstrate that they all exhibit the same effect: the complexity cost exponentially increases with M. This complexity cost effect arises due to pleiotropy, which becomes inevitable when $M \gg N$, where N represents the number of genes. When there are more traits to be regulated by the same set of genes, higher pleiotropy occurs, which refers to the average number of unrelated traits affected by a single gene. Pleiotropy is a fundamental characteristic of gene regulation²⁴ and one of the primary causes of the complexity cost¹⁹. The substantial complexity cost results from the pronounced roughness of the fitness landscape, which exhibits numerous peaks, valleys, and ridges connecting those peaks. Notably, fundamental findings on hard combinatorial problems (refer to¹⁰ for an overview) demonstrate that combinatorial models effectively depict real fitness landscapes, characterized by the presence of multiple peaks, valleys, and connecting ridges^14,15,16,17. It is important to note that studies of fitness landscapes in natural populations have revealed the prevalence of low fitness for intermediate phenotypes, indicating the existence of valleys in the fitness landscape^14,15,16,17. The models considered in this manuscript are capable of describing such landscapes.

First, we consider homeostasis as a set of coordinated biochemical processes that maintain the parameters of an organism within a given domain (see the “Supplementary Material (SM)” and²⁵ for details). Next, we derive a relation for the probability of the parameters being within that viability domain and find that this probability’s dependence on the genotype is described by the fitness function proposed in²⁰. Maximizing this function leads to challenging combinatorial problems, for which we identify adaptivity barriers. However, we demonstrate that these barriers can be overcome through cooperation. Our combinatorial models demonstrate that the division of labor becomes the only viable evolutionary strategy when an organism must adapt to a multitude of environmental variables. We also investigate the adaptivity barriers associated with the following in silico models:

A:: Simplest unicellular organisms (prokaryotes);
B:: Cooperation among simple organisms, such as symbiosis of prokaryotes;
C:: Bacterial colonies with a simple genetic regulation based on greedy principle;
D:: Cell colonies with a more sophisticated genetic regulation, representing primitive analogs of multicellular organisms.

An example of extreme cooperation is observed in methane-producing colonies, as discussed in^26,27. The process of decomposing organic matter into methane and carbon dioxide is widespread. According to²⁶, the limited availability of energy in methanogenic conversion necessitates efficient cooperation among microorganisms. This mutual dependence, driven by energy constraints, reaches a level where neither partner can function independently. However, when working in unison, they sustain the metabolic activity necessary for survival.

The primary consequence of these mathematical findings is that the transition to cooperation becomes essential for survival when an environment rapidly becomes more challenging. In such cases, cooperative organisms have a higher chance of survival. This assertion can be exemplified using a simple example. Let’s consider a colony of non-interacting cells with 500 genes, which must fulfill 500 ecological constraints, and let’s assume the adaptivity barrier $\alpha =M/N$ is set at $\alpha =1$. Now, suppose an ecological catastrophe occurs, resulting in the emergence of 100 new additional constraints (which form a constraint set, disjoint with previous constraints). In order to adapt, evolution would need to generate approximately 100 new genes, which is highly improbable within a short evolutionary timeframe. However, if we consider a group of cooperating cells, where each cell can handle 100 constraints, only six distinct cell types would be sufficient to adapt to all the constraints, including both the new and pre-existing ones. Note that in many cases $\alpha $ determines the probability of adaptation. This parameter has the order 1 as $M, N>>1$. Basic results (see¹⁰, Ch. 14) on the existence of phase transitions in hard combinatorial problems show that there are even more remarkable effects in cooperation. Suppose, for example, that to be adapted, an organism must satisfy, say, 1000 constraints with 950 genes, and the adaptivity barrier is $\alpha = 1$. Then the chances of survival are vanishingly small because $950 < 1000$, and these chances decrease exponentially with $M - \alpha N$. However, if due to cooperation the number of restrictions is reduced even slightly, for example to 930, the chances of survival become close to 1. Thus, it may occur that even weak cooperation is capable of saving a population in harsh conditions. Certainly, if the organism fails to find suitable neighbors, it may die.

In conclusion, it is worth noting that there are other approaches that explain the origins of multicellularity and cooperation through various factors, including defense against infectious agents and interactions between hosts and parasites^28,29, see also³⁰ for a review of of different hypotheses and their biological testing.

Results

The homeostasis problem for a system of chemical kinetics, which can describe metabolic regulation in bacteria (see Sect. Methods), is investigated. We aim to estimate the probability of viability, denoted as $P_{v, T}$, which represents the likelihood that the system parameters remain within a homeostasis domain over a large time period [0, T]. To estimate $P_{v, T}$, we employ a novel concept of replicative stability through genetic regulation, as proposed in²⁵. Briefly, the concept is as follows: In the work referenced as³¹, it is hypothesized that all cells with fixed genomes and genetic regulations will eventually perish due to external and internal fluctuations. However, if these cells replicate and modify their genome, then the resulting chain of replicating cells can potentially survive indefinitely with a non-zero probability. This hypothesis was mathematically analyzed in²⁵, and we have incorporated these results into our study.

By considering all possible situations of stress impact on a bacterium or a cell, we obtain an integral representation for $P_{v, T}$.

This representation exhibits intriguing properties. The chances of viability depend singularly on the efficiency of the gene regulation network (GRN), the rate of innovation within this GRN, and the size of the GRN. The size of a gene regulation network (GRN) is defined by the number $O(N_{reg})$ of nodes it contains. Given that the connectivity of real networks is bounded, the number of network interconnections has the order $O(N_{reg})$. The efficiency of a GRN is characterized by models described in the SM and in²⁵. We hypothesize that the GRN network counteracts stress-induced perturbations through appropriate gene expression, with network efficiency quantitatively measuring such compensation (details can be found in the SM).

The innovation rate refers to the probability that the antistress GRN network will enhance its efficiency during the replication stage. This rate determines the GRN size increase and is substantially lower than the mutation rate because an innovation may result from several mutations.

It is demonstrated that the viability probability, regarded as the fitness function F(s), relies on the genotype $s = (s_1, s_2, ..., s_N)$, where $s_i \in \{0,1\}$. Here the value $s_i=1$ indicates the expression of a gene.

For sufficiently large $T$, the viability probability can be approximated by the following relations:

$$\begin{aligned} \log P_{v,T} \approx -\sum _{t=1}^{N_{rep}} P_{out} (s(t), {\textbf{w}}(t)) \end{aligned}$$

(2.1)

where $s(t)$ represents the genotype at the $t$-th replication step, $N_{rep}$ is the number of replication steps within the time interval $[0, T]$, and $P_{out} (s, {\textbf{w}})$ is the probability of exiting the homeostasis domain between subsequent replication steps (which is assumed to be small). The variable ${\textbf{w}}$ refers to gene network parameters, determining various coefficients within the gene regulation network (GRN), such as interaction matrix entries, kinetic rates, thresholds, etc. One has

$$\begin{aligned} P_{out}(s,{\textbf{w}})=1- F(s,{\textbf{w}}), \end{aligned}$$

where $F$ can be interpreted as a fitness function:

$$\begin{aligned} F(s,{\textbf{w}})= \sum _{k=1}^M b_k f_k(s, {\textbf{w}}). \end{aligned}$$

(2.2)

Here, $f_k$ represent Boolean functions, and $b_k$ are positive weights. In this context, $M$ represents the number of environmental constraints, corresponding to different stress scenarios, and $f_k$ are probabilities to survive between replications under the $k$-th stress scenario with genotype $s$ and GRN parameters ${\textbf{w}}$. In simpler cases, such as a chain of reactions of the Michaelis–Menten type, this fitness can be reduced to the gene-trait maps, as recently investigated in^20,32,33. In classical theory (^18,19,34), the fitness $F$ is fixed, and only the genotype evolves. However, our new idea is as follows: we propose that with replication, the gene regulation network (GRN) expands and enhances stress resistance (see a conceptual scheme in Fig. 1). Determining precise estimates of this resistance is a formidable mathematical problem (see the SM). Nevertheless, we can use estimates from Deep Learning theory, assuming that the efficiency of the GRN depends mainly on its size. Evolution in $t$ involves not only modification of the genotype but also an increase in GRN size step by step, which alters the fitness. This extension of the classical approach explains why evolution is successful and why the GRN expands. In fact, for $f_k$ independent of $t$, we have $P_{v, T} \rightarrow 0$ as $T \rightarrow \infty $, but in our model, evolution can continue indefinitely (see the SM). In fact, for large $ T $, we obtain $ P_{v, T} \approx 1 - \exp (- c_0 N_{reg}^{\gamma }) $, where $ P_{v, +\infty } $ represents the probability that the system state remains within a small homeostasis domain over the time interval $ (0, +\infty ) $. Here, $ N_{reg} $ denotes the size of the gene network (i.e., the number of units within the network), $ \gamma > 0 $ is a small constant influenced by the number of fluctuating parameters of random stress, and $ c_0 $ is a positive constant depending on details of the system’s chemical kinetics. GRNs compensate for stress impacts (see also²⁵). We show that this compensation leads to an exponential increase in the viability probability. The larger the network, the better the compensation, as follows from the above estimate for $ P_{v, +\infty } $.

One can show that larger GRNs are better equipped to cope with complex combinations of stresses. Regulator genes act as switching devices, facilitating these switches. Stress response networks exhibit a modular structure, where different modules generate responses to various stresses³⁵. Then, the regulators switch between these modules. In our paper, however, we do not delve into these questions regarding the structure of networks. We simply utilize known estimates of the accuracy of approximation of target functions via Deep Networks^36,37,38. These estimates describe the dependence of approximation accuracy on the network size $ N_{reg} $, but it is challenging to infer details about the network structure, except for the fact that the network is deep and the vertices have a small degree.

Furthermore, we demonstrate the versatility of this function as it is capable of describing not only a free-living bacterium but also various cooperation phenomena. Our findings can be summarized as follows: a new form of cooperation emerges when the computational cost of cooperation is lower than that of direct adaptation without cooperation. To illustrate this concept, we explore the evolutionary adaptivity barriers in the in silico models labeled A–D.

We initially examine the evolutionary adaptivity barriers for various models without cooperation (the case A). This entails investigating whether well-adapted phenotypes exist for individual organisms without engaging in cooperation. In this case, the fitness maximum is achieved if all $f_i=1$. By making certain assumptions, such as the randomness of genetic regulation (as described in Assumption RG in the SM), we derive the following results.

The adaptivity barrier can be expressed via the quantity $P_{+, \delta }$, which is a maximal probability to find a constraint i and the gene expression string s such that $f_i(s)>1-\delta $:

$$\begin{aligned} P_{+, \delta } =\max _{i \in [1, M], \ s \in S^N} \Pr [ f_i (s, \textbf{w}) > 1-\delta ], \end{aligned}$$

(2.3)

where $\delta \in [0,1)$ is a small parameter, $[1,M]=\{1,2,..., M\}$ and $S^N=\{0, 1\}^N$ the set of all genotypes.

If $P_{+, \delta }$ is not small there exists a genotype s satisfying conditions $f_i(s)>1-\delta $ for all $i=1,..., M$, i.e., a genotype sufficiently well adapted with respect to all M constraints. We find the following upper bound for M:

$$\begin{aligned} M < C_{\textrm{max}} N, \quad C_{\textrm{max}} = \frac{\ln 2}{|\ln P_{+,\delta }|}, \end{aligned}$$

(2.4)

where N is the number of genes, and the constant $C_{\textrm{max}}$ is uniform in M and N as both $M, N \gg 1$. If condition (2.4) is not satisfied, the likelihood of a well-adapted phenotype becomes highly improbable, with its probability exponentially approaching zero (as detailed in Proposition I from the SM). Therefore, inequality (2.4) acts as a constraint on the maximum adaptivity achievable by an organism without cellular cooperation. The specific values of $C_{\textrm{max}}$ for various models are discussed in depth in the SM. We focus on two basic models here: the K-SAT and single-layered perceptrons (SLPs), as used in²⁰. The K-satisfiability problem (K-SAT), a fundamental and challenging combinatorial problem¹⁰, has a clear biological interpretation. Biologically, it represents M constraints, each dependent on K Boolean genes, which must be satisfied by the correct expression of K genes involved. The number of satisfied constraints can be viewed as the fitness of a Boolean string of length $N>> K$. For large K, one finds that $C_{\textrm{max}} \approx 2^K \ln 2$. However, if the number of genes involved in each trait is randomly distributed according to a Poisson distribution with a mean of K, then $C_{\textrm{max}} \approx \exp (K/2) \ln 2$, indicating that randomization reduces the barrier. It is also important to note that gene redundancy, as defined by the parameter K, significantly increases the adaptivity barrier and, consequently, adaptability.

The SLP model is defined by

$$\begin{aligned} f_j =\sigma \left( \sum _{i=1}^{N} w_{ji}{s_i} - h_j\right) , \end{aligned}$$

(2.5)

where $j = 1, \ldots , M$. Here $\sigma (z)$ is a sigmoidal function of z such that $\sigma $ is monotone increasing, $\sigma (z) \rightarrow 1$ as $z \rightarrow +\infty $ and $\sigma (z) \rightarrow 0$ as $z \rightarrow -\infty $. The coefficients $w_{ij}$ can have different signs. If $w_{ji}>0$ then the i-th gene is an activator for j-th trait, if $w_{ji}<0$ it is a repressor, and for $w_{ji}=0$ that gene does not affect the i-th trait. This model can be viewed as a circuit with the ability to regulate phenotype robustness through special parameters $h_j$, which are thresholds first introduced in³⁹. The model as described in equation (2.5) has been considered in^40,41,42.

Suppose that all $w_{ij}$ are independent, normally distributed, random quantities with zero mean and variance $r^2$. These assumptions, which are standard for such models⁴³, imply that we are considering a random gene regulation. Then

$$\begin{aligned} C_{\textrm{max}} =\ln 2 (\ln {2\pi }^{-1/2} \int _{{\bar{h}}}^{\infty } \exp ( - u^2 /2) du)^{-1}, \end{aligned}$$

(2.6)

where the quantity $ {\bar{h}}= h (\sqrt{N } r)^{-1}.$ can be interpreted as a normalized threshold, and it defines the sensitivity of the trait $f_i$ with respect to mutations.

It is noteworthy that in multilayered models, the coefficient $C_{\textrm{max}}$ may substantially exceed that of single-layer models. This suggests that cascade regulation can alleviate the complexity cost, as detailed in the SM.

In case B, we examine a symbiotic relationship involving n organisms using the Set Covering Model (see the SM for more details). We define $N_{reg}$ as the number of genes in the set $S^{reg}$ required to facilitate cooperation or symbiosis. For instance, in the context of symbiosis transitioning from prokaryotes to eukaryotes, these genes $S^{reg}$ may be responsible for the adhesion of different organelles, such as mitochondria. The transition from case A to case B becomes computationally advantageous when the following condition is met: $ N_{reg} < C_{\textrm{max}}^{-1} \frac{(n-1)M}{n}. $ It is reasonable to expect that this inequality holds true in challenging environments characterized by a large number of potential ecological constraints ($M \gg 1$).

Consider the Jacob-Monod regulation⁴⁴. In the Monod–Jacob model, the Escherichia coli lac operon encodes proteins necessary for the transport and breakdown of the sugar lactose (lac). The production of proteins is prevented when a repressor, encoded by a regulatory gene, binds to its operator, a specific site in the DNA sequence that is close to the genes encoding the proteins. If each organism possesses a sensor to detect which products are already produced by other organisms, it eliminates the need to produce those products. Maximizing fitness, therefore, involves producing the maximum number of products that are not already synthesized by other organisms. Furthermore, it is important to note that this regulation minimizes energy consumption, as discussed in⁴⁵. The Jacob–Monod regulation is limited to systems whose functioning does not rely on feedback loops composed of the gene regulatory system and external input. Mathematically, this regulation can be interpreted as a greedy algorithm, as outlined in the SM. The greedy approach implies that each organism aims to satisfy constraints that remain unsatisfied by others, thus maximizing its individual fitness (Fig. 2). Let us estimate $N_{reg}$ for this greedy algorithm.

Note that, in order to perform this regulation, the i-th cell must have sensors capable of recognizing $M \theta _i$ different reagents, where $\theta _i$ is on the order of 1/n. Therefore, it can be expected that in this case, $N_{reg} $ is on the order of M/n. Consequently, for the minimal required gene number N, we obtain $ N > C_{reg} \frac{M}{n}=N_{reg}, $ where $C_{JM}$ is a constant $>1$ while for individual cells we have $N = M>> N_{reg}$. So, the transition from B to C is computationally profitable if the parameter n satisfies $ n> C_{reg}$. This is attributable to the fact that a single regulatory gene can coordinate the expression of multiple proteins.

Furthermore, for the cases A, B, and C, as well as the fitness functions examined in²⁰, the search for a gene string s that maximizes the fitness can be reduced to an NP-hard Boolean linear programming problem. However, in the transition from case C to case D, as demonstrated in the SM, our model indicates that in multicellular organisms, the search for a response to stress leads to a standard real-valued linear programming problem. In algorithm theory, it is well-known that transitioning from discrete to real variables significantly simplifies the problem (a process known as relaxation). In our case, these real variables represent the concentrations of cells of a given type. Unlike the NP-hard problem, this real-valued linear programming problem can be feasibly solved within a reasonable timeframe. Such a property potentially gives multicellular organisms a decisive advantage, as their genetic networks can find a response to stress much faster and have higher chances of success.

Transitions from simple symbiotic systems to cell colonies can occur in response to a decrease in available resources. Consequently, colonies of cells with the Jacob–Monod regulation demonstrate greater adaptability to the environment compared to ensembles of non-interacting cells. This increased adaptability arises from the fact that cell colonies consume fewer resources (see Fig 2).

The scarcity of resources available for consumption may also provide an explanation for the transition from cell colonies to more complex multicellular organisms with gene regulation. We exploit the interpretation of simple gene regulations as greedy algorithms, which, in general, is not optimal and cannot achieve the optimal value of the target function. In our case, the target function is the number of consumed resources, as indicated by relation (4.8). If the critical constant $C_{res}$, representing the maximum level of available resources, falls below the value attainable by the greedy algorithm, the simple cell colony encounters an extinction challenge. Therefore, the transition from simple cell colonies to multicellular organisms, where more sophisticated gene regulation algorithms are at play, can be attributed to the need to overcome starvation and enhance resource utilization. The transition to higher forms through cooperative evolution does not necessarily result in the complete extinction of lower forms. In fact, it is known that the current distribution of life’s kingdoms is as follows: plants dominate with approximately 450 gigatons of carbon (Gt C), while animals account for about 2 Gt C, bacteria for approximately 70 Gt C, and archaea for around 7 Gt C⁴⁶. Microorganisms forming simple colonies can benefit from symbiosis with more complex forms. For example, mammalian intestines provide relatively stable environments for bacteria⁴⁷.

Discussion

The emergence and evolution of complexity are fundamental challenges in biology^1,2,48. It is intriguing to consider why the evolution of life did not cease at the stage of the simplest autotrophic prokaryotes but produced “organisms with huge, regulated genomes, multiple tissue types, and even ability to develop theories of evolution?” (see⁴⁸, Ch. 8).

This manuscript demonstrates that classical models from algorithm theory and mathematical methods from statistical mechanics, originally developed for neural networks, spin glasses, and Boolean satisfaction problems, can provide insights into the emergence of new forms of cooperation in life. Inspired by^5,6, we view evolution as a hard combinatorial problem and draw upon the foundational principles of algorithm theory^10,21. The key concept, as highlighted in^5,6, is that evolution is achievable if the time required for adaptation is polynomial in the number of genes, but it becomes infeasible if this time is exponentially large. Notably, the existence of a fundamental adaptivity barrier was first identified by R. Fisher in the context of the Fisher Geometric model¹⁸, and subsequent studies by Orr^19,34 further explored this concept. In this manuscript, we also demonstrate that such an adaptivity barrier is present in practically all Boolean fitness models, which encompass gene-trait maps and generalize the Fisher model with mutations that have been extensively studied⁴⁹.

A second important idea, proposed in³¹, suggests that all metabolic systems with fixed parameters are inherently unstable under fluctuations and will eventually be destroyed. In order to maintain stability and avoid destruction, evolution needs the process of replication, which involves the invention of new genes and modifications to existing ones. This concept was explored further in⁵⁰, leading to fundamental implications. The introduction of new genes requires cells to develop more complex metabolic systems, thereby giving rise to a generator of complexity. Consequently, the evolution of individual organisms inevitably encounters a fundamental adaptivity barrier. This barrier hinders gradual evolution, prompting the emergence of new forms of cooperation when the computational (and thus thermodynamic) cost of gradual evolution becomes too high to overcome the adaptivity barrier.

Complex traits can evolve adaptively or non-adaptively. The debate over the nature of evolution has been ongoing for many years, starting with Kimura’s seminal work⁵¹ and continuing with others like⁴⁸. Genomic data, coupled with the relatively small effective population sizes of large multicellular organisms, support the opinion that the evolution of these organisms might have been non-adaptive, aligning with the theory of constructive neutral evolution. Based on our models, drawing a definitive conclusion remains challenging. However, our estimates suggest that the emergence of multicellular organisms was quite plausible.

Undoubtedly, constructing an efficient genetic network is a complex process. Nevertheless, our estimates indicate that even a slight increase in network size can exponentially enhance the probability of maintaining homeostasis, provided the initial GRN was sufficiently large. The likelihood of a series of mutations leading to a more efficient network is small, but not exponentially so, in terms of the network size parameter $N_{reg}$. Our calculations reveal that even if almost all mutations were non-adaptive, except for the final terminal mutation, the process of successive replications leading to an organism with complex genetic regulation is not implausibly rare.

Using the Boolean models, we can demonstrate that certain forms of cooperation may arise as a result of rapid environmental changes. When faced with bounded mutation rates and a limited supply of genes within individual organisms, it becomes insufficient to adapt to new and challenging environmental constraints. In such cases, cooperation between organisms becomes the only viable option for survival.

Furthermore, some forms of cooperation emerge due to resource limitations. For example, under inappropriate environmental conditions but with abundant nutrients, slime mold exists as a colony of single cells. However, when faced with harsher conditions such as starvation, the cells undergo migration, aggregation, and differentiation into spores. This transformation from colonies to a primitive multicellular organism aligns with our model for the transition from case C to case D.

The algorithmic approach highlights that the biosphere is not capable of adapting to rapid environmental changes within short periods. As a result, these rapid changes can lead to the emergence of new cooperative forms or mass extinctions. A notable example of this is the Cambrian explosion, a period associated with global warming caused by the greenhouse effect⁵². During this period, there was a substantial increase in biodiversity and the emergence of multicellular organisms. The exact reasons for these developments are still subject to debate, but one leading theory suggests that the increase in atmospheric oxygen levels played a crucial role. The higher oxygen levels allowed cells to adhere together and form complex body structures, and they also facilitated metabolic processes involved in the production of collagen, a protein critical for the formation of hard body structures.

On the contrary, there are theoretical arguments and comparative genomics data that support the second scenario, known as the neutral or non-adaptive scenario⁵³. This perspective, discussed in detail in⁴⁸ (Chapters 8-9), suggests that random evolutionary processes and genetic drift play an important role in the emergence of large, complex genomes, rather than solely adaptive selection.

Our theoretical arguments align, to some extent, with this non-adaptive theory. The process of random evolution leading to the emergence of large, effective regulation networks of size N may require a number of steps, most of which do not contribute to an immediate increase in fitness. However, once this network is established, it provides a substantial fitness advantage (exponential in $N^{\gamma }$) that facilitates the fixation of complex regulated genomes, even in small populations.

Methods

Our approach is built upon several established components: (i) The concept of replicative stability in systems supporting homeostasis, as proposed Gromov-A.Carbone³¹; (ii) The fitness model developed Reinitz et al.^20,41; (iii) The theory of large deviations and stochastic transitions in random dynamical systems⁵⁴; (iv) The theory of phase transitions in combinatorial problems²¹; (v) The novel idea employed in this study is the stability via gene regulation, as described in²⁵ and the SM. In this context, we use universal approximations provided by deep networks^36,37,38,55.

The application of these methods can be described as follows. We use (i), (ii), (v) to estimate the stochastic stability of evolving biochemical systems under random perturbations. Note that the fundamental results of large deviation theory, as outlined in point (iii), are applied to estimate the probability that the organism will remain within the homeostasis zone and survive.

This analysis leads us to the development of combinatorial fitness models, which have the ability to describe both individual organisms and various cooperative communities. Furthermore, with the help of (iv), we estimate the adaptivity barriers associated with different situations. In the subsequent subsections, we will introduce the specific models that are used.

Metabolic regulation and homeostasis

Here, we briefly describe a GRN model and homeostasis. This model is also described in more detail in²⁵ and the SM.

Let $v_1,..., v_n$ represent the concentrations of chemical reagents involved in the metabolism of an organism. Biochemical kinetics can be described by the following system of differential equations:

$$\begin{aligned} \frac{dv_i}{dt}= { g_i}( v, \xi , s), \quad t \ge 0, \end{aligned}$$

(4.1)

where $v=(v_1, ...., v_n) \in D$, where D is a compact domain in a non-negative cone ${{\mathbb {R}}}_{>}^n=\{v \in {{\mathbb {R}}}^n: \ v_i \ge 0 \}$ with a smooth boundary ${\partial D}$, $v_i(t)$ are the concentrations of reagents, the reaction terms $g_i$ are smooth functions, for example, polynomials (see the SM), in v dependent on $\xi $, which is a random stress parameter, and the gene expression string s. Reaction rates $ g_i $ depend on the state of the random environment, characterized by the coordinate $ \xi $ in $ {\mathcal {E}} \subset {\mathbb {R}}^d $. Here, $ {\mathcal {E}} $ is a set of all possible environments. For instance, $ \xi $ could be the concentrations of nutrients necessary for the organism’s survival, the temperature, or parameters describing a complex temperature regime, including variations during different times of the day.

To incorporate genes into our models, we use Boolean strings $s = (s_1, ..., s_{N_g})$, where $N_g$ is the number of genes. Each gene $s_i$ can take Boolean values $s_i \in \{0, 1 \}$.

Homeostasis. We assume, to be viable, the organisms should produce sufficiently large amounts of the output concentrations $v_i$: $ v_i(\xi ,s) > h_i, i \in I, $ where $I \subset \{1, ..., N_{r} \}$ is a subset of indices. These conditions define the homeostasis (viability) domain, where organisms survive (see the SM). The probability $P_{v}(s)$ that the organism remains viable is

$$\begin{aligned} P_{v}(s) =Pr [ v_i(\xi ,s) > h_i \quad \forall i \in I ]. \end{aligned}$$

(4.2)

Let the random quantity $\xi $ be distributed according to the probability measure $d\rho (\xi )$. Then

$$\begin{aligned} P_{v}(s) =\int _{V(h, N_r,s)} d\rho (\xi ), \end{aligned}$$

(4.3)

where the set V is defined by $ V(h, N_r, s)= \{\xi \in {\mathcal E} : \quad v_i(\xi ,s) > h_i \quad \forall i \in I \}. $ The quantity $P_v(s)$ can be considered as the fitness of an individual. Computing this fitness for real metabolic networks is a challenging task. However, in many cases, it can be shown that minimizing the probability $P_v(s)$ reduces to solving different Boolean combinatorial problems, which are described in the following subsections (refer to the SM for more details).

The quantity $P_v(s)$ can be considered as the fitness of an individual. Computing this fitness for real metabolic networks presents a challenging task. However, in many cases, it can be shown that minimizing the probability $P_v(s)$ is equivalent to solving various Boolean combinatorial problems, which are described in the following subsections (refer to the SM for more details).

GRN. One can show (see the SM), through the Fourier analysis of pseudo-Boolean functions, that the kinetic rates $ g_i $ can be split into two parts: $ g_i^{pert} $, interpreted as a perturbation induced by the stress $ \xi $, and $ g_i^{reg} $, which depends on gene expression. The second part can be considered as a GRN. We can ensure the stability of the system under perturbations if the perturbation and the response to it mutually negate (or almost negate) each other.

As an example, let us consider mechanisms that allow bacteria to resist antibiotics. Bacteria have different methods to withstand the effects of an antibiotic. For instance, some bacterial enzymes can inactivate antibiotics. An example can be given by $\beta $-lactamase, which destroys the active component of penicillins. Bacteria can also produce enzymes capable of adding different chemical groups to antibiotics, preventing the binding between the antibiotic and its target in the cell. This resistance mechanism can be described by our model.

Simplest fitness model

This model (as described in^20,32) can be obtained by considering a chain of reactions following the Michaelis–Menten law, where each stage is controlled by a gene. In this context, we can examine the gene-trait maps defined by

$$\begin{aligned} f_j =\sigma \left( \sum _{i=1}^{N} w_{ji}{s_i} - h_j\right) , \end{aligned}$$

(4.4)

where $j = 1, \ldots , M$. Here $\sigma (z)$ is a sigmoidal function of z such that $\sigma $ is monotone increasing, $\sigma (z) \rightarrow 1$ as $z \rightarrow +\infty $ and $\sigma (z) \rightarrow 0$ as $z \rightarrow -\infty $. The coefficients $w_{ij}$ can have different signs. If $w_{ji}>0$ then the i-th gene is an activator for j-th trait, if $w_{ji}<0$ it is a repressor, and for $w_{ji}=0$ that gene does not affect the j-th trait. This model can be viewed as a circuit that allows for the regulation of phenotype robustness through specific parameters $h_j$, which are thresholds initially introduced in³⁹. Similar models, such as (2.5), have also been explored in studies such as^40,41,42.

Suppose that all $w_{ij}$ are independent and normally distributed random variables with zero mean and variance $r^2$. These assumptions, which are standard for such models⁴³, imply that we are considering random gene regulation.

Multilayered perceptron (MLP)

To see that the MLP can appear as a model of metabolism we can consider two metabolic ways, which work in a cooperative way (see the SM).

MLPs (Multi-Layer Perceptrons) typically consist of one or several hidden layers arranged in a feed-forward cascade. In each layer, the genes depend recursively on the genes in the previous layer through sigmoidal functions. This architecture resembles the gene regulatory networks observed in developmental biology, such as the cascade of transcription factors during Drosophila melanogaster morphogenesis. Maternal factors activate gap or pair-rule genes, which in turn activate Hox genes, and Hox genes activate realizator genes that drive segment differentiation in the developing embryo. Therefore, MLP architectures can effectively mimic realistic gene regulatory networks encountered in developmental biology. Furthermore, MLP architectures are Turing-complete, meaning they have the ability to approximate any Boolean function, including any conceivable gene-trait map⁵⁶. This versatility allows MLPs to serve as models for various biological processes, including metabolism. For example, by considering two metabolic pathways that operate cooperatively, we can demonstrate how MLPs can effectively represent such interactions (see the SM for more details).

It has been demonstrated that both single-layer and multilayer perceptron models are capable of reproducing the topological characteristics of rugged and fragmented fitness landscapes observed in real-world situations. This is achieved through the reduction of single-layer perceptron models to the well-known hard combinatorial model called K-SAT (see the SM for further details).

Models of cooperation

All the cooperative models discussed in this context can be derived from the fitness functions described earlier. The fitness functions capture the essential dynamics and interactions within the systems, allowing us to model and analyze various cooperative phenomena.

Models for symbiosis

We consider a case where organisms $O_1, O_2,..., O_m$ coexist in an external environment. These organisms have the ability to activate different genotypes $s^{(1)}, s^{(2)},..., s^{(n)}$ that enable them to produce chemical compounds (gene products) essential for their survival. Let us denote by $a_{il}$ the quantity of i-th gene product that can be obtained by activating genotype $s^{(l)}$. The cost of activating genotype $s^{(l)}$ is denoted by $r_l$, which represents the amount of resources required. Additionally, $h_i$ represents the minimum amount of the i-th gene product needed for the survival of the organism. Mathematically, the cooperation problem among organisms can be formulated as an Integer Linear Programming problem, which can be defined as follows: To minimize

$$\begin{aligned} F_{targ}(u)= \sum _{l=1}^n r_{l} u_l \end{aligned}$$

(4.5)

under conditions

$$\begin{aligned} \sum _{l=1}^n a_{il} u_l \ge h_i, \quad \forall i=1,..., M \end{aligned}$$

(4.6)

$$\begin{aligned} u_l \in \{0,1\}. \end{aligned}$$

(4.7)

This means that each organism produces a subset of gene products, and collectively they need to produce all the products required for survival using the minimal amount of resources defined by $F_{targ}(u)$. The Boolean variable $u_l$ represents the activation status of the genotype $s^{(l)}$, where $u_l=1$ if the genotype is activated and $u_l=0$ otherwise. It is worth noting that an alternative formulation of the problem can be considered, where the condition (4.5) is replaced by:

$$\begin{aligned} F_{targ}(u) < C_{res}. \end{aligned}$$

(4.8)

This implies that the population of organisms must adapt and survive under constrained resource availability. In the specific case where $a_{il} \in {0,1}$, $r_l=1$, and $h_i=1$, the Integer Linear Programming problem reduces to the Set Cover Problem⁵⁷. Let’s delve into the Set Cover Problem and its biological interpretation in more detail.

Let $\xi $ represent a parameter that characterizes the state of the environment. We assume the existence of a set ${\mathcal U}={f_{1},..., { f}_{M}}$, which encompasses all possible ecological constraints (referred to as the Universe). For each specific $\xi $, a different subset of constraints ${{\mathcal {E}}} (\xi )\subset {{\mathcal {U}}}$ is present.

Let ${{\mathcal {F}}}$ be a family that comprises n subsets ${\mathcal E}_k$, where $k=1,...,n$, representing the different constraints from the Universe ${{\mathcal {U}}}$. We assume that the k-th organism has the ability to fulfill the constraints specified by a set ${{\mathcal {E}}}_k \in {{\mathcal {F}}}$ from this family.

We suppose that

$$\begin{aligned} {{\mathcal {U}}}={ \bigcup }_{k=1}^n {{\mathcal {E}}}_{k}, \end{aligned}$$

(4.9)

i.e., all the sets ${{\mathcal {E}}}_{k}$ cover the entire Universe ${{\mathcal {U}}}$.

Consider different subfamilies ${{\mathcal {E}}}_j \subset {{\mathcal {F}}}, j \in J$, where J is a subset of $[1,n]=\{1,..., n\}$. Let us denote by |J| the number of elements of the set J, i.e., the size of the subfamily. Then the Set Covering Problem is as follows:

Consider the family ${{\mathcal {F}}}$ of subsets of ${{\mathcal {U}}}$. Suppose (4.9) is satisfied. For a subset ${{\mathcal {E}}} \subset {{\mathcal {U}}}$ to find a cover ${{\mathcal {E}}}_j$ of ${{\mathcal {E}}}$ by subsets ${{\mathcal {E}}}_j \in J$:

$$\begin{aligned} {{\mathcal {E}}} \subset \bigcup _{j \in J} {{\mathcal {E}}}_j, \end{aligned}$$

(4.10)

that has a minimal size |J|. In other words, we have a set ${{\mathcal {U}}}$ containing elements 1, 2, ..., M, referred to as the Universe, and a collection ${{\mathcal {F}}}$ consisting of m sets. The objective is to find the smallest subset of ${{\mathcal {F}}}$ such that the union of this subset covers the entire Universe ${\mathcal U}$ (note that there may be multiple possible subsets that satisfy this condition).

Data availability

All data generated or analyzed during this study are included in this published article and its supplementary information files. The code to produce numerical results is available at https://doi.org/10.5281/zenodo.6481568.

References

Katsnelson, M. I., Wolf, Y. I. & Koonin, E. V. Towards physical principles of biological evolution. Phys. Scr. 93(4), 043001 (2018).
Article ADS Google Scholar
Wolf, Y. I., Katsnelson, M. I. & Koonin, E. Physical foundations of biological complexity. PNAS 115(37), 8678–8687 (2018).
Article ADS Google Scholar
Maynard Smith, J. & Schatzmary, E. The Major Transitions in Evolution (Oxford University Press, Oxford, 1995).
Google Scholar
Kauffman, S. A. & Weinberger, E. D. The NK model of rugged fitness landscapes and its application to maturation of the immune response. J. Theor. Biol. 141(2), 211–245. https://doi.org/10.1016/S0022-5193(89)80019-0 (1989).
Article CAS PubMed ADS Google Scholar
Valiant, L. G. Evolvability. J. ACM 56(1), 1–21. https://doi.org/10.1145/1462153.1462156 (2009).
Article MathSciNet Google Scholar
Valiant, L. G. Evolvability. Electron. Colloq. Comput. Complex. 120(120), 1–19 (2006).
Google Scholar
Mjolsness, E., Sharp, D. H. & Reinitz, J. A connectionist model of development. J. Theor. Biol. 152(4), 429–453. https://doi.org/10.1016/S0022-5193(05)80391-1 (1991).
Article CAS PubMed ADS Google Scholar
Papadimitriou, C. H. & Steiglitz, K. Combinatorial Optimization: Algorithms and Complexity. Dover. ISBN 0486402584, Mineola, NY (1998)
Sierksma, G. & Zwols, Y. Linear and Integer Optimization: Theory and Practice (CRC Press, Boca Raton London New York, 2015).
Book Google Scholar
Moore, C. & Mertens, S. The Nature of Computation (Oxford University Press, Oxford, 2011). https://doi.org/10.1093/acprof:oso/9780199233212.001.0001.
Book Google Scholar
Wright, S. The roles of mutation, inbreeding, crossbreeding, and selection in evolution. Proc. Sixth Int. Congress Genetics 1(8), 355–366 (1932).
Google Scholar
Kaznatcheev, A. Computational complexity as an ultimate constraint on evolution. Genetics 212(1), 245–265. https://doi.org/10.1534/genetics.119.302000 (2019).
Article PubMed PubMed Central Google Scholar
Valiant, L. Probably Approximately Correct: Nature’s Algorithms for Learning and Prospering in a Complex World, p. 208. Basic Books, (2013). http://www.amazon.com/dp/0465032710
Visser, J. A. G. M. & Krug, J. Empirical fitness landscapes and the predictability of evolution. Nat. Rev. Genet. 15, 480–490 (2014).
Article PubMed Google Scholar
Arias, M. et al. Crossing fitness valleys: Empirical estimation of a fitness landscape associated with polymorphic mimicry. Proc. R. Soc. B 283, 0391–0400 (2016).
Article Google Scholar
Nahum, J. R. et al. A tortoise-hare pattern seen in adapting structured and unstructured populations suggests a rugged fitness landscape in bacteria. Proc. Natl. Acad. Sci. USA 112, 530–7535 (2015).
Article Google Scholar
Leimar, O., Tullberg, B. , & James, M. Mimicry, saltational evolution, and the crossing of fitness valleys. Chapter 16 in: E.I. Svensson and R. Calsbeek, eds. 2012. The Adaptive Landscape in Evolutionary Biology, 257–270 (2012).
Fisher, R. A. The Genetical Theory of Natural Selection (Clarendon Press, Oxford, U.K., 1930).
Book Google Scholar
Orr, H. A. Adaptation and cost of complexity. Evolution 54, 13–20 (2000).
Article CAS PubMed Google Scholar
Jiang, P., Reinitz, J. & Kreitman, M. The effect of mutational robustness on the evolvability of multicellular organisms and eukaryotic cells. J. Evolut. Biol. 36, 906–924 (2023).
Article CAS Google Scholar
Friedgut, E. Sharp thresholds of graph properties, and the $k$-sat problem. J. Am. Math. Soc. 12(04), 1017–1055. https://doi.org/10.1090/S0894-0347-99-00305-7 (1999).
Article MathSciNet Google Scholar
Gardner, E. & Derrida, B. Optimal storage properties of neural network models. J. Phys. A: Math. Gen. 21, 271–284 (1988).
Article MathSciNet ADS Google Scholar
Mézard, M. The space of interactions in neural networks: Gardner’s computation with the cavity method. J. Phys. A: Math. Gen. 22, 2181–2190 (1989).
Article MathSciNet ADS Google Scholar
Wagner, G. P. & Zhang, J. The pleiotropic structure of the genotype-phenotype map: The evolvability of complex organisms. Nat. Rev. Genet. 12, 204–213 (2011).
Article CAS PubMed Google Scholar
Vakulenko, S. & Grigoriev, D. Deep gene networks and response to stress. Mathematics 9(23), 3028. https://doi.org/10.3390/math9233028 (2021).
Article Google Scholar
Schink, B. Energetics of syntrophic cooperation in methanogenic degradation. Microbiol. Mol. Biol. Rev. 61, 262–280 (1997).
CAS PubMed PubMed Central Google Scholar
Sudakow, I., Savenkova, E., Kondrashov, D., Vakulenko, S. A. & Sashina, E. Diverse soil microbial communities may mitigate climate system bifurcation. Chaos Solit. Fract. 177, 114273. https://doi.org/10.1016/j.chaos.2023.114273 (2023).
Article MathSciNet Google Scholar
Iranzo, J., Lobkovsky, A. E., Wolf, Y. I. & Koonin, E. V. Virus-host arms race at the joint origin of multicellularity and programmed cell death. Cell Cycle 13, 3083–3088 (2014).
Article CAS PubMed PubMed Central Google Scholar
Koonin, E. V. Viruses and mobile elements as drivers of evolutionary transitions. Phil. Trans. R. Soc. B 371, 20150442 (2016).
Article PubMed PubMed Central Google Scholar
Brunet, T. & King, N. The origin of animal multicellularity and cell differentiation. Dev. Cell 43, 124–140 (2017).
Article CAS PubMed PubMed Central Google Scholar
Gromov, M. & Carbone, A. Mathematical slices of molecular biology. Preprint IHES/M/01/03 (2001) (2001)
Grigoriev, D., Reinitz, J., Vakulenko, S. & Weber, A. Punctuated evolution and robustness in morphogenesis. Biosystems 123, 106–113. https://doi.org/10.1016/j.biosystems.2014.06.013 (2014).
Article CAS PubMed PubMed Central Google Scholar
Reinitz, J., Vakulenko, S., Sudakow, I. & Grigoriev, D. Robust morphogenesis by chaotic dynamics. Sci. Rep. 13(1), 7482. https://doi.org/10.1038/s41598-023-34041-x (2023).
Article CAS PubMed PubMed Central ADS Google Scholar
Orr, H. A. The genetic theory of adaptation: A brief history. Nat. Rev. Genet. 6(2), 119–127. https://doi.org/10.1038/nrg1523 (2005).
Article CAS PubMed Google Scholar
Singh, A. H., Wolf, D. H., Wang, P. & Arkin, A. P. Modularity of stress response evolution. PNAS 105, 7500–7505 (2008).
Article CAS PubMed PubMed Central ADS Google Scholar
Montanelli, H., Yang, H. & Du, Q. Deep relu networks overcome the curse of dimensionality for generalized bandlimited functions. arXiv (2020)
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444. https://doi.org/10.1038/nature14539 (2015).
Article CAS PubMed ADS Google Scholar
Shen, Z., Yang, H. & Zhang, S. Nonlinear approximation via compositions. arXiv (2019)
Stern, C. Selection for subthreshold differences and the origin of pseudoexogenous adaptations. Am. Nat. 92(866), 313–316 (1958).
Article Google Scholar
Reinitz, J., Vakulenko, S., Grigoriev, D. & Weber, A. Adaptation fitness landscape learning and fast evolution. F1000Research 8, 358. https://doi.org/10.12688/f1000research.18575.1 (2019).
Article PubMed PubMed Central Google Scholar
Jiang, P., Kreitman, M. & Reinitz, J. The relationship between robustness and evolution. bioRxiv (2018) https://doi.org/10.1101/268862 . Accessed 2019-04-27.
Watson, R. A., Wagner, G. P., Pavlicev, M., Weinreich, D. M. & Mills, R. The evolution of phenotypic correlations and “developmental memory’’. Evolution 68(4), 1124–1138. https://doi.org/10.1111/evo.12337 (2014).
Article PubMed PubMed Central Google Scholar
Talagrand, M. Spin Glasses : A Challenge for Mathematicians : Cavity and Mean Field Models (Spinger, Berlin, New-York, 2003).
Google Scholar
Jacob, F. & Monod, J. Genetic regulatory mechanisms in the synthesis of proteins. J. Mol. Biol. 3(3), 318–356. https://doi.org/10.1016/S0022-2836(61)80072-7 (1961).
Article CAS PubMed Google Scholar
Muller-Hill, B.: The Lac Operon. A Short History of a Genetic Paradigm. de Gruyter, Berlin. (1996)
Bar-Ona, Y. M., Phillips, R. & Milo, R. The biomass distribution on earth. PNAS 115, 6506–6511 (2018).
Article ADS Google Scholar
Careya, H. V. & Duddleston, K. N. Animal-microbial symbioses in changing environments. J. Therm. Biol 44, 78–84 (2014).
Article Google Scholar
Koonin, E. The Logic of Chance: The Nature and Origin of Biological Evolution (FT press, NewYork, 2011).
Google Scholar
Franke, J., Klözer, A., Visser, J. A. G. M. & Krug, J. Evolutionary accessibility of mutational pathways. PLoS Comput. Biol.https://doi.org/10.1371/journal.pcbi.1002134 (2011).
Article MathSciNet PubMed PubMed Central Google Scholar
Vakulenko, S. Complexity and Evolution of Dissipative Systems: An Analytical Approach (De Gruyter, Berlin, Boston, 2014).
Google Scholar
Kimura, M. The Neutral Theory of Molecular Evolution (Cambridge University Press, Cambridge, 1983).
Book Google Scholar
Bloh, W., Bounama, C. & Franck, S. Cambrian explosion triggered by geosphere-biosphere feedbacks. Geophys. Res. Lett. 30, 1963. https://doi.org/10.1029/2003GL017928 (2003).
Article CAS ADS Google Scholar
Lynch, M. The frailty of adaptive hypotheses for the origins of organismal complexity. Proceedings of the National Academy of Sciences USA 104 Supplement 1(8), 597–598804 (2007).
Parter, M., Kashtan, N. & Alon, U. Facilitated variation: How evolution learns from past environments to generalize to new environments. PLoS Comput. Biol.https://doi.org/10.1371/journal.pcbi.1000206 (2008).
Article PubMed PubMed Central Google Scholar
Wang, Q. Exponential convergence of the deep neural network approximation for analytic functions. China Sci. Math. 61, 1733–1740 (2018).
Article MathSciNet ADS Google Scholar
Barron, A. R. Universal approximation bounds for superposition of a sigmoid function. IEEE Trans. Inf. Theory 39(3), 930–945 (1993).
Article MathSciNet Google Scholar
Cormen, T. H., Leiserson, C. E., Rivest, R. L. & Stein, C. Introduction to Algorithms (MIT Press and McGraw-Hill, Cambridge, 2001).
Google Scholar

Download references

Acknowledgements

I.S. gratefully acknowledges support from the Division of Physics at the National Science Foundation (NSF) through Grant No. PHY-2102906. Part of this research was performed while I.S. was visiting the Institute for Mathematical and Statistical Innovation (IMSI), which is supported by the NSF Grant No. DMS-1929348. This research was also supported in part by NSF Grant No. PHY-1748958, Gordon and Betty Moore Foundation Grant No. 2919.02, and the Kavli Foundation. J. R. and S.V. are supported by the grant of National Institutes of Health (NIH) 2R01 OD010936. S. V. was also supported by the Ministry of Science and Higher Education of the Russian Federation by Agreement No. 075-15-2022-291 on the provision of a grant in the form of subsidies from the federal budget for the implementation of state support for the establishment and development of the world-class scientific center “Pavlov center Integrative physiology for medicine, high-tech healthcare, and stress-resilience technologies”. We express our gratitude to Alexander Khodos (Israel) for his valuable assistance in conducting the numerical simulations.

Author information

These authors contributed equally: John Reinitz and Sergey A. Vakulenko.

Authors and Affiliations

School of Mathematics and Statistics, The Open University, Milton Keynes, MK7 6AA, UK
Ivan Sudakow
Departments of Statistics, Ecology and Evolution, Molecular Genetics and Cell Biology, University of Chicago, Chicago, 10587, IL, USA
John Reinitz
Institute for Problems in Mechanical Engineering, Russian Academy of Sciences, Saint Petersburg, 199178, Russia
Sergey A. Vakulenko
Saint Petersburg Electrotechnical University, Saint Petersburg, 197022, Russia
Sergey A. Vakulenko
CNRS, Mathématiques, Université de Lille, Villeneuve d’Ascq, Lille, 59655, France
Dima Grigoriev

Authors

Ivan Sudakow
View author publications
You can also search for this author in PubMed Google Scholar
John Reinitz
View author publications
You can also search for this author in PubMed Google Scholar
Sergey A. Vakulenko
View author publications
You can also search for this author in PubMed Google Scholar
Dima Grigoriev
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

I.S., J.R., S.V., and D.G. proposed the concept. I.S. and S.V. wrote the text. J.R. and D.G. edited the paper. I.S. administrated Project.

Corresponding author

Correspondence to Ivan Sudakow.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Information.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Sudakow, I., Reinitz, J., Vakulenko, S.A. et al. Evolution of biological cooperation: an algorithmic approach. Sci Rep 14, 1468 (2024). https://doi.org/10.1038/s41598-024-52028-0

Download citation

Received: 20 September 2023
Accepted: 12 January 2024
Published: 17 January 2024
DOI: https://doi.org/10.1038/s41598-024-52028-0

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.