A novel procedure for the identification of chaos in complex biological systems

We demonstrate the presence of chaos in stochastic simulations that are widely used to study biodiversity in nature. The investigation deals with a set of three distinct species that evolve according to the standard rules of mobility, reproduction and predation, with predation following the cyclic rules of the popular rock, paper and scissors game. The study uncovers the possibility to distinguish between time evolutions that start from slightly different initial states, guided by the Hamming distance which heuristically unveils the chaotic behavior. The finding opens up a quantitative approach that relates the correlation length to the average density of maxima of a typical species, and an ensemble of stochastic simulations is implemented to support the procedure. The main result of the work shows how a single and simple experimental realization that counts the density of maxima associated with the chaotic evolution of the species serves to infer its correlation length. We use the result to investigate others distinct complex systems, one dealing with a set of differential equations that can be used to model a diversity of natural and artificial chaotic systems, and another one, focusing on the ocean water level.

Scientific RepoRts | 7:44900 | DOI: 10.1038/srep44900 closely related to the current investigation. In particular, in refs 13-21 the authors put a great deal of effort into the identification of chaos, but in the current work we suggest a new route, which is based on the Hamming distance concept 32 , and on the counting of the density of maxima 33 of the abundance of the species, which we explain below. Other issues that are considered in refs 22-31 deal with generalizations of the standard three-species system to consider systems with four or more species, with the addition of other rules that may jeopardize biodiversity. Specific possibilities are studied in ref. 24 with the addition of diffusion, which may block the spiral pattern formation due to mobility, and also in ref. 26, when one adds players that never change their strategy, regardless of the neighborhood. Moreover, in the recent work 28 the authors investigate how to preserve biodiversity despite the presence of a spatially heterogeneous environment.
There are other studies on the cyclic dominance in evolutionary games, as one can find, for instance, in the recent review 27 . Some works investigate mechanisms that can be used to describe distinct kind of invasion rates between competing species, to act to jeopardize biodiversity. Specific examples can be found in the recent investigations [29][30][31] , which can be of direct interest to the current study, as we further comment in the Sec. Results.
We study the system with three distinct species and standard rules of evolution. The main motivation to deal within this well-known environment is to unveil the chaotic behavior incontestably. We move on encouraged by decades of studies in the physics of compound nuclei and nuclear resonances, and more recently by theories developed to investigate mesoscopic systems 33 , in order to offer a positive answer to this issue. Specifically, we identify the presence of chaos in the stochastic evolution and use it to quantify correlation measurements through a single and simple experimental realization of the system. As we show below, instead of following the standard way to describe the chaotic behavior, we pave a new route, with the result relating the correlation length to the average density of maxima of the abundance or density of individuals of a typical species in the system. It is the main result of the work, and is due to the chaotic properties of the stochastic evolution, which we identify and measure in the current study. We also show that it is a general result and can be used in a diversity of situations, to infer the correlation length of systems that evolve in time chaotically. We first identify the finding, and then illustrate it with two distinct applications, one dealing with a set of first-order differential equations, and the other with the ocean water level.

Results
We use stochastic simulations and follow standard procedure to investigate a system with three distinct species 8,10 . The approach is implemented in Sec. Methods, and there one includes detailed investigation that allows to develop all the results that we describe in the current Section. In particular, one describes the abundance or density of individuals of all the three species. A typical result appears in Fig. 1, where one uses the Bézier algorithm to display in blue the abundance of a given species. It shows a pattern that evolves in time oscillating around an average value, developing an apparently unpredictable number of maxima. As it is known, however, the cyclic behavior shown in the figure is in general required to generate stable evolution in a system described by a set of species that fight among themselves under specific rules. In a relatively short time scale, some species may tend to be suppressed, while others increase their abundance, but the dominance of a typical species over another one changes with the time, decreasing and inverting behavior, inducing an average value in a long time evolution. This cyclic equilibrium behavior is the central issue of the current work, and is used to identify the chaotic behavior even in a single and short time evolution of the system, as it is assured by the maximum entropy principle and the ergodic theorem.
The pattern displayed in Fig. 1 presents a number of maxima that can be related to the correlation length, due to the intrinsic chaotic behavior of the density of individuals. The study allows us to express the main result of the current work in the form It relates the correlation length τ to the average density of maxima ρ of the abundance of a typical species which is illustrated in Fig. 1. The counting of maxima is related to the tendency of the system to return to equilibrium, so the higher the density of maxima, the faster the tendency to approach equilibrium, although the system never relax to the state with the abundance fixed at the corresponding average value. The quantification that appears in Figure 1. The blue Bézier curve shows the evolution of the abundance of a typical species in the interval in between 35000 and 40000 generations. The curve is built from the data corresponding to the blue species that appears in Fig. 4, as described in Sec. Methods.
Scientific RepoRts | 7:44900 | DOI: 10.1038/srep44900 Eq. (1) is due to the chaotic pattern of the stochastic evolution, which we identify and measure using standard techniques. Before working this out, however, in this Fig. 1 one counts the number of maxima in the interval in between 35000 and 40000 generations to get the average density ρ ≈ 0.0034, compatible with the values indicated in Fig. 2 (Top). We then use (1) to get τ ≈ 49, consistent with the value suggested in the inset in Fig. 2 (Bottom). The results displayed in Fig. 2 are obtained from an ensemble of stochastic simulations which we explain in Sec. Methods. The above expression (1) is of very practical use, because it allows us to infer the correlation length with a single and simple assessment to the time evolution of the chaotic variable under consideration, being it of natural or artificial origin, so it is of wide applicability. It follows from the identification that the stochastic simulation which is generically used in biodiversity engenders chaotic behavior. To focus on this, in Sec. Methods one describes the numerical procedure to be employed in this work. We then use it to assess the random processes engendered by the dynamical evolution. However, due to the impossibility to directly control the randomness of the simulations, the strategy elaborated follows a procedure which generates two distinct final states starting from initial states that are slightly different from each other.
Owing to quantify the chaotic behavior, we employ the Hamming distance 32 to measure the difference between the two final states, which is a simple and appropriate manner to distinguish quantities such as vectors, matrices, etc, and can be illustrated with binary vectors. Since it counts the number of sites in the second vector that do not match with the corresponding sites of first one, the distance between (0,0,0,1) and (1,0,1,1) is two, for instance. We use the Hamming distance H(t) to measure the difference between the two final states, which are now seen as two distinct N × N matrices. In Fig. 3 one displays the Hamming distance density h(t) = H(t)/N 2 with the solid green curve that represents an average obtained from a set of 100 simulations, each one starting with a distinct initial state. We recall that it is actually nontrivial to distinguish chaos from randomness, but here one has to emphasize that the new algorithm that we have implemented to account for the Hamming distance, which is described in Sec. Methods, provides the possibility to access the chaotic behavior that underlies the stochastic simulations very clearly. The behavior that appears in Fig. 3 is a key result of the work. It shows that the Hamming distance density increases smoothly and then converges to a given value inside an interval with very narrow width. It is asymptotically stable and does not fill the full lattice. For an animated illustration on this, see the video at https://youtu. be/nzLP-XcCvWc. We have implemented other simulations, changing the values of the parameters (m,r,p), the size of the lattice and the number of species in the system, and observed the same qualitative behavior, showing its universality.
As we have further noted, if one modifies the rules that control the time evolution, the profile of the Hamming distance density displayed in Fig. 3 may change drastically. For instance, if the modification of the rules does not jeopardize biodiversity, the Hamming distance density gets the universal behavior shown in Fig. 3. However, if one follows the investigations considered in refs 29-31, for instance, and modifies the rules in order to destroy biodiversity, the Hamming distance density starts increasing, oscillating in the interval [0,1], but it ends up vanishing in the long run, if the two initial states evolve to become the same final state, or become unit, if the two final states are completely different from each other. The result shows that the asymptotic stability of the Hamming distance density is intrinsically connected with biodiversity. We are now investigating how the amplitude and width of the Hamming distance density behave for three, four and five species, as one increases the mobility to higher and higher values, reaching the region that jeopardizes biodiversity, as firstly shown in ref. 10 and further explored in the more recent works 34,35 . We hope to offer a detailed investigation on this issue in the near future, emphasizing that the Hamming distance behavior is directly related with the evolution of the abundance of a typical species that is displayed in Fig. 1, which was built from the data corresponding to the blue species that appears in Fig. 4. The results depicted in Fig. 4 are further described in Sec. Methods, and show that the abundance of each one of the three species fluctuates around the same average value, as shown in red, blue and yellow.
Knowing that the stochastic processes engender chaotic behavior, it is of interest to further explore the above result. Here one should ask for the behavior of the system under perturbations, the presence of attractors, the Lyapunov spectrum etc. Instead of this standard route, however, we move forward inspired by the previous study 33 and focus attention on counting the density of maxima of the time evolution of the abundance of the species. In this sense, one takes l i to describe the abundance of one of the three species i = a (red), b (blue), or c (yellow), and considers the investigation developed at the end of the Sec. Methods to calculate the average density of maxima ρ i . The main results are shown in Eqs (5)(6)(7)(8), and can be used to write ρ i in the form  This is another key result of the work, and offers a new quantity that arises from the chaotic behavior imprinted in the stochastic simulations. To verify its validity, we implemented a procedure to create an ensemble of events to obtain the correlation function, which is used to calculate the correlation length. As it is known, the correlation length τ is extracted from the correlation function as the width at half height, so we use the fitting function C(t) = cos(κt) to write κτ = π/3, and from Eqs (2) and (7) one gets to This allows to relate the density of maxima to the correlation length in the form τ ρ = 〈 〉 1/(6 ), as we have anticipated in Eq. (1).
The result offers a way to measure the correlation length from a single experimental assessment to the system under investigation, calculating the average density of maxima, as it is illustrated in Fig. 1. We note that the finding nicely connects the top and bottom panels in Fig. 2, since the correlation length which one reads from the inset in the bottom panel in Fig. 2 can be obtained from the value of the density of maxima which one reads from the top panel in Fig. 2 They are the Rössler equations 36 , and can be used to model a diversity of natural and artificial chaotic systems. We see from the left panel in Fig. 5, where the behavior of x is shown, that the number of maxima is 17, so the correlation length gives τ = 0.097. This is an excellent result, since using the fitting function for the correlation function displayed in the right panel in Fig. 5, one gets to τ = 0.098. We move on to examine another system of distinct nature that also evolves chaotically. We focus attention on the ocean water level, with data available from the National Oceanic and Atmospheric Administration. This brings about another illustration, which appears in Fig. 6, displaying in blue the observed water level for the station ID 1770000, Pago Pago, American Samoa. See http://tidesandcurrents.noaa.gov/waterlevels.html?id= 1770000. In the right panel in Fig. 6 one shows the correlation function, visually indicating that the correlation  length is around 2 hours. In fact, if one uses the fitting curve it gives τ = 2.0748, and if one counts the maxima that appear in the left panel in Fig. 6 and then uses the result in Eq. (1), one gets to the value τ = 2.0689.

Discussion
In this work we studied the presence of chaos in a system with three distinct species, that evolve under the standard rules of mobility, predation and reproduction. We first used the Hamming distance density to heuristically characterize the chaotic behavior embedded in the stochastic evolution of the system. The typical result is shown in Fig. 3, and is obtained from the algorithm developed in Sec. Methods, which effectively separates the chaotic behavior from the randomness of the stochastic simulations. The qualitative behavior displayed in Fig. 3 suggests that we investigate how the amplitude and width of the Hamming distance density behave for three, four and five species, as one increases the mobility to higher and higher values, reaching the region that jeopardizes biodiversity. Being a quantity that unveil the chaotic behavior, the Hamming distance density may be used as an order parameter to investigate chaos in complex systems. We will report on this issue in the near future.
The finding motivated us to developed a stronger result, unveiling a quantitative approach to the correlation length from a single and simple experimental measurement of the average density of maxima that appear in the abundance of a typical species during the dynamical evolution of the system. The connection between the density of maxima and the correlation length was formulated under the principle of maxima entropy, using an ensemble of events that we developed numerically in Sec. Methods. Since the stochastic simulations here considered are universally used to describe biodiversity in nature, we are then left with the suggestion that chaos is imprinted in biodiversity.
The study unveiled a simple route to obtain the correlation length associated to a given chaotic time evolution, and the simplicity of the procedure makes the result especially relevant to investigate complex chaotic systems. Examples of applications of this result abound, and here we have illustrated its applicability with the deterministic system described by the set of first-order differential equations (4), which are known as the Rössler equations 36 and engender the behavior displayed in Fig. 5. Also, we have considered another well distinct complex system, dealing with the ocean water level data that is displayed in Fig. 6.
The main result of this work indicates that a single and simple experimental investigation of the time evolution of a given chaotic quantity allows estimating the correlation length of the corresponding quantity with a very high confidence level. As we have illustrated with some explicit examples, it can be widely used to study complex processes, as the ones that appear in Geology, Meteorology, Evolutionary Biology, and in many other areas of nonlinear science. We hope that the current study may help us to better understand biodiversity and its associated chaotic behavior, inspiring further research on similar issues. In particular, the simplicity of the result suggests that the ergodic hypothesis and the maximization of the entropy are valid for complex biological systems and may serve for other studies on the thermodynamics and statistical properties of such systems.

Methods
In this work we developed stochastic simulations considering a standard system which is widely used to investigate biodiversity in nature. We review the procedure recalling that the simulations are implemented considering the system with three distinct species a, b, and c, identified with the colors red, blue, and yellow, respectively. They evolve according to the three basic rules: mobility (m), reproduction (r), and predation (p). The evolution is carried out standardly, taking a square lattice of size N × N, in which the three species and the empty sites (which is identified by e, with the color white) are equally but randomly distributed in the lattice, such that the quantity of individuals of each species (including the empty sites) is L = N 2 /4, at the initial state. We deal with local dispersion and appropriate mobility. A given site is considered active if occupied by an individual of one of the three active species, and it may interact with one of its eight nearest neighbors, the Moore neighborhood.
As usual, the numerical simulations are performed with periodic boundary conditions. If i stands for a,b or c, and α for a,b,c, or e, mobility and reproduction are represented by iα → αi and ie → ii, respectively. The other rule, predation, follow the rock-paper-scissors game, that is, ab → ae, bc → be, and ca → ce. In this work we take m = 0.5, r = 0.25, and p = 0.25, for mobility, reproduction and predation, respectively, for all the three species. We have considered other possibilities, with m = 0.25, r = 0.25 and p = 0.5, for instance, and no qualitative modification in the results was found. The dynamical process starts with a random assess to the square lattice, followed by a random selection of one of the three rules and by a random choice of one of the eight neighbors. It is then checked if the assessed site is empty or not: if it is empty, one returns to the lattice to simulate another assess to it; if it contains an individual of one of the three species, one takes the selected rule and uses it with the selected neighbor. To measure the time evolution we use generation, which is the time spent to access the lattice N 2 times.
To prepare the initial state, one randomly chooses one among the three species a, b and c, and the empty site e with the same probability, and distributes it in the square lattice. The procedure is repeated N 2 times, evenly filling all the sites in the square lattice. A typical initial state is illustrated in the left panel in Fig. 7. One uses it to run the stochastic simulations and get to the final configuration which is displayed in the right panel of Fig. 7. There one sees that the system evolves forming a specific pattern, in which the species organize themselves in spatial portions of the lattice. This is known in the literature, and has been explored in a diversity of contexts, to study diversity and other related behaviors.
In order to unveil the chaotic behavior, we use the stochastic simulations to generate a new algorithm. The procedure goes as follows: one first builds a initial state as done in the left panel in Fig. 7, and makes a copy of it. One uses this initial state to run the simulation to get to the final state, which is then saved. The key point here is that during the time evolution a new file is created, in which one saves every single step used to run it. One then takes the copy of the initial state and randomly selects a lattice site and modifies its content. This new state has the tiniest difference, since among the many lattice sites it has a single site which is different from the initial state already used to evolve in time. With this new initial state, one runs the same simulation already considered, Scientific RepoRts | 7:44900 | DOI: 10.1038/srep44900 evolving it according to the very same rules, in the same order and pace, as they appear in the saved file. The procedure leads to another final state, which is also saved.
The two final states are different, but the difference has nothing to do with the randomness of the stochastic simulations, being due to the tiniest modification introduced in the second initial state. To infer the presence of chaos, one has to measure the difference between the two final states, generated with the same stochastic rules. Toward this goal, we have developed an algorithm which displays in light green the lattice sites that differ in the two configurations, leaving the other sites empty. In Fig. 8 two snapshots are displayed, one at the initial time in the left panel, with a single almost invisible light green site (which is depicted at the grid center) marking the tiniest difference between the two initial states. The other snapshot is in the right panel, and it shows all the light green sites that mark the difference between the two final states, after implementing the simulations for a long time. The two snapshots show the impact that the subtle modification introduced in the initial state causes in the evolution of the system. The result suggests that the time evolution that appears in the stochastic simulation engenders chaotic behavior. The procedure allowed to introduce the Hamming distance density which we discussed in Sec. Results and displayed in Fig. 3.
To further explore the stochastic network simulations, one investigates how the quantities of individuals L i (i = a,b,c) vary as the system evolves in time. In fact, one uses the densities l i = L i /N 2 and the results are depicted in Fig. 4, showing that the abundance of each one of the three species fluctuates around the same average value, as shown in red, blue and yellow. They do not add to unit because of the number of empty sites, that fluctuates around a lower value. This happens because the species move, reproduce and predate evenly, while the empty sites enter the game passively. We have run other simulations, starting from an initial state with no empty site, and ended up with final states with the species having abundance similar to the ones displayed in Fig. 4. This shows that the addition or not of empty sites in the initial state does not modify the abundance behavior for long time evolutions. The abundance behavior that appear in Fig. 4 show a substructure which is not present in the  curve depicted in Fig. 1, due to the Bézier algorithmic tool used to smooth its behavior. This is a legitimate procedure, since the substructure that appears in the abundance is due to the intrinsic discreteness of the stochastic approach, having nothing to do with the chaotic behavior present in the process under investigation. The Bézier algorithm that we implement in this work uses as the control points the set of points available from the data, in the time interval used to count the number of maxima.
As can be seen from Fig. 4, the abundance l i evolves in time and fluctuates to produce local maximum in the interval [t, t + δt], for sufficiently small δt, so one has ′ > l t ( ) 0, The joint probability ′ ″ P l l ( , ) i i can be used to calculate the average density of maxima ρ i through the simple route: the probability to find a maximum in the interval [t, t + δt] is proportional to the integral spanning the region defined above, such that The principle of maximum entropy can be used to construct the joint probability distribution for l i and its derivatives from the previous equations. After implementing the algebraic calculations, integration on l i leads to ′ ″ P l l ( , ) i i which gives The above expressions can be used to write the density of maxima in terms of the correlation function, as we showed before in Sec. Results. One then implement a numerical investigation to obtain an ensemble of events, taking an initial state and running the simulation for a very long time. This is a single event, and it is repeated many times, to get to the ensemble of events, each event being built as the first one, running the simulation from a random initial state within the same time interval. The ensemble is then used to count the number of peaks in l i (i = a, b, c) in the simulation. They are depicted in the top panel in Fig. 2, for each one of the three species, and there the values of the average density of maxima are also shown.
The procedure is also used to get the correlation function, which is depicted in the bottom panel in Fig. 2 for all the three species in a long time evolution, showing the expected general behavior, with the periodicity reflecting the periodic boundary conditions in the lattice. In the inset we depict the correlation function for a time evolution enough to display the correlation length. One then uses the short time evolution shown in the inset of the bottom panel in Fig. 2 to search for the best fit curve, getting C(t) = cos(κt) where κ = 0.0212, with error lower then 5%, obtained within the least square approach. We emphasize that the natural decaying process present in the correlation function is not of practical significance to calculate the correlation length, since it can be obtained from the short time evolution displayed in the inset in the bottom panel in Fig. 2.