Introduction

A widely-appropriate complexity measure is needed in numerous areas1,2,3,4, for directly applications or for a unified theoretical framework like statistical mechanics5. To date, except some special cases6,7,8, most calculable complexity measures are superficially classified into three types (I, II and III) depending on three C-R relations: a monotonically ascending curve, a convex curve and a monotonically descending curve, respectively8. Competition is mainly between the type-I and the type-II. According to many type-II supporters, highly complex systems like human brains evidently exist at a critical transition point between randomness (or deterministic chaos) and regularity called the edge of chaos or weak chaos9,10,11,12,13 and then an ideal type-II measure should regard an object of weak chaos as the most complex and low-periodic objects and completely-chaotic objects the simplest, despite most type-II measures, e.g. the logical depth14, do not suffice.

There is another deeper but perhaps not strictly accurate classification. Deterministic measures are usually estimations of the incomputable concept Kolmogorov complexity3 (KC), the uncompressible information account of individual object defined as the length of the minimal computer program that regenerates the object; while statistical measures are chiefly derived from H for systems describable in probabilistic language9,10. A quantity is called extensive if it scales (asymptotically) with the size rw of the random word (string) that describes the system under consideration15. In an isolated system, H is extensive, whereas even some statistical measures appear not9. Then disputes arise.

Amazingly, a recently-introduced deterministic measure lattice complexity CL exhibits intrinsic adaptability to various C-R relations6,7, even a degree of sensitivity to weak chaos, implying an ultimate solution of all above-mentioned disputes. For a deterministic α-nary symbol string s, intrinsic adaptability with a parameter r needs only treating all (overlapping) length-r words in s as αr-nary symbols for specific measures' calculation; while extrinsic adaptability needs extra variables and operations to show two existing different-type measures' behavior alone or jointly8. Traditionally, r is related to rw, because s is often considered a collection of outcomes of a length-r random word and a statistical form of intrinsic adaptability may help to find the proper theoretic fundamental.

In this article, C2, a statistical measure previously known as of type II16,17, is found of somewhat intrinsic C-R adaptability. Further analysis reveals a contradiction between the adaptability and the random UDGM (r-UDGM), i.e. random process, in which entropies are exclusively rooted. With the nonlinear deterministic iterative system being identified as the deterministic UDGM (d-UDGM) that can generate any arbitrary symbol string as the traditional r-UDGM can, the C-R competition is clarified. A particular UDRM containing both r- and d-UDGM is shown to unit major competing ideas of complexity measurement naturally in an estimation of KC.

Results

Deterministic adaptable complexity

With its widely-used type-I estimation Lempel-Ziv complexity18 (CLZ), KC is traditionally considered a measure of randomness3,10,19. Although this judgment is valid for random objects, the adaptable estimation CL reveals more aspects of KC.

Both CLZ and CL simulate a machine reading the given string s over a finite alphabet continuously into an unlimited memory. Alongside the reading procedure, both algorithms virtually separate s into uncompressible units and count the unit number as the complexity value of s. The present unit is one such that has a present symbol just being read. It can still be compressible and then extendable.

Compressibility is reduced to duplicability in CLZ. If the present unit can be duplicated from any section of the exhaustive memory including the already-read part of the unit itself, the duplication operation extends the unit simply symbol by symbol until no section of the memory equals the unit. At that time the present symbol is regarded as an insertion making the present unit uncompressible and the next unit, with its first symbol, will become the present (see the example below).

In CL, a present-unit-extending mechanism prior to the duplication is the deterministic iterative map on following either chaotic (no-symbol-repeating) rule or periodic rule. Iterations of such map are regarded as compressible as duplications, inspired by the fact that short-program-described iterative systems, e.g. logistic maps20, can produce any symbolic sequence out of chaotic or periodic orbits.

For example, let s = 0010001100 11010111 and the dot and the sign “” denote the insertion in CLZ and CL, respectively. The results are as follows:

According to CLZ, the first symbol is always an insertion without prefix. The second has a duplicable prototype in the memory, but the third makes the unit 01 does not match the exhaustive memory 00. The fourth and fifth symbols 00 together can be duplicated from the previous symbols 0010, while the sixth can not, because 000 has no prototype in 00100. Three following uncompressible units are 11, 001101 and 0111. Since there are 6 separated units, CL (s) = 6.

According to CL, the first two symbols 00 are generated by a 1-period iteration, but the third symbol 1 interrupts the iteration. Because 001 is not duplicable, it is an uncompressible unit. The next unit 0001 is identified similarly. Concerning the third unit 1001, since the first two symbols 1 and 0 are different from each other, they should be assumed following chaotic rule; the third symbol 0 implies that a periodic rule is employed with an initial state 1. After a periodic rule is broken, neither 1001 nor its follower unit 101011 is found duplicable. With the last unit 1 being separated, we see CL (s) = 5.

Let parameter r = 2, any two-symbol word in s compose a refined symbol and CL (s2) = 3. Let r ≥ 6, CL (sr) = 1. Indeed, as has been shown6,7, for a finite s, there is a critical order r* such that once rr*, sr can be regarded as a single iteration and then CL = 1. When r reaches the particular r* of a given “completely chaotic” object, with both the “completely chaotic” and the low-periodic object obtaining the minimal CL, CL achieves the transition from a type-I measure to a type-II. The objects of highest r*, with the most difficulty to obtain CL = 1, are strings of the period-doubling accumulating points known as weak chaos13.

Adaptable entropy

In classical information theory21, when the probability pi of any event xi is obtainable, with a random variable X representing all α possible events, the α-nary Shannon entropy is the mean of the information content −logαpi of X

If α = 2, the unit of S is just bit; and if not, since , one may time S by log2α to get the entropy H of bit. For an binary independent and identically distributed random word with elemental events and any event having the probability , S becomes the entropy rate (entropy per symbol) of order rw ,

It is well-known that for uniformly distributed . Measuring the mean of bits needed for the shortest description of the random word's experimental outcome, is extensive because in the limit the entropy rate21 h is a constant.

Accompanied by a type-I measure C1, C216,17 is derived from “a hierarchical approach to complexity of infinite stationary strings22.” By stationary, we see that statistical properties of the strings in consideration are fixed with time or space changing, a precondition for all statistical measures' application.

Given a binary string s of length n, there are totally 2r distinct words of length r (or r-words for short). Let Fa(r) denote the frequency of a distinct allowed r-words that really emerges in s and Ff(r) the frequency of a distinct forbidden r-words in s counted as follows: if in s (except its end) a (r − 1) -word emerges x times, but no r-word emerges, we account a forbidden r-word emerging x times. For example, when basic alphabet , if , then we regard the frequency of the allowed r-word as that of the forbidden r-word .

With the probability being replaced by relative frequency in s, C1 is the entropy rate of allowed r-words and C2 of forbidden r-words:

Here, i denotes each distinct r-word; and are relative frequencies.

Given a finite r, if s containing all possible (including overlapping) 2r r-words, no forbidden r-word occurs. For a completely chaotic (random) case, as , the number of distinct forbidden r-words , then .

Since s is finite, C1 is a real-world estimate of hr, while C2 is roughly adaptable to type I and type II. First, the critical order r* still works. If s is of a level of randomness, every r*-word in s is distinct as well as every (r* + 1) -word is. So the number of distinct allowed (r* + 1)-word , thus . Second, there may exist another critical word length such that for , and then C2 = 0. By increasing r from the of a given completely chaotic object to the r* of the same object, C2 roughly achieves a transformation from a type-II measure to a type-I.

If s is a sequence of minimum period m, when rm, there are m different r-words of equal frequency and then C2(r) can simply be predicted.

Functionally, C2 is composed of a type-I measure C1 and a type-II measure . As is a low-precision version of C2 and , may cause a precision problem of C2 in showing type-II behavior.

What is really involved in C2 calculation is the frequency of every distinct allowed r-word sharing length-(r − 1) prefix with a forbidden r-word. This means that C2 is actually applicable to α-nary strings. For convenience and without loss of generality, let us assume that the strings under consideration are binary hereafter.

Intrinsic adaptability

Let us use the logistic map to exhibit the intrinsic adaptability. The case μ = 4 is known as the completely chaotic (pseudo-random) object and the case μ = 3.57 a representative sample of weak chaos. After 25000 times iteration deleted as transient, from a trajectory of xt we got a binary symbolic sequence of length 8204 by the partition 0.5. With different r, CLZ, CL, C1 and C2 for both μ = 4 and μ = 3.57 are calculated and shown in Fig. 1.

Figure 1
figure 1

Complexity of logistic map.

(a) CLZ and CL for μ = 4; (b) C1 and C2 for μ = 4; (c) CLZ and CL for μ = 3.57; (d) C1 and C2 for μ = 3.57.

From Fig. 1, with r increasing remarkable symmetry can be found in behavior of the two pairs of measures, CLZ and CL and C1 and C2. When μ = 4, and if we scale the ordinate logarithmically, we will find r* = 26. Although CLZ and CL of low r are almost equal, the difference between them increases with r until rr*. On the other hand, with , C1 = 1 and C2 = 0. They rapidly converge when r approaching to r* and stay equal when .

When μ = 3.57, [Fig. 1(d)] and (not shown in Fig. 1). Actually, in a long range of r between about 26 and 834, C1C2. To let C2 act as a type-II measure, r must be within the range from 4 to 10.

With r ≤ 3, the μ = 3.57 and the μ = 4 cases are not distinguishable by C2. When 26 > r > 10, CL and C2 are both in a type-transition state. The transition range is near the word length r = 13 [Fig. 1(b)], the solution of the equation . It is easy to see that and are valid for any length-n string and is valid only if in s each overlapping distinct possible r-word just appears once6.

Fixing r but letting μ vary with Δμ = 0.0001 from 3.5 to 4, results are as shown in Fig. 2. When r = 3, almost all cases in the region about μ > 3.555 obtain C2 = 0 (not shown in Fig. 2); when r = 4, the zero-C2 region reduces to about μ > 3.907 [Fig. 2(b)]; and when r = 4 to about μ > 3.978 (not shown in Fig. 2). Since is either 3 or 4 for most of chaotic cases, we see a serious precision problem of C2 of low r [Fig. 2(b)].

Figure 2
figure 2

The C2 of logistic map.

(a) Bifurcation diagram; (b) r = 4; (c) r = 10; (d) r = 12.

As C2(10) = 0 for μ = 4 [Fig. 2(c)], C2(10) is roughly type-II. For any chaotic s, C2 always rapidly converge to C1 once the of s has been exceeded [Fig. 1]. Since 10 is certainly larger than 3 or 4, within most chaotic area except a small region very close to the point μ = 4, C1 and C2 act similarly.

Highest C2 [as shown in Fig. 2 (c)] is not close to the edge of chaos as highest CL is [see Fig. 8 in Ref. 6]. When r is only a little higher than the of μ = 4 case, e.g. r = 12, C2 becomes definitely a type-I measure [Fig. 2 (d)]. Thus, C2 has a significantly smaller range of choice of r for roughly type-II behavior than CL.

Symbolic dynamical analysis and UDGM

Symbolic dynamics for one-dimensional nonlinear iterative systems including logistic maps provides a one-to-one correspondence between any semi-infinite symbol string and the initial point of the trajectory producing the string20. A finite r-word represents a deterministic segment enclosing the initial point. Increments of r will rescale the segment into a shorter one. Therefore, the parameter r in CL is also called fine-graining order, while in entropy this name may not be appropriate, as discussed below.

Table 1 shows the distribution of all possible 4-bit words in μ = 3.645 case. If two adjacent 3-bit-prefix-sharing words both emerge or not emerge, they are ignored. Hence we get only three distinct 4-bit forbidden words 0110, 0100 and 1100, but ignore 1001, which is also adjacent to an allowed word. Moreover, since one r-word creates two prefix-sharing (r +1) -words, when r = 5, all (virtual) segments corresponding to 4-bit forbidden words are ignored. It makes fluctuate irrelevantly to the real spatial structure in phase space. For instance, there always exist mid-position adjacent segments not being visited, but when r = 4, 5 and 6, the number of distinct forbidden words corresponding to such segments equals 2, 0 and 2, respectively. Thus the curves of C2 versus r can hardly be smooth except some cases of almost complete chaos [Fig. 1 (b) and (d)].

Table 1 Distribution of all possible 4-bit words in 8204-points μ = 3.645 case

Essentially, any entropy is only applicable for a random string emitting r-words with measureable stationary distribution, or the r-UDGM of arbitrary given deterministic string s. To apply the r-UDGM exclusively, one has to ignore all temporal or spatial information of s unrelated to the distribution, let alone nonstationary objects of no stable distribution. For instance, given r = 2, the 2-periodic infinite string has C2 = 0.5 and be regarded as a medium complex case despite its simple temporal structure.

In contrast to the entropy, KC estimations CLZ and CL are in itself designed for single deterministic strings. The successive process in searching duplicable section ensures the low complexity value for simple regular strings, e.g. of CLZ = 3 and CL = 1.

Due to the absence of a perspective of deterministic chaos in its process of dealing with irregular strings, CLZ is still type-I. Using the terminology of C1 and C2, we may say in s CLZ successively counts up non-overlapping allowed words in assorted lengths, each of which is adaptively increased from 1 to such a value that the word is distinct from any part of the exhaustive memory. The word length adjusting mechanics makes CLZ be much more fine-grained than and let the parameter r become meaningless.

Deterministic iterative systems are considered in CL a sort of simple data-generating models. They can be regarded as logistic maps, in which fine-graining order r represents the size of the segments, or as equivalent one-bit-output binary recurrence equations, in which r represents the bit number of input. For any given s, when r < r*, this sort of models is non-universal data-generating model (NUDGM) and when r ≥ r*, become universal, i.e. the d-UDGM and leave no space for others.

The d-UDGM can be used alone by regard s as a single trajectory of a simplest equation, whose operation assignment must be defined as regular as possible. Determining the exact optimal set of arithmetic operations and their assignment in the equation may need countless tentative calculations. What can absolutely not be reduced is the smallest bit number of input, the r* of s, representing the system's uncompressible information.

Henceforth let rw denote also the length of assumed random word in the r-UDGM of s to distinguish it from the fine-graining order r. For a length-n s, let CL(r, n) replace CL(sr) and let r = 0 mean that no segment involves iterative mappings, CLZ can be viewed as a special case of CL denoted by CL(0, n).

Discussion

Logically, a quantity designed for single deterministic strings is unconditionally suitable for stationary random strings because one can calculate the quantity's probabilistic mean, whereas the mean may not fit any individual string. The quantity's mean of a length-rw random string can be interpreted as the average of k individual results each computed from a length-rw string emitted by the random string with . To arrange such k emitted strings in a single sample time series, we must assume that the random string, or equivalently the time series, is ergodic, i.e. the relative frequency of any distinct length-rw deterministic string in the time series equals the string's probability: , .

If every Pai(rw) is determined by previously known , in an ergodic time series the arrangement of all length-rw emitted strings can be overlapping or non-overlapping, in a certain order or disorder, which is pointless for calculating a quantity's mean: the mean can be theoretically obtained without numerical computation. A theorem of Brudno23 states that the KC per symbol of almost all emitted strings of infinite length is equal to the entropy rate h. Likewise, when fine-graining effect is absent or negligible (i.e. r = 0 or 1), maximum CL relates to emitted strings of maximum randomness and one can prove that6,18,21

with probability 1. Hence, in traditional statistical-mechanics language or are not only extensive but also asymptotically equal to H.

In order to apply entropy to a given length-n time series, we have to assume that s is ergodic despite its real generating mechanism. Without previously-known , from s only C1(rw) rather than h can be obtained. To make and , we have to let . For example, if s is a given completely random object without forbidden word, we must at least let to ensure that no distinct rw-word has .

Here we encounter an unsolvable paradox: to show C2's type-I behavior, i.e. let , it has to be valid that , or even , for a given completely random object, thus the precondition for any entropy's applications cannot be satisfied. Moreover, neither an r-UDGM-rooted measure nor a d-UDGM-rooted can embody the intrinsic adaptability, since in the r-UDGM fine-graining process is not only meaningless but also harmful and in the d-UDGM r* of s is the ultimate point rather than an example of intrinsic adaptability.

Independent of any specific data-generating model, the KC estimation concerns lossless regeneration of given data. The insertion operation in CL fills blanks left by any NUDGM with symbols already known from s and grants the UDRM containing this NUDGM universality. The duplication operation freely generates repeated words as the r-UDGM does, making the UDRM of CL(0, n) a sort of quasi-r-UDGM. Except for low-period objects, CL(0, n) shows no noticeable distinctness from C1 in its C-R behavior. Without needing to previously set a for calculating the average over all allowed rw-words in s, the whole s is treated as a single emitted object for CL and then per symbol appear to be estimations of h even better than C124,25,26.

With r increasing, CL appears a simulator not only of H but also of r*. The intrinsic adaptability of CL embodies indeed a general information/complexity measure presenting a smooth transition from the r-UDGM-rooted (superficially type-I) information concept to the d-UDGM-rooted (ideally type-II) complexity concept, all consistent with the principle of KC.

The r-UDGM and the d-UDGM identified here enable us to succinctly redefine the C-R conflict and avoid unnecessary confusions caused by misuse of each UDGM, e.g. about randomness and chaos, not only in complexity measurement. Besides many well-defined deterministic dynamical systems, living organisms12,27,28,29, e.g. human brain, heart and economic systems, appear nonstationary, edge-of-chaos, and, strictly speaking, beyond the scope of all types of r-UDGM-rooted statistical mechanics including generalised versions5. For these systems, a d-UDGM-rooted measure or framework is certainly an option and need further studies. However, in living organisms, randomness is not able to be excluded except that noise or free will30 is. Thus, a UDRM-rooted framework that proceeds from a KC-based general information measure, CL or its possible revised version, may have more adaptability to complicated real-world situations than a single-UDGM-rooted.

In brief, though we can separate complexity from information by using the d-UDGM and the r-UDGM alone, it would be more natural to accept a general measure encompassing H and its d-UDGM counterpart. The behavior of this measure should have been outlined by CL, since the d-UDGM always becomes more and more overwhelming with r increasing.