Generalized entropies, density of states, and non-extensivity

The concept of entropy connects the number of possible configurations with the number of variables in large stochastic systems. Independent or weakly interacting variables render the number of configurations scale exponentially with the number of variables, making the Boltzmann–Gibbs–Shannon entropy extensive. In systems with strongly interacting variables, or with variables driven by history-dependent dynamics, this is no longer true. Here we show that contrary to the generally held belief, not only strong correlations or history-dependence, but skewed-enough distribution of visiting probabilities, that is, first-order statistics, also play a role in determining the relation between configuration space size and system size, or, equivalently, the extensive form of generalized entropy. We present a macroscopic formalism describing this interplay between first-order statistics, higher-order statistics, and configuration space growth. We demonstrate that knowing any two strongly restricts the possibilities of the third. We believe that this unified macroscopic picture of emergent degrees of freedom constraining mechanisms provides a step towards finding order in the zoo of strongly interacting complex systems.


Introduction
Today, witnessing the feedback loop of developing digital technologies and increasing amount of data collected, there has been an ever increasing need and opportunity to understand and control complex biological, social or technological systems [1][2][3][4][5].The hallmark of such systems is that their global behavior emerges out of a large number of stochastic variables interacting in a non-trivial way [6][7][8][9].A useful level of description is provided by (generalized) statistical mechanics, an effort to identify relations between relevant observable summary statistics of stochastic dynamics over configuration space.In many cases, the microscopic dynamical rules governing the system are not known; instead, their effect on first-order and higher-order statistics (i.e., visiting probabilities and spatial/temporal correlations) over configuration space form the basis of understanding.A first classification of all systems is given by the mere size of the configuration space W in the function of the number of microscopic variables N .This is also a necessary classification for system-size independent modeling.In the absence of interactions, W (N ) grows exponentially as the joint distribution over microscopic variables factorize.Non-trivial joint distributions, however, result in non-trivial restrictions on configuration space and possibly non-exponential scaling of W (N ).Such systems are labelled nonextensive.In this paper, we factor sources of non-extensivity to first-order and higher-order statistical properties of the joint distribution over microscopic variables.In particular, effective classification of all higher-order statistics, however complicated they are, have been introduced under the name of generalized entropies.Generalized entropies indirectly model correlations: the specific entropic form that scales with system size (i.e., extensive, S ∼ N ), tells us which class the system itself belongs to.
The diversity of proposed entropic functionals reflects the conceptual diversity behind the assumptions all leading, in weakly correlated systems, to the same mathematical form of the Boltmann-Gibbs-Shannon entropy, S BGS = W i=1 −p i ln p i .In particular, arguments relying on thermodynamics, statistical mechanics, dynamical systems, information theory, and statistics all provide means to derive S BGS as a useful measure, and they all provide different means to generalize it [2,18,[30][31][32][33][34][35][36][37][38][39]].Here we do not commit to any of these conceptual frameworks, instead, following the work by Hanel and Thurner in Ref. [30], we rely on an axiomatic characterization of generalized entropies, based on the Shannon-Khinchin axioms SK1-SK4 [40].Assuming that most of the relevant generalized entropic forms can be written as a sum of a pointwise function g over probabilities, with a notable exception being the class of Rényi entropies, S Rényi = 1 1−α log W i=1 p α i , axioms SK1-SK4 regarding S g [p] translate to the language of the entropic kernel g(p).Prescribing all SK1-SK4 uniquely determines g to be proportional to the Boltzmann-Gibbs-Shannon kernel, g BGS = −p ln p.A surprisingly rich phenomenology of all possible generalizations to non-extensive systems can be achieved by discarding the only SK axiom that prescribe the resulting entropy to be additive, namely, the decomposability axiom SK4, S[p AB ] = S[p A|B ] B + S[p B ], expressing that the entropy of a joint distribution p AB can be decomposed to the expected entropy of the conditional distribution p A|B and the entropy of the marginal p B .Interestingly, assuming equiprobable configurations (p i ≡ W −1 ), all possible entropies obeying to SK1-SK3 follow the asymptotic scaling law with Hanel-Thurner (H-T) exponent 0 < c ≤ 1 [30].A particularly simple representative of each possible asymptotic non-extensivity class is given by the one-parameter family of Tsallis entropies [41], g q (p) = p−p q q−1 , with g q belonging to the class c = q, limiting g BGS (class c = 1) as q → 1.
Crucially, for any specific system, the functional form of the generalized entropy that is extensive maps any correlation structure beyond first-order statistics {p i } to its global consequences: the scaling of configuration space W (N ) with system size, classified by the H-T exponent c.In this paper, we unify this phenomenological description with the effect of first-order statistics to gain a complete picture of sources of non-extensive configuration space growth, factored to first-order and higher-order statistics.In particular, as entropies are invariant to relabeling of states i ↔ j, they only depend on the fraction of states with probability p.This probability density over probabilities we call density of states (p), similarly to statistical physics and condensed matter theory, where densities over log-probabilities play an important role.
The paper is organized as follows.The Results section introduces the general formalism, subsection Density of states: examples discusses the role of specific density of states, and the Discussion summarizes the results.Detailed calculations are given in the Supplementary Information S2.
The paper is organized as follows.The Results section introduces the general formalism, subsection Density of states: examples discusses the role of specific density of states, and the Discussion summarizes the results.Detailed calculations are given in the Supplementary Information S2.

Results
In this section, we develop a simple mathematical framework relating three central concepts of strongly correlated systems: first order statistics, quantified in terms of density of states (p), higher-order statistics, modeled by generalized entropic functionals S g [p] and the scaling of configuration space with system size W (N ), classified by the H-T exponent c. Figure 1a illustrates the idea.
While doing so, we attempt to provide a step-by-step introduction to the logic of generalized statistical mechanics.Generalized statistical mechanics phenomenologically classifies all conceivable correlated systems by compressing all statistical dependences in the system's stochastic dynamics into one relevant measure: the scaling of configuration space with system size.It uses a reverse logic: starting from the size of the configuration space W , through the prescription of extensivity of system-specific generalized entropic functional, one arrives to the system size N (W ) [42,43].Note that based on observing (some statistics of) the dynamics over configuration space, this is a meaningful definition of system size, whereas the mere number of variables is not: think of a configuration space defined by many copies of the same variable (i.e., maximal mutual information between them).In order to avoid confusion, from now on we refer to N as effective system size.Inverting (the asymptotics of) N (W ) tells us the system's configuration space scaling W (N ), classified by the H-T exponent c through Eq. ( 2).We complement this algorithmic recipe, depicted in Figure 1c, with the missing ingredient, first-order statistics (p), to yield a coherent picture of all possible sources of non-extensivity, factored to first and higher-order correlations.
Asymptotically, W → ∞, any generalized entropy S g can be written as by grouping terms with the same probability in the sum, weighted by the density of states , visualized in Figure 1b.Concavity of g, along with Jensen's inequality, g (p) ≤ g p , guarantees that S g is maximal for the uniform distribution over states, (p) = δ(p − 1/W ) (see Supplementary Information S2).
The density of states cannot be arbitrary, however: it is constrained by the normalization condition on p, that is, the expected value of p under is fixed to be 1/W , decreasing the dimension of the parameter space of by 1.
When needed, we emphasize this constraint by explicitly writing (p|W ).An additional technical difficulty stems from the fact that the support of is bounded.In this paper, we choose density of states that are bounded on [0, 1] but limit well-known distributions over a semi-infinite support as W → ∞, and consequently, as p → 0.
The particular forms of we consider, with detailed calculations in subsection of Density of states: examples and in the Supplementary Information S2, are i) a delta function (p) = δ(p − 1/W ), corresponding to uniform distribution over states (microcanonical, MC), ii) a combination of multiple delta functions, describing multiple uniform domains in configuration space, possibly scaling differently with configuration space size W (multi-delta, MD), iii) a case in which a single state has macroscopic (non-disappearing) probability at the limit W → ∞ and the probability of all other states are equal (Bose-Einstein, BE), iv) an exponential density of states (exponential), v) a log-gamma density of states that limits log-normal (log-gamma), one that is a power-law with exponent limiting −1 (power law), and vi) a two-parameter family over [0, 1], the beta distribution, where we tune the one remaining free parameter to achieve a power-law tail with a tunable exponent (beta).

Configuration space scaling
A simple classification of all correlated systems can be given by assessing how the associated extensive generalized entropy (S g ∼ N ) reacts to configuration space rescaling, W → λW .This idea, consistent with the Shannon-Khinchin axiomatic foundations of information, is articulated in terms of the Hanel-Thurner (H-T) exponent c, defined in Eq. ( 2).g q = (p − p q )/(q − 1) g BGS = −p ln p e.g., Figure 1: a) Generalized entropies S g , providing a phenomenological classification of higher order statistics over configuration space, density of states (p), summarizing first order statistics over configuration space , and configuration space scaling with system size W (N ).Knowledge of any two strongly restricts the possibilities for the third.b) Computation of entropy S g , corresponding to the shaded area, based on kernel g and density of states .In this example, g = g BGS = −p ln p, (p) is exponential, and the size of the configuration space is W = 10 6 , corresponding to p = W −1 = 10 −6 .c) Computational steps relating generalized entropy S g , density of states , configuration space scaling W (N ), and Hanel-Thurner exponent c.Density of states and generalized entropies summarize first and higher order statistics over configuration space, respectively, whereas configuration space scaling and the H-T exponent classify complex systems based on how available configuration space scales with effective system size N .Note that the starting point is the size of configuration space W ; effective system size N is determined by leveraging extensivity of the system-specific generalized entropic form.
Here we generalize H-T scaling to systems with arbitrary (not necessarily uniform) visiting probabilities over states as H-T scaling in case of a uniform distribution over configuration space is recovered by setting (p|W ) = δ(p − 1/W ), simplifying Eq. ( 5) to R λ = lim W →∞ λg 1 λW /g 1 W ∼ λ 1−c , implying that g(p) scales as p c as p → 0 + .For example, g BGS = −p ln p ∼ p, and g q = p−p q q−1 ∼ p q if 0 < q ≤ 1, in agreement with S BGS and S q belonging to H-T class c = 1 and c = q, respectively.In a general framework, however, non-extensivity, classified by the H-T exponent c, depends on both first order and higher order statistics, accounted for by and g jointly.In particular, as the density of states broadens (while still obeying to the expected value constraint p = W −1 ), contributions to the total entropy come from configurations with a wider range of probabilities. Figure 2b shows the contribution of configurations with probability p < r to the total entropy, which we call cumulative entropy Π g (r), for specific combinations of density of states and entropy kernels g.
How does broadening of affect the scaling of available configuration space W (N ), and its classification, given by the H-T exponent c?In the following, we perform exact calculations following the steps illustrated on Figure 1c, for specific density of states and entropy kernels g.We summarize the results in Table1 for BGS entropy and in Table2 for Tsallis entropies.

Density of states: examples
In order to keep track of the consequences of changing the density of states alone, we introduce the following nomenclature.We refer to the scaling of S g = W g(p) with W as regular if with C 1 = 0, where δ denotes uniform configuration probabilities over the sample space (microcanonical ensemble).
Otherwise, the scaling is referred to as anomalous.

Uniform distribution over configuration space (microcanonical)
The simplest example is associated with the microcanonical ensemble whose density of states is written as (p) = δ(p − 1/W ).This form enables us to analitically express all forms of generalized entropies as Based on that, e.g. the S BGS ∼ ln W and S q ∼ 1−W 1−q q−1 dependence for the Boltzmann-Gibbs-Shannon and Tsallis entropies can be shown in a straightforward manner.Hence, extensivity of S BGS is ensured by imposing exponential configuration space scaling W (N ) ∼ e N , whereas S q is extensive under W (N ) ∼ N 1 1−q (for further details see Table1 and Table2).
As the upper bound of S q is always realized by the microcanonical ensemble, this case is corresponding to maximal disorder, which coincides with the SK2 maximality axiom.

Multiple uniform domains over configuration space (multi-delta)
A straightforward generalization of the classical microcanonical ensemble, let us discuss a system whose phase space is decomposeable into several disjunct k + 1 sub-domains (e.g., k + 1 different set of configurations).The volume of these sub-domains might scale differently with the size of the system being denoted by V 0 (N ), V 1 (N ), ..., V k (N ).We additionally assume that the V 0 (N ) function is standing for the scaling of a forbidden region, and each configuration corresponding to the same n-th set of configurations is occurring with a probability proportional to the size of this population ∼ V n .Hence, the DOS with the correct pre-factors can simply be formulated as where In the most trivial case of k = 1, Eq. ( 9) is reducing to (p) = δ 1 − 1 W (microcanonical).Under this , generalized entropies take the form of where V * = max j V j defines the asymptotically leading term.In spite of the obvious analogy between Eq. ( 8) and the last part of Eq. ( 10), the two expressions can fairly differ from each other, e.g. in terms of non-extensivity classes.In the next paragraph, we proceed with providing a higher resolution picture of entropies defined under multi-delta density.
The entropy form in Eq. ( 10) implicitly suggests that if a particular region V 0 of the configuration space is forbidden, or the regions V i=1,...,k = V * are rarely visited, then the generalized entropies are principally determined by the behaviour of the dominant term V * , which has already been discussed in various different contexts such as in Refs.[9,43].This additionally implies that extensivity is entirely encoded in the specific dependence of V * = V * (W ).For simplicity hereinafter we assume that the volume of the entire configuration space is scaling with the volume of the leading sub-domain as V * (W ) ∼ W ξ , yielding where 0 < ξ ≤ 1 for obvious reasons (see Supplementary Information S2).For the linear case (ξ = 1) regular sample space scaling is recovered [30], however, sub-linear dependence can have very interesting consequences in terms of the H-T scaling given in Eq. ( 2).Plugging V * (W ) ∼ W ξ into Eq.( 5), i.e. keeping track of how entropies change under the rescaling of the configuration space volume we obtain where c = 1 − ξ + ξq.Compared to the microcanonical ensemble where Tsallis entropy, S q is extensive under For the sake of detailed comparison of the scaling exponents under different forms of see Table1 and Table2.
A single macroscopic state (Bose-Einstein) In this example we consider a system comprising of a single state with a macroscopic large probability, p 1 = 1 − 1 W −1 , together with W −1 evenly distributed further states whose corresponding probabilites are given by p i=2,...,W = Based on the analogy with a Bose-Einstein condensate, we refer to this system as Bose-Einstein, and write the corresponding density of states as Using that, generalized entropies can analytically be expressed as Note that Eq. ( 14) differs greatly from its microcanonical analogue Eq. ( 8), revealing that even few configurations with macroscopic probabilities can significantly alter the behaviour that we would predict based on the microcanonical ensemble.
A system hallmarked by the density of states given above in Eq. ( 13) is displaying strong heterogeneity in its configurations as the variance of configuration probabilities decays linearly with the expected value, σ 2 W →∞ ∼ p ∼ W −1 , contrary to the microcanonical picture, where σ 2 ∼ 0 ∼ W −∞ .Due to the presence of macroscopically large weights, entropic forms have non-zero contributions coming from p ≈ 1, therefore H-T scaling is given by where c = 2 for BGS and c = 2q for Tsallis entropies (see Supplementary Information S2).Note that in both cases the c exponent is greater than that of corresponding to the microcanonical picture.Hence, this example apparently reveals how H-T scaling is changing as (p) broadens.As another consequence, under the previous density of states, Tsallis entropy with deformation parameter q < 1 2 is extensive if W (N ) ∼ N 1 1−2q however, for q > 1 2 extensivity of S q surprisingly requires W (N ) to be monotonically decreasing (see Table1 and Table2).

Exponential density of states (exponential)
Another characteristic type of decays can be formulated through an exponential density of states, which is simply expressed as Although this density function satisfies the normalizaton condition only in the asymptotic sense, i.e., 1 0 (p)dp = 1 + O(e −W ), the corresponding approximation error decays rapidly with W . Hence, the expected value of the above exponential form can safely be approximated by p = 1 W + O(e −W ), in accordance with the constraints imposed in Results section.The Boltzmann-Gibbs-Shannon entropy in this case takes the form of while the Tsallis entropy with q < 1 deformation parameter can be given as where the equality is obtained by neglecting asymptotically vanishing terms.Although the aforementioned density of states satisfies the necessary conditions only asymptotically, the closely related form of (p) = lim p→0 (W − 1) (1 − p) W −2 ∼ W e −W p has the nice property of exactly fulfilling both normalization and expected value constraints.For example, the BGS entropy in this case can be written as , where ψ(x) = d ln Γ(x) dx denotes the Digamma function.
The results above show that an exponentially decaying (p) is always resulting in regular scaling.This practically implies that all the results obtained for the microcanonical ensemble, including i.e. the H-T scaling properties can automatically be extended to the exponential case without any modifications (see Table1 and Table2).

Log-gamma density of states (log-gamma, limiting log-normal)
A density of states that limits log-normal when W → ∞ yet is defined on the bounded support [0, 1] can be defined as the distribution of the product of independent uniform variables on [0, 1].We call this distribution log-gamma, since in log-transformed variables, it is a sum of exponentially distributed variables, i.e., a gamma distribution.As the number of terms in the sum approaches infinity, gamma limits normal; consequently, as the number of terms in the product approaches infinity, log-gamma limits log-normal (for details, see [44]).Its tail on a log-log scale (limiting a quadratic function, interpolating between an exponential and a power law tail) is shown in Figure 2a.The log-gamma density of states is defined as [44] (p) = where w = log 2 (W ) − 1.Under the previous form of , BGS entropy reads (see Supplementary Information S2) corresponding to a regular (logarithmic) scaling with W .In contrast, the Tsallis entropy surprisingly shows anomalous scaling for q = 1 as where c = log 2 (1 + q) = q.On one hand this means that the microcanonical and log-gamma density of states are practically indistinguishable from the point of view of BGS entropy, while on the other hand, it also implies that H-T exponent c = log 2 (1 + q) associated with Tsallis entropies and their deformation parameters q do not coincide anymore.Consequently, under log-gamma density of states extensivity of S q can be obtained for the systems where ) .In Figure 2c we show how S q scales with W as a function of the deformation parameter for both microcanonical and log-gamma density of states.

Power law density of states (power law)
Power-law like decay offers a much larger heterogeneity over the configuration space compared to e.g., the exponential density of states, and therefore, might dramatically alter the dependence of entropies upon W .To illustrate this, let us first define the corresponding density of states as Surprisingly, this slow decay characterized by the strong inhomogeneity of probabilities over the configuration space yields to an anomalous scaling of both BGS and Tsallis entropy, Note that BGS and Tsallis entropies remain finite even in the limit of W → ∞, consequently their extensivity can not be carried out anymore, disenabling thermodynamical description of the corresponding systems.In general, we proclaim that the impossibility of extensivity suggests the lack of disorder and uncertainty.

Beta density of states (beta)
A particularly interesting example of first order statistics is the normalized Beta distribution, representing family of continuous probability density functions that are parametrized by two positive shape parameters a and b, written as where B(a, b) denotes the Beta function.In order to satisfy the previously discussed expected value constraint, the shape parameters should algebraically be related to the sample space volume as p = a a+b = 1 W , therefore b = a(W − 1).According to previous relation, either a or b could be chosen arbitrarily while the correct choice for the other shape parameter has to be made in the light of the previous one.Under these settings BGS entropy can simply be expressed as where ψ(x) = d ln Γ(x) dx denotes the Digamma function.When a is in the regime of a ∼ 1 1, the distribution becomes extremely skewed, resulting in anomalous scaling, whereas greater values a 1 yield regular S BGS ∼ ln W scaling.The Tsallis entropy takes the form of which again results in anomalous scaling for a ∼ 1 W −1 .According to the above, first order statistics following the Beta distribution can lead to substantially different types of configuration space scaling depending on the specific choice for the shape parameter a = a(W ).This can be seen by letting b ≈ aW and then using the approximate form of Eq. ( 24) which reads with the free parameter of a(W ) controlling the broadness of the distribution.By tuning the shape parameter in Eq. ( 28) above, we can smoothly interpolate between an exponential and power law like (p), and in parallel keep track of how our system virtually undergoes a transition from a highly disordered state (hallmarked by exponential density) to the ordered regime (characterized by power-law density).
We proceed with providing a detailed description of the above transition.For simplicity, here we restrict our analysis solely to the BGS entropy, however the results can straightforwardly be extended to the Tsallis entropy.If a(W ) > 1 W −1 , the first term in Eq. ( 25) is dominating the second term, and based on the asymptotic expansion of the Digamma function we recover the regular S BGS ≈ ψ(aW + 1) ∼ ln W scaling.If however, a(W ) < 1 W −1 , the density of states takes a power law like form, and the system is governed into a highly ordered configuration.The very existence of this transition between the two different regimes is mathematically encoded in the properties of the Digamma function.At the transition line a(W ) = 1 W −1 between the two regimes Below this line the BGS entropy can not satisfy extensivity.In slightly more intuitive terms, the distribution of the configuration probabilities becomes so extremely skewed in this regime that practically no uncertainty is appearing in the system description, which therefore, cannot be made extensive.In Figure 2d.we illustrate the scaling of the BGS entropy in multiple different cases.
Table 1: Scaling relations between BGS entropy S BGS ∼ W −p ln p , configuration space size W , and effective system size N , along with Hanel-Thurner exponent c, summarizing the results of the paper.Detailed calculations, following the outline shown in Figure 1c, are given in the Results section and in the Supplementary Information S2.

Discussion
Any dynamics, microscopic and coarse-grained, transient and stationary, takes place in the space of all configurations of a system.The size of the configuration space, W , is controlled by the system size N .For systems composed of weakly interacting variables, W scales exponentially with N ; the two provides a synonymous description of system size.System size independent (statistical) models can be formulated in terms of homogeneous (e.g., intensive or extensive) functions of N , without paying much attention to the fact that dynamics actually takes place on configuration space.
In complex systems of many strongly interacting variables, this is no longer true.The size of the configuration space might scale non-exponentially with system size, depending on multiple, not yet fully understood facets of correlated, history-dependent dynamics.A macroscopic attempt to extend ideas of statistical mechanics, in particular, the idea of size-invariance, formulated in terms of well-defined thermodynamic limits, to such complex systems can be built on the concept of generalized entropies S g .Generalized entropies connect system size N with configuration space size W by implicitly re-defining system size as proportional to the generalized entropy characteristic to the system.Generalized entropies are, therefore, extensive by definition, i.e., S g ∼ N .Note that this inverse logic is necessary as the space of dynamics is the configuration space; system size, of which homogeneous functions and system-size invariant models can be formulated, is auxiliary.5), of Tsallis entropies for the microcanonical ensemble and log-gamma (limiting log-normal) density of states.Although the H-T exponent of BGS entropy (q = 1) is invariant to changing the density of states from microcanonical to log-gamma, this is no longer true for Tsallis entropies (q < 1), indicating that non-extensivity class of any system is jointly determined by its extensive generalized entropy and the system's density of states.d) Scaling of BGS entropy S BGS with configuration space size W when the system's the density of states follows a beta distribution with a power-law tail, characterized by a(W ).Note that in systems with a ∼ W − with ≥ 1, BGS entropy converges to a finite value in the thermodynamic limit W → ∞.
Table 2: Scaling relations between Tsallis entropies S q ∼ W p−p q q−1 , configuration space size W , and effective system size N , along with Hanel-Thurner exponent c, summarizing the results of the paper.Detailed calculations, following the outline shown in Figure 1c, are given in the Results section and in the Supplementary Information S2.
Bose-Einstein; Eq. ( 13) Generalized entropies account for higher-order statistics of the dynamics over configuration space, resulting in historydependence (mathematically formulated as e.g., non-ergodicity), or correlated visiting probabilities.What they do not describe is visiting probabilites themselves, reflected by the fact that S g takes these visiting probabilities as arguments.
Conventional statistical mechanics suggest that changing visiting probabilities, i.e., the distribution over configurations, does not alter the (asymptotic) relation between system size and configuration space size: the Boltzmann-Gibbs-Shannon entropy S BGS is extensive in all known ensembles.
In this paper, we show that this is not the case.First order statistics of visiting probabilities, formulated in terms of density of states , might very well change the relation between N and W .We identify three classes of density of states.Class i) does not change the asymptotic scaling of W (N ) compared to the uniform (microcanonical) distribution over configuration space = δ(p − W −1 ).These density of states we call regular, as defined in Eq. (7).
Class ii) entails density of states that change W (N ) asymptotically (i.e., they are anomalous, according to Eq. 7) yet an extensive generalized entropy can be still assigned.This extensive generalized entropy is necessarily a different one that is extensive for the microcanonical case.The appropriate generalized entropy is thus dependent on the visiting probabilities, e.g., on the thermodynamic ensemble in equilibrium statistical physics.This is observed in a corresponding to a system with a microstate with macroscopic probability (that we call Bose-Einstein, BE) in both weakly and strongly correlated systems, modeled by S BGS and by the family of Tsallis entropies, S q , respectively.First order statistics also modify the non-extensivity scaling W (N ) in strongly correlated systems in case of a limiting log-normal, and a is corresponding to multiple uniform domains in configuration space, whereas they do not modify W (N ) in weakly correlated systems.
The third class describes density of states that are so fat-tailed that W (N ) saturates, and therefore, no extensive entropy can be assigned to such systems.We find that systems with a power-law density of states, with exponent approaching −1 in the limit of W → ∞ belong to this class, regardless of their higher-order statistics.Such systems have non-zero visiting probability at all configurations, yet these visiting probabilities are so skewed that both S BGS and S q saturates at a finite value when W → ∞.
We believe that this unified macroscopic picture of strongly interacting and history-dependent processes, based on first and higher order statistics of visiting probabilities over configuration space, makes statistical mechanics a more flexible tool for modeling complex systems both within and outside the realm of physics.
based on which we obtain for 0 < q < 1.Assuming the scaling form of V * (W ) ∼ W ξ the previous equation can be further simplified to where c = 1 − ξ + ξq in agreement with the corresponding row in Table2.
s3 A single macroscopic state (Bose-Einstein) Owing to the presence of a macroscopic state, H-T scaling and thus, conditions of extensivity substantially change under Bose-Einstein density of states.This stems from the fact that states with probabilites p ≈ 1 have non-negligible contribution to the entropy compared to, e.g. the microcanonical picture where the major contribution is coming from p = 1 W ≈ 0. The corresponding density of states written in the form of Eq. ( 13) fulfills both normalization and expected value contraints since the following identities hold Plugging Eq. ( 13) into Eq.( 3) and exploiting the fact that with δ(x) being the Dirac-delta function, we simply arrive to Specifically, for the BGS entropy we obtain With the notations of x = 1 W −1 and z = 1 λ the generalized c exponent for BGS entropy is given by directly plugging Eq. (S11) into Eq.( 5) yielding where c = 2. Interestingly, extensivity of BGS can be maintained by imposing W (N ) ∼ e −W−1(− N 2 ) .The previous form of W (N ) is a decreasing function of N implying that asymptotically, N → ∞, the configuration space has to collapse in order for extensivity to be satisfied.Under Bose-Einstein density of states Tsallis entropy reads Based on Eq. (S14) and Eq.(S6) generealized H-T scaling takes the form of where c = 2q.For q = 1 2 , S q → 1 therefore Tsallis entropy can not be made extensive.In case of q ∈ 0, 1 2 extensivity is obtained by setting W (N ) ∼ N 1 1−2q .If however, q > 1 2 S q ∼ N requires W (N ) to decrease with N .
which altogether with the identity of 1 + q = 2 log 2 (1+q) yield to where c = log 2 (1 + q) with the corresponding curve depicted in Figure 2c.As a surprising consequence, extensivity of the Tsallis entropy requires W (N ) ∼ N 1 log 2 (1+q) .
s6 Power-law density of states (power-law) For the sake of technical controllability, let us define the following integral based on which we can precisely evaluate various expressions later on.First, with the substitutions of A = 1 W −1 and m = −1 + 1 W −1 we obtain the normalization condition, whereas expected value requirement can be checked through Tsallis entropy can also be expressed as a function of the integral appearing in Eq. (S26), namely which reduces to BGS entropy if q → 1, therefore in a perfect accordance with Eq. ( 23).
An alternative formulation of power-law density of states is provided by the form of which has the nice property of (p = 1) = 0 but offerring the same type of scaling with W , The fact that under power-law form of both BGS and Tsallis entropies asymptotically converge to a finite value implies that these entropic forms can not display extensivity, not even in the limit of W → ∞, therefore making it impossible to provide a thermodynamical description of the corresponding system.
s7 Beta density of states (beta) The Shannon entropy under Beta distribution of the configuration probabilities is given by the following integral S BGS = W −p ln p = W In addition to this, assuming scaling form for a(W ) ∼ W η−1 , we arrive to S BGS = ψ(W η + 1) − ψ(W η−1 + 1).(S33)

Figure 2 :
Figure2: a) Continuous, parameter-free density of states we consider in this paper.b) Contribution of configurations with probability p < r to the total entropy of the system, which we call cumulative entropy Π g (r), for different combinations of density of states and entropy kernels g.In each case, p = W −1 = 2 −20 ≈ 10 −6 .c) Hanel-Thurner exponent, given by Eq. (5), of Tsallis entropies for the microcanonical ensemble and log-gamma (limiting log-normal) density of states.Although the H-T exponent of BGS entropy (q = 1) is invariant to changing the density of states from microcanonical to log-gamma, this is no longer true for Tsallis entropies (q < 1), indicating that non-extensivity class of any system is jointly determined by its extensive generalized entropy and the system's density of states.d) Scaling of BGS entropy S BGS with configuration space size W when the system's the density of states follows a beta distribution with a power-law tail, characterized by a(W ).Note that in systems with a ∼ W − with ≥ 1, BGS entropy converges to a finite value in the thermodynamic limit W → ∞.