Universality, criticality and complexity of information propagation in social media

Statistical laws of information avalanches in social media appear, at least according to existing empirical studies, not robust across systems. As a consequence, radically different processes may represent plausible driving mechanisms for information propagation. Here, we analyze almost one billion time-stamped events collected from several online platforms – including Telegram, Twitter and Weibo – over observation windows longer than ten years, and show that the propagation of information in social media is a universal and critical process. Universality arises from the observation of identical macroscopic patterns across platforms, irrespective of the details of the specific system at hand. Critical behavior is deduced from the power-law distributions, and corresponding hyperscaling relations, characterizing size and duration of avalanches of information. Statistical testing on our data indicates that a mixture of simple and complex contagion characterizes the propagation of information in social media. Data suggest that the complexity of the process is correlated with the semantic content of the information that is propagated.

versal and critical process.Universality arises from the observation of identical macroscopic patterns across platforms, irrespective of the details of the specific system at hand.Critical behavior is deduced from the power-law distributions, and corresponding hyperscaling relations, characterizing size and duration of avalanches of information.Neuronal activity may be modeled as a simple contagion process, where only a single exposure to activity may be sufficient for its diffusion.On the contrary, statistical testing on our data indicates that a mixture of simple and complex contagion, where involvement of an individual requires exposure from multiple acquaintances, characterizes the propagation of information in social media.We show that the complexity of the process is correlated with the semantic content of the information that is propagated.Conversational topics about music, movies and TV shows tend to propagate as simple contagion processes, whereas controversial discussions on political/societal themes obey the rules of complex contagion.
Social media have dramatically changed the way people produce, access and consume information (1), and there is increasing evidence that online discussions have the potential to impact society in unprecedented ways (2) 1 .It is not surprising therefore the renewed scientific interest to comprehend the mechanisms that drive information propagation.
Analyses of the propagation of information in social media reveals, at least qualitatively, similarities with other natural phenomena such as the firing of neurons (7,8) and earthquakes (9).These are processes characterized by bursty activity patterns.Activity consists of point-like events in time, and bursts (or avalanches) of activity are defined as sequences of close-by events.Bursts are separated by long periods of low activity.Activity is characterized at the macroscopic level by the distributions P (S) and P (T ) of the size S and the duration T of avalanches.Information propagation can be studied considering the same observables (10)(11)(12)(13)(14)(15).
In real-word systems P (S) and P (T ) have a power-law decay for large value of their argument, i.e., P (S) ∼ S −τ and P (T ) ∼ T −α (7-9, 12, 16-18).This property is interpreted as evidence of the system operating at, or in the vicinity of, a critical point.This statement is supported by the theory of absorbing phase transitions according to which, if the avalanche dynamics is at a critical point, then P (S) and P (T ) must decay as power laws.Further, in a process operating at criticality, the average size of avalanches with given duration must obey the hyperscaling relation S ∼ T γ , with γ = (α − 1)/(τ − 1) (16,19,20).The specific value of the exponents τ and α typically differ for classes of systems.Their actual values are fundamental for the characterization of systems into universality classes, i.e., an ontology of processes with conceptual and practical relevance (21).
There are systems for which strong evidence supporting the existence of universality classes 1 Only in the past year, we witnessed two emblematic examples.The public debate around the COVID-19 pandemic has been accompanied by the so-called Infodemic that is affecting the outcome of the vaccination campaign by increasing hesitancy (3)(4)(5).Also, online discussions in the Reddit channel r/wallstreetbets induced many individuals to buy GameStop shares in opposition to the shorting operation carried out by hedge funds and professional investors.As a result, the market capital of the company displayed an increase of more than $22 billion in just a few days (6).
suggests appropriate theoretical models and the microscopic mechanisms driving the dynamics.
For example, there is large agreement on the fact that neuronal activity in the brain is universal and critical (7,8,(22)(23)(24)(25).Universality is the notion that nearly identical avalanche statistics are observed for a multitude of organisms.Criticality instead refers to the fact that avalanche statistics are characterized by algebraic distributions.In particular, the critical exponent values are those of the universality class of the mean-field branching process (BP), i.e., τ = 3/2 and α = 2 (26)(27)(28).The finding informs us about the mechanism that drives the unfolding of an avalanche of neuronal activity in the brain.Neurons influence each other according to a simple "contagion process," where only a single exposure to an active neuron may be sufficient to trigger the activity of another.As a result, activity propagates from neuron to neuron as the avalanche unfolds.
Where information propagation (in general, and in online social media) is concerned, the issue of the existence of well-defined universality classes is far from settled.Existing analyses typically study data collected from a single source and over short observation windows.It is often found that distributions of avalanche size and duration obey power laws, but the estimated values of the exponents are not the same across studies: τ values range between τ 2 and τ 4 (13, 14, 29-31), whereas α 3.6 (32) or α 2.5 (33,34).Also, empirical studies reporting on correlations between size and duration of avalanches do not observe a power law at all (35,36).These different results might be ascribed to multiple operative definitions of avalanches, which can be given in terms of hashtags time series (TS) (29,35) as well as reply trees or retweet chains (13,31,37).Further, regardless of the definition, the temporal resolution can affect the avalanche distribution (12,38).As a consequence of the variability in the distributions inferred, uncertainty about representative theoretical models remains.Finally, empirical evidence and theoretical support for microscopic mechanisms that may drive the propagation of information in social media are inconclusive.Stemming from the apparent similarity between the spreading of disease and information, a widely accepted paradigm is that information diffuses according to a simple contagion process (10,13,29,35,39,40).Simple contagion is at the core of many theoretical models of information propagation used in the literature, all displaying critical properties of the BP universality class (41).However, there are quite a few studies in favor of the complex contagion paradigm (42)(43)(44)(45).As originally introduced by Centola and Macy, in a complex contagion process the involvement of an individual in the propagation of information requires exposure from multiple acquaintances (46).Distinguishing between simple and complex contagion and, possibly, how they coexist within the same population (47), is fundamental to understand the spreading of (mis)information in online social media (42,48).
Complex contagion is exemplified by some models, such as the Linear Threshold Model and the Random Field Ising Model (RFIM) (19,49).
In this work, we perform a large-scale study of (hash)tags TS from Twitter, Telegram, Weibo, Parler, StackOverflow and Delicious (see Supplemental Material (SM) A, B for details about the data sets).We consider a total of 206, 972, 692 TS, cumulatively consisting of 905, 377, 009 events, collected over periods ranging up to 10 years.The Twitter data, collected specifically for this work, are fully available together with codes to reproduce the results of this paper (50).To define avalanches in a principled fashion we adopt the approach inspired by percolation theory proposed in Ref. (38).We provide evidence that social media share universal statistics of avalanches that are well described by power-law distributions.At the aggregate level, each social media displays a critical behavior that is compatible with the RFIM, indicating that, plausibly, information propagates in social media according to a complex contagion process.Second, we develop a novel statistical technique able to determine the level of criticality and complexity of individual TS.We find that nearly 20% of the TS are less than 5% away from criticality.These account for 53% of all events in our data sets.Also, we find that about 50% of the individual TS are better explained in terms of a complex rather than a simple contagion process.A qualitative analysis of the most popular hashtags suggests that information concerning conversational topics, e.g., music or TV shows, spreads according to the rules of simple contagion, whereas information concerning political/societal controversies shows signatures of an underlying complex contagion process.
The operational definition of an avalanche depends on the value of the parameter ∆, the minimal time separation between two consecutive events belonging to distinct avalanches.A proper choice ∆ * of the time resolution ∆ for the specific data set at hand is necessary to avoid significant distortion in the resulting avalanche statistics.This statement is true for synthetic TS generated by temporal point processes (38), but also for the empirical TS analyzed in this paper (see SM D, I for details).To determine the value of ∆ * we take advantage of the principled method developed in Ref. (38) which identifies ∆ * as the critical point of a one-dimensional percolation model.Results are presented in Fig. 1.Values of ∆ * for each data set are reported in the SM D; they vary substantially across data sets, from ∆ * 1, 500 s for Twitter to ∆ * 30, 000 s for Telegram (Fig. 1B).
Once the time resolution is rescaled according to ∆ → ∆/∆ * , the curves of percolation strength relative to different data sets exhibit a nearly identical quantitative behavior.This fact suggests the possibility of seeing the propagation of information in social media as a universal process, with ∆ * representing the natural resolution for observing information avalanches.The avalanche statistics of Figs.2A-C seems well described by power laws, indicating that the underlying process is (nearly) critical, and that its universality class can be identified by estimating the value of the critical exponents τ , α, and γ (21).We rely on maximum likelihood estimation for τ and α (51); linear regression on the logarithm of the relation S ∼ T γ is used to estimate γ. Results are reported in Fig. 2D, see SM G for details.The estimated exponent τ is compatible with the one of the mean-field RFIM universality class, i.e., τ = 9/4 (19).
The compatibility of avalanche statistics with those of a homogeneous mean-field model is not surprising given that in some social media there is no underlying network among users and in the others there are mechanisms for the propagation of information that bypass it.There is an apparent mismatch between our estimates α and γ and the RFIM predictions α = 7/2 and γ = 2.The mismatch can be theoretically explained by the peculiar shape of the scaling function characterizing the distribution of avalanche duration, which affects also the estimate of γ (52).Difficulties in observing the asymptotic exponents of the RFIM due to the effect of the scaling functions emerge also in numerical simulations of the RFIM and are well known (19).
The proximity of exponents estimated across data sets points to the existence of a genuine and distinctive universality class for information propagation in social media.In particular, this class seems to be different from that of BP often invoked as representative in phenomena related to information diffusion.51), we set the threshold for statistical significance equal to p = 0.1.
We verified, however, that the outcome of the analysis is not greatly affected by the choice of the threshold value, see SM O. Third, it establishes whether a TS is better modeled by BP or RFIM by comparing their likelihood.
Results of our analysis are reported in Figs. 3 and 4. Our method is applied only to TS including avalanches that contain at least two avalanches larger than S min = 10.Tests of robustness for different S min values are reported in the SM O.In all systems under analysis, we find that the best fitting parameter assumes values over a broad range, encompassing a large portion of the subcritical phase, as well as the critical point of the models (Figs.3A and 3B).
The individual-level analysis confirms the results obtained for the aggregate data.The majority of events belongs to a minority of TS giving rise to the largest avalanches.As a consequence, the large-scale behaviour of each system is mainly determined by those few TS that are fitted in a narrow region of the parameter space close to the critical point for both BP and RFIM (insets of Figs.3A and 3B).Also, our tests indicate that the vast majority of TS are well described by at least one of the two models (Fig. 4A).Model selection indicates that individual TS are divided in two nearly equally populated classes, one better described by BP and the other by RFIM (Fig. 4A).Simple and complex contagion thus coexist in social media, with only a mild dominance of complex over simple contagion (Fig. 3C).These results are not incompatible with the aggregate avalanche statistics (Fig. 2).Fig. 3D shows that critical TS that belong to the class of complex contagion display power-law scaling compatible with the RFIM τ exponent.Also, the critical TS that the fitting procedure attributes to the BP class show a neat crossover to RFIM scaling for large avalanches.The mixture produces a universal distribution that is overall more compatible with the RFIM universality class rather than the BP class (Fig. 2C).
In summary, we revealed that temporal patterns characterizing bursts of activity in online social media are universal, thus they should be ascribed to mechanisms that are so basic that underlie information diffusion in all social media platforms.Also, in contrast with the vast majority of previous studies where purely diffusive models have been considered ( 41), we showed that information propagation in social media is often better described by a complex contagion dynamics.Complex contagion is here exemplified by the RFIM, an agent-based model of activation originally formulated to describe the para-to-ferromagnetic phase transition in metals (19).Recast in language proper to the description of information propagation (53), RFIM prescribes that each agent (i) has a personal opinion, (ii) is subject to the social influence exerted by the agents she interacts with, and (iii) is also driven by an external force representing the public information about exogenous events.These appear reasonable assumptions for modeling many realistic discussions happening in social media.Fig. 4 shows the 30 most popular Twitter hashtags identified by our method either in the simple or in the complex contagion classes.In the category of simple contagion, we find conversational topics, mostly related to music or cinema/TV shows.Hashtags belonging to the class of complex contagion display either periodic patterns or are related to political/controversial themes.This qualitative picture fits with previous studies that have explicitly focused on the semantic meaning of different hashtags in Twitter (48).For both classes of information avalanches, we inferred the dynamics underlying their generation as critical, a fact that provides theoretical ground for the surprising but remarkable robustness of our findings.Our results pave the way for future research about both descriptive theories and data-driven predictive models.The presence of a large portion of social media content that acquires popularity via complex contagion dynamics calls for a reconsideration of predictive algorithms relying on the temporal characteristics of the signal only, because these algorithms often neglect the semantics of hashtags and, even more frequently, topological features of their propagation (54)(55)(56)(57)(58).Both aspects are important for a successful discrimination between information propagating as a simple or complex contagion process (42,48).We argue that the distinction between these truly different mechanisms is fundamental for the development of novel theoretical and data-driven approaches.We speculate that our results extend beyond the six platforms considered here.If so, there must be a mechanism that explains the universality shown by the data, involving a critical dynamics that is independent of the peculiarities implemented in the individual platforms.Understanding where this mechanism is rooted in and how to exploit this mechanism for the prediction of the propagation of information in online social media remain open challenges for future research.we plot the same data as in the main panel, but with the rescaling ∆ → ∆/∆ * .For the sake of comparison, each curve has been normalized to its maximum.

SUPPLEMENTAL MATERIAL
A Data sets We study data sets concerning the activity of users in six different social media, namely Twitter, Telegram, Parler, Weibo, StackOverflow and Delicious.For each system we identify all (hash)tag in the data and build a time series (TS) for each (hash)tag.The TS contains the times, i.e., {t Table S1: Summary table of the data.From left to right we report: the name of the data set, the acronym we use to refer to the data set, the temporal window of data collection, the total number of time series, the total number of events(times) and a link to the original data.Events correspond to the observation of items in the original data.

B Data cleaning
Fig. S1 shows the daily rate of activity in each data set.While TWT and WEI display a rate of activity almost constant over our observation period, the other data sets display significant variations.
We restrict our attention to observation windows where all data are nearly stationary, i.e., the number of events per unit time is roughly constant for time units much larger than the temporal resolution of the data.These shorter observation windows are highlighted in Fig. S1.
Daily rates in the reduced temporal window are shown separately in Fig. S2.
Table S2: Summary table of the data after reduction of the observation windows.From left to right we report: the name of the data set, the acronym we use to refer to the data set, the temporal window of data collection, the total number of time series, the total number of events (times) and a link to the original data.

C Beyond social media: neuronal systems and earthquakes
In addition to the six data sets concerning social media, we further study data sets describing activity in different systems.
We consider a set of 88 TS, collected in Ref. (70), generated by monitoring the spontaneous Figure S1: Daily rate of activity in the original data sets.The rate is computed as the total number of events per day.The dashed vertical lines in the panels for TLG and DEL mark the beginning and the end of the reduced temporal window.The dashed vertical line in the panels for PARL and STCK mark the beginning of the reduced temporal window, which in this case ends where the original window ends.
activity of dissociated cultures of rat's hippocampal cells.Specifically, we consider the culture number 1 in the 11-th day in vitro and refer to it as RHDC (Rat Hippocampal Dissociated Cultures).A set of 166 TS, collected in Ref. (71,72), generated by monitoring the neural activity in cultured slices of mice somatosensory cortex is further considered.In this case we consider the data set number 1 and refer to it as MSOS (Mouse Somatosensory Organotypic Slice).We also consider a data set generated by monitoring the neural activity in the premotor cortex of a macaque, collected in Ref. (73).We use the MT S2 data set and refer to it as MPC (Macaque Premotor Cortex).In these systems each electrode is associated to a TS and an event corresponds to the detection of a spike by the electrode.
We further consider three catalogues of earthquakes reporting seismological activity in Figure S2: Daily rate of activity in the reduced data sets.The rate is computed as the total number of events per day.In each panel we show the same data as in Fig. S1, but restricted to the temporal windows delimited by the dashed vertical lines respectively for each data set.
Japan (74), in California (75) and in Europe (77).In the case of the California catalogue, we discard all events prior to Jan. 1, 1900.For each of these catalogues, we divide geographical space into bins.For each bin, we construct a TS composed of the time of events whose longitude and latitude falls within the bin, in the same way as done in Ref. (12).The procedure of geographical binning is illustrated in Fig. S3.Table 3 summarizes the properties of these data sets.

D Defining avalanches from time series
Given a TS {t 1 , t 2 , . . .}, we define an avalanche starting at t b as a sequence of events {t b , t b+1 , . . ., t b+S−1 } such that t b − t b−1 > ∆, t b+S − t b+S−1 > ∆ and t b+i − t b+i−1 ≤ ∆ for all i = 1, . . ., S, where ∆ is the resolution parameter.The size S of an avalanche is the number of events within it and the Table S3: Summary table of data sets describing neuronal and seismological activity.From left to right we report: the name of the data set, the acronym we use to refer to the data set, the temporal window of data collection, the total number of time series, the total number of events (times) and a link to the original data.
duration T is the time lag between the first and last event in the avalanche, i.e., T = t b+S−1 − t b .
Depending on the value of ∆, the same TS may correspond to different avalanches.
We follow the principled approach of Ref. (38), where avalanches are constructed for ∆ = ∆ * .∆ * corresponds to the critical point of a one-dimensional percolation model that is used to describe the TS.To this end, we define the order parameter P ∞ of the percolation model and its associated susceptibility χ, respectively, as Here, S M is the average over all the TS of the size of the largest avalanche S M .The transition point is associated to the value of ∆ where the susceptibility χ reaches its maximum, i.e., Note that TS with only one event introduce an offset in the measure of P ∞ and are not informative w.r.t. the optimal resolution ∆ * , i.e., S M = 1 for any ∆ in these TS.
For this reason, we remove these TS from the sample and compute P ∞ and χ considering only TS composed of at least two events.on data sets generated on social media.We report the name of the data set (upper row) and the associated value of ∆ * (bottom row), expressed in seconds.
Table S4 reports the values of the optimal resolution ∆ * obtained by means of the percolation analysis on the social media data sets.The avalanche statistics reported in the main text is obtained for ∆ = ∆ * .The statistics refers to all avalanches, excluding the largest one of each TS.This choice is due to the well-known fact that in percolation theory the largest cluster respects a different statistics than that of finite clusters (81).on data sets not representative of social media.We report the name of the data set (upper row) and the associated value of ∆ * (bottom row), expressed in seconds.
Table S5 reports the values of the optimal resolution ∆ * for the data sets not concerning activity in social media.

E The branching process
We consider an homogeneous mean-field branching process (BP), where an individual initially active spreads activity to a random number of peers, who can in turn spread activity further.
The process continues for a number T of time steps or generations, until there is a generation in which no individual further spreads activity.T is the duration of the avalanche.The size S of the avalanche is given by the total number of individuals activated during the avalanche.
The only tunable parameter of the model is denoted by n, representing the average number of individuals who are activated from a single spreader.n is known as the branching ratio.BP is Finite avalanches of activity in the BP obey the laws where • is the average over different avalanches, and P (S) and P (T ) are the probability distributions of S and T , respectively.The functions D S and D T are known as scaling functions and introduce a correction at small values of their argument, where we have defined the reduced distance from the critical point n = |n − n c |/n c .The above exponents are not independent, rather they are related by γ = 1/(σzν) = (α − 1)/(τ − 1).For BP, we have that τ = 3/2 and α = 2. σ, z and ν are additional critical exponents.We do not explicitly consider them in our analysis.

F The Random Field Ising Model
We consider the mean-field formulation of the zero-temperature Random Field Ising Model (RFIM).In the RFIM, agent i is characterized by the state variable y i = ±1 indicating whether the agent is active, i.e., y i = +1, or not, i.e., y i = −1.In the initial configuration all agents are inactive.In the long-term limit, all agents become active.Activation of individual agents may happen at very different stages of the dynamics.However, once in the active state, agents can not change their state back to inactive.Each agent i has a propensity h i to become active, with . A large value of h i indicates that the agent is particularly prone to become active.Agents interact by means of ferromagnetic interactions that model social pressure, i.e., active neighbors push an inactive agent to become active.The whole system is further affected by public information which all agents have access to and that pushes users toward becoming active with intensity H ∈ (−∞, +∞).
In the initial configuration, all agents are inactive (y i = −1 for all agents i).External pressure H grows till the agent with the largest h i value becomes active.This change of state can trigger an avalanche of activity in the other nodes.Specifically, agent j becomes active if the following condition is met where N is the system size and the mean-field formulation is expressed by the all-to-all interaction.When an avalanche ends, the external pressure H grows again until a new user becomes active and triggers a new avalanche.The field is frozen during the unfolding of avalanches, meaning that avalanches are characterized by a time scale much shorter than the one characterizing external pressure.The size S of an avalanche is given by the number of users that are activated during the avalanche; its duration T is given by the activation rounds characterizing the avalanche.The stochasticity of the model comes from the random nature of the propensities

G Estimation of the exponents
To estimate the exponents τ and α for the empirical avalanche distributions, we use the fact that for a generic power-law probability distribution with exponent η, the maximum likelihood estimator can be written as where x i is a data point of the empirical sample, and x min is the smallest value of the sample that is expected to truly respect the power-law statistics (51).Z is the number of data points x i ≥ x min .If the variable under consideration is discrete, the factor x min in the denominator of the logarithm in Eq. (S4) must be replaced by x min − 0.5.The error on the maximum likelihood estimator is ∆η = (η − 1)/ √ Z.We use S min = 2 to fit the size distribution and T min = 2∆ * to fit the duration distribution.This protocol allows us to measure τ and α of the distributions P (S) and P (T ), respectively, and to further measure the scaling exponent γ as (α − 1)/(τ − 1).Assuming that the two estimators are uncorrelated, the uncertainty on the ratio f (τ , α) = (α − 1)/(τ − 1) can be simply evaluated as To independently estimate the exponent γ, we take the logarithm of both sides in the last Eq. in (S2) and perform linear regression.The exponent γ and its uncertainty are then given by γ respectively, where X = log T and Y = log S , and are the residuals.

H Scaling in neuronal systems and earthquakes
We perform on the supplementary data sets the same analysis performed in the main text for data sets concerning social media.Results are shown in Fig. S4.The three data sets describing neuronal brain activity in different animals all display the BP statistics for both the size and the duration distributions.The finding is consistent with previous studies (8,24,25).The scaling relation between S and T does not show the scaling γ = 2 as expected from BP theory.However, a slightly superlinear relation between these quantities has been reported for many different neuronal systems (22,25).

I Temporal resolution and avalanche statistics
In Fig. S5, we display the avalanche statistics of different systems obtained for different values of the temporal resolution ∆.For ∆ = ∆ * , the power-law scaling is affected by apparent exponential cutoffs.The finding is in perfect agreement with theoretical arguments (38).As the avalanche statistics obtained with the present approach represents the correlations existing in the system (12), the observation of distorted distributions means that the correlations existing in the data have not been properly identified.The same issue arises when each TS is assumed to be a unique avalanche, see Fig.The summation is performed over all avalanches with S ≥ S min , a parameter we vary in our analysis.The distributions P and Q are normalized over the interval [S min , ∞) to account for this fact.The best fit of the empirical distribution obtained from a TS against the model at this aim, we fix the system size to be N = 10 6 and fit 10 4 realizations of each model.Results are shown in Fig. S9.The fitting procedure is able to identify the ground truth, either RFIM or BP, regardless of the S min value.In those cases in which model selection requires the loglikelihood ratio test, it still generally holds that the true model is selected with higher chances.
In the case of synthetic data we can also compare the inferred parameter with the ground truth and Fig. S9 C and F show that the probability that these two quantities differ decays quickly as the difference departs from zero.

O Robustness of the fits
In the main text we show results of the fitting protocol using S min = 10.Further, we estimated statistical significance by setting the threshold value to 0.1.Our conclusions, however, are unaffected by different choices of these parameters.In Fig. S10 and Fig. S11, we vary the threshold over the p-value and S min , respectively.Figure S9: Fitting procedure applied to synthetic data sets.We show the results obtained when the RFIM is the ground truth (upper row) and when the BP is the ground truth (bottom row), considering four values of S min .Left column: we report the overall probability that the RFIM (red) or the BP (blue) is the selected model, the probability that both models are discarded (purple) and the probability that both the models are not individually rejected so that the model selection is performed by means of the log-likelihood ratio test (yellow).Central column: we report the probability that the RFIM (red) or the BP (blue) is the model selected by means of the log-likelihood ratio test.Error bars represent σ/N , where N is the sample size and σ = √ 0.25N is the standard deviation of a binomial distribution with probability of success equal to 1/2.Asterisks are used to denote significant deviations from the unbiased binomial model, i.e., three asterisks indicate for p < 0.001.Right column: we report the probability distribution of the distance between the true value of the parameter used to generate the distribution P and the parameter inferred by fitting against the true model.25N is the standard deviation of a binomial distribution with probability of success equal to 1/2.Asterisks are used to denote significant deviations from the unbiased binomial model, i.e., two asterisks indicate for p < 0.01 and one asterisk stands for p < 0.1.B) We report the fraction of TS that are classified in the RFIM class (red), the fraction of TS that are classified as BP (blue), the fraction of TS that is classified as neither BP nor RFIM (purple) and the fraction of TS that pass both statistical tests (yellow).In this case, the log-likelihood ratio test is required for model selection, see panel A.Here we set to 0.05 the threshold over the p values.C) Same as in panel A, but the threshold over the p values is set to 0.2.D) Same as in panel B, but the threshold over the p values is set to 0. Asterisks are used to denote significant deviations from the unbiased binomial model, i.e., two asterisks indicate for p < 0.01 and one asterisk stands for p < 0.1.B) We report the fraction of TS that are classified in the RFIM class (red), the fraction of TS that are classified as BP (blue), the fraction of TS that is classified as neither BP nor RFIM (purple) and the fraction of TS that pass both statistical tests (yellow).In this case, the log-likelihood ratio test is required for model selection, see panel A.Here we use S min = 27.C) Same as in panel A, but S min = 95.D) Same as in panel B, but S min = 95.

Fig. 2A
Fig. 2A and 2B show the distributions of avalanche size and duration obtained by setting ∆ = ∆ * .Fig. 2C shows the relation between size and duration.The collapse of curves relative to different data sets on a single curve hints once more to processes belonging to the same universality class.

Fig. 1 .
Fig. 1.Universality of information propagation in online social media.A) In the main panel, we show the percolation strength as a function of the temporal resolution ∆.Different colors/symbols refer to different social media: Twitter (TWT), Telegram (TLG), Parler (PARL), Weibo (WEI), Stack Overflow (STCK), and Delicious (DEL).In the inset, we plot the same data as in the main panel, but with the horizontal axis rescaled as ∆ → ∆/∆ * .B) In the main panel, we plot the susceptibility as a function of the time resolution for the same data as in A. The optimal resolution ∆ * is identified as the location of the peak of the susceptibility.In the inset,

Fig. 2 .Fig. 3 .Fig. 4 .
Fig. 2. Universality and criticality of information propagation in social media.A) Avalanche size distribution.Different colors/symbols indicate data obtained from different social media.Acronyms are defined as in Fig. 1.In this panel, the full line stands for RFIM critical scaling; the dashed line denotes BP critical scaling.B) Distribution of avalanche duration for the same data as in panel A. To make the distributions collapse one on the top of the other, duration is multiplied by the factor 1/∆ * and probabilities are multiplied by the factor ∆ * .C) Average size of avalanches with given duration.Data are the same as in A and B. The abscissa of each curve is rescaled as ∆ → ∆/∆ * .D) Maximum likelihood estimates of the exponents τ , α and γ, see SM G for details.We also display the ratio (α − 1)/(τ − 1).Error bars are always smaller than the size of the symbols.The dashed lines at τ = 2.25, α = 2.50 and γ = 1.20 correspond to best fit with the data (full lines) of panels A, B and C, respectively.

Figure S3 :
Figure S3: Construction of TS from seismological data.From left to right we report: the Japan catalog, the European catalog and the Californian catalog.Top row: spatial distribution of earthquakes in the three catalogs considered.Bottom row: histogram of the spatial distribution.Bins are squares of side 100 Km.
extracted from a normal distribution with zero mean and variance R. The choice of the normal distribution is quite standard both for ferromagnets and for social systems (53).R is the control parameter of the model, which is critical for R = R c = 2/π.Avalanche statistics still obey laws similar to those of Eqs.(S2).The functional form of the scaling functions differs from those of BP; also, their argument is given in terms of the distance from the critical point of RFIM, i.e., n = |n − n c |/n c is replaced by R = |R − R c |/R c .The values of the critical exponents are τ = 9/4 and α = 7/2 (19). S6 .

Figure S4 : 3 .
Figure S4: Avalanche in systems other than social media.A) Distribution of avalanche size.Different colors/symbols refer to different systems: rat's hippocampal dissociated cultures (RHDC), mouse somatosensory organotypic slices (MSOS), macaque premotor cortex (MPC), earthquakes in Japan (JAP), California (CAL) and Europe (EUR).The full line stands for RFIM critical scaling; the dashed line denotes BP critical scaling.B) Distribution of avalanche duration for the same data as in panel A. Duration is rescaled by the factor 1/∆ * and probabilities are rescaled by the factor ∆ * .The dashed line denotes BP critical scaling.C) Average size of avalanches with given duration.Data are the same as in A and B. The abscissa of each curve is rescaled by 1/∆ * .The dashed line denotes BP critical scaling.D) Maximum likelihood estimates of the exponents τ , α and γ, see SM G for details.

Figure S10 :
FigureS10: Robustness against the level of statistical significance.We set statistical significance equal to 0.05 (upper row) and 0.2 (lower row).Acronyms of the data sets are the same used in the main text.A) Probability that the log-likelihood ratio test favors RFIM over BP (blue), or vice versa BP over RFIM (red), using a threshold 0.05 over the p values.Only TS that are sufficiently well fitted by both models are considered in the analysis, see panel B. Error bars represent σ/N , where N is the sample size and σ = √ 0.25N is the standard deviation of a binomial distribution with probability of success equal to 1/2.Asterisks are used to denote significant deviations from the unbiased binomial model, i.e., two asterisks indicate for p < 0.01 and one asterisk stands for p < 0.1.B) We report the fraction of TS that are classified in the RFIM class (red), the fraction of TS that are classified as BP (blue), the fraction of TS that is classified as neither BP nor RFIM (purple) and the fraction of TS that pass both statistical tests (yellow).In this case, the log-likelihood ratio test is required for model selection, see panel A.Here we set to 0.05 the threshold over the p values.C) Same as in panel A, but the threshold over the p values is set to 0.2.D) Same as in panel B, but the threshold over the p values is set to 0.2.45

Figure S11 :
FigureS10: Robustness against the level of statistical significance.We set statistical significance equal to 0.05 (upper row) and 0.2 (lower row).Acronyms of the data sets are the same used in the main text.A) Probability that the log-likelihood ratio test favors RFIM over BP (blue), or vice versa BP over RFIM (red), using a threshold 0.05 over the p values.Only TS that are sufficiently well fitted by both models are considered in the analysis, see panel B. Error bars represent σ/N , where N is the sample size and σ = √ 0.25N is the standard deviation of a binomial distribution with probability of success equal to 1/2.Asterisks are used to denote significant deviations from the unbiased binomial model, i.e., two asterisks indicate for p < 0.01 and one asterisk stands for p < 0.1.B) We report the fraction of TS that are classified in the RFIM class (red), the fraction of TS that are classified as BP (blue), the fraction of TS that is classified as neither BP nor RFIM (purple) and the fraction of TS that pass both statistical tests (yellow).In this case, the log-likelihood ratio test is required for model selection, see panel A.Here we set to 0.05 the threshold over the p values.C) Same as in panel A, but the threshold over the p values is set to 0.2.D) Same as in panel B, but the threshold over the p values is set to 0.2.45 for RFIM.Second, it evaluates the goodness of the individual fits via their p-values.Similarly to the prescription of Ref. ( 1 , t 2 , . . .}, when the (hash)tag is observed in the data.The Twitter data set is composed of 2,353,192,777 Tweets corresponding to a 10% random sample of all Tweets posted on Twitter during the observation window Oct. 1 -Nov.30, 2019.The collection of this data has been performed via the Indiana University OSoME Decahose stream (59, 60).Telegram TS (64)extracted from a total of 317,224,715 messages, originally collected in Ref.(61).Parler TS are extracted from a total of 183,062,974 posts, originally collected in Ref.(62).Weibo TS are extracted from 226,841,249 posts, originally collected in Ref.(63).StackOverflow TS are extracted from a total number of 46,947,635 questions and answers.Delicious TS were extracted from 7,034,524 users actions, originally collected in Ref.(64).Timestamps always have the temporal resolution of the second, except for the StackOverflow data set, whose temporal resolution is the millisecond.Table1summarize the properties of these data sets.

Table 2
reports information about the data sets as they result after reducing the temporal windows.The results shown in the main text and in the SM are all obtained from the analysis of data sets over reduced observation windows.

Table S4 :
Summary table of the values of ∆ * obtained by maximizing the susceptibility (S1)

Table S5 :
Summary table of the values of ∆ * obtained by maximizing the susceptibility (S1)