Emergence of consensus as a modular-to-nested transition in communication dynamics

Online social networks have transformed the way in which humans communicate and interact, leading to a new information ecosystem where people send and receive information through multiple channels, including traditional communication media. Despite many attempts to characterize the structure and dynamics of these techno-social systems, little is known about fundamental aspects such as how collective attention arises and what determines the information life-cycle. Current approaches to these problems either focus on human temporal dynamics or on semiotic dynamics. In addition, as recently shown, information ecosystems are highly competitive, with humans and memes striving for scarce resources –visibility and attention, respectively. Inspired by similar problems in ecology, here we develop a methodology that allows to cast all the previous aspects into a compact framework and to characterize, using microblogging data, information-driven systems as mutualistic networks. Our results show that collective attention around a topic is reached when the user-meme network self-adapts from a modular to a nested structure, which ultimately allows minimizing competition and attaining consensus. Beyond a sociological interpretation, we explore such resemblance to natural mutualistic communities via well-known dynamics of ecological systems.

width w, and adjacent snapshots have an overlap of φw. Such overlapping scheme is a rather standard procedure when considering chunked temporally-resolved information, to provide a smooth account of change in time.
The question remains how these datasets can be suitably represented. The most natural way to map userhashtag interactions is through a bipartite graph of relations, which in turn corresponds to a rectangular presence-absence matrix M t = {m uh }, where m uh = 1 if user u has posted a message containing h, and 0 otherwise (note that matrix M t corresponds to block of the block off-diagonal binary matrix A in eq. (1) of the main text). Noteworthy, this implies that only binary values are considered, i.e. the number of interactions between nodes u, h is not recorded. Besides, we acknowledge that results in the main text are not affected by the chosen window width (results there correspond to w = 12 hours and 3 days, with overlaps of 6 and 36 hours, respectively). See below for more details.
It is also important to highlight that the M t matrices may not contain the same nodes across t: as time advances, users join (disappear) as they start (cease) to show activity; the same applies for hashtags, which might or might not be in the focus of attention of users. This volatile situation is quite normal in time-resolved ecology field studies [33,35,53], where the accent is placed on the system's dynamicsrather than individual species. Moreover, the level of turnover in the sequence of data is very informative, as it characterizes how the system renews its structure over time (see the main text).

A.2 Pruning the data
The large size of our two datasets -78,081 unique users and 22,376 unique hashtags in the 15M dataset, and 842,745 users plus 4,217,530 hashtags in the UK dataset-handicaps the data processing and makes the calculations time-consuming. We must therefore apply some restrictions to the number of users and hashtags considered in the network.
Therefore we apply a rather straightforward criterion, by which we prune the least active users in the data. This means that only top-contributors (and their associated hashtags) show up in the matrices that we study. In doing so, we guarantee that the whole approach makes sense: only by including the most active users we make sure that generalists and specialists will show up -if any nested patterns are to be found. Also the probability of obtaining a connected matrix is higher. Again, we acknowledge that ours is an arbitrary decision. To provide solid evidence, we have tried several matrix sizes.
Spanish dataset. Whereas results reported in the main text are based on the 1,024 most active users, we have also tested smaller sets with qualitatively the same results (see Figure S1). In this Figure, we represent the standardized results for both nestedness (left) and modularity (right). Both magnitudes will be described in details later (section B and C). Three dates are also considered at different moments of the 15M movement: three days before the main camps took place -May 12th-, at the onset of the protests; May 15th itself; and May 19th, when the maximum nestedness is achieved and protests are considered to have reached high levels of visibility. Nestedness curves show a tendency to saturate for large values of the number of users selected. This flattening is achieved at lower values for earlier dates, being far from saturation on May 19th. In view of these results, we can safely conclude that our pruning procedure, i.e., the restriction to the most active users, does not give rise to a misleading claim about the nested structure organized around the movement formation. So far, and to avoid extrapolation, we can safely state from Fig. S1 that, if we build a nestedness time series and a modularity time series (admittedly both quite poor: only 3 points each) we see that 19M shows, nestedness-wise, a maximum, regardless of the size one wishes to pick; and modularity-wise a minimum, regardless of the size on wishes to pick (except for very small sizes, n < 150). In summary, for any size reported in Fig. S1, " #$% < " #'% < " #(% and #$% > #'% > #(% which would render perfect anti-correlation (that is, a stronger result than the one reported in the main text).
UK dataset. For this dataset, the filter is applied in a slightly different way: the cutoff is applied to both users and hashtags, by choosing the 512 more active users and the 512 most-used hashtags. The reason underlying the additional constraint on hashtags and the smaller number of nodes considered, is the large amount of hashtags used in this dataset: 1,024 users can generate from 2,245 to 13,113 hashtags, depending on the observation time window. Some technical details about the observation period, number of users and hashtags, and time-windows width can be found in Table S1.
As for how we build bipartite networks for the UK dataset, different possibilities arise: on the one hand, we could randomly select a subset of users and hashtags involved in the network, but in this way we might be missing the relevant agents thought to play a major role in the contribution to the nestedness of the whole system. Besides, a random selection could lead to empty matrices (none of the selected users tweeted any of the selected hashtags). We must, nevertheless, remark that this situation is highly unlikely for the 15M event, as a result of the very nature of the dataset: only people and hashtags related to this particular topic were extracted from Twitter. We will be making use of this method as a way to compare the nestedness levels in the 15M with a topic null model, built from data from the UK dataset. Figure S1: Robustness against matrix size. For the w = 12h set some days have been selected. We perform the null model analysis for different cutoffs in the number of users (x axis), and show how the standardised leading eigenvalue (left) and the standardised modularity (right) evolve. The end at ∼700 users for the curve corresponding to D = 12M indicates that the largest possible matrix has been reached, i.e there are no more active users at that particular day. These results not only guarantee that our conclusions about the nested structure around the 15M are robust, but also show that the observed peak would be more prominent if we considered the real matrix including all the users and hashtags.  Table S1: Datasets summary details. The date range, number of total users and hashtags are displayed. We also indicate the cutoff in the number of users and hashtags (if any) that has been applied. An unspecified hashtag filter indicates that the hashtag set is determined by the set of selected users. Users are filtered by activity and hashtags by usage. We also show the final number of users and hashtags after the selection process. Finally, the window width and overlap between consecutive windows are also displayed.

B. Nestedness in online social networks
Robustness across metrics. Several studies have been focused on quantifying nestedness, the first proposals being made by Hultén [49], Darlington [46] and Daubenmire [47] to describe patterns in which species-poor sites are proper subsets of those ones present at species-rich sites. Nestedness analysis has become very popular among ecologists, and, although the concept is widely accepted, it has not been formally defined, yielding to several distinct metrics [29,30,45]. In this work (main text), we adopt a definition numerically confirmed by Staniczenko et al. [29], where nestedness is given by the maximum eigenvalue of the network's adjacency matrix. This metric is based on a theorem regarding chain graphs first provided by Bell et al. [27,28], where it is shown that among all the connected bipartite graphs with N nodes and E edges, a perfectly nested graph gives the larger spectral radius. The method is advantageous over other possibilities due to the invariance of eigenvalues under matrix permutations, and the remarkably low computation time required to perform eigenvalue calculations, even for large matrices. This is an important detail provided that z-scores for nestedness are obtained against 10 4 random realizations.
Nevertheless, we have checked the validity of our results against the improved metric NODF, defined by Almeida-Neto et al. [30]. This measure is based on two simple properties: decreasing fill (DF) and paired overlap (PO). Assuming that row (column) i is located at an upper position in the sorted presenceabsence matrix from row (column) j, the decreasing fill condition imposes that a pair of rows (columns) can only contribute to the nestedness if the marginal total -the number of interactions a row (column) has-of row (column) i, is greater or equal to the marginal total of row (column) j. In this case, the paired nestedness, N ij , is equal to the paired overlap PO ij , i.e., the number of shared interactions between rows (columns) i, j. The metric can be summarized as: Both metrics are compared in Figure S2. In the x-axis the standardized value of the leading eigenvector is displayed against the standardized NODF measure. Matrices involved in the plot correspond to graphs at the distinct snapshots with time-window w = 1d. These results are displayed along with the Pearson and Spearman coefficients and their p-values, showing a good linear correlation with p-values p < 10 −3 . Figure S2: Comparison against nestedness metrics. For every matrix from the set of time windows with w = 12h, the standardised leading eigenvalue, z λ , and the standardised NODF metric, z NODF , are computed. There is a good agreement between both metrics, as the Pearson coefficient, r, and the Spearman coefficient, r s , show along with their p-values.
Robustness across significance tests. The fact that real matrices are usually far from being perfectly nested, imposes the use of a test for the significance of nestedness values. Such a test implies the implementation of a null model and the computation of standardized results, and additionally, allows one to compare matrices with distinct sizes, this comparison being impossible otherwise. Regarding modularity, the metric already includes in its very definition a null model, in such a way that the modularity obtained is already a comparison with a randomized counterpart of the network.
Different null models may be proposed. For example, one could think of a null model rewiring the set of links present in the network. A strict application of such scheme would not maintain the bipartite structure of the network, and for that reason it should be avoided.
Within this restriction we can still think of some variations. Here we explore two different possibilities, as discussed in [41]. In null model I, the number of links in the network is preserved, but placed at random within the matrix -although respecting the class of the origin and end of it. The degree sequence is therefore not preserved. Null model II is a probabilistic null model where an interaction between hashtag h and user u is established with probability proportional to their connectivity, In the above expression, n stands for the total number of users, i.e., the first dimension of M uh , and m for the number of hashtag, equal to the second dimension of M uh . k u and k h correspond to the degree of user u and hashtag h, respectively. This model maintains the number of interactions per class only approximately, i.e. it probabilistically maintains the observed total number of interactions.
We can go further and consider an X-swap scheme null model III, in which a rewiring of the edges is applied but keeping constant the degree sequence of the nodes in the system. This null model, however, is too restrictive, and gives a small number of possible configurations, specially for those matrices having few non-empty cells. We must consider null models having a balance between the number of possible configurations and strictness. For this reason we choose to discard null model III, and apply the probabilistic null model II, which is the strictest between models I and II. Figure S3 reports the consistency of the results for the nestedness using either of the chosen null models. Z-scores have been calculated over 10,000 randomizations. Figure S3: Robustness against statistical null models for nestedness. Unsurprisingly, results for the null model I (much less restrictive) yield extremely high z-scores, as opposed to the comparatively moderate results from null model II (note the y=x line as a visual aid). Despite these differences, both models are highly consistent at quantifying the level of significance for the nestedness.

C. Modularity
Modularity was originally proposed as a metric for community detection in networks by Newman and Girvan [51] aiming at identifying the mesoscale organization of networks, which reveals many hidden features invisible from a global perspective of the network; informally, modularity (typically labeled Q) relies on the detection of densely connected subgraphs: it quantifies the extent to which nodes in a network tend to cluster together, in comparison to the expected distribution of a random counterpart (null model).
One of the interesting aspects of Q is its reliance on the concept of null model, which can be taken as the baseline against which optimization makes sense. This has allowed the original formulation by Newman to be extended to other scenarios, namely directed, weighted or signed (if we pay attention to the features of the links); and bipartite and multiplex networks, beyond the (more common) unipartite networks. The general layout of Q is , where g i represents the module node i belongs to, A ij is the real adjacency matrix of the network, and P ij are the probabilities that an edge linking nodes i and j exists in the null model. The key point is to define, in this equation, a suitable, adapted null model to confront the real connectivity patterns (as in the mentioned cases). The issue is controversial because even within a type of network different possible null models can be defined. In the bipartite scheme we find two main proposals. We have chosen to work with Barber's definition of modularity for bipartite networks [26] (see also the main text), ruling out the proposal by Guimerà et al. [48]. In Guimerà's proposal, modules are forced to be defined strictly in class "purity", that is, a module can only contain nodes of a single class. His method is thus almost equivalent to optimize modularity on the projected unipartite network, which collapses the information in the bipartite original network onto one of its classes (see [48] for the details). On the contrary, Barber's definition naturally incorporates combined (or mixed) modules, formed by nodes from both classes.
The choice of one or another definition is a matter of the problem one intends to solve. Indeed, it may not make a lot of sense to define movie-actor mixed modules, because the semantics of such a module is not very clear. In other problems, however, it may be more convenient to allow for mixed modules. This is often the case in ecology (as for instance in [41]), because it is more interesting to identify modules that have a precise biological meaning as potential co-evolutionary groups [54] or as cores of mutualistic networks [9]. As we are also, in our user-meme systems, more interested in this co-evolutionary perspective, we have taken Barber's approach.
We have applied this metric making use of the software provided by Marquitti et al. [50], where the simulated annealing method [27] is used to maximize Barber's modularity. Statistical significance of the results is checked obtaining the z-score of the original network modularity, against the average and standard deviation of 100 random realizations (null model II as for nestedness, see above).

D. Nestedness and Modularity: further considerations
Robustness across window widths. Beyond assessing the robustness of the results for the nestedness values (regarding the used metrics and null models), we also need to test for robustness against the (admittedly arbitrary) choice of a window-width. This applies both to the soundness of the results in nestedness and modularity.
In Figure S4 we report results for modularity and nestedness (both in their standardised version) for every width w we have tested. Upon inspection, it is clear that results are noisier the narrower the window is -the regularity of the peaks suggests that the measures are sensitive to circadian rhythms (periodic temporal patterns). For values aggregating the activity for one day and beyond, such periodic variations disappear. Remarkably, nestedness (lower panel) shows the same trend for every window width. In contrast, the trajectory of the modularity z-scores is coherent up to w = 1 day, but it is blurred out for w = 3 days. These results (together with those obtained for the UK dataset) suggest that events have their very own characteristic timescale [55], and observed trends are valid only within a relatively precise range. Figure S4: Robustness against window width. Standardised nestedness, z λ , and modularity, z Q , values are displayed for window sizes 6h, 12h, 24h and 72h.
We observe the same robustness in the UK datasets for a window width of w = 2 hours (as opposed to w = 1 hour reported in the main text). Given the fast time scale of the event (it fully develops in less than two days), wider window schemes blur the results.
Ruling out epiphenomenal conclusions. Both in the main text and throughout this document we have provided solid evidence that, in an information ecosystem such as Twitter, topics arise in a nested scheme out of an initially modular structure. One may argue, however, that this striking outcome may be artificial in different senses. First, it is possible that the modular-to-nested transition occurs out of a "topological artifact", namely, that the network starts as a broken set of small components (thus being trivially modular) and undergoes a percolation process such that nestedness is possible from then on. In Figure S5 the size of the giant component of the system is plotted as it evolves in time. Such component is always above 0.78N, and as such a percolation transition is never observed. Notably, the y-axis is labeled from 0.75 and above, which implies that, for the whole time range (over a month), the system does not undergo any abrupt percolation process. The figure corresponds to a window width w = 6 hours (the noisiest and sparsest one).
A second consideration implies our disregarding of weighted values. Indeed, we have focused on binary, presence/absence matrices -in an effort to follow the ecosystems literature. Additionally, NODF does not have, to our knowledge, a weighted equivalent, so comparison is properly done only with a binary representation of the system.
Admittedly, this represents a loss of information, which could potentially affect the results. We are aware that the spectral radius approach to nestedness does allow for weights to be present in the interaction matrix. For the sake of completeness, we have measured nestedness also considering weights, which stand for the frequency with which an individual used a certain hashtag, given a certain time window. The result can be seen in Figure S6, where the growing pattern follows precisely the trends reported in the main text ( Figure 2) and here ( Figure S4). Figure S6: Nestedness evolution as measured from weighted matrices, i.e. matrices which encode the absolute usage counts of hashtags by the corresponding Twitter users. The figure corresponds to a window width w = 3, and delivers the same growing patterns as its own binary counterpart. (time in the x-axis is expressed here in minutes since the origin of our data; here, t = 30000 is roughly equivalent to May15th).

E. Mutualistic Dynamical Model
According to the dynamical framework from Bastolla et al. [36], the evolution of a mutualistic ecosystem can be modeled through a set of N differential equations in which each equation and the Holling term h imposes a limit to the mutualistic growth rate, avoiding divergences in the case of large populations. The equations for the system's dynamics are described in Materials and Methods section of the main text.

E.1. Synthetic topologies
Regarding topologies, we built ad hoc two ensembles of synthetic networks for different network sizes. For each pair of ensembles (equal size, equal link density), the networks of the first ensemble present an almost perfectly nested architecture, while the networks of the second one exhibit an almost perfectly modular structure. All the networks of a given size N were built with the same number of users and hashtags n = m. Nested networks were constructed starting from a perfect nested structure, involving a connectivity distribution k u = u, u = 1, 2, . . . ; k h = h, h = 1,2,..., and subsequently randomizing each link with probability p = 0.02. This method provides networks with an almost perfect nested topology and According to the procedure by Newman [51], modular networks were constructed starting from a perfect modular structure consisting of 5 disconnected blocks (cliques) of equal size N i = N/5, and subsequently randomly connecting pairs to reach the connectivity <k>= N/4. The number of blocks (5) and the rewiring probability p = 0.02 were arbitrarily chosen. Nevertheless, the results are robust against variations of these values, as shown in figure S7.

E.2. Realizations
In each realization, we used a different network of the corresponding ensemble, and assigned different random initial frequencies to each user and hashtag in the interval s u,h (t = 0) (0, 1), and different growing rates in the interval α u,h (0.85, 1.1). We ran the dynamics defined by equations (4) and (6) of the main text and, once the stationary state was reached, we computed the survival rate by adding the number of users and hashtags with frequency s u,h > 0 and then divided by their initial number N. Accordingly, the survival area stands for the region of the parameters space with a survival rate greater than a given value. We performed 1000 different realizations per each point of the space of parameters β × γ and for each size and topology. According to the standard biology procedures (see, e.g., [36,54]), the inter-specific term was fixed to ρ = 0.2 and the Holling term was set to λ = 0.1. The values for the competition and mutualistic terms covered the range β,γ [0.1] with intervals of δ β , δ γ = 0.05 (from weak to strong mutualism regimes).
Results of these extensive simulations are shown in Figure 3 of the main text, where left panels represent the survival rate (i.e., the diversity in the stationary state) as a function of the competitive and mutualistic terms β and γ, for a system size of N = 1000. Right panels of Figure 3 of the main text represent the normalized area of the space of parameters β × γ that exhibits a survival rate equal o higher than the corresponding value of the x-axis, for different network sizes in different panels: N = 50, 100, 300, 1000. As discussed in the main text, a large area of the space of parameters exhibits high persistence for the nested architecture where persistence is low for the modular one, but never the opposite. Otherwise, for the modular architecture, the area of the space β × γ with a given persistence decreases sharply with the network size, while for the nested architecture this dependence is smaller. This effect saturates for large values of the network size, that is, once the size N ~ 500 is reached, the size of the network does not have a noticeable effect on persistence anymore. Figures S8 and S9 complement, respectively, panel left and right of Figure 3 of the main text. Figure S8 represents the persistence (i.e. diversity of memes and hashtags in the stationary state) for each value of β and γ, for nested and modular networks of 100, 500, and 1000 nodes. Additionally, Figure S9 represents the normalized area of the surface (β × γ) that exhibits a persistence equal o higher than a given value as a function of that value, for nested and modular networks. In the above results ( Figure 3 of the main text and Figures S6-S9), the interval α u,h (0.85, 1.1) has been taken according to the biological literature (see, e.g. [36]). Nevertheless, the main result (nested architectures out-survive modular ones) holds for wider interval of α u,h, as shown in figure S11 for α u,h (0.5, 1.5).  The value ρ = 0.2 in the simulations from Figure 3 is taken from the ecological literature, where the intra-species competitive term (1 -ρ) is usually considered to be greater than the inter-species term (ρ), as members of the same species are competing for the same resources. For the sake of completeness, we have studied as well the case N = 1000, ρ = 0.6, which in our study corresponds to an inter-users stress greater than the intra-users stress. Figure S10 shows that, when the inter-users stress exceeds the intrausers stress, the main feature observed in the dynamics remains intact, namely, modular networks exhibit poor survival, while nested networks show equal or higher levels than the modular structure in any given region.

F. Core-periphery structure
Meso-scale structures in networks have received considerable attention in recent years, as the detection of these intermediate-scale patterns can reveal important characteristics that are hidden at both local and global scales. Among the wide diversity of methods aiming at the detection of such structures, community detection methods have become very popular and successful. In this section we focus our attention on a different type of meso-scale structure, known as the core-periphery structure, that helps one to visualize which nodes of the graph belong to a densely connected component or core, and which of them are part of the network's sparsely connected periphery. Nodes belonging to the core should be relatively well connected to other nodes in the network, either central or peripheral; whereas nodes in the periphery should be those elements poorly connected with the core, and disconnected from the periphery. According to this intuitive notion, many methods have been proposed. We follow here a method developed by Della Rossa et al. [38], based on the profile derived by a standard random walk model. It and can be obtained in a very general framework and is applied here for undirected unweighted networks.
Let w ij = w ji be the link of weight 1 between nodes i ↔ j in our network of size N. At each time step, the probability that the random walker at node i jumps to node j is given by m ij : (5) where k i is the degree of i. The asymptotic probability of visiting node i has the closed form (6) The method starts by randomly selecting a node i among those with the weakest connectivities, and assigning α i = 0. Pk, the set of nodes that are already assigned at step k, is then filled with i, P 1 = {i}. For the following steps, k = 2, 3, ..., n, the node j attaining the minimum in (7) is selected. If it is not unique, a randomly chosen node among them, l, is selected, and P k = P k−1 l. Although the algorithm presents some randomness, it has been verified that the effect in the analysis of real-world networks is negligible. The core-periphery profile is then the set {α k }, with 0 ≤ α k ≤ 1,where α k = 0 for nodes belonging to the periphery and α k > 0 for nodes in the core.
As the goal of this section is to identify the possible formation of a core during the days preceding the 15M, a distance metric should be defined. As a first approach we consider the distance between two core-periphery structures as the product between the two {α k } sequences, (8) Notice that the α "vectors" do not necessarily share the same coordinates, that is, it may happen that a given node (now by nodes we refer to users or hashtags indifferently) present in does not appear in because it was not part of the network. Whenever this is the case, we consider the contribution to the dot product to be zero (i.e., as if it were at the periphery). On the other hand, we normalize the above expression in order to get a bounded value: , which is the expression used in the main text and labeled as D RC .
In the main text we have discussed the conformation of a relatively stable core of hashtags around the 15M day, in contrast to a high turnover of users coming to and leaving the core at different snapshots.
Here, we scrutinize further such a finding by ruling out the possibility that it could be due, for example, to the fact that the set of users in the core could be similar over the distinct time-windows and change abruptly at the reference point under consideration. To this aim, we additionally measured the distance of a given core C t from the previous core C t-1 -the core present in the previous time-stamp. Results in Figure S12 reject this conjecture: the turnover of users is still high -the distance is small-when the core is compared with that in the preceding graph, suggesting that users are actually entering and exiting the key positions in the network. In contrast, hashtags keep relatively constant at high distances, indicating that the core near the 15M is formed smoothly -the exception to this being a sharp decrease around the 15M if observed at a 3-day window resolution. The reason for this behavior is the takeover of new hashtags (with respect to the ones that originated the protest), which pushes the original ones away from the core of the structure. The best example of this is the hashtag #democraciarealya ("real democracy now"), which is placed at the core of the bipartite network for a long period of time, but its leading role is substituted by the more generic (and "cheap" from a microblogging perspective) #15m from the immediately previous days of May 15 and onwards. Figure S12: The figure emulates the results in Figure 3 of the main text; however, distances are computed between each time stamped core and the former configuration, C t vs. C t-1 . As in the main text, the figure illustrates that hashtags build a more stable core in comparison to users.

G. Anti-correlation between nestedness and modularity
Further details on the reported Q-nestedness correlated/anti-correlated patterns can be seen in Table S2 and Figure S13.  Table S2: Pearson correlation coefficient values for nestedness and modularity, along with their p-values, for both datasets and available window widths. In the case of 15M, we report Pearson correlations before and after the climax of the event (though the exact moment at which modularity abruptly collapses is slightly different in each case, see panels in Figure S13). Note that correlations fail to be significant (that is, p > 0.05) for w = 3 days, as the results for modularity are blurred compared to the observed pattern in 6h ≤ w ≤ 24h.

H. "Topic" null model
We have discussed in the beginning of this supplementary information distinct possibilities regarding the construction of presence-absence matrices that describe the set of interactions in our systems. We have also mentioned that different cutoffs to hashtags and/or users can be applied, and discussed the more reasonable way to proceed to study nestedness and modularity, which consists of either considering the most active users and their related set of hashtags, or the set of most active users plus most tweeted hashtags at a time. Also, on the side of statistical soundness, we have delved into different null model possibilities.
Now however, our concern focuses on the singularity of the results themselves. In particular, we want to test whether the modularity-nestedness crossover we have observed for particular topics is universal to any activity on Twitter (and in this sense uninteresting), or rather it is a specific mechanism underlying the formation of consensus around related information. Thus we explore here three additional possibilities for the w = 12h time-window on the UK dataset. In option (a) we select randomly and independently 512 users and 512 hashtags, and build the corresponding presence-absence matrix. Although the way in which nodes are selected can produce empty matrices corresponding to graphs with no links, this never happened in our dataset (all matrices have more than 20 non-empty cells). In model (b), 512 users are randomly selected and they determine the set of hashtags to consider. Model (c) is analogous but selecting randomly the 512 hashtags to be included, along with the set of users that tweeted them.
These three sets, (a), (b) and (c), can be considered as an additional category of null models that allow us to discern if the nested patterns previously observed are significant: for example, if set (a) showed high levels of z λ we would not be able to conclude that the coordination phase observed in the 15M is relevant, as we would be finding nested patterns even for structures randomly filtered. A comparison between the three methods is displayed in Figure S14. Results include data from Figure 5 (bottom panel) in the main text. We observe that, when we consider independent users and hashtags at random -set (a)nested patters do not show up and the bipartite network do not present any kind of organized structure. The exception is the region between the 3rd of February afternoon and the 4th of February, when the XLVII Super Bowl took place, probably due to the high relevance of this tournament (if it became global trending topic, even a randomly built network would show, to some extent, a nested structure). When users (hashtags) are randomly selected, but the set of hashtags (users) is closely related to them, the nestedness increase -sets (b) and (c)-, but this is a systematic shift rather than a differential change. Figure S14: Some results for unfiltered Twitter traffic (2013). Set (a) corresponds to our UK dataset with 512 hashtags and 512 users randomly and independently selected. Such a random selection implies that the presenceabsence matrix might be empty, although it never happened in this case. In set (b), 512 users have been randomly chosen determining the set of hashtags. Inversely, set (c) have been obtained by randomly filtering 512 hashtags their related users. Finally, set (4) comprises the 512 most active users and the 512 most used hashtags for comparison.

Appendix. Some selected hashtags
In Tables S3 and S4 we display some of the hashtags used in our dataset, along with the number of counts registered.