The recent wave of mobilizations in the Arab world and across Western countries has generated much discussion on how digital media is connected to the diffusion of protests. We examine that connection using data from the surge of mobilizations that took place in Spain in May 2011. We study recruitment patterns in the Twitter network and find evidence of social influence and complex contagion. We identify the network position of early participants (i.e. the leaders of the recruitment process) and of the users who acted as seeds of message cascades (i.e. the spreaders of information). We find that early participants cannot be characterized by a typical topological position but spreaders tend to be more central in the network. These findings shed light on the connection between online networks, social contagion, and collective dynamics, and offer an empirical test to the recruitment mechanisms theorized in formal models of collective action.
The last few years have seen an eruption of political protests aided by internet technologies. The phrase Twitter revolution was coined in 2009 to refer to the mass mobilizations that took place in Moldova1 and, a few months later, in Iran2, in both cases to protest against fraudulent elections. Since then, the number of events connecting social media with social unrest has multiplied, not only in the context of authoritarian regimes exemplified by the recent wave of upsurges across the Arab world but also in western liberal democracies, particularly in the aftermath of the financial crisis and changes to welfare policies. These protests respond to very different socio-economic circumstances and are driven by very different political agendas, but they all seem to share the same morphological feature: the use of social networking sites (SNSs) to help protesters self-organize and attain a critical mass of participants. There is, however, not much evidence on how exactly SNSs encourage recruitment. Empirical research on online activity around riots and protests is scarce, and the few studies that exist3,4,5 show no clear patterns of protest growth. Related research has shown that information cascades in online networks occur only rarely6,7,8, with the implication that even online it is difficult to reach and mobilize a high number of people. Revolutions, riots and mass mobilizations are also rare and, as such, difficult to predict; but when they happen, they unleash potentially dramatic consequences. The relevant question, which we set to answer here, is not when these protests take place but whether and how SNSs contribute to trigger their explosion.
Sociologists have long analyzed networks as the main recruitment channels through which social movements grow9,10. Empirical research has shown that networks were crucial to the organization of collective action long before the internet could act as an organizing tool, with historical examples that include the insurgency in the Paris commune of 187111, the 60's civil right struggles in the U.S.12, and the demonstrations that took place in East Germany prior to the fall of the Berlin wall13,14. These studies provide evidence that recruits to a movement tend to be connected to others already involved and that networks open channels through which influence on behavior spreads, but they are limited by the quality of the network data analyzed, particularly around time dynamics. Analytical models have tried to overcome these data limitations by recreating the formal features of interpersonal influence, and analyzing how they are related to diffusion15,16,17,18 and to examples of social contagion like collective action or the growth of social movements19,20,21,22,23. Four main findings arise from these models. First, the shape of the threshold distribution, i.e. the variance in the propensity to join intrinsic to people, determines the global reach of cascades. Second, individual thresholds interact with the size of local networks: two actors with the same propensity might be recruited at different times if one is connected to a larger number of people. Third, attaining a critical mass depends on being able to activate a sufficiently large number of low threshold actors that are also well connected in the overall network structure. And fourth, the exposure to multiple sources can be more important than multiple exposures: unlike epidemics, the social contagion of behavior often requires reinforcement from multiple people. Recent experiments have confirmed the relevance of complex contagions to explain behavior in online contexts24, and large-scale analyses have validated its effects on information diffusion on Twitter25.
Models of collective action have identified important network mechanisms behind the decision to join a protest, but they suffer from lack of empirical calibration and external validity. Online networks, and the role that SNSs play in articulating the growth of protests, offer a great opportunity to explore recruitment mechanisms in an empirical setting. We analyze one such setting by studying the protests that took place in Spain in May 2011. The mobilization emerged as a reaction to the political response to the financial crisis and it organized around broad demands for new forms of democratic representation. The main target of the campaign was to organize a protest on May 15, which brought tens of thousands of people to the streets of 59 cities all over the country. After the march, hundreds of participants decided to camp in the city squares until May 22, the date for local and regional elections; crowded demonstrations took place daily during that week. After the elections, the movement remained active but the protests gradually lost strength and its media visibility waned (more background information in SI).
We analyze Twitter activity around those protests for the period April 25 (20 days before the first mass mobilizations) to May 25 (10 days after the first mass mobilizations, and 3 days after the elections). The data set follows the posting behavior of 87,569 users and tracks a total of 581,750 protest messages (see Methods). We know, for each user, who they follow and who is following them. In addition to this asymmetric network, we also consider a version of the network that only retains reciprocated and therefore stronger connections. Previous research has suggested that Twitter is closer to a news media platform than to a social network7; this research suggests that the properties of the online network cannot be directly compared to other social networks because of the prominence of broadcasters. The symmetric (reciprocated) network mitigates the relatively higher influence of these hubs of activity and retains only connections that reflect mutual acknowledgement between users, which is arguably a stronger proxy to offline relationships. Contrasting recruitment patterns in both the asymmetric and symmetric networks allows us to test whether the dynamics of mobilization depend on weak, broadcasting links or on stronger connections, based on mutual recognition. Our analysis of recruitment is based on the assumption that users joined the movement the moment they started sending Tweets about it. We also assume that once they are activated, they remain so for the rest of the period we consider.
By the end of our 30-day window, most users in the network had sent at least one message related to the protest, with only about 2% remaining silent (but still being exposed to movement information, Fig. 1).The most significant increase in activity takes place right after the initial protest (May 15), during the week leading to the elections of May 22. Up to that point, only about 10% of the users had sent at least a message related to the protests.
Activation times tell us the exact moment when users start emitting messages, and allow us to distinguish between activists leading the protests and those who reacted in later stages. We calculated, for each user, the proportion of neighbors being followed that had been active at the time of recruitment (ka/kin). This gives us a measure that approximates the threshold parameter used in formal models of social contagion, particularly those that incorporate networks17,18,22. Activists with an intrinsic willingness to participate have a threshold ka/kin0, whereas those who need a lot of pressure from their local networks before they decide to join are in the opposite extreme ka/kin1. Looking at the empirical distribution, most users in our case exhibit intermediate values (Fig. 2A). Although the distribution is roughly uniform for almost the full threshold interval, there are two local maxima at 0 (users who act as the recruitment seeds) and 0.5 (users who join when half their neighbors already did). The symmetric network has a significantly higher number of users with ka/kin = 0 because it mitigates the influence of hubs or broadcasters (i.e. users who do not reciprocate connections about 7,000 in the overall network but who contribute to activate low threshold participants, the seeds in the symmetric network). The shape of the distribution changes before and after 15 May, the first big demonstration day (Fig. 2B). Most early participants i.e. users who sent a message prior to the first mass mobilizations and to the news media coverage of the events needed, on average, less local pressure to join, which is consistent with their role as leaders of the movement. Because most activity takes place after 15-M, the threshold distribution for the ten days that followed is not very different from the threshold distribution for the full period.
The actual chronological time of activation changes across same-threshold actors (see SI, Fig. S2); this variation is predictable given that actors react to different local networks, both in size and composition. The time it takes neighbors to join, however, also influences the activation of users. We measure the pace at which the number of active neighbors grows using the logarithmic derivative of activation times ka/ka = (kat+1kat)/kat+126. The rationale behind ka/ka is that some users might be susceptible to recruitment bursts, that is, more likely to join if many of their neighbors do in a short time-span. This emphasis on time dynamics qualifies the idea of complex contagion: receiving stimuli from multiple sources is important because, unlike epidemics, social contagion often requires exposure to a diversity of sources22; evidence of recruitment bursts would suggest that the effects of multiple and diverse exposures are magnified if they take place in a short time window. We find that early participants, i.e. users with low thresholds, are insensitive to recruitment bursts; for the vast majority of users, however, being exposed to sudden rates of activation precedes their decision to join (Fig. 3A). Users with moderate thresholds who are susceptible to bursts act as the critical mass that makes the movement grow from a minority of early participants to the vast majority of users: without them, late participants (the majority of users that made the movement explode) would not have joined in (Fig. 3B).
Information diffusion follows different dynamics. Very few messages generate cascades of a global scale: we assume that if a user emits a message at time t and one of their followers also emits a message within the interval (t, t+ t), both messages belong to the same chain. A chain is aborted when none of the followers exposed to a message acts as a spreader, and messages can only belong to a single chain, i.e. only the messages that do not belong to a previous chain are considered seeds for a new cascade (see Methods). The vast majority of these chains die soon, with only a very small fraction reaching global dimensions, a result that is robust using different time intervals (Fig. 4A). This supports previous findings6,7,8 and reveals that cascades are rare even in the context of exceptional events. We run a k-shell decomposition27 to identify the network position of users acting as seeds of the most successful chains. We found a positive association between network centrality, as measured by the classification of nodes in high k-cores, and cascade size (Fig. 4B). This positive association suggests that agents at the core of the network not necessarily those with a higher number of connections, but connected to equally well connected users (Fig. 4CD) are the most effective when it comes to spreading information, again in accordance with what has been found in research on epidemics and contagion28. Spreaders, though, need to be recruited first, and the same decomposition analysis does not find any significant association between thresholds and topology, i.e. early participants do not have a characteristic network position; they are instead scattered all over the network (see SI, Fig. S5).
The role that SNSs play in helping protests grow is uncontested by most media reports of recent events. However, there is not much evidence of how exactly these online platforms can help disseminate calls for action and organize a collective movement. Our findings suggest that there are two parallel processes taking place: the dynamics of recruitment, and the dynamics of information diffusion. While being central in the network is crucial to be influential in the diffusion process, there is no topological position that characterizes the early participants that trigger recruitment. This suggests that whatever exogenous factors motivate early participants to start sending messages, the consequence is that they create random seeding in the online network: they spur focuses of early activity that are topologically heterogeneous and that spread through low threshold individuals. This finding is consistent with previous work using simulations that test (and challenge) the influential hypothesis17,18. However, a small core of central users is still critical to trigger chains of messages of high orders of magnitude. The advantage that this minority has as cascade generators derives from their location in the network; contrary to what has been argued in previous research4, centrality in the network of followers is still a meaningful measure of influence in online networks at least in the context of mass mobilizations.
The decision to join a protest depends on multiple reasons that we do not capture with online data for instance, the amount of offline news media to which users are exposed. It is not surprising, then, that network position does not account for time of activation as it does for cascading influence (the diffusion of messages is, for the most part, endogenous, depending on the network structure). However, there is one element in the recruitment process that is endogenous as well, and that is the timing of exposures. The existence of recruitment bursts indicates that the effects of complex contagion22 are boosted by accelerated exposure, that is, by multiple stimuli received from different sources that take place within a small time window. These bursts facilitated by the speed at which information flows online provide empirical evidence of what scholars of social movements have called, metaphorically, collective effervescence29. We provide an empirical measure for that metaphor and find that most users are susceptible to it. These findings qualify threshold models of collective action that do not take into account the urgency to join that bursts of activity instill in people.
In addition, this study provides evidence of why horizontal organizations (like the platform coordinating this protest, see SI) are so successful at mobilizing people through SNSs: their decentralized structure, based on coalitions of smaller organizations, plant activation seeds randomly at the start of the recruitment process, which maximizes the chances of reaching a percolating core; users at this network core, in turn, contribute to the growth of the movement by generating cascades of messages that trigger new activations, and so forth. These joint dynamics illustrate the trade-off between global bridges (controlled by well connected users) and local networks: the former are efficient at transmitting information, the later at transmitting behavior22. This is one reason why Twitter has played a prominent role in so many recent protests and mobilizations: it combines the global reach of broadcasters with local, personalized relations (which we approximate in the form of reciprocal connections); in the light of our data, both features are important to articulate the growth of a movement. These features, however, are necessary, not sufficient, conditions. Again, being able to generate recruitment patterns on a scale of this order is still an exceptional event, and this study sheds no light that helps predict future occurrences; but it shows that when exceptional events like mass mobilizations take place, recruitment and information diffusion dynamics are reinforcing each other along the way.
Our data has two main limitations. First, we might be overestimating social influence because we do not control for demographic information and the effects of homophily in network formation30. Studies that control for demographic attributes, however, still find that networks are significant predictors of recruitment10,14; in the light of those findings, we can only assume that online networks will still be significant channels for the spread of behavior once demographics are taken into account. Second, we also do not control for exposure to offline media, which is likely to have interacted with social influence, or to other sources of information that might have also contributed to recruitment (like, for instance, offline discussion networks). The lack of media coverage before the demonstrations of May 15 allows us to conduct a natural experiment and compare how the network channels recruitment with and without the common knowledge of media exposure. We show that there is no significant shift to the left of the threshold distribution once the media starts reporting on the protests this would have indicated that exposure to mass media led to a higher proportion of users joining the protests in the absence of local pressure. On the contrary, we find that local pressure is still an important precursor for a large number of users, and that the vast majority are still susceptible to bursts of activity in their local networks.
Our findings, however, are still limited by the fact that we are not capturing the full range of information exposure: users had access to other sources we do not consider that might have also influenced their decision to join the movement; this unobserved exposure is surely overestimating the influence effects of Twitter activity. The different times of adoption that we analyse suggest that for some users (the early adopters) online activity in Twitter had probably more weight in their decision to join than for others (the late adopters, who needed reinforcement from other sources, probably mass media or offline networks, before displaying their commitment online). Further investigations should consider the relative weight that different sources of information have in shaping individual behaviour.
In addition, future research should consider if our results are robust using time dependent networks. One of the main assumptions of our data is that the network of followers does not change during the period considered; in fact, a significant number of connections are likely to have been created as a result of the mobilization itself. Future work should also address if our findings are platform-dependent or universal to different types of online networks. Recent events, like the riots in London in August 2011, suggest that different online platforms are being used to mobilize different populations31. The question that future research should consider is if the same recruitment patterns apply regardless of the technology being used, or if the affordances of the technology (i.e. public/private by default) shape the collective dynamics that they help coordinate. The replication of these analyses with data covering similar events (like the OccupyWallStreet protests initiated in New York, and soon spreading to other U.S. cities) could help determine if the dynamics we identify here can be generalized to different social contexts.
The data contains time-stamped tweets for the period April 25 to May 25. Messages related to the protests were identified using a list of 70 #hashtags (full list in SI). The collection of messages is restricted to Spanish language and to users connected from Spain, and it was archived by a local start-up company, Cierzo Development Ltd using the SMMART Platform. We estimate that our sample captures above a third of the total number of messages exchanged in Twitter related to the protests. The network of followers was reconstructed applying a one-step snowball sampling procedure, using the authors that sent protest messages as the seed nodes. An arc (i,j) in this network means that user i is following the Tweets of user j, and we assume that this network is static for the period we consider. The symmetric network filters out all asymmetric arcs, that is, for every arc (i,j) there also needs to be an arc (j,i).
We reconstruct message chains assuming that protest activity is contagious if it takes place in short time windows. We do not have access to re-tweet (RT) information, but since all our messages are related to the 15-M movement, chains refer to the same subject matter (although the precise content of the messages in the same chain might differ). This measurement maps the extent to which the stream of content related to the protests diffuses in given time windows.
The k-shell decomposition assigns a shell index ks to each user by pruning the network down to users with more than k neighbours. The process starts removing all nodes with degree k = 1, which are classified (together with their links) in a shell with index ks = 1. Nodes in the next shell, with degree k = 2, are then removed and assigned to ks = 2, and so forth until all nodes are removed (and all users are classified). Shells are layers of centrality in the network: users classified in shells with higher indexes are located at the core, whereas users with lower indexes define the periphery of the network (see SI for details of node classification in shells).
We thank Mason A. Porter for helpful comments and suggestions. S.G.B. is "partially supported by the Spanish MICINN projects CSO2009-09890 and CSD2010-00034. J. B-H is partially supported bythe Spanish MICINN through project FIS2008-01240.Y. M. is supported by the Spanish MICINN through projects FIS2008-01240 and FIS2009-13364-C02-01 and by the Government of Aragon (DGA) through the grant No. PI038/08.