Introduction

Ideas have formidable potential to impact public opinion, culture, policy and profit1. The advent of social media2 has lowered the cost of information production and broadcasting, boosting the potential reach of each idea or meme3. However, the abundance of information to which we are exposed through online social networks and other socio-technical systems is exceeding our capacity to consume it. Ideas must compete for our scarce individual and collective attention. As a result, the dynamic of information is driven more than ever before by the economy of attention, first theorized by Simon4. Yet the processes that drive popularity in our limited-attention world are still largely unexplored5,6,7,8,9,10,11,12,13,14,15.

The availability of data from online social media has recently created unprecedented opportunities to explore human and social phenomena on a global scale16,17. In this context one of the most challenging problems is the study of the competition dynamics of ideas, information, knowledge and rumors. Understanding this problem is crucial in a broad range of settings, from viral marketing to scientific discovery acceleration. Aspects of competition for limited attention have been studied through news, movies and topics posted on blogs and social media10,11,13. The popularity of news decreases with the number of competing items that are simultaneously available8,18,19.

However, even in the simplified settings of social media platforms, it is hard to disentangle the effects of limited attention from many concurrent factors, such as the structure of the underlying social network7,13, the activity of users and the size of their potential audience19, the different degrees of influence of information spreaders20, the intrinsic quality of the information they spread21, the persistence of topics22,23 and homophily24. To compound these difficulties, social networks that host information diffusion processes are not closed systems; exogenous factors like exposure to traditional media and their reports of world events play important roles in the popularity and lifetime of specific topics10,25. Another example of our limited attention is the cognitive limit on the number of stable social relationships that we can sustain, as postulated by Dunbar26 and recently supported by analysis of Twitter data27.

We propose an agent-based model to study the role of the limited attention of individual users in the diffusion process and in particular whether competition for our finite attention may affect meme popularity, diversity and lifetime. Although competition among ideas has been implicitly assumed as a factor behind, e.g., the decay in interest toward news and movies28,8,10, to the best of our knowledge nobody has attempted to explicitly model the mechanisms of competition and how they shape the spread of information. In particular, we show that a simple model of competition on a social network, without any further assumptions about meme merit, user interests, or explicit exogenous factors, can account for the massive heterogeneity in meme popularity and persistence.

Results

Here we outline a number of empirical findings that motivate both our question and the main assumptions behind our model. We then describe the proposed agent-based toy model of meme diffusion and compare its predictions with the empirical data. Finally we show that the social network structure and our finite attention are both key ingredients of the diffusion model, as their removal leads to results inconsistent with the empirical data.

We validate our model with data from Twitter, a micro-blogging platform that allows many millions of people to broadcast short messages through social connections. Users can “follow” interesting people, by which a directed social network is formed. Posts (“tweets”) appear on the screen of followers. People can forward (“retweet”) selected posts from their screen to their followers. Furthermore, users often mark their posts with topic labels (“hashtags”). Let us use these tags as operational proxies to identify memes. A retweet carries a meme from user to user. As a meme spreads in this way, it forms a cascade or diffusion network such as those illustrated in Fig. 1. We collected a sample of retweets that include one or more hashtags, produced by Twitter users over a specific period of time (see details in Methods section). This provides us with a quantitative framework to study the competition for attention in the wild.

Figure 1
figure 1

Visualizations of meme diffusion networks for different topics.

Nodes represent Twitter users and directed edges represent retweeted posts that carry the meme. The brightness of a node indicates the activity (number of retweets) of a user and the weight of an edge reflects the number of retweets between two users. (a) The #Japan meme shows how news about the March 2011 earthquake propagated. (b) The #GOP tag stands for the US Republican Party and as many political memes, displays a strong polarization between people with opposing views. Memes related to the “Arab Spring” and in particular the 2011 uprisings in (c) #Egypt and (d) #Syria display characteristic hub users and strong connections, respectively.

Limited attention

We first explore the competition among memes. In particular, we test the hypothesis that the attention of a user is somewhat independent from the overall diversity of information discussed in a given period. Let us quantify the breadth of attention of a user through Shannon entropy S = −Σi f(i) log f(i) where f(i) is the proportion of tweets generated by the user about meme i. Given a user who has posted n messages, her entropy can be as small as 0, if all of her posts are about the same meme; or as large as log n if she has posted a message about each of n different memes. We can measure the diversity of the information available in the system analogously, defining f(i) as the proportion of tweets about meme i across all users. Note that these entropy-based measures are subject to the limits of our operational definition of a meme; finer or coarser definitions would yield different values.

In Fig. 2 we compare the daily values of the system entropy to the corresponding average user entropy. The key observation here is that a user's breadth of attention remains essentially constant irrespective of system diversity. This is a clear indication that the diversity of memes to which a user can pay attention is bound. With the continuous injection of new memes, this indirectly suggests that memes survive at the expense of others. We explicitly assume this in the information diffusion model presented later.

Figure 2
figure 2

Plot of daily system entropy (solid red line) and average user breadth of attention (dashed blue line).

Days in our observation period are ranked from low to high system entropy, therefore the latter is monotonously increasing.

User interests

It has been suggested that topical interests affect user behavior in social media29,30. This is a potentially important ingredient in a model of meme diffusion, as an interesting meme may have a competitive advantage. Therefore we wish to explore whether user interests, as inferred from past behavior, are predictive of future behavior.

Let us consider every user in our dataset and any retweets they produce. When a user u emits a new retweet, we define her interests Iu as the set of all memes about which she has tweeted up to that moment. We also collect the set M0 of memes associated with the new retweet. The n most recent posts across all users prior to the new retweet are considered as a set of potential candidates that might have been retweeted, but were not. The corresponding sets of memes M1, M2, …, Mn are recorded (n = 10). We compute the similarity sim(M0, Iu), sim(M1, Iu), …, sim(Mn, Iu) between the user interests and the actual and candidate posts and recover the conditional probability P(retweet(u, M)|sim(M, Iu)) that u retweets a post with memes M given the similarity between the memes and her user interests. We turn to the Maximum Information Path similarity measure31,32 that considers shared memes but discounts the more common ones:

where x is a meme and f(x) the proportion of messages about x.

Fig. 3 shows that users are more likely to retweet memes about which they posted in the past (Pearson correlation coefficient ρ = 0.98). This suggests that memory is an important ingredient for a model of meme competition and we explicitly take this aspect into account in the model presented below.

Figure 3
figure 3

Relationship between the probability of retweeting a message and its similarity to the user interests, inferred from prior posting behavior.

Empirical regularities

In Fig. 4 we observe several regularities in the empirical data. We first consider meme lifetime, defined as the maximum number of consecutive time units in which posts about the meme are observed; meme popularity, defined as the number of users per day who tweet about a meme, measured over a given time period; and user activity, defined as the number of messages per day posted by a user, measured over a time period. These three quantities all display long-tailed distributions (Fig. 4(a,b,c)). The excellent collapse of the curves demonstrates that the distributions are robust even if measured over different time units or observed over different periods of time. We further measure the breadth of user attention, defined earlier through the meme entropy. Although the entropy distribution is peaked, some users have broad attention while others are very focused (Fig. 4(d)). This distribution is also robust with respect to different periods of time.

Figure 4
figure 4

Empirical regularities in Twitter data.

(a) Probability distribution of the lifetime of a meme using hours (red circles), days (blue squares) and weeks (green triangles) as time units. In the plot, units are converted into hours. Since the distributions are well approximated by a power law, we can align the curves by rescaling the y-axis by λ–α, where λ is the ratio of the time units (e.g., λ = 24 for rescaling days into hours) and α ≈ 2.5 is the exponent of the power law (via maximum likelihood estimation33). This demonstrates that the shape of the lifetime distribution is not an artifact of the time unit chosen to define the lifetime. (b) Complementary cumulative probability distribution of the popularity of a meme, measured by the total number of users per day who have used that meme. This and the following measures were performed daily (filled red circles), weekly (filled blue squares) and monthly (filled green triangles). (c) Complementary cumulative probability distribution of user activity, measured by the number of messages per day posted by a user. (d) Probability distribution of breadth of user attention (entropy), based on the memes tweeted by a user. Note that the larger the number of posts produced, the smaller the non-zero entropy values recorded for users who focus on a small set of memes. This explains why the distributions for longer periods of time extend further to the left.

All of these empirical findings point to extremely heterogenous behaviors; some memes are extremely successful (popular and persistent), while the great majority die quickly. A small fraction of memes therefore account for the great majority of all posts. Likewise, a small fraction of users account for most of the traffic. These heterogeneities can in principle be attributed to a variety of causes. The broad distributions of meme popularity could result from a diversity in some intrinsic meme value, with “important” memes attracting more attention. Long-lived memes might be sustained exogenously by traditional media and real-world events. User activity and breadth of attention distributions could be a reflection of innate behavioral differences. What is, then, a minimal set of assumptions necessary to interpret this empirical data? One way to tackle this question is to start from a minimalist model of information spreading that assumes none of the above externalities. In particular we will explore to what extent the statistical features of memes and users can be accounted by the limited attention capacity of the users coupled with the heterogeneity of their social connections.

Model description

Our basic model assumes a frozen network of agents. An agent maintains a time-ordered list of posts, each about a specific meme. Multiple posts may be about the same meme. Users pay attention to these memes only. Asynchronously and with uniform probability, each agent can generate a post about a new meme or forward some of the posts from the list, transmitting the corresponding memes to neighboring agents. Neighbors in turn pay attention to a newly received meme by placing it at the top of their lists. To account for the empirical observation that past behavior affects what memes the user will spread in the future, we include a memory mechanism that allows agents to develop endogenous interests and focus. Finally, we model limited attention by allowing posts to survive in an agent's list or memory only for a finite amount of time. When a post is forgotten, its associated meme become less represented. A meme is forgotten when the last post carrying that meme disappears from the user's list or memory. Note that list and memory work like first-in-first-out rather than priority queues, as proposed in models of bursty human activity34. In the context of single-agent behavior, our memory mechanism is reminiscent of the classic Yule-Simon model\cite{yule-simon43, Cattuto3001200744}.

The retweet model we propose is illustrated in Fig. 5. Agents interact on a directed social network of friends/followers. Each user node is equipped with a screen where received memes are recorded and a memory with records of posted memes. An edge from a friend to a follower indicates that the friend's memes can be read on the follower's screen (#x and #y in Fig. 5(a) appear on the screen in Fig. 5(b)). At each step, an agent is selected randomly to post memes to neighbors. The agent may post about a new meme with probability pn (#z in Fig. 5(b)). The posted meme immediately appears at the top of the memory. Otherwise, the agent reads posts about existing memes from the screen. Each post may attract the user's attention with probability pr (the user pays attention to #x, #y in Fig. 5(c)). Then the agent either retweets the post (#x in Fig. 5(c)) with probability 1 − pm, or tweets about a meme chosen from memory (#v triggered by #y in Fig. 5(c)) with probability pm. Any post in memory has equal opportunities to be selected, therefore memes that appear more frequently in memory are more likely to be propagated (the memory has two posts about #v in Fig. 5(d)). To model limited user attention, both screen and memory have a finite capacity, which is the time in which a post remains in an agent's screen or memory. For all agents, posts are removed after one time unit, which simulates a unit of real time, corresponding to Nu steps where Nu is the number of agents. If people use the system once weekly on average, the time unit corresponds to a week.

Figure 5
figure 5

Illustration of the meme diffusion model.

Each user has a memory and a screen, both with limited size. (a) Memes are propagated along follower links. (b) The memes received by a user appear on the screen. With probability pn, the user posts a new meme, which is stored in memory. (c) Otherwise, with probability 1 – pn, the user scans the screen. Each meme x in the screen catches the user's attention with probability pr. Then with probability pm a random meme from memory is triggered, or x is retweeted with probability 1 – pm. (d) All memes posted by the user are also stored in memory.

Simulation results

The model has three parameters: pn regulates the amount of novelty that enters the system (number of cascades), pr determines the overall retweet activity (size of cascades) and pm accounts for individual focus (diversity of user interests). We estimated all three directly from the empirical data (see Methods).

The social network underlying the meme diffusion process is a critical component of the model. To obtain a network of manageable size while preserving the structure of the actual social network, we sampled a directed graph with 105 nodes from the Twitter follower network (details in Methods). The nodes correspond to a subset of the users who generated the posts in our empirical data. To evaluate the predictions of our model, we compare them with empirical data that includes only the retweets of the same subset of users. To study the role played by the network structure in the meme diffusion process, we also simulated the model on a random Erdös-Rényi (ER) network with the same number of nodes and edges. As shown in Fig. 6, the model captures the main features of the empirical distributions of meme lifetime and popularity, user activity and breadth of user attention. The comparison with the corresponding distributions generated using the ER network shows that in general, the heterogeneity of the observed quantities is greatly reduced when memes spread on a random network. This is not unexpected. Consider for example meme popularity (Fig. 6(b)); the real social network has a broad (scale free, not shown) distribution of degree, with a consistent number of hub users who have a large number of followers. Memes spread by these users are likely to achieve greater popularity. This does not happen in the ER network where the degree distribution is narrow (Poissonian). The difference observed in the distribution of breadth of user attention, for both low and high entropy values (Fig. 6(d)), may be explained by the heterogeneity in the number of friends. Users with few friends may have low breadth of attention while those with many friends are exposed to many memes and thus may exhibit greater entropy.

Figure 6
figure 6

Evaluation of model by comparison of simulations with empirical data (same panels and symbols as in Fig. 4).

To study the role played by the network structure in the meme diffusion process, we simulate the model on the sampled follower network (solid black line) and a random network (dashed red line). Both networks have 105 nodes and about 3 × 106 edges. (a) The definition of lifetime uses the week as time unit. (b,c,d) Meme popularity, user activity and user entropy data are based on weekly measures.

The second key ingredient of our model is the competition among memes for limited user attention. To evaluate the role of such a competition on the meme diffusion process, we simulated variations of the model with stronger or weaker competition. This was accomplished by tuning the length tw of the time window in which posts are retained in an agent's screen or memory. A shorter time window (tw < 1) leads to less attention and thus increased competition, while a longer time window (tw > 1) allows for attention to more memes and thus less competition. As we can observe in Fig. 7, stronger competition (tw = 0.1) fails to reproduce the large observed number of long-lived memes (Fig. 7(a)). Weaker competition (tw = 5), on the other hand, cannot generate extremely popular memes (Fig. 7(b)) nor extremely active users (Fig. 7(c)).

Figure 7
figure 7

Evaluation of model by comparison of simulations with empirical data (same panels and symbols as in Fig. 4).

To study the role of meme competition, we simulate the model on the sampled follower network with different levels of competition; posts are removed from screen and memory after tw time units. We compare the standard model (tw = 1, solid black line) against versions with less competition (tw = 5, dot-dashed magenta line) and more competition (tw = 0.1, dashed red line). (a) The definition of lifetime uses the week as time unit. (b,c,d) Meme popularity, user activity and user entropy data are based on weekly measures.

We also simulated our model without user interests, by setting pm = 0. The most noticeable difference in this case is the lack of highly focused individuals. Users have no memory of their past behavior and can only pay attention to memes from their friends. As a result, the model fails to account for low entropy individuals (not shown but similar to the random network case in Fig. 6(d)).

Discussion

The present findings demonstrate that the combination of social network structure and competition for finite user attention is a sufficient condition for the emergence of broad diversity in meme popularity, lifetime and user activity. This is a remarkable result: one can account for the often-reported long-tailed distributions of topic popularity and lifetime7,12,14,29 without having to assume exogenous factors such as intrinsic meme appeal, user influence, or external events. The only source of heterogeneity in our model is the social network; users differ in their audience size but not in the quality of their messages.

Our model is inspired by the long tradition that represents information spreading as an epidemic process, where infection is passed along the edges of the underlying social network35,36,37,7,28,12.

In the context of social media, several authors explored the temporal evolution of popularity. Wu and Huberman8 studied the decay in news popularity. They showed that temporal patterns of collective attention are well described by a multiplicative process with a single novelty factor. While the decay in popularity is attributed to competition for attention, the underlying mechanism is not modeled explicitly. Crane and Sornette10 introduced a model to describe the exogenous and endogenous bursts of attention toward a video, by combining an epidemic spreading process with a forgetting mechanism. Hogg and Lerman38 proposed a stochastic model to predict the popularity of a news story via the intrinsic interest of the story and the rates at which users find it directly and through friends. These models describe the popularity of a single piece of information and are therefore unsuitable to capture the competition for our collective attention among multiple simultaneous information epidemics. Although recent epidemiological models have started considering the simultaneous spread of competing strains39,40, our framework is the first attempt to deal with a virtually unbounded number of new “epidemics” that are continuously injected into the system. A closer analogy to our approach is perhaps provided by neutral models of ecosystems, where individuals (posts) belonging to different species (memes) produce offspring in an environment (our collective attention) that can sustain only a limited number of individuals. At every generation, individuals belonging to new species enter the ecosystem while as many individuals die as needed to maintain the sustainability threshold41.

Since Simon’s seminal paper4, the economy of attention has been an enormously popular notion, yet it has always been assumed implicitly and never put to the test. Our model provides a first attempt to focus explicitly on mechanisms of competition and to evaluate the quantitative effects of making attention more scarce or abundant.

Our results do not constitute a proof that exogenous features, like intrinsic values of memes, play no role in determining their popularity. However we have shown that at the statistical level it is not necessary to invoke external explanations for the observed global dynamics of memes. This appears as an arresting conclusion that makes information epidemics quite different from the basic modeling and conceptual framework of biological epidemics. While the intrinsic features of viruses and their adaptation to hosts are extremely relevant in determining the winning strains, in the information world the limited time and attention of human behavior are sufficient to generate a complex information landscape and define a wide range of different meme spreading patterns. This calls for a major revision of many concepts commonly used in the modeling and characterization of meme diffusion and opens the path to different frameworks for the analysis of competition among ideas and strategies for the optimization/suppression of their spread.

Methods

The data analyzed in this paper was obtained through Twitter's public APIs. We collected more than 120 millions retweets from October 2010 to January 2011, involving 12.5 million distinct users and 1.3 million hashtags. Each post contains information about who generated and who retweeted it. As expected in a social network, the follower graph has scale-free degree distributions.

Due to the size of the empirical follower network, we sampled a manageable subset for our simulations. The sampling procedure was a random walk with occasional restarts from random locations (teleportation factor 0.15). Though no sampling method is perfect, the modified random walk is efficient in terms of API queries and reproduces the salient topological features of the sampled network42. The sampled network has 105 nodes and about 3×106 edges. The empirical retweets generated by the users in the sample display trends similar to those from the entire dataset, therefore we expect the model predictions to be consistent not only with the sample but also with the full dataset.

The parameter pn characterizes the probability of tweeting about a new meme. To estimate this parameter from the empirical data, we examine whether each hashtag has been observed in previous time units (weeks). The proportion of posts with new hashtags is approximately 0.45 ± 0.05. We thus set pn = 0.45 for all the simulations. For each simulation — standard model, model with underlying random network and models with strong and weak competition — the parameter pr is tuned to capture the average number of posted memes per user per unit time (Table 1). Finally, the parameter pm represents the proportion of all memes tweeted by an individual that match the content of the memory. To estimate it from the empirical data, we compare each hashtag with those produced by a user in the previous time unit (week). Using the average value across all users (0.4 ± 0.01) we set pm = 0.4.

Table 1 Parameter settings for different simulations. “Avg user activity” is the average number of posts per user per time unit