Introduction

Spreading of opinions, frauds, behavioural patterns, and product adoptions are all examples of social contagion phenomena where collective patterns emerge due to correlated decisions of a large number of individuals. Although these choices are personal, they are not independent but potentially driven by several processes such as social influence1, homophily2, and information arriving from external sources like news or mass media3. Social contagion evolves over networks of interconnected individuals, where links associated with social ties transfer influence between peers4. Several earlier studies aimed to identify the dominant mechanisms at play in social contagion processes5,6,7,8. One key element, termed behavioural threshold by Granovetter6, is defined as “the number or proportion of others who must make one decision before a given actor does so”. Following this idea various network models have been introduced9,10,11,12,13,14,15 to understand the threshold-driven spreading, commonly known as complex contagion16. Although these models are related to a larger set of collective dynamics, they are particularly different from simple contagion where the exposure of nodes is driven by independent contagion stimuli17,18. In addition, collective adoption patterns may appear as a consequence of homophilic structural correlations, where connected individuals adopt due to their similar interests and not due to direct social influence. Distinguishing between the effects of social influence and homophily at the individual level remains as a challenge19,20. Furthermore, in real social spreading phenomena all these mechanisms are arguably present. However, while in the case of homophily the adoption behaviour is only seemingly correlated, and for simple contagion only the number of exposures matters, in complex contagion the fraction of adopting neighbours relative to the total number of partners determines whether a node adopts or not, capturing the natural mechanisms involved in individuals’ decision makings21,22,23. Due to this additional complexity, threshold models are able to emulate system-wide adoption patterns known as global cascades. Regarding the effects of social influence several assumptions have been proposed about its functional dependency on the number of influencers. While Granovetter and others6,9 suggest a simple linear correspondence, e.g. observed in large techno-social systems21, from the theory of Latané24 one could derive a non-linear dependency as recently demonstrated by small-scale online experiments1.

Behavioural cascades are rare but potentially stupendous social spreading phenomena, where collective patterns of exposure emerge as a consequence of small initial perturbations. Some examples are the rapid emergence of political and grass-root movements25,26,27, fast spreading of information12,28,29,30,31,32,33,34 or behavioural patterns35, etc. The characterisation34,36,37,38,39,40 and modelling9,41,42,43 of such processes have received plenty of attention and provide some basic understanding of the conditions and structure of empirical and synthetic cascades on various types of networks44,45,46,47. However, these studies commonly fail in addressing the temporal dynamics of the emerging cascades, which may vary considerably between different cases of social contagion. Moreover, they have not answered why real-world cascades can evolve through various dynamic pathways ranging from slow to rapid patterning, especially in systems where the threshold mechanisms play a role and social phenomena spread globally. Besides the case of rapid cascading mentioned above, an example of the other extreme is the propagation of products in social networks18, where adoption evolves gradually even if it is driven by threshold mechanisms and may cover a large fraction of the total population21. This behaviour characterises the adoption of online services such as Facebook, Twitter, LinkedIn and Skype (Fig. 1a), since their yearly maximum relative growth of cumulative adoption48 (for definition see Appendix) is lower than in the case of rapid cascades as suggested e.g. by the Watts threshold (WT) model.

Figure 1: Structure and dynamics of online service adoption.
figure 1

(a) Yearly maximum relative growth rate (RGR) of cumulative adoption (see Appendix) for several online social-communication services48, including three Skype paid services (s1 - “subscription”, s2 - “voicemail”, and s3 - “buy credit”). The red bar corresponds to a rapid cascade of adoption suggested by the Watts threshold (WT) model, while the green bar is the model prediction for Skype s3. (b,c) Snowball sample of the Skype social network (gray links) with nodes and links coloured according to their adoption state: multiple innovators (green nodes), induced small vulnerable trees (red nodes and links), and the triggered connected stable cluster (blue nodes and links). Note that some vulnerable and stable clusters seemingly appear without an innovator seed due to the finite distance used in the snowball sampling method. (d) Degree distribution P(k) of the Skype network (gray/blue circles for raw/binned data) on double log-scale with arbitrary base n. P(k) is fitted by a lognormal distribution (see Appendix and SI) with parameters μD = 1.2 and σD = 1.39, and average z = 8.56 (red line). (e) Distribution Pk) of integer thresholds Φk for several degree groups in Skype s3 (inset). By using Pk, k) = kPk/k), these curves collapse to a master curve approximated by a lognormal function (dashed line in main panel) with parameters μT = −2 and σT = 1, as constrained by the average threshold w = 0.19 (see Appendix and SI). (f) Adoption rate of innovators [Ri(t)], vulnerable nodes [Rv(t)], and stable nodes [Rs(t)], as well as net service adoption rate [R(t)]. Rates are measured with a 1-month time window, while q and τ are arbitrary constants. The shaded area indicates the regime where innovators adopt approximately with constant rate.

To fill this gap in the modelling of social diffusion, here we will analyse and model real-world examples of social contagion phenomena. Our aim is to identify the crucial mechanisms necessary to consider in models of complex contagion to match them better with reality, and define a model that incorporates these mechanisms and captures the possible dynamics leading to the emergence of real-world global cascades. We follow the adoption dynamics of the Skype paid service “buy credit” for 89 months since 2004, which evolves over the social network of one of the largest voice over internet providers in the world. Data includes the time of first payment of each user, an individual and conscious action that tracks adoption behaviour. In addition we follow the “subscription” service over 42 months since 2008 (for results see Supplementary Information [SI]). In contrast to other empirical studies where incomplete knowledge about the underlying social network leads to unavoidable bias21, we use here the largest connected component of the aggregated free Skype service as the underlying structure, where nodes are Skype users and links confirmed contacts between them. This is a good approximation since it maps all connections in the Skype social network without sampling, and the paid service is only available for individuals already enrolled in the Skype network. Also note that the service adoption process evolves in a considerably faster time-scale than the underpinning social network. This way applying a time-scale separation, and considering the network to be static, may provide a good first approximation here. The underlying structure is an aggregate from September 2003 to November 2011 (i.e. over 99 months) and contains roughly 4.4 billion links and 510 million registered users worldwide49. The data is fully anonymised and considers only confirmed connections between users (for more data details see SI).

In what follows we first provide empirical evidence of the distribution of individual adoption thresholds and other structural and dynamical features of a worldwide adoption cluster. We incorporate the observed structural and threshold heterogeneities into a dynamical threshold model where multiple nodes adopting spontaneously (i.e. firstly among their neighbours) are allowed50. We find that if the fraction of users who reject to adopt the product is large, the system enters a quenched state where the evolution and structure of the global adoption cluster is very similar to our empirical observations. Model calculations and the analysis of the real social contagion process suggest that the evolving structure of an adoption cluster differs radically from what has been proposed earlier9, since it is triggered by several spontaneous adoptions arriving at a constant rate, while stable adopters who are initially resisting exposure, are actually responsible for the emergence of global social adoption (Fig. 1b,c).

Results

Social contagion phenomena can be modelled as binary-state processes evolving on networks and driven by threshold mechanisms. In these systems individuals are represented by nodes, each being either in a susceptible (0) or adopter (1) state and influencing each other by transferring information via social ties6. Nodes are connected in a network with degree distribution P(k) and average degree z = 〈k〉. In addition, each node has an individual threshold ϕ [0, 1] drawn from a distribution P(ϕ) with average w = 〈ϕ〉. This threshold determines the minimum fraction of exposed neighbours that triggers adoption and captures the resistance of an individual against engaging in spreading behaviour. Once a node reaches its threshold, it switches state from 0 to 1 and keeps it until the end of the dynamics. In his seminal paper about threshold dynamics, Watts9 classified nodes into three categories based on their threshold and degree. He identified innovator nodes that spontaneously change state to 1, thus starting the process. Such nodes have a trivial threshold ϕ = 0. Then there are nodes with threshold 0 < ϕ ≤ 1/k, called vulnerable, which need one adopting neighbour before their own adoption. Finally, there are more resilient nodes with threshold ϕ > 1/k, denoted as stable, referring to individuals in need of strong social influence to follow the actions of their acquaintances.

In the WT model9, small perturbations (like the spontaneous adoption of a single seed node) can trigger global cascading patterns. However, their emergence is subject to the so-called cascade condition: the innovator seed has to be linked to a percolating vulnerable cluster, which adopts immediately afterwards and further triggers a global cascade (i.e. a set of adopters larger than a fixed fraction of the finite network). The cascade condition is satisfied if the network is inside a bounded regime in (w, z)-space9. This regime depends on degree and threshold heterogeneities9 and may change its shape if several innovators start the process42. In addition, while models with more sophisticated social influence function can be introduced24,51 the original linear-threshold assumption proposed by Watts and Granovetter seems to be sufficient to interpret our observations.

Empirical observations

Degree and threshold heterogeneities are indeed present in the social network of Skype. The degree distribution P(k) is well approximated by a lognormal function (k ≥ kmin) with parameters μD = 1.2, σD = 1.39 and kmin = 1 (Fig. 1d), giving an average degree z = 8.56 (for goodness of fit see SI). Moreover, at the time of adoption we can measure the threshold ϕ = Φk/k of a user by counting the number Φk of its neighbours who have adopted the service earlier. We then group users by degree and calculate the distribution Pk) of the integer threshold Φk38 (Fig. 1e). By using the scaling relation Pk, k) = kPk/k) all distributions collapse to a master curve well approximated by a lognormal function , with parameters μT = −2 and σT = 1 as constrained by the average threshold w = 0.19 (see Appendix and SI). Note that we observe qualitatively the same scaling and lognormal shape of the threshold distribution for another service (see SI). These empirical observations, in addition to the broad degree distribution, provide quantitative evidence about the heterogeneous nature of adoption thresholds.

Since we know the complete structure of the online social network, as well as the first time of service usage for all adopters, we can follow the temporal evolution of the adoption dynamics. By counting the number of adopting neighbours of an ego, we identify innovators (Φk = 0), and vulnerable (Φk = 1) or stable (Φk > 1) nodes. The adoption rates for these categories behave rather differently from previous suggestions9 (Fig. 1f). First, there is not only one seed but an increasing fraction of innovators in the system who, after an initial period, adopt approximately at a constant rate. Second, vulnerable nodes adopt approximately with the same rate as innovators suggesting a strong correlation between these types of adoption. This stationary behaviour is rather surprising as environmental effects, like competition or marketing campaigns, potentially influence the adoption dynamics. Nevertheless, this pattern appears to be consistent among the two investigated services (also see Fig. S5 in SI), where innovator and vulnerable adoptions evolve with approximately constant rates beyond statistical and periodic (circadian, monthly, yearly) fluctuations. On the other hand, the overall adoption process accelerates due to the increasing rate of stable adoptions induced by social influence. At the same time a giant adoption cluster grows and percolates through the whole network (Fig. 3a, main panel). Despite of this expansion dynamics and connected structure of the service adoption cluster, the service reaches less than 6% of the total number of active Skype users over a period of 7 years49. Therefore we ask whether one can refer to these adoption clusters as cascades. They are not triggered by a small perturbation but induced by several innovators; their evolution is not instantaneous but ranges through several years; and although they involve millions of individuals, they reach only a reduced fraction of the whole network. To answer we incorporate the above mentioned features into a dynamical threshold model50 with a growing group of innovators and investigate their effect on the evolution of global social adoption. Note that although we cannot follow the direct pathways of social influence, we perform a null model study to demonstrate at the system level that social influence is present and dominates the contagion process, as compared to effects of homophily (see section S3 of the SI, together with another empirical spreading scenario in S7.1).

Figure 2: Threshold model for the adoption of online services.
figure 2

(a,b) Surface plot of the normalised fraction of adopters ρ/(1 − r) in (w, z)-space, for r = 0.73 and t = 89. Contour lines signal parameter values for which 20% of non-immune nodes have adopted, for fixed r and varying time (a), and for fixed time and varying r. (b) The continuous contour line and dot indicate parameter values in the last observation of Skype s3. A regime of maximal adoption (ρ ≈ 1 − r) grows as time goes by, and shrinks for larger r. (c) Time series of the fraction of adopters ρ for fixed pn = 0.00019 and varying r (main), and for fixed r = 0 and varying pn (inset). These curves are well approximated by the solution of Eq. (1) for k0 = 3, kM−1 = 150 and M = 25 (dashed lines). The dynamics is clearly faster for larger pn. As r increases, the system enters a regime where the dynamics is slowed down and adopters are mostly innovators. (d) Final fraction of innovators ρ0(∞) and time tc when 50% of non-immune nodes have adopted as a function of r, both simulated and theoretical. The crossover to a regime of slow adoption is characterised by a maximal fraction of innovators and time tc. Unless otherwise stated, pn = 0.00019 and we use N = 104, μD = 1.09, σD = 1.39, kmin = 1, μT = −2, and σT = 1 to obtain z = 8.56 and w = 0.19 as in Skype s3. The difference in μD between data and model is due to finite-size effects (see Appendix). Numerical results are averages over 102 (a-b) and 103 (c-d) realisations.

Figure 3: Empirical cluster statistics and simulation results.
figure 3

(a) Empirical connected-component size distribution at different times for the adoption [P(sa), main panel] and stable adoption [P(ss), inset] networks, with sa and ss relative to system size. (b) Empirical connected-component size distribution P(sv) for the relative size of innovator-induced vulnerable trees at different times. (c) Average size of the largest (LC) and 2nd largest (LC2nd) components of the model network (‘Net’), adoption network (‘Casc’), stable network (‘Stab’), and induced vulnerable trees (‘Vuln’) as a function of r. Dashed lines show the observed relative size of the real LC of the adopter network in 2011 [see main panel in (a)] and the predicted r value. (d) Distribution P(d) of depths of induced vulnerable trees in both data and model for several r values, showing a good fit with the data for r = 0.73. The difference in the tail is due to finite-size effects. (e) Correlation 〈sv〉(k) between innovator degree and average size of vulnerable trees in both data and model with the same r values as in (e). Model calculations for (d,e) correspond to networks of size N = 106 and are averaged over 102 realisations.

Model

Our modelling framework is an extension to conventional threshold dynamics on networks studied by Watts, Gleeson, Singh, and others, where all nodes are initially susceptible and innovators are only introduced as an initial seed of arbitrary size9,15,42,43. Apart from the threshold rule discussed above, our model considers two additional features: (i) a fraction r of ‘immune’ nodes that never adopt, indicating lack of interest in the service; (ii) due to external influence, susceptible nodes adopt the innovation spontaneously (i.e. become innovators) throughout time with constant rate pn, rather than only at the beginning of the dynamics. In this way, the dynamical evolution of the system is completely defined by the online social network, the distribution P(ϕ) and the parameters r, pn. For the sake of simplicity we consider a configuration-model network, i.e., we ignore correlations in the social network and characterise it solely by its degree distribution P(k). Furthermore, node degrees and thresholds are considered to be independent38,52,53. We remark that somewhat similar concepts called “stubborn nodes” mimicking individuals’ resistance against adoption54,55, and “global nodes” capturing adoptions driven by external effects56 have been considered in threshold models with a rich variety of effects on cascading behaviour.

Our threshold model, which has also been introduced in50, can be studied analytically by extending the framework of approximate master equations (AMEs) for monotone binary-state dynamics recently developed by Gleeson38,52,53, where the transition rate between susceptible and adoption states only depends on the number m of network neighbours that have already adopted. We describe a node by the property vector k = (k, c), where k = k0, k1, …kM−1 is its degree and c = 0, 1, …, M its type, i.e. c = 0 is the type of the fraction r of immune nodes, while c ≠ 0 is the type of all non-immune nodes that have threshold ϕc. In this way P(ϕ) is substituted by the discrete distribution of types P(c) (for c > 0). The integer M is the maximum number of degrees (or non-zero types) considered in the AME framework, which can be increased to improve the accuracy of the analytical approximation at the expense of speed in its numerical computation (see S4.2). Under these conditions, the AME system describing the dynamics of the threshold model is reduced to the pair of ordinary differential equations (see SI),

where ρ(t) is the fraction of adopters in the network, ν(t) is the probability that a randomly chosen neighbour of a susceptible node is an adopter, and the initial conditions are ρ(0) = ν(0) = 0. Here,

and,

where , pr = pn/(1 − r), and is the binomial distribution. The fraction of adopters ρ is then obtained by solving Eq. (1) numerically. Since susceptible nodes adopt spontaneously with rate pn, the fraction of innovators ρ0(t) in the network is given by (see S4.3),

We also implement the threshold model numerically via a Monte Carlo simulation in a network of size N, with a lognormal degree distribution and a lognormal threshold distribution as observed empirically. Thus, we can explore the behaviour of ρ and ρ0 as a function of z, w, pn and r, both in the numerical simulation and in the theoretical approximation given by Eqs (1) and (4). For pn > 0 some nodes adopt spontaneously as time passes by, leading to a frozen state characterised by a final fraction ρ(∞) = 1 − r of adopters. However, the time needed to reach such state depends heavily on the distribution of degrees and thresholds, as signalled by a region of large adoption (ρ ≈ 1 − r) that grows in (w, z)-space with time (contour lines in Fig. 2a). If we fix a time in the dynamics and vary the fraction of immune nodes instead, this region shrinks as r increases (contour lines in Fig. 2b). In other words, the set of networks (defined by their average degree and threshold) that allow the spread of adoption is larger at later times in the dynamics, or when the fraction of immune nodes is small. When both t and r are fixed, the normalised fraction of adopters ρ/(1 − r) gradually decreases for less connected networks with larger thresholds (surface plot in Fig. 2a,b).

For r ≈ 0 the critical fraction of innovators necessary to trigger a cascade of fast adoption throughout all susceptible nodes may be identified as the inflection point in the time series of ρ (Fig. 2c, inset). The adoption cascade appears sooner for larger pn, since this parameter regulates how quickly the critical fraction of innovators is reached. Yet as we increase r above a threshold rc, the system enters a regime where rapid cascades disappear and adoption is slowed down. The crossover between these regimes is gradual, as seen in the shape of ρ for increasing r (Fig. 2c, main panel). We may identify rc in various ways: by the maximum in both the final fraction of innovators ρ0(∞) and the critical time tc when ρ = (1 − r)/2 (Fig. 2d), or as the r value where the inflection point in ρ disappears. These measures indicate rc ≈ 0.8 for the chosen parameters. All global properties of the dynamics (like the functional dependence of ρ and ρ0) are very well approximated by the solution of Eqs (1) and (4) (dashed lines in Fig. 2c,d). Indeed, the AME framework is able to capture the shape of the ρ time series, the crossover between regimes of fast and slow adoption, as well as the maximum in ρ0(∞) and tc.

Validation

To better understand how innovation spreads throughout real social networks, we take a closer look at the internal structure of the service adoption process. By taking into account individual adoption times we construct an evolving adoption network with links between users who have adopted the service before time t and are connected in the social structure. In order to avoid the effect of instantaneous group adoptions (evidently not driven by social influence), we only consider links between nodes who are neighbours in the underlying social network and whose adoption did not happen at the same time. This way links in the adoption graph indicate ties where social influence among individuals could have existed. The size distribution P(sa) of connected components in the adoption network shows the emergence of a giant percolating component over time (Fig. 3a), along with several other small clusters. Moreover, after decomposition we observe that the giant cluster does not consist of a single innovator seed and percolating vulnerable tree9, but builds up from several innovator seeds that induce small vulnerable trees locally (Fig. 3b), each with small depth (Fig. 3d)34,57. At the same time the stable adoption network (considering connections between all stable adopters at the time) has a giant connected component, indicating the emergence of a percolating stable cluster with size comparable to the largest adoption cluster (Fig. 3a, inset). These observations suggest a scenario for the evolution of the global adoption component different from earlier threshold models9. It appears that here multiple innovators adopt at different times and trigger local vulnerable trees (Fig. 1b), which in turn induce a percolating component of connected stable nodes that holds the global adoption cluster together (Fig. 1c). Consequently, in the structure of the adoption network primary triggering effects are important only locally, while external and secondary triggering mechanisms seem to be responsible for the emergence of global-scale adoption.

To model the observed dynamics and explore the effect of immune nodes, we perform extensive numerical simulations of the threshold model with parameters determined directly from the data (see Appendix and SI). We use a network structure with empirical degree and threshold distributions and fix pn = 0.00019 as the constant rate of innovators, implying that the time scale of a Monte Carlo iteration in the model is 1 month. We measure the average size of the largest (LC) and second largest (LC2nd) connected components of the background social network, and of the stable, vulnerable and global adoption networks, as a function of the fraction of immune nodes r. After T = 89 iterations (matching the length of the real observation period) we identify three regimes of the dynamics (Fig. 3c): if 0 < r < 0.6 (dark-shaded area) the spreading process is very rapid and evolves in a global cascade, which reaches most of the nodes of the shrinking susceptible network in a few iteration steps. About 10% of adopters are connected in a percolating stable cluster, while vulnerable components remain very small in accordance with empirical observations. In the crossover regime 0.6 < r < 0.8 (light-shaded area), the adoption process slows down considerably (Fig. 2d, lower panel), as stable adoptions become less likely due to the quenching effect of immune nodes. The adoption process becomes the slowest at rc = 0.8 (Fig. 2d, lower panel) when the percolating stable cluster falls apart, as demonstrated by a peak in the corresponding LC2nd curve (Fig. 3c, lower panel). Finally, around r = 0.9 the adoption network becomes fragmented and no global diffusion takes place. We repeat the same calculations for another service and find qualitatively the same picture, but with the crossover regime shifted towards larger r values due to the different parametrisation of the model process. Note that another possible reason for the slow adoption could be the time users wait between their threshold has been reached and actual adoption. We test for the effect of this potential scenario on the empirical curves but find no qualitative change in the dynamics (see SI).

We can use these calculations to estimate the only unknown parameter (the fraction r of immune nodes in Skype) by matching the size of the largest component (LCNet) between real and model adoption networks at time T. Empirically, this value is the relative size corresponding to the last point on the right-hand side of the distribution for 2011 (Fig. 3a, main panel). The corresponding value in the model is r = 0.73 (dashed lines in Fig. 3c; also Fig. 2a,b), suggesting that the real adoption process lies in the crossover regime. The other analysed service turns out to lie right of the crossover regime, which explains its large innovator adoption rate and reduced size of stable and vulnerable adoption clusters (see SI).

To test the validity of the prediction of r we perform three different calculations. First we measure the maximum relative growth rate of cumulative adoption and find a good match between model and data (Skype s3 and Model Skype s3 in Fig. 1a). In other words, the model correctly estimates the speed of the adoption process. Second, we measure the distribution P(d) of depths of induced vulnerable trees (Fig. 3d). Finally, in order to verify earlier theoretical suggestions42, we look at the correlation 〈sv〉(k) between the degree of innovators and the average size of vulnerable trees induced by them (Fig. 3e). We perform the last two measurements on the real data and in the model process for r = 0.6 and 0.9, as well as for the predicted value r = 0.73. In the case of 〈sv〉(k), we find a strong positive correlation in the data, explained partially by degree heterogeneities in the underlying social network, but surprisingly well emulated by the model. More importantly, although both quantities appear to scale with r, measures for the estimated r value fit the empirical data remarkably well, confirming our earlier validation based on the matching of relative component sizes (for further discussion see SI).

Discussion

Although some products and innovations diffusing in society may cover a large fraction of the population, their spreading tends to follow slow cascading patterns, the dynamics of which have been modelled before by simple diffusion models like that of Bass18. However, this approach neglects threshold mechanisms that arguably drive the decision making of single individuals. On the other hand, threshold models study the conditions for cascades in global diffusion but do not address their temporal evolution, which is clearly a relevant factor in real-world adoption processes. These models are commonly used to predict rapid cascading patterns of adoption, which is a more realistic scenario for the spreading of information, opinions, or behavioural patterns but are not observed in the case of product or innovation diffusion where adoption requires additional efforts, e.g., free or paid registration. Here we provide a solution for this conundrum by analysing and modelling the worldwide spread of an online service in the techno-social communication network of Skype. Beyond the novel empirical evidence about heterogeneous adoption thresholds and non-linear dynamics of the adoption process, we identify two additional components necessary to introduce in the modelling of product adoption, namely: (a) a constant flow of innovators, which may induce rapid adoption cascades even if the system is initially out of the cascading regime; and (b) a fraction of immune nodes that forces the system into a quenched state where adoption slows down. These features are responsible for a critical structure of empirical adoption components that radically differs from previous theoretical expectations. We incorporate these mechanisms into a threshold model controlled by the rate of innovators and the fraction of immune nodes. The model contains several simplifying assumptions, e.g., about the functional form of social influence, the uniformity of nodes with the same degree, or about ignoring homophily in the adoption process, however, it is able to reproduce several pathways ranging from cascading behaviour to more realistic dynamics of innovation adoption. By constraining the model with empirically determined parameters, we provide an estimate for the real fraction of susceptible agents in the social network of Skype, and validate this prediction through correlated structural features matching empirical observations.

Our aim in this study was to provide empirical observations as well as methods and tools to model the dynamics of social contagion phenomena with the hope it will foster thoughts for future research. One possible direction would be the observation of the reported structure and evolution of the global adoption cluster in other systems similar to the ones studied in26,28,29,34,36,57. Other promising directions could be the consideration of homophilic or assortative structural correlations, or the evolving nature of the underpinning social network with timely created and dissolved social ties (as studied in21), or the effects of interpersonal influence or leader-follower mechanisms on the social contagion process. Finally, we hope that the reported results may improve efficiency in the strategies of enhancing the diffusion of products and innovations, by shifting attention from the creation of short-lived perturbations to the sustenance of external input.

Material and Methods

Data description

We use a static representation of the Skype social network aggregated over 99 months between September 2003 and November 2011. We follow the adoption of the “buy credit” paid service for 89 months starting from 2004, and the paid service “subscription” for 42 months starting from 2008 (for further details about the network and service see SI). By considering the online social structure and adoption times, we identify users as innovator, vulnerable, or stable nodes based on the number Φk of adopting neighbours at the time of exposure. Thresholds are calculated as ϕ = Φk/k for users with k contacts. The adoption network is constructed by considering confirmed social links between users who adopted the service earlier than t. In order to avoid the effect of instantaneous group adoptions (evidently not driven by social influence), we only consider links between nodes who are neighbours in the underlying social network and whose adoption did not happen at the same time. Note that for the categorisation of nodes we use only the adoption time and the state of their peers, and thus real categories may differ slightly. For example, an innovator may appear as a vulnerable or stable node, even if its decision was not driven by social influence but some of its peers adopted earlier. To consider this bias we measure effective rates of adoption for the model process as well, just like for the empirical case (Fig. 1) and section S3.

Maximum relative growth rate

This measure is obtained by taking the maximum of the yearly adoption rate (yearly count of adoptions) normalised by the final observed adoption number of a given service. It characterises the maximum speed of adoption a service experienced during its history and takes values between 0 (no cascade) and 1 (instantaneous cascade). We repeat this measurement for the estimated number of registered users of Facebook, Twitter, and LinkedIn48, as well as for the number of active users of Skype and three paid Skype services. Adoption rates for Facebook, Twitter, and LinkedIn correspond to the period between 2006 and 2012, and for Skype and its services to the interval from release date until 2011.

Empirical parameter estimation

We use the Skype data to directly determine all model parameters, apart from the fraction r of immune nodes. To best approximate the degree distribution of the real network, after testing different candidate functions (see SI) we select a lognormal function with parameters μD = 1.2 and σD = 1.39 and minimum degree kmin = 1, leading to the average degree z = 8.56. To account for finite-size effects in the model results for low N (Fig. 2), we decrease μD slightly to obtain the same value of z as in the real network.

The threshold distribution of each degree group collapses to a master curve after normalisation by using the scaling relation Pk, k) = kPk/k). This master curve can be well approximated by the lognormal distribution , with parameters μT = −2 and σT = 1 as determined by the empirical average threshold w = 0.19 and standard deviation 0.233 (for further details see SI). We estimate a rate of innovators pn = 0.00019 by fitting a constant function to Ri(t) for t > 2τ (Fig. 1f). The fit to pn also matches the time scale of a Monte Carlo iteration in the model to 1 month. Model results (Fig. 3d,e) are calculated with r = 0.73 and pn = 0.00019. Simulation results in Fig. 3c–e are averaged over 100 configuration-model networks of size N = 105 (106) after T = 89 iterations, matching the length of the observation period in Skype.

Model description

We characterise the static social network by the extended distribution P(k), where P(k) = rP(k) for c = 0 and P(k) = (1 − r)P(k)P(c) for c > 0. Non-immune, susceptible nodes with property vector k adopt spontaneously at a constant rate pn, else they adopt only if a fraction ϕc of their k neighbours have adopted before. These rules are condensed in the probability Fk,mdt that a node will adopt in a small time interval dt, given that m of its neighbours are already adopters,

with F(k,0),m = 0 k, m and F(0,c),0 = pr c ≠ 0 (for immune and isolated nodes, respectively). The dynamics of adoption is well described by an AME for the fraction sk,m(t) of k-nodes that are susceptible at time t and have m = 0, …, k adopting neighbours52,53,58,

where . To reduce the dimensionality of Eq. (6) we consider the ansatz for m < c, leading to the condition . With and some algebra, this condition is reduced to Eq. (1) (see SI).

Additional Information

How to cite this article: Karsai, M. et al. Local cascades induced global contagion: How heterogeneous thresholds, exogenous effects, and unconcerned behaviour govern online adoption spreading. Sci. Rep. 6, 27178; doi: 10.1038/srep27178 (2016).