Cascading collapse of online social networks

Online social networks have increasing influence on our society, they may play decisive roles in politics and can be crucial for the fate of companies. Such services compete with each other and some may even break down rapidly. Using social network datasets we show the main factors leading to such a dramatic collapse. At early stage mostly the loosely bound users disappear, later collective effects play the main role leading to cascading failures. We present a theory based on a generalised threshold model to explain the findings and show how the collapse time can be estimated in advance using the dynamics of the churning users. Our results shed light to possible mechanisms of instabilities in other competing social processes.


Further details of the model.
Even though the network of the OSN iWiW had an average degree of k 220 it is obvious that users do not perceive all their acquaintances for continuing to use or abandoning a given service. They are generally interested in their intimate friends with whom they regularly communicate. Thus we have checked which is the average degree for which our model fits best the empirical data. After fitting the exogenous timescale µ for the early (2007−2009) years the waiting time is fitted for the ratio of the active users. Table 1 summarises the results while Fig. 2 (b) shows the best fit with k = 10 and k = 200. It is remarkable that the best fit was found for k = 10. We note that the waiting time which amounts to slightly more than two weeks is also realistic in this case.
We can thus assume that when considering abandoning an OSN users do not care about all their acquaintances but only about their intimate relationships, which amounts to about 10 persons. Considering that on the average 1/3 of the population   was not on iWiW, this indicates that people are attached to 12−15 friends. This number is similar to the size of Dunbar's circle of intimate relationships 5 .
We have chosen that around 50% of the links are part of a triangle. This was motivated by the high clustering of the original iWiW data. The model naturally gives similar results without triangle in the network with similar result, but with average degree of 8 and average waiting time of 12 days.
In order to be able to compare our model to the empirical results we extended the sparse network with k =10 to have average degree of 220 as the original iWiW network. This was done in a probabilistic proportional way (assuming that those who have more intimate friends also have more acquaintances). New edges were added to the network by connecting nodes with probability proportional to their original degree while keeping the ratio of triangles the same as for the sparse network. The extended network was only considered to calculate r end and user selection for removal by exogenous effect it played no role in the cascade process.
For the exogenous effect we defined a selection weight for nodes with different degree: f (k) ∼ k −2 . This favours the selection of the users with low degree for the spontaneous churning as motivated by the observation that low degree nodes leave first. Further motivation for f (k) was taken from the average last login time as function of the node degree. We have tested the sensitivity of the model to the actual form of f (k) but found all results qualitatively similar.

Further empirical network.
In this Section we study another empirical system, the Gowalla, for which the collapse can be analysed. The dataset was obtained from Large Stanford Dataset Collection 6 and is a location based online social network service. Figure 3   number of registered and active users as function of the time measured in number of days after the foundation. The number of users in Gowalla increased almost linearly, while the number users who left the service began to increase rapidly at around t = 400 days, though the new users were still arriving at the same pace. Since friendship network data was available we could calculate the distribution of active friends at the last login of a user. The result is plotted in Fig. 3 (b). For users with intermediate degrees 8−150 a peak can be seen at around 0.65 which indicates that in case of Gowalla social pressure was important.
The Gowalla social service differs from iWiW as posting locations can be done alone, while activity in an OSN, where the main aim is the interaction with acquaintances, requires more participation of the friends. Consequently for the ratio of active friends at the time of last login we see the other maximum at r end = 0 for Gowalla, indicating that many users who had high affinity in sharing their geolocations keep this habit even if friends stop doing the same.
We have tried to fit the cumulative active users of Gowalla with our model and achieved a reasonable result as shown in Fig. 4 (a), using a similar k = 14 degree network with wider threshold distribution λ = 0.65 ± 0.25 and with τ = 100 days waiting time. Let us note that in this case among all users leaving the service only 14% left due to social pressure. This small number of cascade users makes the estimation for the peak of the cascade is rather uncertain. We found 04.2011±6 months which coincides well with the history of Gowalla, especially in the countries with highest usage (Sweden and Saudi Arabia).