Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

# Detecting and modelling real percolation and phase transitions of information on social media

## Abstract

It is widely believed that information spread on social media is a percolation process, with parallels to phase transitions in theoretical physics. However, evidence for this hypothesis is limited, as phase transitions have not been directly observed in any social media. Here, through an analysis of 100 million Weibo and 40 million Twitter users, we identify percolation-like spread and find that it happens more readily than current theoretical models would predict. The lower percolation threshold can be explained by the existence of positive feedback in the coevolution between network structure and user activity level, such that more-active users gain more followers. Moreover, this coevolution induces an extreme imbalance in users’ influence. Our findings indicate that the ability of information to spread across social networks is higher than expected, with implications for many information-spread problems.

## Access options

from\$8.99

All prices are NET prices.

## Data availability

The data for this study are available at https://github.com/Jia-Rong-Xie/data-DMRP. The network is very large. You also can find detailed information on how to download the data at www.huyanqing.com. Source data are provided with this paper.

## Code availability

The code for this study is available at https://github.com/Jia-Rong-Xie/code-DMRP.

## References

1. 1.

Lazer, D. et al. Computational social science. Science 323, 721–723 (2009).

2. 2.

Kitsak, M. et al. Identification of influential spreaders in complex networks. Nat. Phys. 6, 888–893 (2010).

3. 3.

Del Vicario, M. et al. The spreading of misinformation online. Proc. Natl Acad. Sci. USA 113, 554–559 (2016).

4. 4.

Gleeson, J. P., O’Sullivan, K. P., Baños, R. A. & Moreno, Y. Effects of network structure, competition and memory time on social spreading phenomena. Phys. Rev. X 6, 021019 (2016).

5. 5.

Wang, P., González, M. C., Hidalgo, C. A. & Barabási, A.-L. Understanding the spreading patterns of mobile phone viruses. Science 324, 1071–1076 (2009).

6. 6.

Bakshy, E., Messing, S. & Adamic, L. Exposure to ideologically diverse news and opinion on Facebook. Science 348, 1130–1132 (2015).

7. 7.

Vosoughi, S., Roy, D. & Aral, S. The spread of true and false news online. Science 359, 1146–1151 (2018).

8. 8.

Kempe, D., Kleinberg, J. & Tardos, É. Maximizing the spread of influence through a social network. in Proc. 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (eds Getoor, L., Senator, T., Domingos, P. & Faloutsos, C.) 137–146 (Association for Computing Machinery, 2003).

9. 9.

O’Keeffe, G. S. & Clarke-Pearson, K. The impact of social media on children, adolescents, and families. Pediatrics 127, 800–804 (2011).

10. 10.

Centola, D. Social media and the science of health behavior. Circulation 127, 2135–2144 (2013).

11. 11.

Onnela, J.-P. & Reed-Tsochas, F. Spontaneous emergence of social influence in online systems. Proc. Natl Acad. Sci. USA 107, 18375–18380 (2010).

12. 12.

Bakshy, E., Rosenn, I., Marlow, C. & Adamic, L. The role of social networks in information diffusion. in Proc. 21st International Conference on World Wide Web (eds Mille, A., Gandon, F., Misselis, J., Rabinovich, M. & Staab, S.) 519–528 (Association for Computing Machinery, 2012).

13. 13.

Kane, G., Alavi, M., Labianca, G. & Borgatti, S. What’s different about social media networks? A framework and research agenda. MIS Q. 38, 274–304 (2014).

14. 14.

Tufekci, Z. & Wilson, C. Social media and the decision to participate in political protest: observations from Tahrir Square. J. Commun. 62, 363–379 (2012).

15. 15.

Lehmann, S. & Ahn, Y.-Y. in Complex Spreading Phenomena in Social Systems (eds Lehmann, S. & Ahn, Y.-Y.) 351–358 (Springer, 2018).

16. 16.

Zhou, T. et al. Solving the apparent diversity–accuracy dilemma of recommender systems. Proc. Natl Acad. Sci. USA 107, 4511–4515 (2009).

17. 17.

Xiang, Z. & Gretzel, U. Role of social media in online travel information search. Tour. Manage. 31, 179–188 (2010).

18. 18.

De Vries, L., Gensler, S. & Leeflang, P. S. H. Popularity of brand posts on brand fan pages: an investigation of the effects of social media marketing. J. Interact. Mark. 26, 83–91 (2012).

19. 19.

Lazer, D., Kennedy, R., King, G. & Vespignani, A. The parable of Google flu: traps in big data analysis. Science 343, 1203–1205 (2014).

20. 20.

Asur, S. & Huberman, B. A. Predicting the future with social media. in Proc. 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (eds Huang, X., King, I., Raghavan, V. & Rueger S.) 492–499 (IEEE Computer Society, 2010).

21. 21.

Backstrom, L. & Leskovec, J. Supervised random walks: predicting and recommending links in social networks. in Proc. 4th ACM International Conference on Web Search and Data Mining (eds King, I., Nejdl, W. & Li, H.) 635–644 (Association for Computing Machinery, 2011).

22. 22.

Conover, M. D. et al. Predicting the political alignment of Twitter users. in PASSAT and IEEE 3rd International Conference on Social Computing (eds Pentland, A., Clippinger, J. & Sweeney, L.) 192–199 (IEEE, 2011).

23. 23.

Gao, H., Barbier, H. & Goolsby, R. Harnessing the crowdsourcing power of social media for disaster relief. IEEE Intell. Syst. 26, 10–14 (2011).

24. 24.

Yates, D. & Paquette, S. Emergency knowledge management and social media technologies: a case study of the 2010 Haitian earthquake. Int. J. Inf. Manage. 31, 6–13 (2010).

25. 25.

Shirky, C. The political power of social media: technology, the public sphere, and political change. Foreign Aff. 90, 28–41 (2011).

26. 26.

Gil de Zúñiga, H., Jung, N. & Valenzuela, S. Social media use for news and individuals’ social capital, civic engagement and political participation. J. Comput. Mediat. Commun. 17, 319–336 (2012).

27. 27.

Wang, D., Kaplan, L., Le, H. & Abdelzaher, T. On truth discovery in social sensing: a maximum likelihood estimation approach. in Proc. 11th ACM International Conference on Information Processing in Sensor Networks (eds Zhao, F., Terzis, A. & Whitehouse, K.) 233–244 (Association for Computing Machinery, 2012).

28. 28.

Pan, B., Zheng, Y., Wilkie, D. & Shahabi, C. Crowd sensing of traffic anomalies based on human mobility and social media. in Proc. 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (eds Knoblock, C., Schneider, M., Kröger, P., Krumm, J. & Widmayer, P.) 344–353 (Association for Computing Machinery, 2013).

29. 29.

Mocanu, D. et al. The Twitter of Babel: mapping world languages through microblogging platforms. PLoS ONE 8, e61981 (2013).

30. 30.

Kaplan, A. M. & Haenlein, M. Users of the world, unite! The challenges and opportunities of social media. Bus. Horiz. 53, 59–68 (2010).

31. 31.

Becatti, C., Caldarelli, G., Lambiotte, R. & Saracco, F. Extracting significant signal of news consumption from social networks: the case of Twitter in Italian political elections. Palgrave Commun. 5, 91 (2019).

32. 32.

Morone, F. & Makse, H. A. Influence maximization in complex networks through optimal percolation. Nature 524, 65–68 (2015).

33. 33.

Lü, L. et al. Vital nodes identification in complex networks. Phys. Rep. 650, 1–63 (2016).

34. 34.

Leskovec, J., Adamic, L. A. & Huberman, B. A. The dynamics of viral marketing. ACM Trans. Web 1, 5 (2007).

35. 35.

Braunstein, A., Dall’Asta, L., Semerjian, G. & Zdeborová, L. Network dismantling. Proc. Natl Acad. Sci. USA 113, 12368–12373 (2016).

36. 36.

Mugisha, S. & Zhou, H.-J. Identifying optimal targets of network attack by belief propagation. Phys. Rev. E 94, 012305 (2016).

37. 37.

Clusella, P., Grassberger, P., Pérez-Reche, F. J. & Politi, A. Immunization and targeted destruction of networks using explosive percolation. Phys. Rev. Lett. 117, 208301 (2016).

38. 38.

Watts, D. J. & Strogatz, S. H. Collective dynamics of ‘small-world’ networks. Nature 393, 440–442 (1998).

39. 39.

Barabási, A.-L. & Albert, R. Emergence of scaling in random networks. Science 286, 509–512 (1999).

40. 40.

Dorogovtsev, S. N., Mendes, J. F. F. & Samukhin, A. N. Giant strongly connected component of directed networks. Phys. Rev. E 64, 025101 (2001).

41. 41.

Hu, Y. et al. Local structure can identify and quantify influential global spreaders in large scale social networks. Proc. Natl Acad. Sci. USA 115, 7468–7472 (2018).

42. 42.

Newman, M. E. J. Spread of epidemic disease on networks. Phys. Rev. E 66, 016128 (2002).

43. 43.

Kwak, H. et al. What is Twitter, a social network or a news media? in Proc. 19th International Conference on World Wide Web (eds Rappa, M., Jones, P., Freire, J. & Chakrabarti, S.) 591–600 (Association for Computing Machinery, 2010).

44. 44.

Weng, L. et al. The role of information diffusion in the evolution of social networks. in Proc. 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (eds Ghani, R., et al.) 356–364 (Association for Computing Machinery, 2013).

45. 45.

Antoniades, D. & Dovrolis, C. Co-evolutionary dynamics in social networks: a case study of Twitter. Comput. Soc. Netw. 2, 14 (2015).

46. 46.

Myers, S. A. & Leskovec, J. The bursty dynamics of the Twitter information network. in Proc. 23rd International Conference on World Wide Web (eds Chung, C., Broder, A., Shim, K. & Suel, T.) 913–924 (Association for Computing Machinery, 2014).

47. 47.

Newman, M. E. J., Strogatz, S. H. & Watts, D. J. Random graphs with arbitrary degree distributions and their applications. Phys. Rev. E 64, 026118 (2001).

48. 48.

Dorogovtsev, S. N., Goltsev, A. V. & Mendes, J. F. F. Critical phenomena in complex networks. Rev. Mod. Phys. 80, 1275–1335 (2008).

49. 49.

Pastor-Satorras, R., Castellano, C., Van Mieghem, P. & Vespignani, A. Epidemic processes in complex networks. Rev. Mod. Phys. 87, 925–979 (2015).

50. 50.

Centola, D. The spread of behavior in an online social network experiment. Science 329, 1194–1197 (2010).

51. 51.

Romero, D. M., Meeder, B. & Kleinberg, J. Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on Twitter. in Proc. 20th International Conference on World Wide Web (eds Sadagopan, S., et al.) 695–704 (Association for Computing Machinery, 2011).

52. 52.

Cha, M., Haddadi, H., Benevenuto, F. & Gummadi, K. P. Measuring user influence in Twitter: the million follower fallacy. in Proc. 4th International AAAI Conference on Weblogs and Social Media (eds Hearst, M., Cohen, W. & Gosling, S.) 10–17 (AAAI Press, 2010).

53. 53.

Wu, S., Hofman, J. M., Mason, W. A. & Watts, D. J. Who says what to whom on Twitter. in Proc. 20th International Conference on World Wide Web (eds Sadagopan, S., et al.) 705–714 (Association for Computing Machinery, 2011).

54. 54.

Bond, R. M. et al. A 61-million-person experiment in social influence and political mobilization. Nature 489, 295–298 (2012).

55. 55.

Huang, X., Gao, J., Buldyrev, S. V., Havlin, S. & Stanley, H. E. Robustness of interdependent networks under targeted attack. Phys. Rev. E 83, 065101 (2011).

56. 56.

Efron, B. & Tibshirani, R. J. An Introduction to the Bootstrap (CRC, 1994).

## Acknowledgements

We thank L. Feng and W. Liu for their very helpful discussions. This work was supported by the Natural Science Foundation of Guangdong for Distinguished Youth Scholar, Guangdong Provincial Department of Science and Technology (grant no. 2020B1515020052), Guangdong High-Level Personnel of Special Support Program, Young TopNotch Talents in Technological Innovation (grant no. 2019TQ05X138), the National Natural Science Foundation of China (grant nos 61903385, 61773412, U1911201, U1711265 and 61971454) and the National Key R&D Program of China (grant no. 2018AAA0101203). The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.

## Author information

Authors

### Contributions

Y.H. conceived the project. Y.H., J.X. and G.Y. designed the experiments. J.X. performed the experiments and numerical modelling. J.X. and Y.H. solved the model. J.X., F.M., J.S., X.M., G.Y. and Y.H. discussed and analysed the results. J.X., G.Y. and Y.H. wrote the manuscript.

### Corresponding author

Correspondence to Yanqing Hu.

## Ethics declarations

### Competing interests

The authors declare no competing interests.

Peer review information Nature Human Behaviour thanks Michael Danziger, Maksim Kitsak and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Extended data

### Extended Data Fig. 1 Relative error of estimated retweet probability $$\frac{{\hat \beta - \beta }}{\beta }$$ as a function of cascade size P∞.

Each dot represents one simulation of data-driven model M2 in Weibo network 2014. $$\hat \beta$$ represents the estimated retweet probability by $$\hat \beta = \frac{{P_\infty }}{{P_e}}$$ and the real retweet probability is β=0.001 for all dots. Both axis are plotted in log scale. Source data

### Extended Data Fig. 2 Global state and localized state of information spreading.

a, Schematic of information cascade in directed network. The spreading in undirected network is a specific scenario in which PGIN(β)=PGOUT(β)=PGSCC(β). b, The cascade sizes are obtained from simulations using data-driven percolation model in Weibo network. The x-axis is plotted in log scale. Each blue point represents one realization. The box region with dashed purple line is most useful for observing phase transitions. Source data

### Extended Data Fig. 3 Cascade size distribution ps of uniform models in Weibo.

a-c, The cascade size distribution ps with different retweet probabilities β or λ. The red crosses and yellow circles represent simulation results. The theoretical results are numerical solutions with quadruple-precision floating-point format. d, The probability of large cascade size fluctuation below the percolation threshold. The values of probability that P happens in the boxes are theoretical results. Each blue dot represents a real cascade. βc=2.64×10−3 is the percolation threshold of uniform site percolation, which equals to the threshold of bond model. We use log scale for both axis in A-C, and set only x-axis as log scale in D. Source data

### Extended Data Fig. 4 Distributions of relative error between empirical cascades and models.

a, Distribution of relative error in data-driven percolation model. b, Distribution of relative error in uniform percolation model. Source data

## Supplementary information

### Supplementary Information

Supplementary Methods 1–5, Supplementary Results 1–6, Supplementary Discussion 1–8, Supplementary Figs. 1–32, Supplementary Tables 1–11 and Supplementary References.

## Source data

### Source Data Fig. 1

Statistical source data and simulation results.

### Source Data Fig. 2

Statistical source data, simulation results and theoretical results.

### Source Data Fig. 3

Statistical source data, simulation results and theoretical results.

### Source Data Extended Data Fig. 1

Simulation results.

### Source Data Extended Data Fig. 2

Simulation results.

### Source Data Extended Data Fig. 3

Statistical source data, simulation results and theoretical results.

### Source Data Extended Data Fig. 4

Statistical source data and simulation results.

## Rights and permissions

Reprints and Permissions

Xie, J., Meng, F., Sun, J. et al. Detecting and modelling real percolation and phase transitions of information on social media. Nat Hum Behav (2021). https://doi.org/10.1038/s41562-021-01090-z