Main

Social media companies are struggling to control online health dis- and misinformation, for example, during the COVID-19 pandemic in 20208. Online narratives tend to be nurtured in in-built community spaces that are a specific feature of platforms such as Facebook (for example, fan pages) but not Twitter3,16,17,18. Previous studies have pointed out that what is missing is a system-level understanding at the level of millions of people13, whereas another study14 has highlighted the need to understand the role of algorithms and bots in the amplification of risk among unwitting crowds.

Here we provide a system-level analysis of the multi-sided ecology of nearly 100 million individuals expressing views regarding vaccination, which are emerging from the approximately 3 billion users of Facebook from across countries, continents and languages (Figs. 1, 2). The segregation in Fig. 1a arises spontaneously. Individuals come together into interlinked clusters. Each cluster is a Facebook page and its members (that is, fans) who subscribe to, share and interact with the content and narratives of that Facebook page. A link from cluster A to B exists when A recommends B to all its members at the page level, as opposed to a page member simply mentioning a cluster. Each red node is a cluster of fans of a page with anti-vaccination content. Cluster size is given by the number of fans, for example, the page ‘RAGE Against the Vaccines’ has a size of approximately 40,000 members. Blue nodes are clusters that support vaccinations, for example, the page ‘The Gates Foundation’ has a size (that is, number of fans) of more than 1 million. Each green node is a page focused around vaccines or another topic—for example, a school parent association—that has become linked to the vaccine debate but for which the stance is still undecided. Support and potential recruitment of these green clusters (crowds) is akin to a battle for the ‘hearts and minds’ of individuals in insurgent warfare.

Seven unexpected features of this cluster network (Fig. 1) and its evolution (Fig. 2) together explain why negative views have become so robust and resilient, despite a considerable number of news stories that supported vaccination and were against anti-vaccination views during the measles outbreak of 2019 and recent efforts against anti-vaccination views from pro-vaccination clusters and Facebook.

First, although anti-vaccination clusters are smaller numerically (that is, have a minority total size, Fig. 1d) and have ideologically fringe opinions, anti-vaccination clusters have become central in terms of the positioning within the network (Fig. 1a). Specifically, whereas pro-vaccination clusters are confined to the smallest two of the three network patches (Fig. 2a), anti-vaccination clusters dominate the main network patch in which they are heavily entangled with a very large presence of undecided clusters (more than 50 million undecided individuals). This means that the pro-vaccination clusters in the smaller network patches may remain ignorant of the main conflict and have the wrong impression that they are winning.

Second, instead of the undecided population being passively persuaded by the anti- or pro-vaccination populations, undecided individuals are highly active: the undecided clusters have the highest growth of new out-links (Fig. 1a), followed by anti-vaccination clusters. Moreover, it is the undecided clusters who are entangled with the anti-vaccination clusters in the main network patch that tend to show this high out-link growth. These findings challenge our current thinking that undecided individuals are a passive background population in the battle for ‘hearts and minds’.

Third, anti-vaccination individuals form more than twice as many clusters compared with pro-vaccination individuals by having a much smaller average cluster size. This means that the anti-vaccination population provides a larger number of sites for engagement than the pro-vaccination population. This enables anti-vaccination clusters to entangle themselves in the network in a way that pro-vaccination clusters cannot. As a result, many anti-vaccination clusters manage to increase their network centrality (Fig. 2b) more than pro-vaccination clusters despite the media ambience that was against anti-vaccination views during 2019, and manage to reach better across the entire network (Fig. 2a).

Fourth, our qualitative analysis of cluster content shows that anti-vaccination clusters offer a wide range of potentially attractive narratives that blend topics such as safety concerns, conspiracy theories and alternative health and medicine, and also now the cause and cure of the COVID-19 virus. This diversity in the anti-vaccination narratives is consistent with other reports in the literature4. By contrast, pro-vaccination views are far more monothematic. Using aggregation mathematics and a multi-agent model, we have reproduced the ability of anti-vaccination support to form into an array of many smaller-sized clusters, each with its own nuanced opinion, from a population of individuals with diverse characteristics (Fig. 3b and Supplementary Information).

Fifth, anti-vaccination clusters show the highest growth during the measles outbreak of 2019, whereas pro-vaccination clusters show the lowest growth (Fig. 1c). Some anti-vaccination clusters grow by more than 300%, whereas no pro-vaccination cluster grows by more than 100% and most clusters grow by less than 50%. This is again consistent with the anti-vaccination population being able to attract more undecided individuals by offering many different types of cluster, each with its own type of negative narrative regarding vaccines.

Sixth, medium-sized anti-vaccination clusters grow most. Whereas larger anti-vaccination clusters take up the attention of the pro-vaccination population, these smaller clusters can expand without being noticed. This finding challenges a broader theoretical notion of population dynamics that claims that groups grow though preferential attachment (that is, a larger size attracts more recruits). Therefore, a different theory is needed that generalizes the notion of size-dependent growth to include heterogeneity (Fig. 3b).

Seventh, geography (Fig. 1b) is a favourable factor for the anti-vaccination population. Anti-vaccination clusters either self-locate within cities, states or countries, or remain global. Figure 1b shows a small sample of the connectivity between localized and global clusters. Any two local clusters (for example, two US states) are typically interconnected through an ether of global clusters and so feel part of both a local and global campaign.

The complex cluster dynamics between undecided, anti-vaccination and pro-vaccination individuals (Figs. 1, 2) mean that traditional mass-action modelling19 cannot be used reliably for predictions or policies. Mass-action models suggest that given the large pro-vaccination majority (Fig. 1d), the anti-vaccination clusters should shrink relative to pro-vaccination clusters under attrition, which is the opposite of what happened in 2019. Figure 3a shows the importance of these missing cluster dynamics using a simple computer simulation with mass-action interactions only between clusters, not populations. The simulation reproduces the increase in anti-vaccination support in 2019, and predicts that anti-vaccination views will dominate in approximately 10 years (Fig. 3a). These findings suggest a new theoretical framework to describe this ecology, and inform new policies that allow pro-vaccination entities, or the platform itself, to choose their preferred scale at which to intervene.

If the preferred intervention scale is at the scale of individual clusters (Fig. 3b), then Fig. 1a can identify and target the most central and potentially influential anti-vaccination clusters. Our clustering theory (see Supplementary Information) predicts that the growth rate of an influential anti-vaccination cluster can be reduced, and the onset time for future anti-vaccination (or connected undecided) clusters delayed, by increasing the heterogeneity within the cluster. This reduces parameter F of our theory, which captures the similarity of pairs of engaged individuals N in a particular narrative. The anti-vaccination (or connected undecided) cluster size C(t) is reduced to C(t) =N(1 − W([−2Ft/N]exp[−2Ft/N])/[−2Ft/N]) where W is the Lambert function20, and the delayed onset time for a future nascent anti-vaccination (or connected undecided) cluster is tonset = N/2F. If instead the preferred intervention scale is at the scale of network patches (single or connected; Fig. 3b), our theoretical framework predicts that the pro-vaccination population (B) can beat the anti-vaccination population or persuade the undecided population (X) within a given network patch S over time T by using Fig. 1a to identify and then proactively engage with the other clusters in S, irrespective of whether they are linked or not:

$$T=\frac{1}{{x}_{{\rm{c}}}(2-{d}_{{\rm{B}}}-{d}_{{\rm{X}}})}\left[4X+(B-X)\mathrm{ln}\,\frac{X(B-X+{x}_{{\rm{c}}})}{{x}_{{\rm{c}}}B}\right]$$
(1)