Asymmetric participation of defenders and critics of vaccines to debates on French-speaking Twitter

For more than a decade, doubt about vaccines has become an increasingly important global issue. Polarization of opinions on this matter, especially through social media, has been repeatedly observed, but details about the balance of forces are left unclear. In this paper, we analyse the flow of information on vaccines on the French-speaking realm of Twitter between 2016 and 2017. Two major asymmetries appear. Rather than opposing themselves on each vaccine, defenders and critics focus on different vaccines and vaccine-related topics. Pro-vaccine accounts focus on hopes for new groundbreaking vaccines and on ongoing outbreaks of vaccine-preventable illnesses. Vaccine critics concentrate their posts on a limited number of “controversial” vaccines and adjuvants. Furthermore, vaccine-critical accounts display greater craft and energy, using a wider variety of sources, and a more coordinated set of hashtags. This double asymmetry can have serious consequences. Despite the presence of a large number of pro-vaccine accounts, some arguments raised by efficiently organized and very active vaccine-critical activists are left unanswered.


Data
Using a combination of the streaming and search Twitter API, we collected all tweets pertaining to vaccination published in French between March 28 th , 2016 and May 23 rd , 2017 (258.166 tweets posted by 107.923 unique users, see Supplementary Material for the full list of keywords used in the data collection). In this paper, we focus on the 58.559 tweets from 31.088 unique users dealing with specific vaccines (existing or hotly awaited), i.e. containing the name of a commercial vaccine or the name of a vaccine-preventable disease and variants of the term "vaccine". We therefore excluded tweets about vaccination in general. We listed 37 vaccines or substances contained in vaccines based on the commercial vaccines available in France, Belgium, Canada and Switzerland, and on our knowledge of existing controversies in these countries (see Table 1 for full description of the 111 keywords used). These 37 topics fall into 5 categories: seasonal flu, mandatory and recommended for the general population in developed countries, adjuvants and additives, hotly awaited, and other vaccines (for rarer diseases, developing countries or specific subgroups). In addition to being necessary to answer our research questions, focusing on this more precise set of keywords had the effect of limiting the risk of integrating irrelevant tweets More details on the data collection and cleaning are reported in the Supplementary Material File. The keywords are presented in Fig. 1.

Results
Between March 2016 and May 2017, the most discussed vaccine on the French Twitter feed was the seasonal flu, as we can observe in Fig. 1. A large attention was also dedicated to'awaited' vaccines, like the vaccines against AIDS and Ebola. Among our five groups of topics, "Recommended and Mandatory vaccines" had the most content. Adjuvants and additives were less debated than our other groups of topics even though they have been at the center of heated debate in France for the past ten years.
To test our hypothesis, we will first present our analysis of our whole sample of tweets and then focus on each vaccine-related topic taken separately. Finally, we will focus on the respective reaches and patterns of activity of vaccine critics and defenders.
www.nature.com/scientificreports www.nature.com/scientificreports/ Vaccine critics and defenders tend to focus on different vaccines. The circulation of information on all vaccines. We partitioned the retweet network using the Louvain algorithm, finding that a structure with 3 major communities is the partition maximizing modularity. As we can see from Fig. 2, the retrieved communities show some similarity with the users' categories. One community (light blue) contains the most important "ANTI" users; another one (rose) several important "PRO" users. The third one (grey) contains users mostly related to media accounts and to the NGO world. Notice however that while the community classification almost perfectly maps the "ANTI" users, several "PRO" accounts, related to institutions or health related media, are classified in the last two groups.
The role of defenders and critics in the flow of information on each vaccine. To get a more detailed view of the retweet network, we constructed a multiplex network, associating a layer to each topic. In some cases, we further decomposed the layer structure at the topic level. Some example of the layers are reported in Fig. 3.
A vast majority of topics is mostly broached by one side only. The flow of information on adjuvants and additives is for instance dominated by vaccine critics, while the flows on flu and measles are dominated by pro-vaccine users. In rare cases, critics and defenders seem to discuss exactly the same topic, like the human papillomavirus (HPV). This opposition is however not exactly symmetrical: HPV as a vaccine-preventable disease is mainly www.nature.com/scientificreports www.nature.com/scientificreports/ discussed by pro-vaccine users; but the actual HPV vaccines (GardasilTM and CervarixTM) are a topic almost exclusively broached by vaccine critical users.
A final step to define the topic preference in the "PRO/ANTI" classes is to go back from the topic retweet graphs. For each topic, α, we calculated the number of tweets citing the topic for each user in the activist list, A Ext and P ext . Based on this value, we reconstructed the ranking of the topics in each class ( α α r r ( ), ( )

PRO A NTI
), and its relative ranking : a strongly positive (resp. negative) relative rank indicates a mostly "PRO" (resp. "ANTI") topic, a small value of this measure indicates a neutral topic, broached almost equally by the two camps. The relative ranking of the most cited topics is reported in the lower plot of Fig. 4. The war horses of the "PRO" users are yellow fever and other tropical diseases, the diseases associated to mandatory vaccines and the seasonal flu. The "ANTI"-vaccine class is strongly focused on the limited number of vaccines and substances that have become very controversial, at least in France such as adjuvants and the hepatitis B vaccine (additional measures of polarization for each topic are reported in the Supplementary Information File).
Vaccine defenders reach a wider audience but vaccine critics use Twitter more effectively. We will now analyze the potential of pro-vaccine and vaccine-critical accounts to reach a large public and to influence opinion.
The reach of vaccine defenders and critics. In Fig. 5, we represent the size of the k-shell audience and of the k-shell sources for the two groups of activists. Since the size of the original groups was not equal, we divided the sizes of the k-shell audience (sources) by its initial value in order to compare the growth mechanisms. The k-shell audience represents all the users that exclusively shared the information from one of the two groups. Similarly the k -shell sources are all the users who were cited by only one of the two groups. Once normalized by the initial size,  www.nature.com/scientificreports www.nature.com/scientificreports/ these normalized measures represent respectively the average capacity of each user in a group to be retweeted (for the audience) and its average retweeting activity (for the sources). We can observe that, for the "PRO" community the audience size stabilizes at a higher relative value. Ceteris paribus, the "PRO" users' posts are retweeted by a larger number of accounts Iterating the shell definition procedure until all users are included, we observe that the audience of the "PRO" and "ANTI" cover together 32% of the retweet network nodes. However, pro-vaccine activists are capable of reaching 24% of the users, while the -vaccine critics only reach 8%. We can observe that in the remaining pool of users (68% who do not exclusively retweet one or the other), only 0.6% retweet both "pro" and "anti" users. The largest part of these users therefore only retweet contents produced by news media accounts.
In order to better understand this result, and to assess the importance of each type of actor in the spread of information, we used some centrality indicators. First we defined the users posting a large number of retweets (users with a large out-degree) as opinion amplifiers k out . As we can see in the left plot of Fig. 6, a small number of vaccine critics present an unusually high retweeting activity and can be considered as super-amplifiers. But, descending at a lower activity level, the strongest amplifiers are vaccine defenders. As explained in the methods section, we assessed the level of influence of a user, using her h-index. The biggest influencers in the vaccine debate are clearly the newsmedia as we can observe in the right plot of Fig. 6. However, while most influential media are only authoritative for the PRO, we observe that a minority are equally retweeted by defenders and critics alike. Critics are more active and use hashtags in a potentially more efficient way. Vaccine critics have a more limited reach than vaccine defenders, but it is important to note that the difference between the two (24% vs 8%) is not that substantial given the large scientific consensus around vaccination and the resources at the disposal of some pro-vaccine actors (Ministries of Health, public agencies, scientific societies, pharmaceutical companies…). To understand this result, let us go back to our measures of the activity of these users. "Anti" users have on average a significantly higher tweeting activity, and make reference to a greater number of sources. "Anti" users tend to make the effort of trying to spread the messages they agree with, but are however less influential compared to the "Pros".
The most frequently used hashtags by pro-vaccine and vaccine-critical accounts, presented in the right plots of Fig. 4, suggest another difference in these two camps' practices. Defenders tend to disperse their use of hashtags over a greater number of them, to refer to themes that are too general to make their tweet stand out in the mass of contents posted on the subject (such as #cancer, #pharmaciens, #Angola), to refer to institutions and media (for instance: #afp, #CNRS) and to focus on hashtags produced by institutions to mobilise around an official campaign (#lagrippejedisnon, #pourchaqueenfant, for example). The latter use of hashtag is efficient to mobilize a community of sympathiser but does not correspond to the keywords parents are likely to put in the search bar when they look for discussion on the actual vaccines they must make a decision on (see methods for a discussion of hashtag use). On the contrary, vaccine critics use both hashtags designed to mobilize their sympathizers (#vaxxed, #lettreprjoyeux…) and -more importantly -hashtags relating directly to specific vaccines or vaccine-specific www.nature.com/scientificreports www.nature.com/scientificreports/ substances (#aluminium, #hpv, #gardasil) making it likely that their tweets will be found when someone searches information on a given vaccine.

Discussion
In this paper, we partially corroborated the hypothesis that vaccine critics and defenders tend to focus on different vaccines. We found that most childhood vaccines attracted a similar amount of attention by critics and defenders but that their respective productions tended to focus on different keywords. We also found that some issues are almost completely abandoned by vaccine defenders even though they can be at the core of contemporary vaccine hesitancy (adjuvants and additives for instance). We also showed that pro users are retweeted by a larger number of accounts but that critics tend to use the specific functionalities of Twitter better even though it does not does not go so far as to give them a wider reach than vaccine defenders.
Our results contribute to current reflections on processes of polarization in the age of digital social media. The specific way social media are designed favours this process of polarization. By suggesting new contents based on the users' previous behaviour, they tend to create echo chambers where the information circulating is culturally homogeneous thereby favouring partisan bias in the rare instances where users are put in contact with dissonant contents. This process is reinforced by producers of fake news who exploit social media's orientation towards virality to make money from the circulation of radically partisan contents 37 . This polarization not only affects perceptions of politicians' actions. Scientific subjects are also caught up in these processes 19 . Analysts have suggested that the current rise of vaccine hesitancy is at least partly due to a process of polarization 15,19,22,38 . Few empirical studies have tested the level of polarization of discussions on social media. Schmidt et al. studied Facebook users who posted at least 10 posts on vaccines between 1st January 2010 and 31st May 2017 15 . They found that a majority of pro and anti-vaccine users only consumes and produces information in favour or against vaccines, not both, indicating a high degree of polarization. Menczner and Hui also found a very high degree of segregation between pro and anti-vaccine Twitter users 27 . However, Lutkenhaus et al. did not find evidence of polarization in discussions of vaccination that took place during the second half of the year 2017 between dutch-speaking Twitter users 26 . The anti-vaccine community was largely connected to several pro-vaccine communities and pro and anti-vaccine users interacted regularly. Nevertheless, they also found that most of these interactions were conflictual (insults, mockery, criticism…). Our results prolong those of Lutkenhaus et al. and suggest that contents produced by Europeans tend to be less polarized than that produced in North America. We found that vaccine critics and defenders composed two fairly cohesive communities on Twitter. But we also found that these two communities were significantly connected, at least indirectly, via their tendency to retweet the same mainstream newsmedia. The role of traditional newsmedia in polarization on social media is often overlooked even though they play a central role in political polarization 17 .
According to Benkler, Faris and Robert, the growing divide between right-wing and left-wing Americans is largely due to choices made by a number of media outlets -in connection with evolutions within the two main political parties-to use partisanship as a market strategy 39 . This logic presides over the constitution of digital echo chambers which are therefore only one of the many mechanisms through which these transformations affect the American public rather than the cause of polarization. France's political landscape is much more multipolar and the French media landscape has not followed the same transformations. A recent study found that there remained a strong core set of agenda-setting elite media who adhere to a philosophy of journalistic objectivity and act as gatekeepers against fake news and radical views 40 . The fact that both defenders and critics can turn to these mainstream media can be explained by the diversity of contents they have produced on the subject of vaccination in recent years. French journalists covering health can be divided when it comes to the legitimacy of concerns www.nature.com/scientificreports www.nature.com/scientificreports/ regarding the safety of vaccines on some specific issues (aluminium-based adjuvants, the safety of the HPV vaccine). But this division is often within each media's newsroom rather than between media outlets 41 . In addition to this, the outcomes of high-profile vaccine-related lawsuits, results of surveys showing the high levels of vaccine hesitancy in France, government decisions or public health officials' statements regarding mandatory vaccinations regularly feature in all media. Consequently, this diversity of contents means that defenders of vaccines can retweet a piece published in Le Monde debunking common "antivaccine myths" while vaccine critics can retweet an interview of the head of a collective of "victims" of aluminium-based adjuvants performed a month earlier by another of Le Monde's journalists. Our results therefore highlight the need to integrate knowledge of the diversity in newsmedia coverage of vaccination to interpret the structure of discussions on social media.
Our main contribution is our finding that defenders and critics of vaccines on Twitter focus on different topics, and especially on different vaccines. Vaccine critics mainly focus on the alleged dangers of specific vaccines. In the French-speaking world, these are the vaccines against HPV or hepatitis B, or adjuvants such as aluminium. But the list would likely be different in other cultural areas and countries as vaccine-related controversies tend to be grounded in local contexts 42 . Pro-vaccine accounts mostly focus on the dangers of a low vaccination coverage and on hopes raised by future vaccines. This asymmetry raises much concerns. Pro-vaccine accounts are numerous, and seem to attract a wider audience than vaccine critical accounts, which could be reassuring. Yet, vaccine-critics are very active and well coordinated and some of their arguments are most of the time left unanswered by their opponents. www.nature.com/scientificreports www.nature.com/scientificreports/ Our findings have implications for public health policy. Researchers working on vaccine hesitancy have argued that it is crucial to challenge vaccine critics' arguments on social media and not let the Internet be the realm of antivaccinationism 4 . How this should be done is currently the object of a heated debate. Some have warned against adversarial approaches and public stigmatization of vaccine critics which might make them appear as victims of persecution, suggest vaccination is a scientifically contested topic and increase polarization of attitudes 5,34,43,44 . Recent studies also suggest that debunking strategies can have counter-productive effects 4,5,45 . In our study, we found that vaccine defenders talk less than critics about the more controversial vaccines or aspects of vaccination. Some of critics' main arguments remain consistently unaddressed. It could mean that the strategy chosen by medical experts and other pro-vaccine actors consists in emphasizing the importance of the principle of vaccination for public health. This strategy has the advantage of not directly mentioning the objects of concerns which has been found to decrease vaccination intentions and to emphasize the importance of herd immunity which tends to alleviate doubts 4,5,46 . Nevertheless, in a context where an increasing number of vaccine critics present themselves as "not antivaxxers" and manage to convince both the public and journalists that they are different from traditional antivaccinationists 28,41 , it is doubtful that this approach will prompt a dismissal of their claims by the public. We believe that, even though more research is necessary to discover the best ways to debunk unfounded claims, defenders of vaccines should not wait for the discovery of a magic bullet before addressing these claims on social media -provided they follow simple ethical rules such as treating vaccine critics and hesitant parents with respect 34,35,43 .
That said, one of our results raises a new type of dilemma for vaccine communication. We found a strong presence of defenders in contents relative to vaccination against papillomaviruses. However, we also found that much of these contents centred on the market name of these vaccines (GardasilTM and, more marginally, CervarixTM) and that vaccine critics completely dominated these contents. Should public doctors, experts and public authorities publicly defend a commercial product? In most developed countries, including France, public authorities are seen as being too close to pharmaceutical companies which contributes to a lack of trust in vaccines 1,3,47-49 . On the one hand, defending a specific commercial vaccine can reinforce the impression that financial interests bear on vaccination policies and market authorizations. But on the other hand, it is necessary to give reassurances that market authorization processes are effective in assessing and monitoring the safety of vaccines.
Finally, our study focused on an important period for debates over vaccines in France: the year preceding the announcement that the number of mandatory vaccines would be extended from 3 to 11 in June 2017. Since then, the new mandate framework was put in place, in January 2018, the French government launched an important communication campaign targeting the public and healthcare professionals and media coverage of vaccine safety seems to have abated. Further investigation should be focused on whether and how these changes have affected the structure of the flow of information on Twitter described in this paper. Because, in some cases, resorting to constraint can backfire by stimulating the constitution of organised anti-vaccine movements or a more general lack of trust in authorities 7,10,[32][33][34][35] , it is crucial to better understand the effect of these measures on public discussions and attitudes to vaccination.

Limitations
The main limitation of our analysis relates to the generalisability of our results. We focused on tweets written in French. Our results likely reflect the specificity of vaccine debates and vaccine hesitancy in France. This can be seen in the volume of discussions on aluminium-based adjuvants. The use of aluminium in vaccines has been at the core of most debates around vaccination in France since 2010 while it has not emerged as an object of major concern outside the French-speaking world 29 . Conversely, the dominance of vaccine defenders in discussions around the MMR vaccine could reflect the fact that this vaccine has not been the object of strong critical mobilisations in France, contrary to countries such as the United States of America and Great Britain 9,48 . The idiosyncratic nature of vaccine hesitancy and of activists' mobilisations on the subject of vaccination is likely to affect two parameters: a) which vaccines will attract most debate, and b) the overall balance of power between positive and negative discourses. Another limitation comes from our focus on Twitter. Because each social mediahas its specificities (practices and publics), it is possible that the types of contents and the flow of information differs radically on Facebook or Instagram for instance. However, we judge this to be unlikely as our results are coherent with the data available on vaccine hesitancy and vaccine-related controversies in France 29 and with data pertaining to discussions of vaccines on Twitter in other European countries 26 . Further research comparing the structure of the flow of information on vaccines on each social media and in different countries would contribute to current reflexions on the best ways to curtail the spread of misinformation on the Internet.

Methods
Retweet networks. We mapped the information flows between users by following their retweeting activity: namely we constructed a directed graph where the nodes, , are the users and a directed edge is created between two nodes (u u , i j ) if user i retweets user j. The retweet graph, represents the circulation of information among users. A retweet most often means that a user endorses the idea expressed by the user she retweets. In this sense the retweet network can also be interpreted as an opinion similarity space. We chose not to study "mentions". It has indeed been shown that, in the case of very polarized debates, like the US elections, polarization in terms of community structure is not observed in the mention graph 50 . This comes from the fact that, in such contexts, mentions are often used to cite one's opponent. The giant component of the retweet network consists of 16.302 nodes, connected by 20.648 weighted directed edges -the number of retweets between two users defining a weighting. The full graph, including isolated nodes, is composed of 20.121 nodes and 23.348 edges.
Using the topical tags listed in Table 1 we associated to each link of the retweet network the list of its associated topics (extracted from the text associated to the retweet): In order to analyze how the information on each topic identifying vaccine critics and defenders. We manually annotated 360 of the most active and/or prominent users on vaccination, noting their position toward vaccines ("pro"/"anti"/"neutral") and whether they were accounts by newsmedia who covered the subject without expressing a personal point of view (coded as "media). The manual annotation procedure allowed us to identify a list of 92 vaccine critics (A ini ), 146 vaccine defenders (P ini ), 86 media and 36 neutral users. From the initial sets of "Pro" and "Anti" users, we built the audience and the sources for the two groups, considering respectively the re-tweets pointing to each of the two groups and originating from them in the following way.
We first defined the 0-shell audience, for the two groups, as the initial sets: . At each iteration we calculated the k-shell audience adding to the − k 1-shell audience the exclusive incoming neighbourhood of the previous shell, namely the users exclusively retweeting one group and not the other: (1) where V i IN represents the set of users retweeting user i. The sources of the groups, S ANTI k ( ) and S PRO k ( ) , were identified in the same way using the outgoing links, i.e., users retweeted by the set. The algorithm converged (quickly here, given the limited diameter of the network) when no new nodes were present in the exclusive neighbourhood. We defined the global audience (and sources) as k-shell audiences at the equilibrium.
Since the number of manually coded activists is too low to perform statistical studies on their behavior (1.4% of the users in the retweet network), we used the k-shell audiences to extend the initial sets. We fixed a threshold on the size of the exclusive neighbourhood being at least 90% of the new total neighbourhood. We required an overlap between the neighbourhoods such as: For this reason, we stopped the procedure at the first iteration ( = k 1) and we defined the new extended activist sets as: = A A ext ANTI (1) and = A P ext PRO (1) . With this procedure we reached an extended number of classified users covering 23.6% of the retweet network nodes: 1224 in the A Ext and 2699 in the P Ext set.
Opinion amplifiers and influencers. In network science, several methods exist to identify the most central users, where the meaning of centrality is determined by the research question. We are interested in analyzing two main features: who are the most influential users, namely those who can be considered an authority in the debate; and who contributes the most to amplify information, namely those who participate in producing a flow of information. For the second part the simplest suitable centrality measure is the out-degree, k out , of the nodes: this measure indicates, for a certain user, how many different users she retweeted.
Concerning the influencers, the situation is a bit more complex because we must take into account both the number of tweets the user produced and the quantity of retweets she received. We therefore decided to apply an adaptation of the h-index 53 to activity on Twittery: a user has h − index = k if she has k tweets with at least k retweets.
We also idenfied the main hashtags used by both communities. Previous studies 54 have emphasized the "central role of the hashtag in coordinating publics" 55 . Choosing a hashtag both clear enough to be understood, and specific enough to avoid confusions, helps attracting a relevant audience, by increasing one's post "searchability" 56 . Understanding the differentiation between the various actors' tagging strategies is thus key to identify the audience they target, and to assess the potential success they can expect. patient and public involvement. There was no involvement of patients or the public in this research.