Assortative mixing of opinions about COVID-19 vaccination in personal networks

Many countries worldwide had difficulties reaching a sufficiently high vaccination uptake during the COVID-19 pandemic. Given this context, we collected data from a panel of 30,000 individuals, which were representative of the population of Romania (a country in Eastern Europe with a low 42.6% vaccination rate) to determine whether people are more likely to be connected to peers displaying similar opinions about COVID-19 vaccination. We extracted 443 personal networks, amounting to 4430 alters. We estimated multilevel logistic regression models with random-ego-level intercepts to predict individual opinions about COVID-19 vaccination. Our evidence indicates positive opinions about the COVID-19 vaccination cluster. Namely, the likelihood of having a positive opinion about COVID-19 vaccination increases when peers have, on average, a more positive attitude than the rest of the nodes in the network (OR 1.31, p < 0.001). We also found that individuals with higher education and age are more likely to hold a positive opinion about COVID-19 vaccination. With the given empirical data, our study cannot reveal whether this assortative mixing of opinions is due to social influence or social selection. However, it may nevertheless have implications for public health interventions, especially in countries that strive to reach higher uptake rates. Understanding opinions about vaccination can act as an early warning system for potential outbreaks, inform predictions about vaccination uptake, or help supply chain management for vaccine distribution.


Introduction
Vaccination has been the paramount pharmaceutical intervention to halt the coronavirus disease pandemic (COVID-19) 1 .However, despite the actions taken in Europe by the European Commission to ensure timely access to vaccines for the Member States, various mass vaccination campaigns have not realized their potential, especially in Eastern European countries.Therefore, understanding the social mechanisms underpinning vaccination willingness is pivotal to ghting against both the still ongoing COVID-19 pandemic and future other pandemics.
Vaccination acceptance and its associated determinants are multiplex 2 .In the case of COVID-19, most of the literature has focused on individual-level predictors such as demographic characteristics (gender/sex, age, ethnicity/race, education, income, occupation), personal health history (medical conditions, personal experience with COVID- 19), and beliefs (perceptions about the harms or e ciency of the vaccine).Signi cantly fewer studies have addressed supra-individual level factors such as healthcare and societal determinants 3 .Scientists agree that vaccine acceptance is complex decision-making in uenced by "experience, risk perception, culture, con dence in authorities and medicine" 4 .However, as we move from one study to another, many of the reported empirical ndings are mixed or unclear (especially concerning race, age group, gender, employment status, and education) 4 .This inconsistency in the results may suggest that researchers have overlooked some predictors.Notably, the potential role of human networks in forming, reinforcing, or spreading opinions about COVID-19 vaccination has not been fully considered.
Disregarding network data is striking.Evidence shows that social networks affect health outcomes 5 (e.g., the spread of obesity, smoking, health screening, HPV vaccination uptake, happiness, depression, sleep, or loneliness).People do not live in isolation, and their behavior is not detached from the behavior of others.Health is a social network outcome.Individuals are interconnected, so their health is interconnected (health preferences, decisions, or habits).Research has already illustrated assortativity (connected individuals tend to share traits and behavior 6 ) as an essential property of human networks 7 .For example, previous work has revealed the association between node characteristics (behavior) and network structure in the case of in uenza vaccination 8 , local and global COVID-19 spreading 9,10 , sexually transmitted infections 11 , alcohol consumption 12 , and, generally, in epidemiologic studies 6 .Additionally, theoretical demonstrations 13 claim that, generally, opinions about vaccination are not randomly distributed in human networks but clustered.Surprisingly, to our knowledge, analyzing opinions about COVID-19 vaccination using a social network perspective has proved inexistent in the literature.Therefore, our paper examines the role of human networks in understanding and predicting opinions about COVID-19 vaccination.Speci cally, our main research objective in this study is to assess whether assortativity positively contributes to predicting COVID-19 opinions.In this direction, we regard personal networks (individuals, their direct social contacts, and the interconnections among them) as the immediate social contexts embedding the individuals 14 .
We analyze the personal networks of 443 individuals (egos), their social contacts (alters), and the tie con gurations embedding the alters and surrounding the egos (the information was collected between March 16 and March 30, 2022, in Romania).These networks are of equal number of alters (ten alters per network) amounting to 4,430 alters.We examine if the opinions of the social contacts are clustered (by assortativity) or randomly distributed.The dataset comprises socio-demographic variables for egos and alters (individual attributes: sex, education, income, and age) and network variables (characteristics of the ego-alter ties and the alter-alter tie con gurations).This unique collection of personal networks is also signi cant because it comes from a population (Romania) that has exhibited the second-lowest COVID-19 vaccination among European Union (EU) countries.As of March 13, 2023, according to the European Centre for Disease Prevention and Control, Romania (an Eastern European EU country) reported 42,6% of the population with at least one dose uptake and only 9,2% with the rst booster uptake.Our ndings suggest that opinions about COVID-19 vaccination are clustered in personal networks.Speci cally, we illustrate that accounting for information about social contacts (ego's alters) brings new insights and allows for predicting COVID-19 vaccination opinions.

Assortativity by COVID-19 vaccination opinions
We claim that current mainstream research can bene t from linking people to their surrounding social context 15 .In this fashion, we aim to detect assortativity in the social organization of opinions about COVID-19 vaccination.
On the one hand, existing evidence 16 advocates the role of social contagion (in uence) in the adoption of innovation.Adopting an idea, a vaccine, or a technology is dependent on the proportion of surrounding people that have already adopted it 17 .Further, actors mutually in uence and inform each other, increasing homogeneity within structural subgroups 18 .Human networks can be addressed as conduits for the circulation of either intangible or tangible resources: from COVID-19 infections 9,19 , opinions 20 and ideas 21,22 to goods and other objects 23,24 .Thus, the pattern of networks is essential for understanding the ow of information 25 .Acquiring a trait may be the result of interacting with only one source (e.g., SARS-COV-2 spreading) or with multiple active sources (e.g., opinion formation) 17 .Scholars argue that close friends and relatives are critical for complex contagion, whereas acquaintances are instrumental for the circulation of information over long social distances 26 .
On the other hand, human networks are not xed but the object of renovations from the part of their embedded members.Social interactions are governed by social selection.Individuals tend to prefer to interact with others who are similar in a space of socio-demographic multi-dimensions 27,28 (homophily).Additionally, contextual in uence (e.g., sharing the same environment: country, community, etc.) can be pivotal in acquiring speci c traits 29 .
According to the literature 30 , social in uence (contagion), social selection (homophily), and contextual in uence (confounding) can lead to assortative mixing.Namely, people live most of their time in clusters of similar peers wherein opinions are formed and reinforced.
Assortativity is the key variable in our models.However, we only limit to detecting the positive contribution of assortativity to predicting opinions about COVID-19 vaccination (outcome variable).(Disentangling the factors responsible for assortativity is beyond the scope of our study.)We observe the behavior of our key variable by controlling for some other potentially relevant aspects of our real-world networks.We account for network composition (actors features such as age, education, sex, and income).Also, we employ betweenness centrality to measure the importance of a given actor for the ow of information between pairs of nodes 31 .This property identi es the nodes that control the circulation of opinions about vaccination in a network.
Next, the network density (how many possible ties are observed) gives information about the speed of opinion circulation.We expect high density networks to exert a higher social control over their members.And, consequently, to enhance a speci c opinion.Inversely, low scores lead to the existence of brokerage positions (people connecting social circles (peers) that otherwise remain disconnected).Then, the number of components (parts of the network that are completely disconnected from one another) indicates divisions in the structure and potential lines of cleavages (e.g., pro and against vaccination sub-groups).Last, network centralization (the tendency of a single node to be more central than all the other nodes) shows whether positional advantages are rather unequally distributed in the personal networks.This structural analysis of the local neighborhoods shows how strategic network positions can be related to the vaccination opinions (how the social texture affects individuals' opinions).
When predicting alters' opinions on vaccination, we employ multilevel analysis with random ego-level ("grouplevel") intercepts to account for two properties of our data.First, alters connected to the same ego are likely to share unobserved variables (e.g., political attitudes, trust in institutions) which may in uence their vaccination opinions.Second, all alters' variables are reported by their respective egos -and we could suspect that different egos have a different understanding of what constitutes a positive or negative opinion.By allowing the expected ratio of pro-vaccination alters to vary randomly by ego, our analysis essentially seeks to predict the difference of opinion among alters in the same personal network.
Looking at the 4,430 alters in our dataset, their median age is similar to the one of the egos (Mdn = 40.0;R = 72.0),which may express an age selection effect.Given the high number of pro-vaccines alters in each network (Mean = 6.5, Std.Dev.= 3.0, Mdn = 7, R = 10), the number of ties to peers that are in favor of COVID-19 vaccination (M = 2.9, SD = 2.2, Mdn = 3.0, R = 9.0) is higher than the one to peers that are against (M = 1.0,SD = 1.5, Mdn = 0.0, R = 9.0).Further, alters have, on average, 4.2 ties (SD = 2.5, Mdn = 4.0, R = 9.0), and display low scores of betweenness (M = 2.1, SD = 4.8, Mdn = 0.0, R = 34.0).We emphasize that in each personal network, the maximum number of alter-alter ties is nine (all personal networks have ten alters).Detailed descriptive statistics about the variables of interest are available in Tables 1-2.We also compute the distribution of our variables of interest by the egos as a grouping variable; see Fig. 1 (descriptive statistics on each personal network are available in the Supplementary Material).We report in Table 3, the results of the multilevel logistic regression models tted to predict alters' opinion about COVID-19 vaccination.And speci cally, to detect evidence of possible assortativity effects in personal networks.We assess the robustness of our results (Table 3) by tting standard logistic regression models, without any multilevel structure (Table 4).In these models, the additional variable proportion of alters that are provaccination (excluding the alter of reference), i.e., prop vacc ex alter, control for the average opinion of all other alters in the network.Interestingly, the two families of models (the multi-level logistic and standard logistic regression models) qualitatively yield the same results for almost all effects.The only qualitative difference in the standard logistic regression models, compared to the multilevel models, is that in the "joint" model (M3, in Table 4), alters whose ego has a higher level of education are less likely to have a positive opinion (M1, OR: 0.79, 95% CI: 0.65-0.96,p = .021;M3, OR: 0.80, 95% CI: 0.65-0.98,p = .030).While this might seem strange at rst glance, we have to take into account that all alter data is reported by ego.The negative effect of ego's education could reveal a social prejudice that higher educated people think more often that their alters have a negative attitude toward vaccination.Thus, it might be a sign that egos with different levels of educations have a different understanding of what constitutes a positive opinion.Similarly, in the multi-level logistic regression models (Table 3), ego's education also has a negative effect.However, this is not statistically signi cant (M1, OR: 0.75, 95% CI: 0.54, 1.05, p = .094;M3, OR: 0.75, 95% CI: 0.54, 1.06, p = .102).
In Table 4, the additional control variable prop vacc ex alter (proportion of alters that are pro-vaccination, excluding the alter of reference), giving the average opinion among the alters in the same network (minus the alter of reference), has a strongly positive effect on the attitude of the alter of reference.While this was expected (in fact, everything else would be a surprise), it underlines the need to control for the average opinion in the network.Note that the multilevel models control for varying average opinion via the random ego-level intercepts.

Discussion
Our study suggests that people with similar opinions about COVID-19 vaccination tend to cluster together (or be partitioned) in personal networks.We nd assortativity to be a positive statistically signi cant effect (which directly contributes to our research objective).The likelihood of having a positive opinion increases when peers (neighbors) have on average a more positive attitude than the rest of the nodes in the network.Further, we discover a social between the opinions held by egos (respondents) and alters (social contacts).Unfortunately, our data do not permit a detailed examination of the causes of this assortative mixing.Future research and longitudinal data are needed to distinguish between social selection, contagion and confounding.
Our models also control for the attributes of the actors and their structural positions in the networks.We discover that alters with higher education 32 and older 33 are more likely to hold a positive opinion.Additionally, educated respondents (egos) think more often than those less educated that their social contacts (alters) have a negative opinion about vaccination.This may be indicative of social prejudices (perceptions about the virtues of vaccination associate with education).At the same time, the structural positions (node-level betweenness) and the organization of the relationships in personal networks (components, density, centralization) do not make signi cant contributions to predicting COVID-19 vaccination opinions.
There are several things our readers should note regarding the interpretation of our results.An inherent feature of any network research design is that egos (respondents) are reporting information about alters (their social contacts).Potentially, this can create biases percolating through the models and interpretations: false consensus effect 34 (the tendency to see one's own choices as relatively common and appropriate) and inaccurate reports 35 .We included the duration of ego-alter ties (years), in our models (75% of all ego-alter ties have a duration of at least eight years).Also, we asked our respondents to provide information about those people whom they communicate most often.At the same time, we employed two different statistical procedures (multi-level and standard logistic regression models) that eventually yielded similar ndings.These remedies should counteract the effects of these biases and improve the quality of the collected data.That is, we expect people who frequently interact over longer time intervals to also have more accurate data on their peers.
Longitudinal cohort (balanced panel) data can improve controlling the magnitude of these biases, in the future.Yet, it cannot completely lter them out.
In sum, our study claims that assortativity impacts COVID-19 vaccination opinions.Thus, we align to the stream of work that have already shown the role of assortativity in vaccination dynamics 6,36 and status 37 or disease spread 38 .In the special case of COVID-19 vaccination, in our sample, people with positive opinions declare having in their social proximity peers holding rather similar positive opinions.We suspect these results can be generalized to the whole population.In this direction, we build on the growing evidence a rming not only the opinion clustering 39,40 or social selection in the adoption of health behavior 41 , but also the clustered vaccination adoption (childhood vaccination refusals 42,43 , seasonal in uenza vaccine uptake 44 , the imitation of vaccination behavior 45 ).We hope our study will be considered by public health experts and authorities a useful insight in their efforts of preparing mass vaccination campaigns, especially in countries with low vaccination rates.

Study design, size, and selection of participants
We performed a real-world cross-sectional study and employed a personal network research design 46 (penet).Using computer-assisted web interviewing, we collected questionnaire data from a random panel of 30,000 individuals (the panel was deemed representative of the Romanian population).Individuals were at least 18 years old when lling out the questionnaire and could speak the Romanian language (the questions were formulated in Romanian).We initiated the data collection process on March 16, 2022, and halted it on March 30, 2022 (we stopped due to the lack of new respondents).We sent invitations to participate in the study to all panel members (an invitation and two additional reminders).The actual data collection process was outsourced (yet, the research team outlined and created the content of the research design).Before enrolling in the study, each respondent was informed about the research objectives and granted anonymity and the possibility of opting out at any moment (even after having submitted the lled-out questionnaire).An availability sample resulted in valid answers (questionnaires) from 896 respondents (dubbed egos according to the penet terminology).
We followed the conventional penet practice and organized the questionnaire into ve components.[1] First, we addressed questions referring to the egos (socio-demographic items and opinions about COVID-19 vaccination).Then, [2] we included a generator of alters (persons connected to the ego).Each ego was required to elicit a number of ten social contacts or alters (maximum ve relatives and ve friends).Namely, the egos were supposed to nominate people they communicate most frequently.We restrained the number of alters to ten to avoid the respondent burden 47 .Afterward, [3] we applied a name interpreter.We asked egos to report sociodemographic information about each of their nominated alters.Further, participants were asked about alters' opinions on COVID-19 vaccination.[4] We measured alter-alter ties, and each ego was asked whether the alters knew each other and communicated in her/ his absence.Lastly, [5] we measured ego-alter ties in terms of duration (For how many years do you know this alter?).Notably, we instructed the egos to use acronyms or nicknames when eliciting information about alters (social contacts).In this way, we avoided disclosing the identity of the alters and harming them in any way.Additionally, we coached the respondents to devise the acronyms in such a way that it would allow them to respond to the alter-alter-tie questions.
Out of the 896 egos with valid responses, only 443 elicited the theoretical maximum number of ten alters.We kept this sub-sample of 443 egos for the statistical analysis and modeling.We decided to work with complete personal networks (consisting of ten alters) for at least two reasons.Methodologically, we wanted to ensure we had su cient information to feed the statistical estimations.Substantially, we wanted to avoid any bias introduced by the number of alters.As shown elsewhere 17 , complex contagion is a function of the number of individuals that act as active social reinforcement sources (in our case, either pro or against COVID-19 vaccination).Figure 2 provides an example of the personal networks that resulted after the administration of the questionnaire.First, we underline the multi-level (hierarchical) organization of the data: alters (the rst level) are grouped by respondents (the second level).Next, we stress the existence of within-network dependencies: various patterns display how alters are interconnected.Also, we highlight the attributes describing the nodes (both egos and alters) and the ties (both ego-alter and alter-alter relationships).The variables included in Fig. 2 are only a selection for expository purposes.

Variables
We start the presentation of the variables used in our study with the actor-level variables, and then we continue with the network-level variables.We collected socio-demographic data referring to sex (0: males, 1: females), education (0: no higher education, 1: higher education), age (numerical, 18 years old), and income (0: less than minimum wage, 1: between minimum & median wage, 2: one minimum wage over the median wage, 3: more than one minimum wage over the median wage; the thresholds re ect Romanian national wages at the moment of data collection).We also measured actors' opinions about COVID-19 vaccination.Each study participant was required to answer the following question: What opinion do you have about COVID-19 vaccination?As stated in the presentation of the study design, we did not interview the alters (this is a common practice in personal network research studies).For this reason, we used a proxy to capture information about them.Namely, we requested egos to inform us about the socio-demographics (sex, age, education) and the opinions of each of their nominated alters about vaccination (What opinion does alter X have about COVID-19 vaccination?).The questions concerning COVID-19 vaccination had the following pre-de ned answers: very bad, bad, good, and very good.Later, we re-coded the responses into binary variables: either an ego (alter) has a positive (good or very good) or negative (bad or very bad) opinion vis-à-vis COVID-19 vaccination.
In terms of network-level measurements, rst, we accounted for the composition of the personal networks.We computed the proportion of female alters, alters with higher education, alters that have a positive opinion about COVID-19 vaccination, and the average age of the alters.These variables were derived from the actor-level variables (see the paragraph above).Then we summarized the properties of the ego-alter ties in each personal network: the mean duration (in years).Further, we computed node-level properties for all the nodes (both egos and alters), such as Freeman's betweenness centrality.We also scrutinized the maximum scores of Freeman's betweenness centrality among the alters with a positive opinion about vaccination (that is, people in favor of vaccination that connect social circles that otherwise would remain disconnected).
Lastly, we looked at the alter-alter ties and described the overall structure of each personal network.We calculated the density (the ratio between the number of observed ties and the number of theoretically possible ties), the centralization ('all roads lead to Rome effect'), and the number of components (subparts of the personal network that remain disconnected in the absence of the ego).

The 'assortativity' variable
Overall, for analysis, we used data on = 443 egos and = 4,430 alters.For alter , let denote the ego of .Note that there is no overlap among the alters of different egos so that each alter is assigned to exactly one For each ego , the binary variable indicates vaccination opinion (1 for positive opinion; 0 for negative opinion).Likewise, for each alter , the binary variable indicates vaccination opinion (1 for positive opinion; 0 for negative opinion).In our analysis, we estimated models explaining alters' opinions and egos' opinions (dependent variables).Given the objectives of our study, we report here only the models predicting alters' opinions .However, the models predicting egos' opinions are available in the Supplementary Material.Interested readers may consult these ego models there if they look for further insights to contextualize the results reported in the body of the paper.
For ego , = 1, …, , we have a vector of covariates , comprising the following covariates: Attribute-level : network centralization ('centraliz) Of particular relevance (and requiring additional explanation) is the alter covariate assortativity variable ('assortativity') which tests whether alter's opinion is likely to be in uenced by the opinions of those alters to which they are connected.Quantitatively, the assortativity variable indicates whether those alters connected to alter have, on average, a higher or a lower vaccination opinion than all of 's alters, different from .More precisely, for alter , let be the set of alters of ego , different from .(Note that in our data, these are always exactly nine alters since the size of all personal networks is ten.)Let be the neighbors of alter , that is, those other alters of ego who are connected to by an alter-alter tie.Then, for an alter with (that is, excluding the isolated alters), we de ne the assortativity variable to be the difference between the average opinion of 's neighbors and the average opinion among all of 's alters, different from .In formulas, For alters with no neighbors (that is, isolated alters with ) we consider this variable as unde ned and drop the respective alter from the analysis.
To provide an example, assume that among the nine alters of ego that are different from , six have a positive opinion.Further, assume that alter has alter-alter ties to four alters, among which two have a positive opinion.Then 's assortativity variable would equal to .Indeed, 's neighbors have a below average opinion for this network, justifying the negative value.
We emphasize that the de nition of the assortativity variable for alter does not take into account 's vaccination opinion, to avoid circular dependency in the data.
The assortativity variable tells us whether 's neighbors are more or less positive about vaccination than all the other alters of ego .The normalization obtained by subtracting the average opinion over the other alters is necessary since without this normalization the effect of this variable would be confounded by the overall ratio of positive opinion in 's network.
In a network with a high ratio of positive opinion, we would expect by chance alone that the average opinion among the neighbors of every alter is likely to lean on the positive side -and in addition, most alters in this network themselves are expected to have a positive opinion.Thus, without normalization, we would expect a positive correlation (taken over all the alters in our data) between 's opinion and the average opinion over 's neighbors -even if there is no assortativity of opinion present.(As a simple exercise, we estimated models with the average opinion of 's neighbors as a covariate and found a strong positive effect -which we claim to be a useless nding since we cannot tell whether that effect indicates assortativity or just a correlation of opinion caused by varying positive ratios over the egos.)By the normalization, we capture in the assortativity variable whether 's neighbors are more or less positive than the rest of the alters in the same network.

Statistical models
We estimated models explaining alters' opinion about COVID-19 vaccination, , via multilevel logistic regression with random ego-level intercepts 48 .The multilevel approach is necessary to account for two characteristics of our data.First, alters are clustered within egos (see the study design previously presented).Therefore, alters of the same ego might have similar values in unobserved covariates, e.g., political attitude, trust in institutions, etc.Second, alters' opinions are always reported by ego and it is questionable whether all egos have the same understanding of "positive" or "negative" opinion.In fact, by the multilevel approach, we do not attempt to explain the absolute level of opinion of the alters -but rather whether alters have a more or less positive opinion, compared to the other alters in the same network.An alternative approach using xed (rather than random) ego-level intercepts is not feasible since there are egos in whose networks no alter or all alters have a positive opinion -which would theoretically lead to ego-level intercepts that are minus in nity or in nity, respectively.Formally, the multilevel logistic regression models specify, for each alter = 1, …, , the probability that has a positive opinion via In the rst equation, the probability that alter has a positive opinion is the logistic transformation of the egolevel intercept plus the sum over the values in the covariate vector , multiplied with the values in the parameter vector .In the second equation, the ego-level intercepts , for = 1,. .., , are assumed to be drawn from a normal distribution with mean and variance .Given the data, we estimate , , , for = 1,. .., , and the parameter vector with the function glmer from the R package lme4 49 .
As an alternative to the multilevel models described, we t standard logistic regression models that have no egolevel intercept but instead include the average opinion over all other alters (compare with the discussion of the assortativity variable) as an additional predictor for explaining the alter's opinion .This additional variable prop vacc ex alter controls for the average opinion of all other alters in the network (see Results Section).By doing so, we explain the difference between the opinion of alter and that of all other alters.In other words, we explain why some alters have a higher or lower opinion than the other alters in the same network.We believe that the multilevel models are a better and more principled approach for our data.On the other hand, the pure logistic regression models (without any multilevel structure) are easier to understand and probably more common.In any case it is interesting to compare the results yielded by the two classes of models.

Supplementary Files
This is a list of supplementary les associated with this preprint.Click to download. supplementarymaterial.pdf Figures

Figure 2 Networks
Figure 2

Table 1
Descriptive statistics.Numeric variables of interest

Table 3
Multilevel logistic regression models explaining alters' opinion about vaccination 1.34, 95% CI: 1.24, 1.46, p < .001;M3: OR: 1.31, 95% CI: 1.21, 1.43, p < .001).This indicates that alters with similar opinions cluster together.Namely, people are more likely to have a positive attitude if their neighbors, on average, have a more positive attitude than the average attitude in the network.We also nd that ego's opinion about vaccination makes a statistically signi cant contribution in predicting alters' opinions (M1, OR: 8.48, 95% CI: 6.03, 11.94, p < .001;M3: OR: 8.30, 95% CI: 5.87, 11.73, p < .001).This positive association suggests a possible ego-alter contagion effect or, alternatively, a social selection effect in the sense that egos tend to select alters with the same opinion.

Table 4 (
Standard) logistic regression models explaining alters' opinions about vaccination.