Homophily impacts the success of vaccine roll-outs

Physical contacts do not occur randomly, rather, individuals with similar socio-demographic and behavioral characteristics are more likely to interact among them, a phenomenon known as homophily. Concurrently, the same characteristics correlate with the adoption of prophylactic tools. As a result, the latter do not unfold homogeneously in a population, affecting their ability to control the spread of infectious diseases. Focusing on the case of vaccines, we reveal that, provided an imperfect vaccine efficacy, three different dynamical regimes exist as a function of the mixing rate between vaccinated and not vaccinated individuals. Specifically, depending on the epidemic pressure, vaccine coverage and efficacy, we find the final attack rate to decrease, increase or vary non monotonously with respect to the mixing rate. We corroborate the phenomenology through Monte Carlo simulations on a temporal real-world contact network. Besides vaccines, our findings hold for any prophylactic tool that reduces but not suppress the probability of transmission, indicating a universal mechanism in spreading dynamics. While it is known that vaccine adoption is clustered, most modelling studies are based on vaccines which efficacy is almost 100%. Here, motivated by the relatively low efficacy of COVID-19 vaccines, the authors explore the interplay of vaccination clustering and imperfect immunization and its effect on the size and probability of outbreaks, revealing a phenomenology which is absent for perfect immunization.

V accines have been crucial in humanity's struggle to protect itself from infectious diseases 1 . In the 20th century, vaccines enabled to control a series of diseases 2 as well as the eradication of smallpox 3 . Nevertheless, vaccines are not uniformly adopted in the population. While on a world-wide scale a lack of access impedes equitable adoption of vaccines 4,5 , among high income countries vaccine hesitancy is the primary barrier [6][7][8] .
Vaccine hesitancy widely correlates with age, socio-economic status, education level, or ethnicity [9][10][11][12] . Concurrently, these same factors shape the interaction patterns in society, leading to what is commonly referred to as homophily 13 , i.e., similarity of social contacts. As a consequence, social interactions are relatively homogeneous with regard to many socio-demographic or behavioral characteristics 13 , and vaccination status is not an exception 6,7,[14][15][16][17][18][19] . This non-uniform, clustered vaccine adoption strongly determines how, and whether, the virus spreads in the population. A great example of this effect is provided by the recurrent measles outbreaks in high-income countries caused by clusters of vaccine hesitant individuals 6,7,[14][15][16]18 .
These recurring outbreaks sparked modeling studies that analyzed the impact of homophilic vaccine adoption on the disease dynamics [20][21][22][23][24] . Due to the high quality of vaccines against measles, these models assumed vaccine efficacy of almost 100%, and showed that clustered adoption always increases the final attack rate. In contrast, vaccines as the ones against influenza or variants of concern of SARS-CoV-2 have relatively low efficacy, between 20% and 80% 25,26 .
In this work, through the joint exploration of imperfect immunization and vaccine uptake, we offer a wider picture in which vaccination clustering is not always detrimental. Adjustable interaction rates allow us to model different contact structures with respect to the vaccination status of the interacting individuals. In such mean-field approximation, we reveal the existence of three dynamical regimes in relation to how the vaccination assortativity affects the epidemic size. This phenomenology is then corroborated by simulations on a real-world temporal contact network. Overall, integrating recent findings on the effect of homophilic adoption of digital proximity tracing apps 27 , we point out a general mechanism in spreading dynamics, implied by any prophylactic measure that reduces but not suppress the probability of transmission.

Results and discussion
Modeling the contact structure and the spreading dynamics. We consider the standard susceptible-infected-recovered (SIR) model, with transmission probability β, recovery rate μ, and contact rate k. Accordingly, in the absence of protected individuals, and assuming homogeneous mixing, the basic reproduction number of the disease is R 0 = βk/μ 28 . The fraction of people who received a vaccine is fixed as V 2 0; 1 ½ . Upon encounter, an infected individual transmits the infection to a vaccinee at a reduced probability β 1 À ε ð Þ, where ε ∈ [0, 1] represents the vaccine efficacy.
Under random mixing, the effective reproduction number would be given by R ¼ R 0 1 À εV ð Þ . In order to include homophilic interactions, we now parametrize the mixing relation between vaccinated and not vaccinated people, i.e., the contact matrix K, with a parameter α ∈ [0, 1]. We denote the entries of K as k ij with i, j ∈ {V, N}, where 'V' and 'N' stand for 'vaccinated' and 'not vaccinated', respectively. We introduce α through the relation k NV = αVk, hence interpolating from complete homophily (α = 0) to random mixing (α = 1). The remaining contact rates follow from the balance equation (1−V)k NV = Vk VN and the constraint k = k NN + k NV = k VV + k VN . Accordingly, K has the following entries: The degree of homophily regarding vaccine uptake, h, i.e., the probability that during a contact both individuals are either vaccinated or not, thus reads Note that α may take values larger than one (with an upper bound depending on V), indicating disassortative mixing. Broad sociological evidence 13,29,30 , however, largely excludes this possibility. Recently, the adoption of contact tracing apps has been shown to strongly correlate with age, income and nationality [31][32][33] , effectively leading to homophilic (assortative) patterns.
Similarly, we tried to estimate assortativity in vaccine uptake indirectly through age-stratification. We leverage the correlations in the contact patterns among age groups and the levels of vaccine coverage within each age group (see the "Methods" section for details). Specifically, for each region/country considered, we combined the age-stratified contact matrices 34 with the data on vaccine adoption against SARS-CoV-2 (counting full vaccinations only), from January 2021 -when first full vaccinations appearedto September 2021 [35][36][37][38] . As shown in Fig. 1, the estimated mixing rate stays well below 1 for almost all the time, sometimes fluctuating around it only for very low levels of vaccine coverage (V ≲ 0.02 for Catalonia).
It must be said, however, that it would be naïve to take the temporal trends in Fig. 1 as ready-to-use data. Indeed, we neglect other important features beyond age such as socio-economic classes or spatial patterns 9,12 . Including these additional features would reinforce the homophilic structure and thus decrease the mixing rate. Therefore, the results deriving from our simple analysis must be solely understood as a qualitative indication in line with existing literature reporting homophily in health behavior 17,30,33 , hence with the choice α ∈ [0, 1]. The latter is always kept fixed here, leaving the study of its implicit timedependence for future work.  Fig. 1 Mixing parameter from age-stratified data on vaccine uptake. Weekly moving average of the mixing parameter, α, as inferred for different countries/regions starting from the contact matrix among age groups and the SARS-CoV-2 vaccines uptake of each group (whose aggregate, V, is shown in the inset plot) from January to September 2021.
We denote with X Y ≡ X Y (t) the fraction of vaccinated (Y = V) and not vaccinated (Y = N) people in compartment X 2 S; I f g at time t. Then, the differential equations governing the dynamics read as _ I N ðtÞ ¼ β k NN I N ðtÞ þ k NV I V ðtÞ Â Ã S N ðtÞ À μI N ðtÞ ð6Þ Calculation of the reproduction number. Linearizing Eqs. (6)-(9) around the disease-free equilibrium (I N , I V , S N , S V ) ≈ (0, 0, 1, 1), we note the Jacobian matrix J to act as the null matrix in the (S N , S V ) two-dimensional subspace, having only zero entries in the respective columns. This subspace thus provides two zero eigenvalues. The other two are provided by the 2 × 2 block J I related to the I N and I V compartments, taking the form The effective reproduction number, R, is found from the spectral radius , we can just look at ρ(J I ). If ρ(J I ) > 0 (R > 1) the disease-free equilibrium is unstable and a finite fraction of the population gets infected. By computing the spectral radius and inserting the explicit expressions of the K matrix entries, one gets One can easily check that R/R 0 decreases monotonously with respect to all the figuring parameters. The monotonous dependence on α was expected, as the higher is the latter, the lower is the number of effectively susceptible people an infected individual can come in contact with. The monotonic shape of R distinguishes the adoption of vaccines from contact tracing apps, for which R can show instead a non monotonous dependence on the mixing rate 27 .
As anticipated, Eq. (11) reduces to R ¼ R 0 1 À εV ð Þfor α = 1, i.e., random mixing. Interestingly, the symmetry between vaccine efficacy, ε, and coverage, V, holding for random mixing, breaks for homophilic adoption. As proved in the "Methods" section and illustrated in Fig. 2, when keeping the product εV fixed, a higher coverage lowers the reproduction number more than a higher efficacy.
By imposing R = 1 and solving for α, we find the critical value of mixing, α c , above which the disease cannot thrive, to be given that α c ≥ 2ð1 À 1=R 0 Þ À ε Â Ã = 1 À εð1 À VÞ ½ , with R 0 > 1, is satisfied. In particular, the condition is always met for ε = 1 and never met for ε = 0 (meaning eradication is not possible in this case, i.e., R > 1). The critical values of vaccine uptake, V c , and efficacy, ε c , are found from Eq. (12) by solving for the respective variable.
Three dynamical regimes. The final attack rate, understood as the final fraction of individuals that got infected (and eventually transitioned to the recovered/removed compartment) throughout all the duration of an outbreak, exhibits three different dynamical regimes with respect to its dependence on the mixing rate, α. This is shown in Fig. 3a, where, by increasing the basic reproduction number, R 0 , the dependence of the final attack rate on α is first monotonously decreasing, then concave and finally monotonously increasing. Following previous work 27 , we refer to these three regimes as critical, intermediate, and saturated, respectively.
These overall regimes result from the competition between two nearly complementary regimes, as illustrated in Fig. 3b, c. With no mixing at all (α = 0), vaccinated and not vaccinated form two disconnected components. Accordingly, the disease is free to spread in the not vaccinated cluster whenever R = R 0 > 1; on the other hand, R = R 0 (1−ε) < R 0 in the vaccinated one, so the spread does not occur if ε is high enough to ensure R < 1. Increasing the mixing (i.e., α), each cluster is 'diluted' with nodes from the other. As a consequence, the reciprocal protection that vaccinated people provide among them to keep infection chains localized is partially lost. Instead, not vaccinated individuals profit from vaccinated individuals in their vicinity and are thus subject to a lower infection probability. Mixing is therefore beneficial for not vaccinated people and detrimental for vaccinated ones. Then, depending on the basic reproduction number, R 0 , the vaccine coverage, V, and the vaccine efficacy, ε, the overall system falls in one of the three dynamical regimes. .
The same qualitative picture holds for the peak of prevalence (see Fig. 3d-f), i.e., the maximum number of simultaneously infectious people during the epidemic. Note that, for each set of fixed parameters, the respective regimes of the peak of prevalence and of the final attack rate may not coincide. For example, at R 0 = 3, the final attack rate is in the intermediate regime while the peak of prevalence is in the critical one (Fig. 3a). In Fig. 4, we show how the three regimes can be also accessed by varying the vaccine coverage, V, or the vaccine efficacy, ε, or both. Specifically, increasing V and/or ε makes the dynamics pass from the saturated to the critical regime, going through the intermediate one. Accordingly, a sufficiently high efficacy makes random mixing always beneficial.
In the case of a perfect vaccine, i.e., ε = 1, the system is always in the critical regime. Indeed, in this case, we see from Eqs. solely driven by the contact rate k NN , which is a linearly decreasing function of the mixing rate, α. As mentioned in the section "Introduction", it has been recently 27 revealed that the same three dynamical regimes found here for vaccination, are also induced by the homophilic adoption of digital proximity tracing apps for epidemic containment. Indeed, the existence of these regimes can be understood within a common underlying mechanism, enclosing a broad class of prophylactic measures. The essential ingredient is the lack of sufficient individual, independent protection offered by the prophylactic tool to the adopter. This is evident for contact tracing apps, as adoption does not directly protect the adopter; instead, individuals only indirectly benefit if adoption is sufficiently widespread such that transmission chains between adopters can be stopped. Similarly, if the vaccine is imperfect, isolated adoption in a high prevalence environment may not substantially reduce the individual infection probability.
Widespread adoption among an individual's contacts is thus necessary to provide mutual protection.
Accordingly, for both vaccines and contact tracing apps, if coverage is low in comparison to the epidemic pressure, only an assortative/clustered adoption can provide mutual protection and thus non adopters cannot be protected (saturated regime). On the contrary, if coverage is high, protection is sufficiently strong such that nonadopters can be protected and thus mixing between adopters and nonadopters is beneficial (critical regime). In between we then find the intermediate regime. In all three cases, this phenomenology holds for any prophylactic measure that reduces the transmission probability since it is mathematically equivalent to the model presented here. In this sense, equivalent results could be found for the use of face masks or the adoption of social distancing.
Monte Carlo simulations on a real-world contact network. To corroborate our theory, we performed numerical simulations upon a temporal contact network estimated via Bluetooth signal exchanges in the Copenhagen Networks Study 40 . The data was collected during a month with almost 700 participants. In order to study different levels of assortativity in vaccine adoption, we distribute vaccines algorithmically until a preset value of α is reached (see the "Methods" section for details) 41 .
The results for the final attack rate, and the peak of prevalence, respectively, reported in Fig. 5a, b for ϵ = 1.0, in Fig. 5c, d for ϵ = 0.8, and in Fig. 5e, f for ϵ = 0.6, confirm the existence of the three dynamical regimes identified by our model. Accordingly, the beneficial effect of random mixing vanishes when the vaccine does not provide sufficient protection. In the saturated regime, mixing is slightly detrimental. Furthermore, as in the mean-field case, we observe that final attack rate and peak of prevalence can be in different dynamical regimes . Particular structural constraints of this aggregated network do not allow for arbitrarily small values of α, making the saturated regime less evident. Nonetheless, higher homophilic levels (even α ≈ 0) can be reached by artificially controlling the mixing, in which case the decrease in the final attack rate would be better appreciated.

Conclusions
Depending on the need and the available information, the model can be applied to different specific populations and at different scales. Currently, many epidemic models only implicitly consider the mixing between not vaccinated and vaccinated individuals through stratification according to age (Fig. 1) 42,43 or socioeconomic status 44 . Our findings urge also to account for the impact that subgroup-specific levels of mixing, can have on the system as a whole. Such approaches may provide a further tool to interpret epidemiological data. Additionally, more realistic models can act as useful guides for policy makers to design nonpharmaceutical interventions. For instance, with respect to the current COVID-19 pandemic, various countries require the green pass to enter restaurants or the workplace. The intention behind the green pass is to nudge individuals to get vaccinated or that non vaccinated individuals reduce their number of physical contacts 45 . However, at the same time, the green pass reshapes the social fabric 46 effectively reducing the mixing rate between vaccinated and non vaccinated individuals. In light of our results, due to the associated reduction in the mixing rate, models that do not include homophily in vaccine adoption may actually overestimate the impact of the green pass. Overall, our results show that more detailed information on the correlations between social interactions and health behavior (vaccination, face mask, social distancing, contact tracing apps) would lead to a more comprehensive analysis.
Shortly after having finished this analysis, we became aware that Hiraoka et al. 47 and Watanabe et al. 48 made an analysis very similar to ours. The phenomenology they uncover is equivalent to the one we found here. In this sense, the findings are very robust regarding model assumptions, as Hiraoka et al. 47 consider random networks and an all-or-nothing vaccine 49 , while we focus on the mean-field case and a temporal real-world network assuming a leaky vaccine 49 . Additionally, Fefferman et al. 50,51 showed that the presence of homophily adds interesting phenomenology regarding the temporal trajectory of the epidemics beyond the three dynamical regimes in the final attack rate.
From a more theoretical standpoint, we showed that the presence of homophily in vaccine adoption leads to three different dynamical regimes (critical, intermediate, saturated). Furthermore, the phenomenology presented here also extend to any prophylactic measure that reduces the transmission probability such as the use face masks 48 , social distancing 52 , as well as digital contact tracing 27,53 . Accordingly, the phenomenology induced by the presence of homophily is robust with respect to the adoption of different health behaviors. This robustness of our findings across prophylactic tools hints to a general feature of spreading dynamics. Eventually, the results highlight how correlated metadata such as vaccination status can add rich phenomenology beyond the network structure itself [54][55][56][57] .

Methods
Homophily level of SARS-Cov-2 vaccine adoption. We infer the mixing parameter α of SARS-CoV-2 vaccine adoption through an age-stratified approach. We exclusively focus on the heterogeneous adoption and contact structure with respect to age strata. Due to limited data availability, we neglect further correlations such as socio-economic factors. In this sense, the inferred value of α likely represents an upper bound.
To calculate α we make use of the empirical contact matrix C 34 and the vaccination coverage v i (t) in age-strata i = 1, 2, …, M at time t [35][36][37][38] . The entry C ij of the contact matrix represents the average number of daily contacts of an individual in age-strata i with an individual in age-strata j. Vaccination data is stratified into intervals of 10 years (0−9,10−19,…,70−79,80+), while the contact matrices use 5 year bins. Therefore, as a first step, we average the empirical contact matrices to transform them into 10 year bins according to the census data.
Generally, the empirical contact matrices do not respect the balance equation C ij N i = C ji N j , where N i represents the number of individuals in age-strata i. There are various approaches to fix this issue 58 . Here, we defined the matrix C as such that the balance equation is met, being e C the empirical matrix. To infer α(t) we leverage its relation with h(t), i.e., Eq. (5). Namely, we first calculate h as Asymmetry between vaccine efficacy and coverage in lowering the reproduction number. Here we prove that, when keeping fixed the product εV between vaccine efficacy, ε, and coverage, V, a higher coverage is more effective than a higher efficacy in lowering the reproduction number, R. To this end, it is convenient to rewrite Eq. (11) in the form Differentiating Eq. (15) with respect to ε while keeping the product εV fixed, one finds ∂R ∂ε εV¼const: ð16Þ This is zero for α = 1. Supposing α ≠ 1 and imposing ∂R/∂ε| εV=const. ≥ 0, after some algebra, one gets the condition 4αεV ≥ 0, which is always satisfied. Therefore, it holds (∂R/∂ε)| εV=const. ≥ 0 and, in a complementary manner, (∂R/∂V)| εV=const. = − (ε/V)(∂R/ ∂ε)| εV=const. ≤ 0, which completes the proof.
Assortative seeding of vaccines on a network. We consider the temporal contact network from the Copenhagen Networks Study 40 that was inferred through Bluetooth signals. We retained those interactions with an associated received signal strength indication (RSSI) not lower than −74 dBm, corresponding to physical distances approximately up to 2 m 59 . The resulting temporal network involves N = 672 individuals and 374,884 pairwise interactions spread over 8064 timestamps, binning 4 weeks of recording time into 5-min intervals.
To distribute the vaccine in the population, we provisionally aggregate the temporal network to get a weighted, static one, where the weight of an edge, w ij , is the duration of time its end nodes interacted, understood as the number of 5-min bins this occurred. Accordingly, the homophily, h, is computed as the sum of the weights (duration of contacts) over the homophilic edges, normalized by the sum over all the edges. In line with Eq. (5), given the vaccination status, v i (v i = 1 if vaccinated and 0 otherwise), of an individual i, we can express h as Then α is obtained from Eq. (5) as α = (1−h)/[2V(1−V)]. To reach a desired mixing level α, we initially distribute the vaccine at random (α ≈ 1) and then iteratively swap the vaccination status of two randomly selected neighboring nodes whenever this leads to a decrease in α, until the latter attains the preset value (±0.01). Additionally, if the average strength (number of contacts) of vaccinated individuals is higher (lower) than that of not vaccinated ones, a swap is only allowed if it increases (decreases) the strength of not vaccinated ones. Without this condition, the algorithm induces a spurious correlation between vaccination status and the strength of a node. Given the role that frequently interacting nodes play in driving the spreading dynamics, such correlation can importantly affect the results. Therefore, as done in Burgio et al. 27 -but not in a previous, related work 17 -only innocuous, not correlating swaps are carried out. The code for the seeding of the vaccine is publicly available 41 .

Data availability
All the data used in this study is publicly available and accordingly referenced.

Code availability
The code to reproduce all the results presented here is available at github and archived at zenodo.org 41 .