## Introduction

Vaccines are likely the most effective public health intervention ever devised, preventing an estimated 2–3 million deaths every year1. Nevertheless, vaccination coverage is far from complete: In the USA, only 35% of relevant adults vaccinated against Herpes Zoster (shingles), 52% of females (21% of males) against human papillomavirus (HPV), and 45% of adults were vaccinated against influenza (flu)2.

One way to increase vaccination coverage is by advertising these vaccines to increase awareness to them and to inform people of their necessity. Specifically, online advertising offers a highly efficient way to tune adverts to inferred user needs and demographics. For example, in search advertising ads are served based on query keywords which indicate intent3. For this reason, spending on search advertising (also known as sponsored ads) in 2019 in the U.S. was valued at US$36.5 billion4 and US$109.9 worldwide5.

Ads have been used to steer people towards healthier behaviors, including smoking cessation6, weight loss7, and reduction in harm from eating disorders8. Directly measuring the effect of an online advertising campaign is difficult (e.g.,9). Thus, more often a proxy for the intended outcome is used. Web searches have been shown to be a powerful predictor of a wide variety of future behaviors. This includes a number of health-related behaviors and metrics, such as prescription drug utilization10, usage of electronic cigarettes11, mask-wearing12, as well as rates of both infectious13,14 and non-infectious diseases15,16. However, the utility of search as a predictive tool extends far beyond health behaviors. Web search data has also been used to predict home and auto purchasing behavior17, airline travel18, unemployment19, voter registration20, and suicide21. Given this plethora of predictions, it is reasonable to infer that searches for vaccines should be considered meaningfully related to actual vaccination behavior, in addition to raising awareness to (in our case) a vaccine. The only stronger proxy we are aware of, for vaccine advertising, is the work by Mohanty et al.22, which used the appearance of a person’s phone within a vaccination clinic as an indication that the person vaccinated.

Research into online advertising campaigns promoting vaccines focused on messaging which stressed the benefits of immunization or on providing information on opportunities for vaccination (see, for example22,23). Little attention has been given to the ways in which users can be reached and on ways to identify user intent. As we show below, most commercial vaccine advertising is aimed at people already expressing awareness of the vaccine, if not outright intention to vaccinate. We hypothesize that the reason for this focus is to nudge people to use specific vaccine types or vaccine providers. In marketing terms, these people are already in the conversation funnel, on their way to buying a vaccine, just debating over which one and where24.

This paper thus speaks to a robust literature on where search ads are in the conversion funnel, especially in relation to their brand specificity. The bottom of the funnel normally has very specific, more rare keywords, as people search for specific products they are going to buy, making them ripe to click on ads versus organic (non-paid) links24. This also speaks to the research on category searches25 which tend to diminish when the brands are well known. With vaccines the brands are not known (with exception of locations for flu shots, but those searches are relatively rare), so the searches are almost always categorical. Thus, in some ways, the vaccines may feel more like indirect sales, than direct sales, benefiting from varied, less direct keywords26.

We focus on an extreme example of indirect sales, people at the very top of the funnel who are unaware of a relevant vaccine (or even the disease): this is especially important for a stakeholder that may not care about which vaccine they receive or where, but simply that they obtain it. This is complicated by the fact that a large percentage of the relevant population for vaccination is unaware of the need to vaccinate with many unaware of the medical condition itself.

Therefore, we ask, is it possible to drive people who are unaware of a vaccine towards inoculation, and if so, what should the balance between advertising to people who are unaware, compared to those who are aware but have not yet vaccinated, be so as to maximally raise the number of vaccinated people?

## Results

### Campaign statistics

We ran advertising campaigns for vaccinations against three diseases: influenza, HPV, and herpes zoster.

Figure 1 exemplifies an advertising campaign and its execution. As described in the Methods, once campaign keywords are selected (a) the ads are shown to users who query using these keywords. Such campaign showings are called impressions (b). The impressions are divided between those where the campaign ads are shown (d) and where the ads of other advertisers are displayed. Each ad can be clicked by a user (f).

Table 1 shows the statistics of the three campaigns, including the dates of the campaigns, the number of distinct ad texts we created and the number of search terms used to trigger ads, number of times that the ads were shown and clicked, and the number of times that other ads were shown in response to the same queries as those for which the campaign ads were shown, denoted by “all ads”. The statistics in Table 1 refer to the steps in Fig. 1.

Among all vaccine-related ads, 98.1% of influenza ads were shown in response to vaccine- or disease-related keywords. Over 99.99% of HPV and Herpes Zoster ads were shown in response to such keywords.

Crowdsourcing coding of terms for their relatedness to the disease or vaccine produces very clear results for many terms being unrelated. On a scale of 1 (for “Definitely NOT searching for [vaccine]”) to 5 (for “Definitely searching for [vaccine]”), 40 of 68 Flu, 14 of 27 HPV, and 19 of 26 Herpes Zoster had means of 1 (i.e., every coder gave them a 1). On the top end there is less precision as coders gave terms for the actual vaccines means between 2.5 and 5.0 with an average of 4.2, and terms for the underlying illnesses between 1.7 and 4.2 with an average of 3.2. Sample terms and their scores are shown in Supplementary Data 1.

### Priming effect

We calculated the term similarity, CTR and the percentage of future searches for the keywords of each campaign. Table 2 shows the correlation between term similarities and the two latter variables. CTR is uncorrelated with term similarity. However, term similarity is strongly correlated with the percentage of future searches. Note that the distribution of term similarities is not uniform (see Supplementary Data 1), which may lead to an underestimation of the actual associations.

Using a Cox proportional hazard model, we evaluated the likelihood that a searcher who queried for one of the target terms (e.g., the condition searches) would later search for vaccines. Searches for terms which were not in the list of condition searches were marked as non-condition searches. The dependent variable is the time to future search from the original ad impression. Impressions that were not followed by a vaccine search from the same user were counted as censored. Table 3 shows the results of this model. We find that in all three cases, treated searchers were more likely to search for vaccines in the future than untreated searchers, and, most important: treated non-condition searchers were much more likely to search than untreated non-condition searchers.

Not only was the effect of treatment on post-ad vaccine searches greater, this effect extended to post-ad vaccine clicks. Table 4 shows the results of a logistic regression model for post-ad vaccine clicks. Searchers were coded as a 1 if they did a post-ad vaccine search and clicked an ad or organic (unpaid search) link in the search results, and 0 if they either did not do a post-ad vaccine search or did not click on any of the results. Table 4 presents the results of this model, which show that our ads generated a significant increase in post-ad clicks on vaccine-related links.

Figure 2 shows the cumulative hazard curves for post-advertisement vaccine searches, with the horizontal axis representing hours elapsed since since the search for a relevant term, stratified by whether the user queried for the disease or vaccine or not, and whether the user saw our campaign ads. As the figures show, treatment always increases the likelihood of future vaccine searches, especially in the near future (see also ref. 27). Curves for the control population which queried for the vaccine are significantly higher than for those who didn’t search for the vaccine. However, treatment of the latter population increases their likelihood of future vaccine searches to almost the same level as that of the control population which did search for the vaccine, clearly demonstrating increased awareness in this population.

Note that, in the case of non-condition original query, we see a large difference at time zero between the treated and control units. This is because for the non-condition searches, people are making a search unrelated to the vaccine or disease, and in the control, they are also not shown a vaccine ad. As such, if any of these people do eventually search for vaccines, it is likely to be long after their original search. On the other hand, a significant proportion of the non-condition users who were in the treatment group and received a vaccine ad on their search term did indeed search for vaccines shortly after seeing the ad, providing evidence of a priming effect of the vaccine ad on vaccine searches.

### Congruence effect of ads

We refer to the congruence effect to describe the change in the likelihood of clicks and future searches caused by the display of relevant ads to people who express an interest in the disease or the vaccine. We evaluated this effect by modeling the likelihood that links to ads or, separately, organic (unpaid search) links will be clicked, as a function of their parameters as well as those of the remaining page, taking into account whether additional ads and/or organic links pointed to the same website as that of the considered link (ad or organic).

Specifically, the model parameters account for the placement of the page or ad, as calculated by the rank of the ad or organic link in the page, where rank 1 means that they are placed highest. Model parameters of the page include the total number of ads and organic links on the page and the probability of the query among all queries. Model parameters of congruence include the fraction of ads or links on the search page which refer to the same website as that of the examined link.

The attributes we evaluated were:

• The number of ads displayed on the page

• The number of organic links displayed on the page

• The rank of the ad or the organic link

• Query popularity: Logarithm of the number of times this query text appears in our data

• Fraction of additional ads pointing to the same website

• Fraction of additional organic links pointing to the same website

Table 5 shows logistic regression models for the likelihood of clicks, separately for ads and for organic links. As the table shows, pages and ads which are higher in the page (lower rank) receive more clicks (as also shown in ref. 28). Additional ads are correlated with more clicks on ads but fewer organics, whereas fewer organic links are correlated with a higher likelihood of clicks on both ads and organic links.

Interestingly, more ads pointing to the same website have a mixed effect on clicks, in some cases (Herpes Zoster) associated with more clicks on ads and in some fewer (HPV), whereas their appearance is correlated with an increasing likelihood of clicks on organic links. Additional organic links from the same website are associated with more clicks on ads in the case of HPV, and less on the organic links in the case of influenza and HPV, but not Herpes Zoster. Thus, it would appear that for vaccine-related queries, it is useful to place ads to websites which appear in the organic links, as the latter drive even more people to click on the ads.

### Optimizing the distribution of campaign funding

In the section “Priming effect” we demonstrated the priming effect of advertising on terms perceived as less-relevant to vaccines, eliciting future searches for the vaccine. However, our results in the section “Congruence effect of ads” showed that placing an advertisement when a link to the advertiser exists in the organic link dramatically increases the likelihood of a click on the ads.

Some people who already know about the vaccine or even plan to vaccinate, that is, people already deep into the conversation funnel, will be more likely to actually do so if their searches are enhanced through ads placed as a response to their searches on these topics. But, our results suggest that if the goal is to recruit more people to vaccinate, advertisers could prime people through advertising to the less-relevant terms, hitting people in the target population but unaware of the vaccine, not even in the conversion funnel.

To find the division of advertising budgets between searches for the vaccine and less-relevant searches so as to maximize the number of people who vaccinate we consider these two factors as follows: First, the increase in click probability due to priming and second, the increase in click probability due to ads placed for vaccine searches. Specifically, let $${P}_{c}^{v}$$ be the probability of click on an ad or organic search result related to vaccination when the user queried for the vaccine, given that a vaccine ad was shown. Similarly, $${P}_{c}^{nv}$$ denotes the same, given that the vaccine ad was not shown. For the priming effect, let $${P}_{p}^{v}$$ be the probability of click on our ad or a future search for a vaccine, where the search term was not vaccine-related and our ad was shown. $${P}_{p}^{nv}$$ denotes the same, but when our ad was not shown.

Increased click probability for vaccine searches is calculated as $${E}_{C}={P}_{c}^{v}-{P}_{c}^{nv}$$ and the priming effect, EP, is equal to $${P}_{p}^{v}-{P}_{p}^{nv}$$.

The probabilities are shown in Table 6. Note that for influenza, showing the vaccine ads seems to reduce the likelihood of clicks on the ads, but this is likely due to the fact that many people search for the condition without explicitly mentioning the vaccine.

Our goal is to maximize ECα + EP (1 − α), where α is the fraction of campaign spending on vaccine searches (α [0, 1]), which is complicated due to the varying prices for any set of keywords. If all keywords were the same price, and EC > EP, the maximum of this function is reached when relevant ads are shown for all searches for the vaccine, and the remaining budget is given to priming searches. Conversely, when EC < EP, the function is maximized by showing relevant ads to all priming searches, and the remaining budget spent on vaccine searches. But, in reality prices do differ and for priming to be the more cost-effective option, the price of the cheapest set of priming keywords (where there is an expansive set of possible keywords with diverse pricing) just needs to have a higher expected return per dollar than the cheapest set of keywords for illness or vaccine-related options (which is much more limited in the scope of options). And, keywords later in a conversation funnel, branded keywords in particular, are more generally expensive29.

## Discussion

Nudging people towards vaccination is an important public health goal which has received significant attention over the years. Here we focused on methods for finding people to whom vaccine ads should be shown, identifying two main effects of such ads. First, we demonstrated a priming effect, whereby people who might not have otherwise considered vaccination could be nudged towards searching for information on the vaccine by showing vaccination-promoting ads to seemingly irrelevant terms, albeit ones used by the target population. Second, we found a congruence effect, whereby people who had shown an interest in the disease or the vaccine against it were more likely to click on relevant ads or increase the number of future searches following an ad displayed to them. We also found that the vast majority of ads promoting vaccines were targeted at this population.

Finally, we found that there is a complex interplay between organic (unpaid) links and ads. More ads pointing to the same website are correlated with an increasing likelihood of clicks on organic links. Additional organic links from the same website are associated with more clicks on ads in the case of HPV, and less on the organic links in the case of influenza and HPV, but not Herpes Zoster. Similar mixed effects were found in previous work30,31,32, but ours is the first to show that even among different vaccines this effect exists. Thus, advertisers should be careful to take this effect into consideration when designing campaigns.

Our work has several limitations: First, as noted in the Methods, here we chose to focus on two measures of outcome: clicks on relevant ads and future searches for the diseases or the vaccines against them. Though actual vaccination rates would have been a superior outcome, these are difficult (if not impossible) to obtain in our anonymous cohort. However, as shown in Supplementary Data 3, there is a good correlation between vaccination rates and the percentage of people searching for vaccines, when measured at the state level.

Second, our campaigns were run to all users who queried with a relevant keyword. In many cases, it may be possible to focus an advertising campaign by showing it to only the relevant demographics. However, in this paper, we did not test this option as it is unclear to which extent such demographic targeting is accurate and whether it would skew our results.

Third, when asked to quantify term similarity, most people gave scores that were either high (4 and higher) or very low (1–1.5). This was likely due to the difficulty that people have in assigning the middle values to the similarity ranges. Thus, the reported correlations could be skewed because of this difficulty.

Another limitation of our work is that we did not examine the specific attributes of the ads themselves. As noted in the Methods, the ads were chosen from popular ads shown on Microsoft Advertising. However, past work (e.g.,33,34), attributes of ads, including for example, whether they contain informational content, appeal to emotion, and include a call to action, all affect how people respond to ads. Another aspect which was not examined in this work was the effect of images which can be included in some forms of advertising, and are known to be useful for eliciting action35. These aspects are left for future work.

Research has found that people do not receive recommended vaccinations for a variety of reasons. Mulet Pons et al.36 found that people did vaccinate because they didn’t think they needed the vaccine, were unaware of it, were worried by adverse reactions or had just forgotten. Similar results were found by Bricout et al.37 for the case of the herpes zoster vaccine. More recently, vaccine hesitancy38 was shown to be an important factor. As we show in Supplementary Data 2, a significant percentage of the population are unaware of either the diseases or the vaccines that we studied, even in the target populations. This is in agreement with prior work39,40, which found a similar lack of knowledge on these issues. However, our experiments only attempt to alleviate people being unaware of the vaccine (see also Supplementary Data 2). Future research will endeavor to address the other factors cited above through appropriate wording of the ads.

Taken together, our finding have clear implications for the way in which the importance of vaccinations is communicated: If the goal of a campaign is to generate revenue for a particular vaccine manufacturer or provider, it may make sense to focus on the congruence effect, maximize spending on searches near the end of the conversion funnel, that is, on those already searching for the illness or vaccine itself. But, if the goal of a campaign is maximize the number of people in society who receive the vaccine, regardless of type or provider, there is a potentially higher return in focusing on the congruence effect for people in a target population for the vaccine who were searching for something unrelated to the vaccine, but are not already in the conversation funnel. However, as we show (“Optimizing the distribution of campaign funding”), care must be taken to balance the advertising budget across the different parts of the conversion funnel: While there are more people in earlier stages of the funnel, they are also less likely to change their behavior41.

This result has implications for digital campaign targeting (especially if future research will show direct measures of vaccine uptake): Digital advertising campaigns should consider not just pushing direct keywords or derive sets of individuals that look exactly like existing customers, but consider tangential keywords that draw in the wider target population who are likely earlier in their conversion funnel, or maybe not even in it. Moreover, we posit that part of the reason for advertiser’s focus on the population which already expressed an intention to vaccinate is that the most common measure for campaign success is click-through rate, which is naturally higher in this population. Therefore, those platforms that run digital advertisements should consider the return on investment measures like the one we note here on expected lift per dollar which is easy enough to automate even for smaller campaigns.

## Methods

In this work, we advertised vaccines against three diseases: Influenza, HPV, and Herpes Zoster. All three vaccines are regularly advertised by pharmaceuticals and health authorities in the USA. The budget for all campaigns was set at a maximum of US\$15 per day. Ads were shown to users in the USA.

### Advertising campaign development

Search advertising campaigns comprise of three main elements: Keywords (which searches direct someone to an ad), ad text (words of the ad itself), and landing page (where the ad sends someone that clicks on it).

Ads were shown in response to user queries on Bing which included a campaign keyword, if the bid engine of Microsoft Advertising decided to show said ads. The ads comprise of a short (one sentence) title and a 1–3 sentence body. The ad also includes a link to a website landing page which is displayed to users who click on the advertisement. Ads are shown in one of two locations on the search results page: Top or bottom. They are marked as ads to indicate to users that these are paid search results. We note that search pages also display unpaid links to websites. These are known as organic links.

Each display of an advertisement to a user is referred to as an impression. Advertisers pay whenever an ad is clicked. A common measure of ad performance is the percentage of ads that are clicked by users, also known as the Clickthrough Rate (CTR).

Treatment users in this study were people who queried using the campaign ad keywords and were shown the campaign advertisements. Control users in this study were people who queried using the campaign ad keywords but were not shown the campaign advertisements. Control users saw either no ads or ads from other advertisers, as served by the advertising system. The bids were set by the advertisement system to be the smallest values that would still make the ads appear on the first page of the search results, making the assignment to the treatment and control semi-random. As no demographic information is available on the populations, we cannot verify that the randomization was completely effective.

Below we provide details on how each element of the campaigns was selected for each of the vaccines.

Keywords for the influenza vaccine comprised of three types: Flu vaccine keywords, including mentions of the flu vaccine or the just the flu, and seasonal keywords. All three campaigns have a similar pattern of vaccine and illness-related terms, targeting people who are inside the conversation funnel, and some set of tangentially related terms (which differ slightly in structure to match the illness), targeting people who are in the target population, but may be outside the conversion funnel. In this case the latter was found by selecting searches with the following keywords that had at least 1000 daily searches in October/November 2018 and had similar levels of search across both 2017 and 2018: “weather”, “fall”, “autumn”, “pumpkin”, “halloween”, “thanksgiving”, “turkey”, “gravy”, “daylight saving”.

The text of the ads was developed by the authors, after examining other influenza vaccine ads shown on Microsoft Advertising. All ads led to the HealthMap Vaccine Finder website https://vaccinefinder.org/.

Keywords for the HPV vaccine were derived from the following classes of terms:

• Vaccine-related: Commercial names of the HPV vaccine or the term “hpv vaccine”

• Disease-related: HPV, “human papilloma virus”

• Keywords of other HPV vaccine ads: Other keywords used by advertisers of the HPV vaccine on Microsoft Advertising, e.g., “cervical cancer vaccine” and “signs of cervical cancer”

• High-school related: Keywords related to college applications (e.g., “common app”, “gpa”, “scholarship”), standardized testing (“sat”,“ap”), parent-school district connection software (“parent portal”, “infinite campus”)

• Parent queries: We identified people who were likely parents by finding those people who queried on Bing between 1 October 2018 and 31 March 2019, and mentioned the terms “my teen”, “teenage son”, or “teenage daughter”. We then scored queries during that time range as the fraction of users who made each query and were in the parent population. The top 50 queries were included as keywords.

• Weather- and news-related terms, as in the influenza vaccine adverts.

Advertisements comprised of the text of the most popular HPV vaccine ads shown on Microsoft Advertising on September 2018. The landing page for the ads was the Center for Disease Control (CDC) page on the HPV vaccine, https://www.cdc.gov/HPV/parents/vaccine.html.

Keywords for the Herpes Zoster vaccine were derived from the following classes of terms:

• Vaccine-related: Commercial names of the Herpes Zoster vaccine or the term “zoster vaccine”.

• Disease-related: “shingles”, “herpes zoster”.

• Older people’s queries: We found the 50 queries made by the highest percentage of people aged 50 or older during October 2019. Age was provided by users at the time of registration to Bing.

• Weather- and news-related terms, as in the influenza vaccine adverts.

Advertisements comprised of the text of the most popular Herpes Zoster vaccine ads shown on Microsoft Advertising on September 2018. The landing page for the ads was the CDC page on the Herpes Zoster vaccine, https://www.cdc.gov/shingles/vaccination.html.

### Assessing campaign effectiveness

As noted in the Introduction, directly measuring vaccination rates as a result of advertising is difficult. Therefore, here we use two other proxies for behavior change: Clicks on the ads (as measured by the CTR) and future searches by the same user on a search engine. The latter measure was the percentage of users who queried for the vaccine on Bing, either by name or by general terms, e.g., “flu vaccine”, following the display of a relevant ad. The latter is a common proxy for medically related behavior change6,7,8. As shown in Supplementary Data 3, at a state-level, this latter measure is a reasonable proxy for vaccination rates.

Searches of users in this experiment were anonymized before the investigators had access to them. Each search comprised of the time and date of the search, an anonymous user identifier, and the query text. We define condition searches as those searches that contained keywords related to either the vaccine or the disease. Data were extracted for the duration of the advertising campaigns.

### Classification of search term similarity

We used Amazon Mechanical Turk to code the search terms on a 5-point Likert scale to classify if the intent of the search was aimed towards a vaccine or not. The scale ran from “Definitely NOT searching for [vaccine]” to “Definitely searching for [vaccine]”. For background we provided the first paragraph of the Wikipedia article on the vaccine and encourage the workers to search if they need more information. For each vaccine we coded the term with over 100 ad impressions or (in the case of HPV) the top 25 terms by impression. We asked six workers to classify each term and took the average score given by them.

### Trial registration and IRB approval

This study was approved by the Microsoft Institutional Review Board. Informed consent could not be received from the anonymous participants. This trial was registered on AsPredicted registration number 34050.

### Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.