The landscape of childhood vaccine exemptions in the United States

Once-eliminated vaccine-preventable childhood diseases, such as measles, are resurging across the United States. Understanding the spatio-temporal trends in vaccine exemptions is crucial to targeting public health intervention to increase vaccine uptake and anticipating vulnerable populations as cases surge. However, prior available data on childhood disease vaccination is either at too rough a spatial scale for this spatially-heterogeneous issue, or is only available for small geographic regions, making general conclusions infeasible. Here, we have collated school vaccine exemption data across the United States and provide it at the county-level for all years included. We demonstrate the fine-scale spatial heterogeneity in vaccine exemption levels, and show that many counties may fall below the herd immunity threshold. We also show that vaccine exemptions increase over time in most states, and non-medical exemptions are highly prevalent where allowed. Our dataset also highlights the need for greater data sharing and standardized reporting across the United States.


Background & Summary
Sufficient vaccine coverage to achieve herd immunity is the key to eliminating vaccine-preventable childhood diseases 1 . In the United States, the resurgence of once-eliminated diseases such as measles has reignited discussions over the role of vaccine hesitancy in diminishing herd immunity 2 . State-mandated school entry immunization requirements in the United States play an important role in achieving high vaccine coverage, but variations in vaccine exemption policies result in a patchwork of vaccine coverage across the country 3 . In all states, medical exemptions from immunizations can be obtained for conditions such as those resulting in immunosuppression or in cases of known adverse reaction to past immunizations. These exemptions have traditionally remained at levels low enough to not represent a threat to herd immunity, although recent dynamics such as that observed in California may change this assessment in the future [4][5][6] . Most other states offer religious exemptions, in which individuals cite personal religious beliefs that preclude vaccination, while others allow philosophical exemptions, which encompasses any personal belief against receiving a vaccination. For example, Georgia allows exemptions where "the immunization conflicts with the religious beliefs of the parent or guardian", whereas Minnesota allows exemptions based on "conscientiously held beliefs of the parent or guardian" 7,8 . The difficulty to obtain these exemptions for the parents varies widely from state to state 9 , helping to create a heterogeneous landscape of immunization policies and of vaccination coverage.
It is essential that we understand the variations in vaccine coverage at spatial scales that are relevant for the circulation of vaccine preventable diseases and to the implementation of public health policies. Such an understanding requires fine-scale geographic data on vaccination coverage and hesitancy. In the United States, information on a number of medical and non-medical exemptions is accessible for states through the Centers for Disease Control and Prevention (CDC) 10 . However, this spatial scale is largely inadequate as both exemptions and vaccine preventable childhood disease cases tend to cluster at smaller spatial scales 11 . Many prior studies have been performed at smaller scales, from the county to the school levels 3,[12][13][14][15] . However, the datasets used in these studies tend to be limited both in time and in terms of the spatial area they cover. Larger scale studies remain a challenge because there is no unified resource for data on childhood vaccination at more local spatial scales 16 . Each state is generally responsible for sharing exemptions data, and, when the data is shared at all, the formats vary widely between states as well as within states for different years. We aim to bridge that gap and facilitate further spatial studies of medical and non-medical exemptions by providing a standardized dataset of medical and non-medical exemptions at the county-scale encompassing multiple states and years in the continental United States.

Methods
We collected data from all US states where school vaccine exemption information was freely available from the Department of Health website in any format. We were able to locate that data in 24 states (see Table 1 for a list of states included). Within these states, the number of years available varied relatively widely, between 19 years in California and a single year in 6 states. The most represented year in our dataset was 2017 (corresponding to school year 2017-2018). Because the dataset was compiled in June-July 2019, we note that it is likely that additional data for more recent years may be available, or that data may have become available in additional states not included in our dataset.
The data format varied widely between states, and exemptions were reported either as a number of exemptions or as a percentage of the enrolled students. We have elected to use number of students rather than percentages, and have transformed data as needed. For most states included in our dataset, the data are provided at the county level. In several states (Arizona, Colorado, Illinois, Maine, Michigan, South Dakota, Tennessee, Vermont, Oregon, and Washington), the data was provided at the school level, which we aggregated to the county.
Additional data processing was necessary in some cases. In Virginia, data was provided by school name, but county or city information was not included. We used a list of public and private schools to match school names with their respective county using fuzzy matching (with the 'fuzzywuzzy' Python package) with an 80% matching requirement. Our algorithm was unable to find a suitable match for between 3.8% and 6.8% of schools (depending on year), and these schools were not included in the final counts at the county level. Similarly, in Idaho, data at the school level included city information but county was not provided. We first matched city and county names, before aggregating the exemption data at the county level. Finally in New York state, exemptions were provided as percentages at the school level but enrollment information was not included. We obtained enrollment for public and private schools separately from the New York State Education Department, and used the school unique code to calculate exemption number from enrollment and exemption percentages. We then aggregated these numbers at the county level.
States reported data for exemptions based on varying definitions, so we selected data records based on data availability to make the data comparable across states. We aimed to achieve parsimonious definitions of total medical exemptions (Fig. 1a), total non-medical exemptions (Fig. 1b), and total exemptions (Fig. 1c), which includes both types of exemptions. We define medical exemptions as reported total medical exemptions. In Florida, permanent medical exemptions were reported separately from temporary medical exemptions, so permanent medical exemptions was chosen to represent total medical exemptions. To define total non-medical exemptions, we considered the state law regarding non-medical exemptions and the data availability. If the state reported total aggregated non-medical exemptions, that was selected as total non-medical exemptions. If the state reported only religious exemptions and only allows religious exemptions, that was selected as total non-medical exemptions. If the state reported only religious exemptions, but also allows philosophical exemptions, that was www.nature.com/scientificdata www.nature.com/scientificdata/ considered missing data. If the state allows philosophical exemptions and only reports philosophical exemptions, that was selected as total non-medical exemptions, as the state may not differentiate religious from philosophical. If the state allows philosophical exemptions and reports both religious and philosophical exemptions separately, these values were summed for total non-medical exemptions. To define total exemptions, if the state reported a total exemptions value, this value was used. If the state did not report a total exemptions value, but reported values for total medical exemptions and total non-medical exemptions, as defined above, these were summed for total exemptions. If the state was missing either medical or non-medical exemptions, but reported the total number of students with completed vaccinations, the total exemptions was the difference between the number of students enrolled and the number of students completed. This classification process is visualized in Fig. 1.
We also considered disease-specific exemptions reports. If a state reported the number of exemptions for a vaccine specific to a given infection, that value was used. If the state did not report exemptions, but did provide the total number complete for that disease, the difference between the enrolled students and the completed students was used. For pertussis-specific vaccination, we used DTaP exemptions where available, and TDaP exemptions where DTaP was not available. For measles-specific vaccination, if separate reports were available for measles, mumps, and rubella, the value for measles was used. If measles was not available, then the mumps or rubella exemptions were used, if available.
The data in the figures is only data reported for kindergartens in states where kindergarten-specific data was available, or K-12 data in states where kindergarten-specific data was not reported. States reported age groups heterogeneously, and data by other age groups is available in the data file. We note that Oregon reports kindergarten-specific data in 2014-2015, then K-12 data in 2016-2018.

Data Records
A master file containing cleaned and consistent data is available on Dryad 17 . This dataset contains the year, state, county FIPS code, and pertinent age group for each record. Each county-year combination that is available has entries for total enrolled, total medical exemptions, total non-medical exemption, religious exemptions, philosophical exemptions, pertussis exemptions, measles exemptions, varicella exemptions, and flu exemptions.
State-year available is detailed in Table 1. Unformatted data by state and year is also available on Dryad. Figure 2 represents the number of total exemptions at the county level reported in 2017. States in blue are those that report kindergarten-specific data. States in green report data for kindergarten through 12th grade. The values reported are the total number of exemptions divided by the total number of enrolled students. The highest values, represented in the lightest green or blue colors, represent the counties in which herd immunity appears to  www.nature.com/scientificdata www.nature.com/scientificdata/ be below the 95% threshold for measles. Some states exhibit high levels of heterogeneity in exemption values, like Florida and Pennsylvania. Several states have a number of counties with low exemption rates, like New York and California, as this data was reported after California disallowed non-medical exemptions. More concerning is the fact that several states appear to have many counties with high levels of exemptions, like Washington and Idaho. We also highlight the vast number of states in white which do not report fine-scale exemption data. Increased reporting will be crucial to further understanding the landscape of immunity in the United States.

technical Validation
After demonstrating the spatial distribution of exemptions, we considered the temporal variation in exemptions. Figure 3 shows the time series of total exemptions by state with standard error. States are visible on the plot only in the years in which data was available for more than one year. States are split into different plots only to aid with visual differentiation. This figure highlights the heterogeneity in temporal reporting across states which adds to the difficulty in understanding trends in exemptions and how they might relate to trends in policy changes, vaccination dynamics, or disease dynamics. We also generally see an increase in exemptions over time in the  www.nature.com/scientificdata www.nature.com/scientificdata/ majority of states. Increasing levels of exemptions may reduce herd immunity levels, and this is a problem across the United States. The case of California highlights the importance of studying exemption trends across time. We see that there is a plunge in exemptions following the removal of non-medical exemptions followed by an increase in the exemption level, perhaps indicating that vaccine hesitant individuals found other means to obtain exemptions 3,4,18 . Figure 4 demonstrates total exemptions across data reported years by state, broken down into medical exemptions, religious exemptions, philosophical exemptions, or unknown. In states that allow philosophical exemptions, these types of exemptions make up the majority of exemptions for most states (Maine, Michigan, and Washington), except for Pennsylvania. Overall, medical exemptions compose a small proportion of total exemptions. In some states, disease-specific exemptions were also reported, and the total exemptions in reported years are shown by state (Fig. 5). In several states, like Arizona, Connecticut, Idaho, Minnesota, Vermont, and Washington, it appears that exemptions are consistent for disease types, possibly indicating that children are exempt from all vaccines, instead of obtaining exemptions for specific vaccines. In other states, exemptions for a specific vaccine or vaccines appear to be more common. Maine and Pennsylvania have more exemptions for varicella. California, Oregon, and Massachusetts have more exemptions for pertussis.  www.nature.com/scientificdata www.nature.com/scientificdata/

Usage Notes
This dataset will have potential uses for guiding public health policy, parameterizing epidemiological models, and validating other sources of vaccination data.
In public health policy, it is important for states to understand the fine-scale landscape of immunity for public health planning. This is necessary for proactive steps, like spatially targeted vaccination campaigns, and for more reactive steps, like anticipating upcoming outbreaks in areas where herd immunity has been waning. Our dataset also makes it feasible to compare across geography and time. A comparative approach is crucial to a better understanding of the underlying drivers of vaccine refusal and the effectiveness of existing public health policies. The larger geographic scope is also crucial to understand the effects of vaccine exemptions at a regional scale; disease susceptibility and transmission does not stop at state borders, thus it's important for individual states to understand the impact of declining vaccination coverage and herd immunity in neighboring states.
This data is also important for the parameterization of epidemiological models. Incorporating fine-scale population-level immunity into models to further understand and predict the reemergence of childhood diseases in the United States will vastly improve model results. Many prior epidemiological studies have only been able to focus on small spatial scales due to available data 19 , but we have seen prior outbreaks of childhood diseases that are not spatially continuous, like the Disneyland measles outbreak 20 . Thus, the use of this dataset will greatly enhance the prediction and mechanistic understanding of childhood disease resurgence.
Our dataset provides the first unified source of information for the landscape of vaccine refusal for childhood diseases in the United States. School records of immunization provide finer spatial resolution, high response rate, and vaccine-specific information. However, these data are limited in accuracy 21 , are only accessible for about half of US states, and remain challenging to compare given differences in data collection methods and data quality 16,[22][23][24] . Novel data sources must be explored to provide a truly comparable fine-scale understanding of the heterogeneity in vaccine refusal across all states of the US. With such alternative datasets, validation will be important against our school vaccine exemption dataset.

Code availability
The code used to produce the figures included in the manuscript as well as the full cleaned and raw datasets are available on Github at https://github.com/bansallab/exemptions-landscape. The code runs in Python 3.6.