Abstract
Although human mobility is considered critical for the spread of the new coronavirus disease (COVID-19) both locally and globally, the extent to which such an association is impacted by social vulnerability remains unclear. Here, using multisource epidemiological and socioeconomic data of US counties, we develop a COVID-19 pandemic vulnerability index (CPVI) to quantify their levels of social vulnerability and examine how social vulnerability moderated the influence of mobility on disease transmissibility (represented by the effective reproduction number, Rt) during the US summer epidemic wave of 2020. We find that counties in the top CPVI quintile suffered almost double in regard to COVID-19 transmission (45.02% days with an Rt higher than 1) from mobility, particularly intracounty mobility, compared to counties in the lowest quintile (21.90%). In contrast, counties in the bottom CPVI quintile were only slightly affected by the level of mobility. As such, a 25% intracounty mobility change was associated with a 15.28% Rt change for counties in the top CPVI quintile, which is eight times the 1.81% Rt change for those in the lowest quintile. These findings suggest the need to account for the vulnerability of communities when making social distancing measures against mobility in the future.
Similar content being viewed by others
Introduction
The new coronavirus disease (COVID-19) pandemic has caused profound disruptions to the lives of people worldwide (Bonaccorsi et al., 2020; Buckee et al., 2021) and disproportionately affected disadvantaged and underprivileged subpopulations (UN, 2020; Buckee et al., 2021). The devastating economic and social effects caused by the pandemic necessitate an investigation into the drivers of disease transmission in the past to formulate appropriate and effective preventive strategies in the future.
As the major routes of transmission for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) are via direct physical contact, droplets, or aerosols, the human movement has been considered critical for the spatial and temporal spread of the disease (Kraemer et al., 2020; Huang et al., 2021). That is, mobility directly contributes to the dispersal of infections through social contact. However, different social groups in terms of income, employment status, and/or age may be vulnerable to the disease to varying degrees due to their mobility abilities and patterns, behaviours and lifestyles, and socioeconomic resources. Thus, the interplay between social vulnerability, mobility, and transmission is complex, and there is an urgent need to understand their interrelationships to make more pertinent public health and social measures against future waves of COVID-19 and other public health crises.
Since the initial outbreak of COVID-19 in Wuhan, China, a plethora of research has assessed the influence of mobility on COVID-19 transmission using anonymized aggregate mobile phone data (Buckee et al., 2020; Kraemer et al., 2020; Tian et al., 2020). The impact of mobility on COVID-19 transmission rates in 25 counties in the US has also been evaluated (Badr et al., 2020), and a strong correlation between mobility patterns and COVID-19 growth rates in the most affected United States (US) counties was found. Similarly, a strong correlation was found at the state level in the US (Gao et al., 2020). These results explain why mobility reduction, such as stay-at-home orders, the closure of schools and workplaces, and restrictions on long-distance travel, was adopted as a primary nonpharmaceutical intervention (NPI) to contain the transmission of the disease. As of July 2021, 186 countries had announced at least one domestic mobility control policy, while 186 countries had also employed an international policy (Hale et al., 2021). These mobility control measures achieved considerable effects, as transmission decreased in 73% of the countries following their first mobility interventions (Nouvellet et al., 2021). Based on cell phone location data from Shenzhen, China, changes in COVID-19 transmission during the course of reopening were simulated by varying the type of mobility restrictions under different transmission scenarios; it was found that a 20–60% reduction in mobility within the city had a significant effect on controlling the spread of COVID-19 (Zhou et al., 2020).
Although an extensive body of literature has investigated the impact of mobility on transmission dynamics, the extent to which such an association is moderated by social vulnerability and varies across geographical areas and different social groups remain largely unexplored. The majority of studies focus on the relationships between vulnerability and mobility and between vulnerability and disparate pandemic outcomes, e.g., mobility reduction as related to income during the lockdown period (Bennett, 2021; Hou et al., 2021), infection risk as related to income (Rufat et al., 2015; Eligon et al., 2020; Snyder and Parks, 2020; Cahill et al., 2021), and the age structure as related to infection or mortality rates (Gu et al., 2020; Snyder and Parks, 2020). However, these studies do not address the varying impacts of mobility on transmission dynamics due to social vulnerability, i.e., how social vulnerability disparately affects the mobility–transmission association. It has been found to be inappropriate and insufficient to apply a uniform relationship between mobility and transmission across counties and social groups with disparate socioeconomic statuses when formulating preventive measures (Gozzi et al., 2021). Understanding such heterogeneous effects can help policymakers target certain social groups and thus make more effective interventions to mitigate COVID-19 transmission and ameliorate social inequity. Moreover, social vulnerability is a multiple-dimensional concept that is rooted in the interactions among social, natural, and engineered systems (Cutter, 2003). With the notion of this complexity, social vulnerability is typically conceptualized as consisting of different dimensions; in practice, these dimensions are collapsed into composite indicators such as the Social Vulnerability Index (SoVI) (Spielman et al., 2020). However, most of the existing studies select a single or a small subset of sociodemographic variables to explore COVID-19 transmission among different vulnerable communities, thereby neglecting the combined effect of multiple variables on COVID-19 transmissibility (Fauver et al., 2020; Tian et al., 2020; Xu and Li, 2020; Cahill et al., 2021). Thus, a comprehensive measure of COVID-19 vulnerability is needed to better understand the socially heterogeneous mobility–transmission relationship.
Efforts have been made to construct a multidimensional COVID-19 vulnerability index (see a more detailed review in the Supplementary Materials). For example, one group of studies follows the traditional framework of social vulnerability to select relevant variables (Acharya and Porwal, 2020; Kim and Bostwick, 2020; Macharia et al., 2020; Snyder and Parks, 2020; Daras et al., 2021; Sarkar and Chouhan, 2021; Qiao and Huang, 2022; Welsh et al., 2022), with the aim of examining the explanatory power of the framework or certain variables. In contrast, another group of studies attempts machine learning methods to assess the vulnerability of administrative units. For instance, Tiwari and colleagues (2021) created the COVID-19 pandemic vulnerability index using random forests and then classified counties in the United States into varying levels of vulnerability. Compared to the traditional methods, the indices constructed via machine learning are less restricted by the conventional theories present in statistics and may more accurately predict the dynamics of the COVID-19 pandemic in communities. However, the advantages of such data-driven methods could become constraints when testing theoretical hypotheses and the internal consistency of the index.
This study investigates COVID-19 transmissibility across counties in the US affected by mobility changes following successive reopening policies implemented by state and local governments. We first develop a COVID-19 pandemic vulnerability index (CPVI) for US counties using their census data via traditional principal component analysis (PCA). Our CPVI is built on the well-established framework of the SoVI from the Centers for Disease Control and Prevention (CDC) (CDC, 2020). In addition to the four dimensions of the SoVI, other factors relevant to the pandemic, i.e., health and the environment, are also included because several studies have found various influences of these factors on COVID-19 spread (Chin et al., 2020; Klompmaker et al., 2021; Marvel et al., 2021). Based on the resulting CPVI, the counties are then divided into five quintiles, with the top 20% representing the most vulnerable group of counties. By including the interaction of mobility and vulnerability in a fixed effect model, the heterogeneous effect of mobility on COVID-19 transmission under varying vulnerability levels (corresponding to the five quintiles) is then observed. In addition, the differences in the effect of vulnerability levels on transmissibility, as represented by the effective reproduction number (Rt), are estimated and tested. The empirical results show that mobility reduction measures can be implemented in a manner that is more pertinent with respect to a county’s vulnerability level, thereby facilitating a more effective containment of COVID-19 infections.
Methods
COVID-19 cases
Statistical data on COVID-19 and the population were collected from USAFacts, which is used by the US CDC. The dataset contains the cumulative number of daily confirmed cases and the cumulative number of COVID-19 deaths in every county in the US. We downloaded the case data from January 22, 2020, to September 1, 2020, and then we derived the number of new cases each day within the period.
County attribute data
The county attribute data were obtained from multiple publicly available data sources. First, data on demographics (e.g., the percentages of the elderly population and ethnic minority population), socioeconomics (e.g., income and poverty), diseases (e.g., diabetes and hypertension data), and health insurance were retrieved from the CDC’s 2015–2018 statistical dataset. Second, the air pollution data were collected from the County Health Ranking & Roadmaps, which count the yearly average PM2.5 concentration for each county. Third, we collected 2017 data on airborne isolation rooms and hospital bed capacity from the Area Health Resources Files. Finally, the normalized difference vegetation index (NDVI) was retrieved from the Moderate Resolution Imaging Spectroradiometer (MODIS) global vegetation index data product (MOD13A2), which provides global gridded NDVI products. The average NDVI values of each county in 2019 were later calculated using the Google Earth Engine (GEE). These metrics were selected because extensive studies have found that they have differential effects on the spread of epidemics (Baron, 2020; Onder et al., 2020, Tahmasebi et al., 2020; Yan et al., 2020; Zoran et al., 2020; Klompmaker et al., 2021). Table S1 lists all the indicators and relevant information used.
Population mobility data
Data on population mobility were originally collected by SafeGraph, which tracks the trajectory of millions of anonymous mobile phone users and generates a daily human movement origin-to-destination (O–D) flow matrix at the county scale in the US. We obtained the O–D data for the period of January 1, 2020–September 1, 2020, and these data were further processed and made available by Kang et al. (2020).
With the O–D dataset, we obtained the inflow population (Inflowt), outflow population (Outflowt), and internal mobile population (Intraflowt) for each county on day t. We took the average population mobility from January 1, 2020, to January 21, 2020, prior to the outbreak in the US as the baseline. Thus, the changes in IntraM for each county during the outbreak can be expressed as follows:
where \({\text {IntraM}}_i^t\) denotes the daily change rate of IntraM on day t, \({\text {Intraflow}}_i^t\) denotes the IntraM on day t, and \({\text {Intraflow}}_i^{t_0}\) denotes the daily average of IntraM during the baseline period. All for county i. IntraMt > 1 indicates an increase of IntraM relative to the baseline period, whereas IntraMt < 1 indicates a decrease.
Similarly, we also calculated the daily change rate of InterM for each county, which can be expressed as follows:
where \({\text {InterM}}_i^t\) denotes the change rate of InterM on day t for county i. The numerator denotes the sum of the inflow and outflow of county i on day t, and the denominator denotes the daily average of the sum of the inflow and outflow of county i during the baseline period.
Setting and selection of study participants
The US directives to shelter in place and temporarily close nonessential businesses and schools were made at the state and local levels throughout March and April 2020 (Badr et al., 2020). The national average mobility then decreased rapidly, followed by a slow decrease in daily new cases. However, in late April and May, the states enacted additional reopening policies, leading to the gradual recovery of mobility (Smith et al., 2020). At the beginning of the outbreak, many studies demonstrated that there was a strong correlation between the spread of COVID-19 and population movement (Badr et al., 2020; Coelho et al., 2020, Rubin et al., 2020; Chang et al., 2021; Nouvellet et al., 2021). However, with the reopening of the states, mobility started to recover, and the number of infections also rebounded rapidly, ushering in a second wave of the pandemic (Fig. S1). While many studies have examined mobility in relation to COVID-19 transmission during the prelockdown and lockdown periods, few studies have explored this relationship postlockdown without vaccination. Therefore, we chose the period of June 1, 2020–August 31, 2020, for our analysis, centring on the mobility–transmissibility relationship.
We set the following county-filtering criteria: counties in the contiguous US, excluding Alaska and Hawaii; counties that had at least one case of COVID-19 as of June 1, 2020; and counties that did not have an average of fewer than three cases per day ranging from June 1, 2020, to August 31, 2020. Counties with a 3-day average of less than one case were excluded. Ultimately, 1118 (out of 3143) counties were selected, covering 257,867,883 people or 78.56% of the total US population, and the cumulative number of confirmed cases in these counties as of September 1, 2020, was 4,980,400 or 82.75% of the total. These counties are distributed in 46 US states and Washington, DC.
Construction of the CPVI
There have been several attempts to construct a COVID-19 vulnerability index (Acharya and Porwal, 2020; Marvel et al., 2021; Tiwari et al., 2021). The basic strategy is a combination of the SoVI used by the US CDC and other factors closely related to the COVID-19 pandemic, such as epidemiological factors and healthcare system factors.
Our CPVI is built on previous social vulnerability indices by selecting the four dimensions used by the SoVI (Spielman et al., 2020) and incorporating epidemiological and healthcare system factors. Eight out of 15 SoVI variables (being an elderly individual, ethnicity, groups, language, lower education attainment, income, poverty, and unemployment) and 9 epidemiological and healthcare system variables (smoking, diabetes, coronary heart diseases, hypertension, air pollution, green exposure, isolation rooms, hospital beds, and insurance) were included in the calculation after selection based on the previous literature (Supplemental Materials). This process allows us to estimate the contribution of each indicator and component and thus test the internal consistency and reliability of the CPVI (Spielman et al., 2020).
The widely used PCA with a varimax rotation matrix is adopted to eliminate redundant information and construct a set of PCs (Halko et al., 2011). The PCs with eigenvalues greater than one should be retained, as each of them, explains more variance than a single variable in the original data (Bro and Smilde, 2014). The loadings of the rotation factors are used to calculate the scores of each PC on the samples. The score vector of the kth PC on the sample is denoted as Si; then, it can be expressed as follows:
where X denotes the matrix formed by the values of each indicator in each sample and aj is the factor loading vector of the jth principal component.
The contribution weights of each subcomponent are calculated by their contribution proportion as follows:
where Wj denotes the weight of the jth subcomponent, m denotes the number of subcomponents, and pj denotes the proportion of the contribution of the jth subcomponent.
Finally, the CPVI is calculated as follows:
The CPVI is divided into five quintiles, with the first quintile (level 1) indicating the lowest vulnerability level and the last quintile (level 5) indicating the highest vulnerability level. Each county is assigned a specific vulnerability level based on its CPVI value.
Calculation of R t
Similar to other infectious diseases (e.g., severe acute respiratory syndrome (SARS) and influenza), the real-time transmissibility of COVID-19 can be estimated using the effective reproduction number (Rt) (Cauchemez et al., 2006; Cowling et al., 2020). Rt is the average number of secondary cases per case at any given time, and it provides a useful measure of how quickly the virus is spreading. As the pandemic progresses, implementing various restrictions and interventions will change the Rt value. If Rt is >1, each infected case will go on to infect an average of more than one person, and the number of infected cases may increase exponentially. If Rt is <1, each infected case will go on to infect an average of less than one person, possibly suggesting a slowdown of the outbreak over time.
We use the approach developed by Bettencourt and Ribeiro (2008) to calculate the Rt of US counties (Bettencourt and Ribeiro, 2008). The daily additions of confirmed cases provide information on the current value of Rt; hence, Rt can be estimated using such numbers. In addition, the value of Rt for the current day is related to the values of Rt one day before and every previous day since the outbreak. The approach is based on the Bayesian framework to estimate the value of the daily Rt using the new cases reported daily:
where P(k|Rt) is the likelihood of observing k new cases given Rt on day t, P(Rt) is the prior beliefs of the value of P(Rt) at the beginning of the study period, and P(k) is the probability of observing k new cases for the given day t.
Given an average arrival rate of λ new cases per day, the probability of observing k new cases follows the Poisson distribution:
Then, the relationship between Rt and λ exists as follows:
where L is the reciprocal of the serial interval, and the value of the serial interval is assigned with the mean (standard deviation) (i.e., 7.5 (3.4) days) according to a previous epidemiological survey (Li et al., 2020; Rubin et al., 2020). Furthermore, since the number of daily new cases is known, the Poisson parameterized likelihood function can be re-expressed by fixed k and varying Rt (48):
Fixed effect model
We formulate a fixed effect model to estimate the heterogeneous association between mobility and transmission based on the county-level CPVI value divided into five vulnerability levels.
The regression of mobility and transmission at time t is written as follows:
where yit is the dependent variable denoting COVID-19 transmissibility (Rt). IntraMit and InterMit denote the two mobility measurements, namely, IntraM and InterM, respectively; CPVIi denotes the COVID-19 pandemic vulnerability index value; the interactions between the mobility measurements and vulnerability (IntraMit*CPVIi and InterMit*CPVIi) denote the moderating effects of vulnerability; Zi denotes other observable and unobservable time-invariant factors (except vulnerability) affecting COVID-19 transmissibility; and μit is the time-specific and individual-specific residual. βs are the regression coefficients, β1 and β2 are the coefficients of the main effect of mobility on transmission, and β4 and β5 denote the magnitude of the moderating effect of vulnerability.
Take the average of both sides as follows:
The difference between Eqs. (11) and (12) are as follows:
Alternatively:
Therefore, all the time-invariant factors, including the COVID-19 vulnerability index CPVIi and other observed and unobserved factors Zi, are differenced out. Only the difference in transmission, the difference in mobility measurements, and the interactions are left.
Results
COVID-19 pandemic vulnerability index
Tables 1 and S1 list the data collected to build the CPVI. The data cover the demographics, social economy, disease, natural environment, and medical security of the counties in the US. After conducting PCA (see the details in the “Methods” section), four meaningful components explaining over 70% of the total variance in the dataset were derived (Table 1). Given the large deviations in the vulnerability variables of some counties, logarithmic operations were performed and then standardized using a mean-standard deviation function. Positive directionalities were assigned to the loadings of principal components (PCs) that were observed to increase the COVID-19 pandemic, whereas negative directionalities were assigned to the loadings of PCs that were observed to decrease the COVID-19 pandemic.
The PCA uncovered four components with eigenvalues >1.0. PC1 explains 38.58% of the total variance and consists of unemployment, poverty, ethnicity, being an elderly individual, lower education attainment, language, and groups. As a result, PC1 is positively correlated with all seven variables. PC2, which explains 16.63% of the total variance in the dataset, is a component that is positively correlated with air pollution, hypertension, coronary heart disease (CHD), diabetes, and smoking but negatively correlated with green exposure. Accounting for 11.79% of the total variance in the dataset, PC3 is negatively correlated with income and positively correlated with insurance. Finally, PC4, which explains 8.30% of the total variance in the dataset, is negatively correlated with isolation rooms and hospital beds.
The CPVI of each county was calculated by weighting based on the variance ratio of each PC. Then, the CPVI values were divided into five vulnerability levels by quintile. The selected set of counties and their corresponding vulnerability levels are displayed in Fig. 1. This map captures the counties with the highest vulnerability levels for a variety of reasons: high unemployment rates and poverty rates (e.g., in Mississippi and South Carolina), the counties with the most minorities and the worst air quality (e.g., in California), the counties with more elderly people (e.g., in Florida, North Carolina and Alabama), the counties with more chronic diseases (e.g., in New York), and the counties with bad air quality and medical resource shortages (e.g., in the Great Lakes coastal region). In contrast, in the counties with lower air pollution and a younger age structure, such as those in the Central Plain, the CPVI is lower. This pattern is consistent with the COVID-19 Pandemic Vulnerability Index Dashboard Map (date: 01/07/2020) released by the National Institute of Environmental Health Science.
The distribution of the CPVI is not identical to any of the characteristics included. For example, counties in California have a higher personal income and level of education than the US average; thus, they should be considered to have the lowest vulnerability if sorted solely by socioeconomic characteristics. However, they are identified as having a vulnerability level of 5, which is the highest vulnerability level. Similarly, Florida is equipped with good medical resources but still has many counties that are identified as having a vulnerability level of 4 or 5. This finding shows that the CPVI does not sort counties based solely on one dimension, such as socioeconomic or pandemic factors, and is thus able to capture the combination of dimensions.
Variation in COVID-19 transmissibility across different vulnerability levels
In this study, the effective reproduction number (Rt) represents the daily transmissibility of COVID-19 (see more details about how to derive it in the “Methods” section). Figure 2 shows the change in Rt for each county on the 4 selected days (June 1, July 1, August 1, and August 31) from June 1 to August 31, 2020. On June 1, 2020, 310 counties had an Rt ≥ 1 (highly transmissible), representing 36.85% of the counties under study (i.e., those for which the Rt can be derived). Of these 4 days, there was the largest number of counties with an Rt ≥ 1 (i.e., 559 counties) on July 1, 2020; this total is more than 50% of the counties under study. The distribution of the hardest-hit counties on June 1 and July 1, 2020, coincided with that of the counties with high CPVI values (Fig. 2A and B) and was concentrated in the coastal areas of California, Florida, and some east-central states. The numbers and distributions of counties with high transmissibility on August 1 and August 31 were similar, i.e., 227 and 271, respectively. They were mainly located in the central-east states, with fewer in the west (Fig. 2C and D).
We calculated the Rt for 102,856 county days (1118 counties * 92 days) from June 1 to August 31, 2020. A county day is defined as a highly transmissible day if its Rt is ≥1 on that day. To explore transmissibility in relation to vulnerability during the study period, we counted the total number of highly transmissible days at different vulnerability levels. Table 2 shows the percentage of highly transmissible days of all county days in the counties under study. When aggregated at the monthly level, the percentage of highly transmissible days increased from June to July and then decreased from July to August. Such a sequence of change is relatively close to the overall US pandemic trend (Fig. S1). In terms of the vulnerability level in each month and overall, the percentage of highly transmissible days increased as the vulnerability level rose. In particular, the number of level 5 locations remained above 50% in both June and July (53.66% in June and 58.72% in July).
Figure 3 shows a comparison of the number of highly transmissible days at each vulnerability level from June 1 to August 31. There was a positive correlation found between COVID-19 transmissibility and vulnerability level. In addition, the fluctuation in the number of county days with high transmissibility per day increased with the increase in the vulnerability level.
Variations in mobility recovery across different vulnerability levels
We differentiated intracounty mobility (IntraM) from intercounty mobility (InterM) and plotted the average values of these two indices for each vulnerability level (Fig. 4). The indices displayed a very consistent pattern in which the values decreased and recovered. With the IntraM from the prelockdown period falling to its lowest level on March 21 during the lockdown period and after reopening, there was a consistent inverse relationship between IntraM and the vulnerability level; i.e., the higher the vulnerability level is, the lower the IntraM is (Fig. 4A). For varying vulnerability levels, the drops in IntraM to the bottom were different. The group of counties with the lowest level of vulnerability showed a decrease of 24.94% relative to their baselines, whereas the decrease was 32.98% for the group with the highest level of vulnerability; thus, the latter decreased by 8.04% more than the former.
On the other hand, the InterM from the prelockdown period dropped to its lowest level on April 12. The group of counties with the lowest vulnerability showed a decrease of 62.72%, whereas the group with the highest vulnerability showed a decrease of 66.36%. The latter decreased by 5.50% more than the former. Another interesting aspect is that before the lockdown period, InterM was higher in more vulnerable groups of counties than in less vulnerable groups; however, during and after the lockdown period, the opposite trend was observed (Fig. 4B).
In this paper, mobility (including both IntraM and InterM) is defined as a ratio relative to the average mobility during the normal period in the US, January 1 to 21, 2020 (see more details in the “Methods” section). A higher (lower) mobility during and after the lockdown period indicates a higher (lower) recovery of mobility. With this notion in mind, the empirical interpretation of the relationship between vulnerability and mobility is as follows: even though the average IntraM (and InterM) of all 5 vulnerability levels fluctuate in a similar pattern, the higher the vulnerability level is, the lower the recovery level of mobility is.
The effect of mobility on R t under varying vulnerability levels
To examine the relationship between mobility and COVID-19 transmissibility in counties with different levels of vulnerability, we calculated the correlations of IntraM and InterM with high transmissibility days for all counties together and for counties in separate vulnerability groups. For all counties, the correlations of IntraM and InterM with high transmissibility days were both moderate (correlation coefficients: 0.53 and 0.55, respectively; both p < 0.05). However, as shown in Fig. 5, these associations were dramatically weakened (0.17 and 0.37; both p < 0.05) when only counties with a low level of vulnerability were considered. Furthermore, regarding the correlation between IntraM and high transmissibility days across each vulnerability group, the correlation coefficient gradually increased with increasing vulnerability level (0.17–0.68, p < 0.05). However, regarding the correlation between InterM and high transmissibility days, the correlation coefficient increased from vulnerability level 1 to vulnerability level 2 and then began to fluctuate by ~0.6 (0.54–0.63, p < 0.05) from vulnerability level 3 to vulnerability level 5.
After exploring the associations above, a fixed effect model (Eq. (14)) was employed to estimate the relationship between mobility and Rt for each county. Clearly, the coefficients of IntraM and InterM were significantly positive for all vulnerability levels (p < 0.05) (Table 3), implying that an increase in both InterM and IntraM would increase the Rt. In other words, both IntraM and InterM accelerated the transmission of COVID-19 in the US during the study period. In addition, in the counties with vulnerability levels between 1 and 4, the coefficients of IntraM and InterM are more equivalent than the coefficients of counties at vulnerability level 5, where the effect of IntraM is almost 2 times larger than that of InterM.
Furthermore, we examined whether the effects of mobility on Rt are significantly heterogeneous across different vulnerability levels. When designating the effect of IntraM in counties at vulnerability level 1 as a reference, the IntraM coefficients in counties at the other 4 levels were significantly larger (p < 0.05) (Table 3). Figure 6 shows the effects of IntraM and InterM on Rt over different vulnerability levels (i.e., 1 to 5 from left to right). Using the group of counties at the highest vulnerability level as a reference, we observed that the effects of IntraM in counties at the other four levels were smaller. In other words, a 25% change in IntraM was associated with a 1.81% (interquartile range (IQR): 0.11–3.51%) change in Rt for the lowest vulnerability level (level 1) but a 15.28% (IQR: 13.35–17.21%) change in Rt for the highest vulnerability level (level 5), which is eight times larger than the former (Table 4). However, for the counties at vulnerability levels 2–4, the changes in the effect of IntraM were marginal. A similar pattern was observed for InterM. The magnitude of the coefficients of InterM increased from 0.09 to 0.28 for counties with a vulnerability level from 1 to 4, with a slight decrease from 0.28 for level 4 to 0.22 for level 5 (Table 3). Additionally, a 25% change in InterM resulted in a 2.31% (IQR: 1.83–2.79%) change in the Rt for level 1 but a 6.98% (IQR: 6.23–7.72%) change in the Rt for level 4.
However, the F-test results showed two distinct effects of IntraM and InterM (Table 5). For IntraM, the differences between the coefficients of levels 2–4 were not statistically significant, but the coefficients of level 1 and level 5 were significantly different from those of the other three levels (p < 0.05). This result illustrates that the coefficients of IntraM across all 5 vulnerability levels could be divided into three parts—level 1, levels 2–4, and level 5. The differences between the three parts were significant, but the coefficients within each part (only part 2) were not statistically significantly different from each other. Furthermore, for InterM, the difference between the coefficients of level 1 and level 2 was marginal; however, for both levels, the coefficients were significantly different from those of the other three levels (p < 0.05). That is, the coefficients of InterM across all 5 vulnerability levels could be divided into two parts, levels 1–2 and levels 3–5. Different from IntraM, there was an inverted U-shaped change in the coefficients in the second part, that is, a slight but significant increase in InterM from level 3 to level 4, as well as a significant decrease from level 4 to level 5.
Discussion and conclusion
The COVID-19 global pandemic has severely affected economies and societies worldwide; in this regard, population mobility has been identified as a key factor in shaping transmission dynamics. However, understanding how mobility affects COVID-19 transmissibility in different regions tends to be complex. For this reason, we examined the heterogeneity of the effects of mobility on COVID-19 transmissibility.
We developed a comprehensive CPVI for US counties that includes socioeconomic, natural environment, and health care coverage dimensions. Through our modelling, we found a positive association between the CPVI and the high transmissibility days in US counties. This result suggests that our CPVI captures the intrinsic vulnerability of people who tend to be hit hardest by COVID-19 and therefore bear the greatest burden of the pandemic.
We also assessed the joint effects of the CPVI and population mobility (including both IntraM and InterM) on COVID-19 transmissibility (Rt). Mobility is positively related to real-time changes in COVID-19 transmissibility (Rt) at all vulnerability levels but to varying degrees. Overall, the more vulnerable a county is, the stronger the influence is; this heterogeneity applies to both IntraM and InterM. The results suggest that the effectiveness of unified policies in restricting mobility may be insufficient, thus calling for “customized” policies based on the vulnerability and mobility characteristics of counties.
We noted that the effect of IntraM on the transmissibility of COVID-19 is greater than that of InterM in the group of most vulnerable counties, thereby necessitating the urgent need to implement mobility reduction policies within these counties. The difference in the effects of IntraM versus InterM may be because within-county contact, which is mostly at the household level, is much more frequent than intercounty contact. We also observed different patterns in the changing mobility effect on transmissibility, which could point to different policies. For example, priority can be given to reopening areas with a lower vulnerability level (such as level 1 and level 2) and subsequently reopening other areas in succession based on the level of vulnerability, depending on the development of the pandemic. Social distancing policies that target reducing the level of social contact within a county could be stricter in areas that have a vulnerability level of 2–4, and more demanding social distancing policies and better support could be advised for areas with a vulnerability level of 5.
Additionally, we found a change in the relationship between the vulnerability level and mobility before and after control measures were launched in the US. Specifically, before the lockdown period, the mobilities of counties belonging to all vulnerability level groups were quite similar; however, after lockdown policies were launched, the vulnerability level groups tended to have lower mobility. This finding is consistent with current policies, suggesting that the policies efficiently reduced the mobility of the most vulnerable groups of people. However, it seems to contradict a previous finding showing that state lockdown policies had a larger effect on income-advantaged groups than on reducing their mobility (Jay et al., 2020). This study uses the proportion of days staying at home or at workplaces all day as the measurement of absolute mobility, and we successfully reproduced the patterns consistent with the abovementioned study using our mobility dataset in terms of household income (Fig. S2). Thus, the disparity results because our study measures relative mobility (compared to a baseline time period) rather than absolute mobility and uses the CPVI rather than household income. By combining the work of Jay et al. with our own, we catch sight of a more complete picture; even though the lockdown policy reduced most of the absolute mobility of the income-advantaged group (who should be regarded as the least vulnerable group), it actually disproportionately affected the most vulnerable group by restricting their nonwork mobility when compared to the reduction in relative mobility. This result also calls attention to the importance of rigorously selecting and precisely measuring the dependent variables in policy evaluation.
Compared with studies that focus solely on one dimension (e.g., income or unemployment only), our study established a comprehensive measurement of vulnerability to the COVID-19 pandemic. We proposed a new composite indicator, namely, CPVI, which emphasizes the COVID-19 risk of socially vulnerable groups by considering the socioeconomic, natural environment, and epidemiological factors. The empirical results suggest that the CPVI captures extremely vulnerable counties in all dimensions included, which provides unique advantages and insights that previous similar studies are unable to capture.
Our study was conducted under several data limitations and assumptions. First, 1118 counties were included in this study. Although accounting for only 35% of all counties in the United States, these counties include 78% of the US population and more than 82% of the cumulative confirmed cases, thus reflecting the true status of the COVID-19 pandemic in the United States. Second, even though fixed effect models were used to minimize the error caused by time-invariant confounders, time-variant confounders such as temperature and people’s knowledge of COVID-19 were not included in the models. Therefore, we also tested another fixed effect model that incorporated time-fixed effects in addition to individual fixed effects. We supposed that the effects of factors that are time variant (but consistent across counties) could be taken into account by the time factor. The results of the time and individual fixed effect models (Tables S2, S3, and Fig. S3) are consistent with the results from our original fixed effect model, even though the coefficient magnitudes of IntraM and InterM at different vulnerability levels change. Third, the incubation period was set as 0 in this study. Given an estimation of the mean incubation period of COVID-19 to be approximately 5 days (Lauer et al., 2020), to test whether our model is robust to the varying incubation periods, we ran our model again with the 5-day incubation length. The results (Tables S4, S5, and Fig. S4) show that the interaction effect of the CPVI and mobility is consistent with and robust to different lengths of the incubation period. Fourth, there might be other factors affecting the relationship between mobility and COVID-19 transmissibility, such as the disease-preventive behaviour adopted by people in social contact. The population’s disease-preventive behaviour, such as avoiding visiting crowded places, directly reduces mobility. Wearing a face mask is also observed to be effective in reducing transmissibility (Cowling et al., 2020). Therefore, disease-preventive behaviour was treated as time-invariant and removed from the calculation process under the assumption that people’s disease-preventive behaviour is more constant compared to the outbreak and the lockdown period (March–June 2020). Fifth, the underlying assumption of using the traditional PCA to compose our CPVI is based on the linear combination of features. However, the relations among those features could be complex. Kernel-PCA may be explored in this regard. Finally, this study was conducted under nonvaccination intervention conditions, which may have some impact on the results of this study when more effective assays are subsequently found and when vaccines are administered and distributed widely. Nevertheless, the method developed in this study can still be applied, although vaccination may change the mobility and transmissibility numbers. Our study also suggests that the distribution of vaccines should assign priority based not only on the age structure but also on other vulnerability factors that tend to be overlooked, such as counties with poor air quality, a little greenery, and/or inadequate medical facilities.
These limitations notwithstanding, our study provides policymakers with a more effective way to explain the transmission of COVID-19, design prevention policies, and carry out additional community-supporting practices. In a paradoxical situation where state governments implement embargo policies that harm regional socioeconomic development and where the blind choice to reopen may exacerbate the spread of COVID-19, our findings should help policymakers develop a policy of batched or gradual reopening.
Data availability
Datasets of confirmed cases of COVID-19 are available at https://usafacts.org/issues/coronavirus/. Population mobility data were obtained from https://github.com/GeoDS/COVID19USFlows, and county attribute data from several organizations are specified in the Methods “County attribute data”. The code for this study is available at https://github.com/huangzhihui-421/Social-vulnerability-amplifies-the-disparate-impact-of-mobility-on-COVID-19-transmissibility-across-.
References
Acharya R, Porwal A (2020) A vulnerability index for the management of and response to the COVID-19 epidemic in India: an ecological study. Lancet Global Health 8(9):e1142–e1151
Badr HS, Du H, Marshall M, Dong E, Squire MM, Gardner LM (2020) Association between mobility patterns and COVID-19 transmission in the USA: a mathematical modelling study. Lancet Infect Dis 20(11):1247–1254
Baron YM (2020) Incidence and case-fatality ratio of COVID-19 infection in relation to tobacco smoking, population density and age demographics in the USA: could particulate matter derived from tobacco smoking act as a vector for COVID-19 transmission? Preprint at medRxiv https://doi.org/10.1101/2020.10.04.20206383
Bennett M (2021) All things equal? Heterogeneity in policy effectiveness against COVID-19 spread in chile. World Dev 137:105208
Bettencourt LM, Ribeiro RM (2008) Real time bayesian estimation of the epidemic potential of emerging infectious diseases. PLoS ONE 3(5):e2185
Bonaccorsi G, Pierri F, Cinelli M, Flori A, Galeazzi A, Porcelli F, Schmidt AL, Valensise CM, Scala A, Quattrociocchi W (2020) Economic and social consequences of human mobility restrictions under COVID-19. Proc Natl Acad Sci USA 117(27):15530–15535
Bro R, Smilde AK (2014) Principal component analysis. Anal Methods 6(9):2812–2831
Buckee C, Noor A, Sattenspiel L (2021) Thinking clearly about social aspects of infectious disease transmission. Nature 595(7866):205–213
Buckee CO, Balsari S, Chan J, Crosas M, Dominici F, Gasser U, Grad YH, B. Grenfell, Halloran ME, Kraemer MU (2020) Aggregated mobility data could help fight COVID-19. Science 368(6487):145–146
Cahill G, Kutac C, Rider NL (2021) Visualizing and assessing US county-level COVID19 vulnerability. Am J Infect Control 49(6):678–684
Cauchemez S, Boëlle P-Y, Donnelly CA, Ferguson NM, Thomas G, Leung GM, Hedley AJ, Anderson RM, Valleron A-J (2006) Real-time estimates in early detection of SARS. Emerg Infect Dis 12(1):110
CDC (2020, 2018) CDC SVI 2018 documentation. https://svi.cdc.gov/Documents/Data/2018_SVI_Data/SVI2018Documentation-508.pdf
Chang S, Pierson E, Koh PW, Gerardin J, Redbird B, Grusky D, Leskovec J (2021) Mobility network models of COVID-19 explain inequities and inform reopening. Nature 589(7840):82–87
Chin T, Kahn R, Li R, Chen JT, Krieger N, Buckee CO, Balsari S, Kiang MV (2020) US county-level characteristics to inform equitable COVID-19 response. Preprint at medRxiv https://doi.org/10.1101/2020.04.08.20058248
Coelho FC, Lana RM, Cruz OG, Villela DA, Bastos LS, Pastore A, Piontti, Davis JT, Vespignani A, Codeço CT, Gomes MF (2020) Assessing the spread of COVID-19 in Brazil: Mobility, morbidity and social vulnerability. PLoS ONE 15(9):e0238214
Cowling BJ, Ali ST, Ng TW, Tsang TK, Li JC, Fong MW, Liao Q, Kwan MY, Lee SL, Chiu SS (2020) Impact assessment of non-pharmaceutical interventions against coronavirus disease 2019 and influenza in Hong Kong: an observational study. The Lancet Public Health 5(5):e279–e288
Cutter SL (2003) The vulnerability of science and the science of vulnerability. Ann Assoc Am Geogr 93(1):1–12
Daras K, Alexiou A, Rose TC, Buchan I, Taylor-Robinson D, Barr B (2021) How does vulnerability to COVID-19 vary between communities in England? Developing a small area vulnerability index (SAVI). J Epidemiol Community Health 75(8):729–734
Eligon J, Burch AD, Searcey D, Oppel RA Jr (2020) Black Americans face alarming rates of coronavirus infection in some states. The New York Times 8 April: A1. https://www.nytimes.com/2020/04/07/us/coronavirus-race.html
Fauver JR, Petrone ME, Hodcroft EB, Shioda K, Ehrlich HY, Watts AG, Vogels CB, Brito AF, Alpert T, Muyombwe A (2020) Coast-to-coast spread of SARS-CoV-2 during the early epidemic in the United States. Cell 181(5):990–996. e995
Gao S, Rao J, Kang Y, Liang Y, Kruse J, Dopfer D, Sethi AK, Reyes JFM, Yandell BS, Patz JA (2020) Association of mobile phone location data indications of travel and stay-at-home mandates with COVID-19 infection rates in the US. JAMA Netw Open 3(9):e2020485–e2020485
Gozzi N, Tizzoni M, Chinazzi M, Ferres L, Vespignani A, Perra N (2021) Estimating the effect of social inequalities on the mitigation of COVID-19 across communities in Santiago de Chile. Nat Commun 12(1):1–9
Gu T, Chu Q, Yu Z, Fa B, Li A, Xu L, Wu R, He Y (2020) History of coronary heart disease increases the mortality rate of coronavirus disease 2019 (COVID-19) patients: a nested case-control study based on publicly reported confirmed cases in Mainland China. BMJ open 10(9):e038976. https://doi.org/10.1136/bmjopen-2020-038976
Hale T, Angrist N, Goldszmidt R, Kira B, Petherick A, Phillips T, Webster S, Cameron-Blake E, Hallas L, Majumdar S (2021) A global panel database of pandemic policies (Oxford COVID-19 Government Response Tracker). Nat Hum Behav 5(4):529–538
Halko N, Martinsson P-G, Shkolnisky Y, Tygert M (2011) An algorithm for the principal component analysis of large data sets. SIAM J Sci Comput 33(5):2580–2594
Hou X, Gao S, Li Q, Kang Y, Chen N, Chen K, Rao J, Ellenberg JS, Patz JA (2021) Intracounty modeling of COVID-19 infection with human mobility: assessing spatial heterogeneity with business traffic, age, and race. Proc Natl Acad Sci USA 118:24
Huang B, Wang J, Cai J, Yao S, Chan PKS, Tam TH-W, Hong Y-Y, Ruktanonchai CW, Carioli A, Floyd JR (2021) Integrated vaccination and physical distancing interventions to prevent future COVID-19 waves in Chinese cities. Nat Hum Behav 5(6):695–705
Jay J, Bor J, Nsoesie EO, Lipson SK, Jones DK, Galea S, Raifman J (2020) Neighbourhood income and physical distancing during the COVID-19 pandemic in the United States. Nat Hum Behav 4(12):1294–1302
Kang Y, Gao S, Liang Y, Li M, Rao J, Kruse J (2020) Multiscale dynamic human mobility flow dataset in the US during the COVID-19 epidemic. Sci Data 7(1):1–13
Kim SJ, Bostwick W (2020) Social vulnerability and racial inequality in COVID-19 deaths in Chicago. Health Educ Behav 47(4):509–513
Klompmaker JO, Hart JE, Holland I, Sabath MB, Wu X, Laden F, Dominici F, James P (2021) County-level exposures to greenness and associations with COVID-19 incidence and mortality in the United States. Environ Res 199:111331
Kraemer MU, Yang C-H, Gutierrez B, Wu C-H, Klein B, Pigott DM, O. C.-D. W. Group, Du Plessis L, Faria NR, Li R (2020) The effect of human mobility and control measures on the COVID-19 epidemic in China. Science 368(6490):493–497
Lauer SA, Grantz KH, Bi Q, Jones FK, Zheng Q, Meredith HR, Azman AS, Reich NG, Lessler J (2020) The incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases: estimation and application. Ann Intern Med 172(9):577–582
Li Q, Guan X, Wu P, Wang X, Zhou L, Tong Y, Ren R, Leung KS, Lau EH, Wong JY (2020) Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia. New Engl J Med 382:1199–1207
Macharia PM, Joseph NK, Okiro EA (2020) A vulnerability index for COVID-19: spatial analysis at the subnational level in Kenya. BMJ Global Health 5(8):e003014
Marvel SW, House JS, Wheeler M, Song K, Zhou Y-H, Wright FA, Chiu WA, Rusyn I, Motsinger-Reif A, Reif DM (2021) The COVID-19 Pandemic Vulnerability Index (PVI) Dashboard: monitoring county-level vulnerability using visualization, statistical modeling, and machine learning. Environ Health Perspect 129(1):017701
Nouvellet P, Bhatia S, Cori A, Ainslie KE, Baguelin M, Bhatt S, Boonyasiri A, Brazeau NF, Cattarino L, Cooper LV (2021) Reduction in mobility and COVID-19 transmission. Nat Commun 12(1):1–9
Onder G, Rezza G, Brusaferro S (2020) Case-fatality rate and characteristics of patients dying in relation to COVID-19 in Italy. JAMA 323(18):1775–1776
Qiao ML, Huang B (2022) Assessment of community vulnerability during the COVID-19 pandemic: Hong Kong as a case study. Int J Appl Earth Obs Geoinf 113:103007
Rubin D, Huang J, Fisher BT, Gasparrini A, Tam V, Song L, Wang X, Kaufman J, Fitzpatrick K, Jain A (2020) Association of social distancing, population density, and temperature with the instantaneous reproduction number of SARS-CoV-2 in counties across the United States. JAMA Netw Open 3(7):e2016099–e2016099
Rufat S, Tate E, Burton CG, Maroof AS (2015) Social vulnerability to floods: Review of case studies and implications for measurement. Int J Disaster Risk Reduct 14:470–486
Sarkar A, Chouhan P (2021) COVID-19: district level vulnerability assessment in India. Clin Epidemiol Global Health 9:204–215
Smith C, Miller B, Taylor P (2020) Reopening the economy under Covid-19: States plot a way back. Published May 22, 2020. Available at: https://www.governing.com/now/Reopening-the-Economy-Under-COVID-19-States-Plot-a-Way-Back.html
Snyder BF, Parks V (2020) Spatial variation in socio-ecological vulnerability to Covid-19 in the contiguous United States. Health Place 66:102471
Spielman SE, Tuccillo J, Folch DC, Schweikert A, Davies R, Wood N, Tate E (2020) Evaluating social vulnerability indicators: criteria and their application to the Social Vulnerability Index. Nat Hazards 100(1):417–436
Tahmasebi P, Shokri-Kuehni SM, Sahimi M, Shokri N (2020) How do environmental, economic and health factors influence regional vulnerability to COVID-19? Preprint at medRxiv https://doi.org/10.1101/2020.04.09.20059659
Tian H, Liu Y, Li Y, Wu C-H, Chen B, Kraemer MU, Li B, Cai J, Xu B, Yang Q (2020) An investigation of transmission control measures during the first 50 days of the COVID-19 epidemic in China. Science 368(6491):638–642
Tiwari A, Dadhania AV, Ragunathrao VAB, Oliveira ER (2021) Using machine learning to develop a novel COVID-19 Vulnerability Index (C19VI). Sci Total Environ 773:145650
United Nations (UN) (2020). UN Secretary-General’s policy brief: The impact of COVID-19 on women. Available at https://www.unwomen.org/en/digital-library/publications/2020/04/policy-brief-the-impact-of-covid-19-on-women
Welsh CE, Sinclair DR, Matthews FE (2022) Static socio-ecological COVID-19 vulnerability index and vaccine hesitancy index for England. Lancet Reg Health-Eur 14:100296
Xu S, Li Y (2020) Beware of the second wave of COVID-19. Lancet 395(10233):1321–1322
Yan H, Valdes AM, Vijay A, Wang S, Liang L, Yang S, Wang H, Tan X, Du J, Jin S (2020) Role of drugs used for chronic disease management on susceptibility and severity of COVID‐19: a large case‐control study. Clin Pharmacol Ther 108(6):1185–1194
Zhou Y, Xu R, Hu D, Yue Y, Li Q, Xia J (2020) Effects of human mobility restrictions on the spread of COVID-19 in Shenzhen, China: a modelling study using mobile phone data. Lancet Digit Health 2(8):e417–e424
Zoran MA, Savastru RS, Savastru DM, Tautan MN (2020) Assessing the relationship between surface levels of PM2. 5 and PM10 particulate matter impact on COVID-19 in Milan, Italy. Sci Total Environ 738:139825
Acknowledgements
This work was supported by the Collaborative Research Fund of the Hong Kong Research Grants Council (C4139-20GF).
Author information
Authors and Affiliations
Contributions
BH and ZH designed the research. ZH, BH, and CC built and ran the models and carried out analyses. ZH collected and compiled the data. BH, ZH, and CC wrote the manuscript. JL, TT, YH, and SP interpreted the findings and commented on and revised drafts of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Ethical approval
Ethical clearance for collecting and using secondary data in this study was granted by the Institutional Review Board of the Chinese University of Hong Kong (No. C4139-20G). All data included in models were supplied and analysed in an anonymous and aggregated format, without access to personal identifying information.
Informed consent
This article does not contain any studies with human participants performed by any of the authors.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Huang, B., Huang, Z., Chen, C. et al. Social vulnerability amplifies the disparate impact of mobility on COVID-19 transmissibility across the United States. Humanit Soc Sci Commun 9, 415 (2022). https://doi.org/10.1057/s41599-022-01437-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1057/s41599-022-01437-5