Abstract
Background
Environmental health disparity research involves the use of metrics to assess exposure to community-level vulnerabilities or inequities. While numerous vulnerability indices have been developed, there is no agreement on standardization or appropriate use, they have largely been applied in urban areas, and their interpretation and utility likely vary across different geographies.
Objective
We evaluated the spatial distribution, variability, and relationships among different metrics of social vulnerability and isolation across urban and rural settings to inform interpretation and selection of metrics for environmental disparity research.
Methods
For all census tracts in North Carolina, we conducted a principal components analysis using 23 socioeconomic/demographic variables from the 2010 United States Census and American Community Survey. We calculated or obtained the neighborhood deprivation index (NDI), residential racial isolation index (RI), educational isolation index (EI), Gini coefficient, and social vulnerability index (SVI). Statistical analyses included Moran’s I for spatial clustering, t-tests for urban-rural differences, Pearson correlation coefficients, and changes in ranking of tracts across metrics.
Results
Social vulnerability metrics exhibited clear spatial patterning (Moran’s I ≥ 0.30, p < 0.01). Greater educational isolation and more intense neighborhood deprivation was observed in rural areas and greater racial isolation in urban areas. Single-domain metrics were not highly correlated with each other (rho ≤ 0.36), while composite metrics (i.e., NDI, SVI, principal components analysis) were highly correlated (rho > 0.80). Composite metrics were more highly correlated with the racial isolation metric in urban (rho: 0.54–0.64) versus rural tracts (rho: 0.36–0.48). Census tract rankings changed considerably based on which metric was being applied.
Significance
High correlations between composite metrics within urban and rural tracts suggests they could be used interchangeably; single domain metrics cannot. Composite metrics capture different facets of vulnerabilities in urban and rural settings, and these complexities should be examined by researchers applying metrics to areas of diverse urban and rural forms.
Introduction
The disproportionate distribution of environmental exposures in communities of lower income or marginalized racial and ethnic groups is well-documented in a voluminous literature [1,2,3,4]. Most of the literature is focused on the urban context, where low-income communities in urban areas are likely exposed to higher ambient concentrations of air pollutants such as ozone and particulate matter [2, 5] or drinking water contaminants, such as lead [6, 7]. Exposure disparities also exist in the rural context, where lower-income communities are more likely to live near environmental hazards like waste facilities and factory farms [8,9,10]. Research has also demonstrated that some populations experience more pronounced health responses in relation to environmental exposures due to interactions between environmental factors and social and structural determinants of health [11, 12].
Environmental health researchers are being called upon to more carefully examine the complex relationships between social vulnerability, race and racism, environmental exposures, and health [13, 14]. Across the environmental justice literature, researchers have quantified exposure to social vulnerability in numerous, inconsistent ways. Approaches have varied over time, and no standard approach or consensus exists. Moreover, comparing and synthesizing results from studies applying social disadvantage metrics across distinct geographies may be implicitly assuming that these indicators capture similar features across distinct types of communities. However, the physical, social, and environmental characteristics of rural communities are different than those in urban communities, where much of the epidemiological and environmental health research is focused [15]. Additionally, socioeconomic position is typically assessed as a “snapshot” measure (i.e., at one time point) and for one aspect of socioeconomic status (SES), such as income or education, whereas current SES is a reflection of multi-generational and/or lifetime SES including that of childhood, financial resources other than traditional income such as family funds, or unusual expenses (e.g., health care expenses) [16]. As socio-cultural systems differ by urbanicity (e.g., greater number of job opportunities in urban areas, higher social cohesion typically in rural areas), the snapshot and single domain proxies of SES may have different implications across types of communities, including those of varying urbanicity. Exposure to community-level socioeconomic factors might be best characterized using different approaches in rural versus urban areas, and across different rural economies (e.g., agricultural vs. non-agricultural) or geographies (e.g., Coastal Plains vs. Mountain). Links between socioeconomic and environmental exposures and health outcomes may be obscured or misinterpreted if appropriate measures are not used.
Socioeconomic factors at the community level, such as poverty, education, and wealth, are often correlated and multidimensional. This correlation between factors makes it appealing to use a composite measure to combine multiple socioeconomic variables into one metric of vulnerability [17]. In addition, a composite metric can serve as a proxy for the multi-faceted concept of SES. While many studies take this approach, researchers use different metrics comprised of different variables, combined in different ways, and often at different geographic scales. These disparate techniques make it challenging to compare results across studies and geographic areas and may yield inconsistent findings. In addition, composite metrics may not easily translate to specific policy interventions, which could benefit from the identification of priority factors. Furthermore, the availability of multiple, similar metrics makes the selection of which metric to use and whether it will impact the research findings challenging.
Additionally, terminology and meaning vary across metrics and over time. For example, the terms “disadvantage,” “deprivation,” and “vulnerability” are sometimes used interchangeably to refer to aspects of social inequity. Concentrated disadvantage [18] and neighborhood deprivation [17] measures are meant to assess a lack of resources in a community, and are typically used as a contextual predictor of poorer health outcomes. Social disadvantage has been defined as “a web of health risks including poverty, substandard housing and building infrastructure, unemployment, the erosion of social capital, and exposure to high levels of violence and crime” [19]. Social vulnerability, on the other hand, characterizes not just a community’s likelihood of harm but also its potential ability to recover from hazards [20, 21]. Other metrics are used to evaluate aspects of systemic racism and discrimination. For example, scholars assess multiple dimensions of racial residential segregation, the separation of different racial and ethnic groups into different neighborhoods with different resources to support health and well-being across the life course [22, 23]. These include the five dimensions originally described by Massey and Denton (1988) (evenness/dissimilarity, exposure/isolation, concentration, centralization, and clustering), and the more recent simplified conceptualization of evenness (similarity in distributions of racial and ethnic groups across areal units) and isolation (the probability of contact with members of a different racial or ethnic group) [24, 25]. The concept of isolation can be extended beyond race and ethnicity to other social domains, such as educational attainment [26]. Furthermore, where socioeconomic disadvantage and residential racial isolation overlap, residents may be especially at risk of both environmental exposures and health risks.
Although epidemiological research is increasingly considering social and structural determinants of health at the community level [14, 27], most existing studies of the influence of community-level factors on health take place in the urban context [15, 28, 29]. While composite metrics often show deep deprivation in urban areas, there is also consistent evidence that living in a rural community can have negative impacts on economic well-being [30]. Rurality is sometimes treated as the absence of urbanity, and as unidimensional even though there are many regional differences in rural areas [15, 27]. Also, many studies of differences in urbanicity compare urban to semi-urban forms (e.g., more urban to less urban, still all in metropolitan areas); relatively few capture a full range of urban, ex-urban, and rural forms [31]. Rural areas differ dramatically not just in physical characteristics like land use and tree cover, but also in social characteristics like population density, age structure, educational attainment, employment status and type, and racial segregation [32]. Amenities such as grocery stores, libraries, and schools may be harder to access. Moreover, the cost of living is often significantly lower in rural communities, which generates important differences in how poverty is experienced. Environmental hazards tend to be different in rural areas as compared to urban areas. Residential isolation may hamper collective bargaining power and sharing of information about hazards [33, 34]. Therefore, there is unlikely to be a one-sized-fits-all solution to assessing the link between socioeconomic conditions, community-level factors, and environmental pollutants.
Variations in how metrics classify areas in terms of the intensity of their vulnerability or isolation could yield different results in terms of differential exposure or health risks, potentially influencing reproducibility of research and policy recommendations. To best characterize the relationship between socioeconomic disadvantage and environmental burden, scholars and stakeholders must understand what their metric is capturing as well as whether it is appropriate for the particular area and research questions they are studying. The lack of a standard approach and understanding of metric performance across geographic settings can pose a design challenge for researchers beginning new studies, contribute to inconsistencies of results in existing studies, and obscure findings, potentially leading to misinterpretation of results. To investigate these issues, our objectives were to apply a selection of social vulnerability and isolation metrics to all census tracts within the state of North Carolina and evaluate the spatial distributions, variability, and relationships of different metrics in urban versus rural settings and across three geographic regions: Coastal Plain, Piedmont, and Mountain. We elected to conduct a single-state analysis rather than a national one, because state-specific differences and within-state heterogeneity could be masked by aggregate statistics [26, 35, 36].
Methods
Study setting
We conducted our study in North Carolina, a southern state in the United States (US) with 33.9% of the population classified as rural, compared to 19.3% of the US, according to the 2010 US Census [37]. The state has a range of urban and rural forms, with corresponding variations in the types of environmental exposures. The state includes three major geographic regions: the Appalachian Mountain region, which is located in the western part of the state, the Piedmont (“foothills”) or middle section of the state which contains the state’s largest cities, and the Coastal Plain, which is the eastern section of the state, covering about 45% of the state area, and home to farmland (e.g., cotton, soybeans, tobacco) [38]. Areas bordering and within the Appalachian Mountains have historically relied on fossil fuel extraction and mining [39]. Agricultural regions present exposures from pesticide use [40, 41] and intense livestock operations [8, 9]. It is also home to growing urban areas experiencing rapid land development [42]. Our analysis was conducted at the census tract level (n = 2195 tracts) in North Carolina. We obtained data on the percentage of the population living in areas designated as urban from the 2010 Census and classified each census tract as “urban” if ≥50% of the population was living in an urbanized area or urban cluster, and rural if it was <50% urban. We classified census tracts as belonging to one of the three major geographical regions of North Carolina.
Social vulnerability and isolation metric selection
For our analysis, we selected the following four metrics that are commonly used in public and environmental health literature: the neighborhood deprivation index (NDI), a residential racial isolation index (RI), the Gini coefficient, and the social vulnerability index (SVI) (Table 1). We also conducted an original principal component analysis (PCA) calculated using 23 census variables and calculated a newly developed educational isolation index (EI) [26] (Table 1). We selected these metrics to cover a range of characteristics, including both single domains (e.g., income inequality, race) or composite metrics (more than one domain incorporated), those which are publicly accessible and those that require additional programming and processing, and those which incorporate characteristics of adjacent areas (“spatial”, e.g., racial isolation) and those which do not (“aspatial”). We obtained or constructed these six metrics for the census tracts in North Carolina; metrics requiring calculations used variables from the 2010 Census and/or American Community Survey.
Neighborhood deprivation index (NDI)
Census tract-level data from the 2010 US Census were used to calculate the composite NDI [17]. The NDI has been used in other studies examining neighborhood conditions and health outcomes, including preterm birth [43, 44], small for gestational age birth [45], and hypertension during pregnancy [46]. The NDI is created using the first component from a PCA of eight census variables: percent of households in poverty, percent of female-headed households with dependents, percent with annual household income < $30,000, percent of households on public assistance, percent of males in management/professional occupation, percent in crowded housing, percent unemployed, and percent without a high school education.
Residential racial isolation of non-Hispanic Blacks (RINHB)
We calculated a previously developed [47] local, spatial measure of racial isolation (RI) of census tracts with respect to self-identifying non-Hispanic Black (NHB) individuals (RINHB), compared with all other racial/ethnic groups, including Hispanics. In contrast to racial composition within a tract, which can be used to assess evenness as a proxy measure of segregation, RINHB measures isolation considering racial composition of an index area and all its adjacent areas and ranges from 0 to 1, with more weight given to the index area. A neighborhood (i.e., the index tract and all its adjacent tracts) whose population is entirely composed of racial and ethnic groups other than NHB will have an RINHB value that is 0. In contrast, a neighborhood environment that is all NHB will have an RINHB value that is 1. These reflect extremes of isolation, whereas a somewhat “integrated” community will have a value somewhere between 0 and 1. A higher value of this metric can generally be interpreted as having greater inequities because communities with an RI value closer to 1 are more likely to have experienced racist policies related to housing, education, and disinvestment, leading to concentrated disadvantage and adverse health outcomes [22, 23, 47, 48].
The metric is calculated as
where ∂i denotes the set of index unit (i) (in our case, census tract) and its neighbors (i.e., tracts that are adjacent to the index tract). Given M mutually exclusive racial subgroups, m indexes the subgroups of M (e.g., NHB). The variable Ti denotes the total population in region i (i.e., the index tract and tracts adjacent to the index tract) and Tim denotes the population of subgroup m in region i. The variable wij denotes an n × n first order adjacency matrix, where n is the number of census tracts in the study area. First order adjacency means that the entries in the matrix, wij, are set to 1 if a boundary is shared by region i and region j, and 0 otherwise. Entries of the main diagonal (since i ∈ ∂i, wij = wii when j = i) of wij are set to 1.5, such that the weight of the index unit, i, is larger than the weights assigned to adjacent tracts. This measure of RINHB has been used in other studies examining neighborhoods and health, including preterm birth [47], low birthweight [47], type 2 diabetes [49], and chronic hypertension [50].
Residential educational isolation without college degree (EIWCD)
We applied a newly developed metric of educational isolation, as educational attainment has consistently been linked to health outcomes[26]. This metric is calculated with a formula analogous to RI, but for educational attainment. It captures the weighted average proportion of the non-college educated population residing in a neighborhood environment (i.e., index census tract and its adjacent tracts, with greater weight given to the index tract). Specifically, in each tract, we calculate a local, spatial measure of educational isolation of non-college educated individuals (EIWCD) (i.e., those without a four-year college degree). The metric is calculated using the formula in (1) but defining two mutually exclusive educational subgroups instead of mutually exclusive racial subgroups. EIWCD ranges from 0 to 1, with values close to 0 indicating that the neighborhood environment is nearly exclusively college educated (near complete isolation of individuals with a four-year degree from those without a college degree), and values close to 1 indicating that the neighborhood is almost all non-college educated (i.e., near complete isolation of individuals without a four-year degree from those with a college degree).
Social vulnerability index (SVI)
The SVI is an index developed by the Centers for Disease Control and Prevention that uses a combination of 15 US Census variables focused on four domains: SES, household composition, race/ethnicity/language, and housing/transportation [51]. It was designed to assist emergency personnel in targeting disaster response efforts, but its use has been extended beyond these contexts [52, 53]. Census tracts are assigned rankings with 0 being the least vulnerable and 1 being the most vulnerable. The ranking number for a given tract is interpreted as the proportion of tracts that have the same or lower social vulnerability than the given tract. Therefore, a ranking of 0.75 indicates that 75% of tracts in the study area (in our case, the state) are less vulnerable than the tract of interest.
Gini coefficient
The Gini Coefficient was originally presented in a 1912 book published in Italian by sociologist and statistician Corrado Gini; it has subsequently been presented in other formats and applied for decades to estimate income inequality [54,55,56]. It ranges from 0 (complete equality: all individuals have the same income) to 1 (complete inequality: all income is earned by 1 individual while all others have none). It is based on the Lorenz curve, which is an observed cumulative income distribution. The coefficient represents the deviation from perfect equality.
Principal components analysis (PCA) on individual census variables
From the decennial 2010 Census and the American Community Survey, we obtained 23 individual variables across multiple domains: educational attainment, income, employment, race/ethnicity/immigration, housing, age, and population density. We conducted a new PCA and utilized the first principal component with corresponding loading factor which includes loadings for 23 individual variables (Supplementary Table S1). Several factors across multiple domains had loadings with absolute values greater than 0.2, indicating a substantive contribution to the metric, such as percent population below the federal poverty line, percent of population with health insurance, and percent population identifying as non-Hispanic Black. Variables capturing disadvantage have positive loadings (e.g., % population unemployed), while variables with negative loadings indicate advantage (e.g., % employed in managerial/professional jobs), meaning that a large factor score can be interpreted as having greater inequity, consistent with the other metrics.
Statistical analysis
The goals of our statistical analysis were to evaluate the numerical values, spatial distributions, and variability of different metrics in urban versus rural settings and across three geographic categories: Coastal Plain, Piedmont, and Mountain, and to evaluate the associations between metrics. We accomplished this with five types of analyses, described below. All metrics can generally be interpreted as a higher numerical value indicating greater neighborhood inequity or disadvantage.
First, we mapped the metrics across all 2195 census tracts to visually evaluate spatial distributions of each metric and differences between the different metrics. We used Moran’s I [57] to quantify spatial clustering of each of the metrics. Second, we compared the distribution of the inequity metrics between urban and rural census tracts and by geographic region using t-tests and using a one-way analysis of variance (ANOVA), respectively. Third, we calculated Pearson correlation coefficients to assess how the different inequity metrics correlate with each other in urban versus rural areas and among the three geographic regions. Fourth, we standardized all metrics by subtracting the mean and dividing by the standard deviation (across all tracts) and averaged these scaled metrics within each census tract. The purpose of this analysis was not to numerically combine separate metrics into an aggregate metric but rather to illuminate whether certain tracts consistently ranked high or low across the metrics. Because a higher metric value is interpreted as having greater inequity, if the average z-score is high in an area, that means it was consistently high (greater inequity) across metrics. A low z-score would indicate that the census tract consistently ranked low (less inequity) across multiple metrics. If the average z-score was moderate, it could be either consistently moderate across metrics or low in some metrics, high in other metrics, yielding a moderate average. We conducted t-tests to compare differences in z-scores between urban/rural areas and ANOVA for the three geographic areas. We calculated the standard deviation (SD) of the standardized metrics within a census tract, which captures whether there was consistency across metrics (low SD in z-scores) or inconsistency (high SD) within a tract. This allows distinguishing whether tracts with moderate z-scores were consistently moderate or had a combination of low and high z-scores. Both the mean and standard deviations of the z-score measures were mapped across all census tracts for visual display. Fifth, we ranked the census tracts by each of the metrics and classified them into metric quintiles. We then determined whether the quintiles changed across metrics for the tracts falling in the top 5% or bottom 5% for any metric, separately by urban and rural areas. This complements the z-score approach by examining how rankings change among a subset of census tracts.
Recognizing that there are different methods for classifying urban and rural areas, and that these classifications can have impacts on research and policy [58], we conducted a sensitivity analysis by rerunning all analyses using two additional rural-urban classification schemes. We obtained the 2010 Rural Urban Commuting Area (RUCA) Codes for all North Carolina census tracts from the US Department of Agriculture [59]. This system has ten categories ranging from rural to metropolitan areas. We applied two additional binary classifications for compatibility with our modeling approach by assigning urban as levels 1–6 (metropolitan and micropolitan areas) versus rural as 7–10 (small town and rural) and urban as levels 1–9 (metropolitan, micropolitan, small town) versus 10 (rural).
Results
We had full data on 2149 of 2195 census tracts (97.9%), with some missingness occurring for individual raw variables, metrics, or the urban/rural designation (2.1% of census tracts). Figure 1 illustrates the urban and rural census tracts, and the three distinct geographic regions of North Carolina. Figure 2 presents the spatial distribution of the six metrics across census tracts. Based on visual inspection, considerable differences in the spatial patterning are evident. For example, residential RINHB is high in some census tracts in more urban areas as well as in the more rural northeastern part of the state, which is also the location of Warren County, widely considered the birthplace of the United States’ environmental justice movement [60]. In contrast, high EIWCD is pervasive throughout most of the rural parts of the state and particularly low in the north/central urban area corresponding to the Research Triangle Park. The isolation indices, which are spatial, exhibit a smoother pattern with distinct clusters. The SVI and NDI have similar overall patterns across the state, with the SVI demonstrating a greater range/variation, potentially attributable to its inclusion of 15 versus 8 components. Our PCA results exhibit more spatial variability, potentially due to its 23 component variables. The Gini coefficient demonstrates less contrast across the tracts. The Moran’s I statistics were statistically significant for all metrics, supporting the visibly observable spatial patterning. Spatial clustering was highest for the spatial metrics EIWCD (Moran’s I = 0.88) and RINHB (Moran’s I = 0.90) and lowest for the aspatial, single-domain Gini coefficient (Moran’s I = 0.30).
The red shading indicates rural-designated census areas (defined as <50% population in a census tract living in an urbanized area) and the blue areas indicate urban-designated areas (50% or more of the population living an an urbanized area). The black borders delineate the three major geographical regions of the state.
Spatial distribution of the values of the six social vulnerability metrics across North Carolina: New PCA original principal components analysis, NDI neighborhood deprivation index, SVI social vulnerability index, RINHB Residential Racial Isolation of Non-Hispanic Blacks, EIWCD Residential Educational Isolation without College Degree, Gini Gini coefficient.
The comparison in the magnitude of the deprivation metrics by urban and rural areas and by three geographic regions are presented in Table 2. Higher values for NDI and EIWCD in rural compared to urban tracts indicate more intense neighborhood deprivation and more intense educational isolation of non-college educated individuals in rural areas compared to urban areas. RINHB is higher in urban compared to rural areas. There are also statistically significant differences in the metrics between the three geographic regions (Table 2). RINHB is highest in the Coastal Plain and Piedmont regions and lowest in the Mountain region. In the Coastal Plain, RINHB is highest in urban areas and a large cluster of more rural census tracts in the northeastern part of the state; in the Piedmont, RINHB is also highest in urban areas, including most of the major cities in North Carolina (e.g., Charlotte, Greensboro, Raleigh, Durham). In sensitivity analyses using different definitions of urban and rural, the magnitude and direction of results were generally consistent, with some metrics, such as our new PCA, being more sensitive to the choice of urban-rural classification (Supplementary Table S2).
Table 3 presents the Pearson correlation coefficients between the metrics in urban and rural tracts. The composite metrics (SVI, NDI, and our new PCA) were highly correlated with each other (ρ > 0.80), though the correlations were stronger in urban (ρ: 0.86–0.92) compared to rural settings (ρ: 0.80–0.85). In contrast, the single-domain metrics related to income, education, and race, were not strongly correlated in either urban or rural settings (ρ ≤ 0.36). The Gini coefficient (specific to income inequality) was not correlated with any of the other single domain metrics (RINHB or EIWCD) in urban or rural tracts (all ρ < |0.1|). The isolation metrics, EIWCD and RINHB, were weakly or moderately correlated in urban (ρ = 0.36) and rural (ρ = 0.32) settings. The correlation between composite metrics and single-domain metrics exhibited some differences between urban and rural tracts and tended to be higher in urban versus rural tracts. The largest differences were between composite metrics and RINHB. RINHB was more strongly correlated with the composite metrics in the urban (ρ: 0.54–0.64) versus rural areas (ρ: 0.36–0.48). The correlations between the composite metrics NDI and SVI and the single-domain EIWCD was slightly higher in urban compared to rural tracts (ρ: 0.62–0.63) and rural areas (ρ: 0.52–0.58); however, the correlation coefficients between our PCA and EIWCD were nearly equivalent in urban and rural tracts (ρ: 0.52–0.53). The strongest correlation observed between the Gini coefficient and any of the composite metrics was with the NDI in both urban (r = 0.26) and rural areas (ρ = 0.22). The correlation coefficients between metrics varied by region. The Pearson correlation coefficients between metrics were lowest in the Mountain region (median ρ = 0.26, range: −0.2–0.85) with relatively low correlation of the RINHB and EIWCD with other metrics (ρ ≤ 0.26). The correlation coefficients in the Coastal Plain and Piedmont regions had similar medians (Coastal Plain: ρ = 0.50, Piedmont ρ = 0.56) and ranges (Coastal Plain: 0–0.92, Piedmont: −0.02–0.91). In sensitivity analyses, we found that while there were some variations in correlation coefficients between metrics under the different urban-rural classifications, no clear pattern emerged.
Figure 3 indicates that the northeastern and southern parts of the state are consistently characterized as having greater social inequity or isolation, because their z-scores are consistently high. There are also small clusters of small-area census tracts with consistently high inequity (high z-scores) in central areas of major cities, including Raleigh, Durham, Charlotte, and Winston-Salem. The difference between mean z-scores across regions was also statistically significant, further illustrating differences in inequities across tracts (p-value < 0.001). The standard deviation of the z-scores appear to be relatively low (<1 standard deviation) across most of the state, with some small areas having high standard deviations, suggesting consistency across the metrics despite the weak to moderate correlations between them.
Supplementary Figs. 1–4 illustrate how the rankings of different census tracts vary based on which metric is being applied, complementing the z-score analysis with another illustration of consistency of how tracts are classified by metric. Supplementary Fig. 1 shows the quintile of each metric (with each quintile represented by a different color) for urban tracts that ranked in the top 5% of numerical values for any of the six metrics. While there are some tracts that rank consistently high across most metrics as indicated by the similar colors when reading across a tract from left to right, most tracts move quintiles based on the metric being applied (i.e., the colors change when reading left to right across metrics for a single tract). Supplementary Fig. 2 shows the quintiles for rural tracts in the top 5% of numerical values for any of the six metrics. The movement of quintiles within the same tract across metrics is also evident based on the color changes. Supplementary Figs. S3 and S4 illustrate these rankings for those tracts in the bottom 5% of any of the six metrics for rural and urban tracts, respectively. The lack of any obvious visual pattern illustrates that these metrics are capturing different features of communities, and tracts ranking high in one component of inequity may rank highly for other but not all aspects of inequity or isolation.
Discussion
This paper sought to evaluate the spatial distributions, variability, and relationships of different metrics in North Carolina for urban versus rural settings, and across three geographic categories: Coastal Plain, Piedmont, and Mountain. Our results found substantial variation in metrics over the state of North Carolina. Rural areas overall had similar or more intense vulnerability compared to urban areas based on the NDI, SVI, EIWCD, yet rural areas are often understudied, under-resourced, and overlooked [15], highlighting the need for additional environmental health and health disparity research that includes rural areas. The single domain metrics representing income inequality, racial segregation, and educational attainment segregation were not highly correlated with each other, further emphasizing that they reflect distinct aspects of the community and are not suitable proxies for each other. For example, RI is an indicator of segregation, and greater income equality did not correlate with greater integration in our analysis. The composite metrics were weakly to moderately correlated with single-domain metrics, and, particularly with regard to RI, these correlations varied by urban/rural designations and geographic regions, indicating that composite metrics are capturing different features in these distinct areas. The composite metrics were highly correlated with each other and exhibited increasing range and variability with the increasing number of factors incorporated in their calculations.
Given the lack of a consensus about which metric to use, deciding which metric to use can be a vexing decision for researchers and communities. Composite metrics are useful for capturing the aggregate neighborhood context. From a statistical standpoint, if the goal is to identify root causes or explain the most variability in an exposure or health outcome, composite metrics with an increasing number of factors can offer benefits. There are many composite metrics available in the literature, and the strong correlations we observed between these aggregate indices suggest that the choice of composite metric would not be expected to substantially alter findings in health studies.
Composite metrics capture the broader spectrum of features within neighborhoods, are not interchangeable with single indicators, and may reflect different underlying factors in different areas. If the goal is to understand the association between a particular neighborhood characteristic (e.g., racial segregation) and an outcome of interest, domain-specific metrics are critical. Domain-specific metrics may also be more helpful for informing specific policy interventions on which to intervene, as they are more easily interpretable than an aggregate index and lend themselves to policy goals (e.g., proportion of population above federal poverty limit). For example, if researchers are each using a different metric or running their own PCA, then any inconsistencies across studies could potentially be attributed to the metric differences and not other distinguishing features of the populations or settings. There are trade-offs between simplicity and interpretability and being more comprehensive [61].
As social vulnerability metrics are being more broadly applied in health and social science literature, other studies have sought to compare these metrics. Lian et al. (2016) also found that certain composite SES indices were not well aligned with individual SES indicators and recommended composite metrics be used when attempting to capture neighborhood SES more broadly [62]. Bravo et al. (2021) examined the inter-relationship between EI and RI across the continental US and found that a US-wide measure of correlation calculated between RI and EI masked considerable small-area variability in spatial patterns and correlations [26]. Rufat et al. examined the validity of various metrics as indicators of seeking federal disaster-related assistance for property or housing damage; they observed differences across composite models in their ability to predict these disaster-related endpoints [63].
Researchers should carefully consider whether they want to capture whether a community is socially disadvantaged based on one factor, such as poverty, or whether they want to characterize a broader spectrum of social disadvantage and use a composite metric. In addition, they may want to assess whether the relationships between social disadvantage and their outcome of interest vary by urbanicity or by geographic region, rather than assuming the relationship will not vary across geographic context. Recognition of the complexity and differences between metrics can help inform design of future analyses and interpretation of the current literature. Inconsistency across this literature results in studies capturing different elements of injustice. The differences in rankings by metrics indicate that efforts to accurately identify populations at risk could differ in results based on which metrics are applied and may create opportunity for faulty comparisons between studies.
Further research about which metrics may be best-suited to a given research or policy question, and the interpretation and appropriate use of various metrics, could include work involving stakeholders, such as qualitative research with community members or research at a smaller geographic scale. Residents could report on which dimensions of deprivation or inequity are most important to them. Further, work by social scientists is needed to characterize the meaning of different metrics and explore where they diverge. Future research could also extend the study area to other states or the entire US. A national-level analysis would provide for exploration of a wider range of variation, including evaluation of different rural forms.
In terms of our study limitations, there are additional metrics used in the literature, such as the area deprivation index [64] or the Social Vulnerability Index (SoVI) [21], and we were not able to include all possible metrics in this analysis. In addition, we focused on residential racial isolation and did not explore other metrics of racial segregation or structural racism. We also acknowledge the census tract as a somewhat arbitrary administrative boundary and that our binary classification of urban-rural tracts based on Census data is relatively simplistic delineation [27]. Our findings that some of the patterns were sensitive to alternative urban-rural designations suggests that researchers should carefully consider how urbanicity is defined and measured, and if or how their analysis or results may be sensitive to alternative urban/rural definitions or classification schemes.
Conclusions
Composite social vulnerability metrics were highly correlated with each other, suggesting that the choice of composite index may not substantively influence research findings. In contrast, single-domain metrics representing inequities or isolation related to income, education, and race were not well-correlated and cannot be used interchangeably. Correlations between composite and single-domain metrics exhibited some urban-rural differences, suggesting these may be capturing different features across geographies, and these complexities should be examined by researchers applying metrics to areas of diverse urban and rural forms, including different ways of characterizing urbanicity. More environmental disparity research is needed in rural settings, which experienced similar or more intense inequity with regard to neighborhood disadvantage, social vulnerability, and education isolation. Most disparity research has been conducted in the urban context and additional work is needed to evaluate the intersectionality of rural populations with racial/ethnic minorities, environmental exposures, and issues of social inequity.
Data availability
All data analyzed in this paper are publicly available with data sources provided in the reference list.
References
Gray SC, Edwards SE, Miranda ML. Race, socioeconomic status, and air pollution exposure in North Carolina. Environ Res. 2013;126:152–8.
Bell ML, Dominici F. Effect modification by community characteristics on the short-term effects of ozone exposure and mortality in 98 US communities. Am J Epidemiol. 2008;167:986–97.
Mennis JL, Jordan L. The distribution of environmental equity: exploring spatial nonstationarity in multivariate models of air toxic releases. Ann Am Assoc Geogr. 2005;95:249–68.
Morello-Frosch R, Pastor M, Sadd J. Environmental justice and Southern California’s “Riskscape”: the distribution of air toxics exposures and health risks among diverse communities. Urban Aff Rev. 2001;36:551–78.
Miranda ML, Edwards SE, Keating MH, Paul CJ. Making the environmental justice grade: the relative burden of air pollution exposure in the United States. Int J Environ Res Public Health. 2011;8:1755–71.
McDonald YJ, Jones NE. Drinking water violations and environmental justice in the United States, 2011–2015. Am J Public Health. 2018;108:1401–7.
Whitehead LS, Buchanan SD. Childhood Lead poisoning: a perpetual environmental justice issue? J Public Health Manag Pract. 2019;25:S115–S120.
Son J-Y, Muenich RL, Schaffer-Smith D, Miranda ML, Bell ML. Distribution of environmental justice metrics for exposure to CAFOs in North Carolina, USA. Environ Res. 2021;195:110862.
Wing S, Cole D, Grant G. Environmental injustice in North Carolina’s hog industry. Environ Health Perspect. 2000;108:225–31.
Silva GS, Warren JL, Deziel NC. Spatial modeling to identify sociodemographic predictors of hydraulic fracturing wastewater injection wells in Ohio census block groups. Environ Health Perspect. 2018;126:067008.
Woo SHL, Liu JC, Yue X, Mickley LJ, Bell ML. Air pollution from wildfires and human health vulnerability in Alaskan communities under climate change. Environ Res Lett. 2020;15:094019.
Mehra R, Keene DE, Kershaw TS, Ickovics JR, Warren JL. Racial and ethnic disparities in adverse birth outcomes: Differences by racial residential segregation. SSM Popul Health. 2019;8:100417.
Payne-Sturges DC, Gee GC, Cory-Slechta DA. Confronting racism in environmental health sciences: moving the science forward for eliminating racial inequities. Environ Health Perspect. 2021;129:055002.
Zota AR, Shamasunder B. Environmental health equity: moving toward a solution-oriented research agenda. J Expo Sci Environ Epidemiol. 2021;31:399–400.
Hartley D. Rural health disparities, population health, and rural culture. Am J Public Health. 2004;94:1675–8.
Jæger MM, Blaabæk EH. Local historical context and multigenerational socioeconomic attainment. Res Soc Stratification Mobil. 2021;73:100606.
Messer LC, Laraia BA, Kaufman JS, Eyster J, Holzman C, Culhane J, et al. The development of a standardized neighborhood deprivation index. J Urban Health. 2006;83:1041–62.
Sampson RJ, Raudenbush SW, Earls F. Neighborhoods and violent crime: a multilevel study of collective efficacy. science. 1997;277:918–24.
Brewer M, Kimbro RT, Denney JT, Osiecki KM, Moffett B, Lopez K. Does neighborhood social and environmental context impact race/ethnic disparities in childhood asthma? Health Place. 2017;44:86–93.
Cutter SL, Finch C. Temporal and spatial changes in social vulnerability to natural hazards. Proc Natl Acad Sci USA. 2008;105:2301–6.
Cutter SL, Boruff BJ, Shirley WL. Social vulnerability to environmental hazards. Soc Sci Q. 2003;84:242–61.
Massey DS, Denton NA. The dimensions of residential segregation*. Soc Forces. 1988;67:281–315.
Acevedo-Garcia D, Lochner KA, Osypuk TL, Subramanian SV. Future directions in residential segregation and health research: a multilevel approach. Am J Public Health. 2003;93:215–21.
Reardon SF, O’Sullivan D. Measures of spatial segregation. Sociological Methodol. 2004;34:121–62.
Johnston R, Poulsen M, Forrest J. Ethnic and racial segregation in U.S. Metropolitan Areas, 1980-2000: the dimensions of segregation revisited. Urban Aff Rev. 2007;42:479–504.
Bravo MA, Leong MC, Gelfand AE, Miranda ML. Assessing disparity using measures of racial and educational isolation. Int J Environ Res Public Health. 2021;18:9384.
Hall SA, Kaufman JS, Ricketts TC. Defining urban and rural areas in US epidemiologic studies. J Urban Health. 2006;83:162–75.
Probst JC, Moore CG, Glover SH, Samuels ME. Person and place: the compounding effects of race/ethnicity and rurality on health. Am J Public Health. 2004;94:1695–703.
Pellow DN. Environmental justice and rural studies: a critical conversation and invitation to collaboration. J Rural Stud. 2016;47:381–6.
Ricketts TC. Arguing for rural health in Medicare: a progressive rhetoric for rural America. J Rural Health. 2004;20:43–51.
Sharma-Wallace L. Toward an environmental justice of the rural-urban interface. Geoforum. 2016;77:174–7.
Lichter DT. Immigration and the new racial diversity in rural America. Rural Socio. 2012;77:3–35.
Butterfield P, Postma J. team tEr The TERRA framework: conceptualizing rural environmental health inequities through an environmental justice lens. Adv Nurs Sci. 2009;32:107–17.
Larsson LS, Butterfield P, Christopher S, Hill W. Rural community leaders’ perceptions of environmental health risks: improving community health. AAOHN J. 2006;54:105–12.
Thiede B, Kim H, Valasik M. The spatial concentration of America’s rural poor population: a postrecession update. Rural Socio. 2018;83:109–44.
Lichter DT, Johnson KM. The changing spatial concentration of America’s rural poor population. Rural Socio. 2007;72:331–58.
US Census. 2010 Census Urban and Rural Classification and Urban Area Criteria. 2010. https://www.census.gov/programs-surveys/geography/guidance/geo-areas/urban-rural/2010-urban-rural.html.
NC Department of Environment and Natural Resources. Division of Parks and Recreation. North Carolina Outdoor Recreation Plan. 2015; https://files.nc.gov/ncparks/north-carolina-state-parks-statewide-comprehensive-outdoor-recreation-plan.pdf. Accessed 20 March 2022.
Hendryx M. Mortality from heart, respiratory, and kidney disease in coal mining areas of Appalachia. Int Arch Occ Env Health. 2009;82:243–9.
Arcury TA, Quandt SA, Russell GB. Pesticide safety among farmworkers: perceived risk and perceived control as factors reflecting environmental justice. Environ Health Perspect. 2002;110:233–40.
Alavanja MC, Sandler DP, McDonnell CJ, Lynch CF, Pennybacker M, Zahm SH, et al. Characteristics of pesticide use in a pesticide applicator cohort: the Agricultural Health Study. Environ Res. 1999;80:172–9.
Delmelle EC, Zhou Y, Thill J-C. Densification without growth management? Evidence from local land development and housing trends in Charlotte, North Carolina, USA. Sustainability. 2014;6:3975–90.
O’Campo P, Burke JG, Culhane J, Elo IT, Eyster J, Holzman C, et al. Neighborhood deprivation and preterm birth among non-Hispanic Black and White women in eight geographic areas in the United States. Am J Epidemiol. 2007;167:155–63.
Holzman C, Eyster J, Kleyn M, Messer LC, Kaufman JS, Laraia BA, et al. Maternal weathering and risk of preterm delivery. Am J Public Health. 2009;99:1864–71.
Elo IT, Culhane JF, Kohler IV, O’Campo P, Burke JG, Messer LC, et al. Neighbourhood deprivation and small‐for‐gestational‐age term births in the United States. Paediatr Perinat Epidemiol. 2009;23:87–96.
Vinikoor‐Imler LC, Gray SC, Edwards SE, Miranda ML. The effects of exposure to particulate matter and neighbourhood deprivation on gestational hypertension. Paediatr Perinat Epidemiol. 2012;26:91–100.
Anthopolos R, James SA, Gelfand AE, Miranda ML. A spatial measure of neighborhood level racial isolation applied to low birthweight, preterm birth, and birthweight in North Carolina. Spat Spatio-temporal Epidemiol. 2011;2:235–46.
Rothstein R. The color of law: a forgotten history of how our government segregated America. Liveright Publishing, New York, 2017.
Bravo MA, Anthopolos R, Kimbro RT, Miranda ML. Residential racial isolation and spatial patterning of type 2 diabetes mellitus in Durham, North Carolina. Am J Epidemiol. 2018;187:1467–76.
Bravo MA, Batch BC, Miranda ML. Residential racial isolation and spatial patterning of hypertension in Durham, North Carolina. Prev Chronic Dis. 2019;16:E36–E36.
Centers for Disease Control and Prevention/Agency for Toxic Substances and Disease Registry/Geospatial Research A, and Services Program. CDC social vulnerability index 2010 database for North Carolina. 2010. Accessed 20 January 2021. https://www.atsdr.cdc.gov/placeandhealth/svi/data_documentation_download.html.
Flanagan BE, Hallisey EJ, Adams E, Lavery A. Measuring community vulnerability to natural and anthropogenic hazards: the centers for disease control and prevention’s social vulnerability index. J Environ Health. 2018;80:34–36.
Flanagan BE, Gregory EW, Hallisey EJ, Heitgerd JL, Lewis B. A social vulnerability index for disaster management. J Homel Secur Emerg Manag. 2011;8:0000102202154773551792.
Dorfman RA. Formula for the Gini coefficient. Rev Econ Stat. 1979;61:146–9.
De Maio FG. Income inequality measures. J Epidemiol Community Health. 2007;61:849–52.
Gini C. Measurement of inequality of incomes. Economic J. 1921;31:124–6.
Moran PA. Notes on continuous stochastic phenomena. Biometrika. 1950;37:17–23.
Bennett KJ, Borders TF, Holmes GM, Kozhimannil KB, Ziller E. What is rural? Challenges and implications of definitions that inadequately encompass rural people and places. Health Aff. 2019;38:1985–92.
Service UER. Rural-Urban Commuting Area Codes. https://www.ers.usda.gov/data-products/rural-urban-commuting-area-codes.aspx. 2019.
Mohai P, Pellow D, Roberts JT. Environmental justice. Annu Rev Environ Resour. 2009;34:405–30.
Spielman SE, Tuccillo J, Folch DC, Schweikert A, Davies R, Wood N, et al. Evaluating social vulnerability indicators: criteria and their application to the Social Vulnerability Index. Nat Hazards. 2020;100:417–36.
Lian M, Struthers J, Liu Y. Statistical assessment of neighborhood socioeconomic deprivation environment in spatial epidemiologic studies. Open J Stat. 2016;6:436.
Rufat S, Tate E, Emrich CT, Antolini F. How valid are social vulnerability models? Ann Am Assoc Geogr. 2019;109:1131–53.
Kind AJ, Buckingham WR. Making neighborhood-disadvantage metrics accessible—the neighborhood atlas. N Engl J Med. 2018;378:2456.
Acknowledgements
The authors thank Claire Osgood for data management expertise.
Funding
Research reported in this publication was supported by the National Institute on Minority Health and Health Disparities of the National Institutes of Health under Award Number R01MD012769. F. Macalintal contributed to this work as part of a summer internship supported by NIEHS training grant R25ES029052. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. In addition, this publication was developed under Assistance Agreement No. RD835871 awarded by the U.S. Environmental Protection Agency to Yale University. It has not been formally reviewed by EPA. The views expressed in this document are solely those of the authors and do not necessarily reflect those of the Agency. EPA does not endorse any products or commercial services mentioned in this publication. This work was supported in part by a Pilot Grant from Yale Cancer Center.
Author information
Authors and Affiliations
Contributions
NCD: Conceptualization, methodology, acquired data, led writing of original draft and revisions. JW: Methodology, Formal analysis, Visualization, review & editing. MAB: Methodology, data interpretation, visualization, review & editing. FM: Contributed to original draft, review of final version. RTK: Conceptualization, writing original draft, review & editing. MB: Project conception, data interpretation, review & editing, funding acquisition.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
About this article
Cite this article
Deziel, N.C., Warren, J.L., Bravo, M.A. et al. Assessing community-level exposure to social vulnerability and isolation: spatial patterning and urban-rural differences. J Expo Sci Environ Epidemiol 33, 198–206 (2023). https://doi.org/10.1038/s41370-022-00435-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41370-022-00435-8
Keywords
- Geospatial analyses
- Population-based studies
- Vulnerable populations
- Environmental justice
This article is cited by
-
Measuring accessibility to public services and infrastructure criticality for disasters risk management
Scientific Reports (2023)