Main

In the United States, economic segregation is very high, with income affecting where one lives9, who one marries10, and who one meets and befriends11. This extreme segregation is costly. It reduces economic mobility12,13,14,15, fosters a wide range of health problems16,17,18 and increases political polarization19,20,21,22. Although there are all manner of reforms designed to reduce economic segregation (such as subsidized housing), it has long been argued that one of the most powerful segregation-reducing dynamics is rising urbanization23 and the resulting happenstance mixing that it induces1,2,3,4,5,6. This ‘cosmopolitan mixing hypothesis’ anticipates that, in large cities, the combination of increased population diversity, constrained space and accessible public transportation will bring diverse individuals into close physical proximity with one another2, reducing everyday socioeconomic segregation. The New York City Subway has been lauded, for example, as a mixing bowl in which a diverse set of people cross paths each day24.

As plausible as the cosmopolitan mixing hypothesis might seem, big cities also provide new opportunities for self-segregation, because they are large enough to enable people to seek out and find others who are similar to themselves25. These contrasting hypotheses about the relationship between urbanization and socioeconomic mixing remain untested because it has been difficult to measure real-world exposures that take the form of path crossings and encounters among individuals7,26,27. It becomes possible to measure such exposures when mobile phone geolocation data are analysed at the device level. Although mobile phone data have been used for many research purposes28,29,30,31,32,33,34,35,36,37,38, a nationwide study of socioeconomic mixing and urbanization has not been undertaken because of difficulties in ascertaining individual-level socioeconomic status (SES), determining when dyadic exposures occur, and amassing the data needed to compare across cities or counties28,29,30,32,33,34,35,36.

Here we carefully test the cosmopolitan mixing hypothesis and the dynamics underlying it. To assess this hypothesis and understand the relationship between urbanization and segregation, we use mobile phone mobility data in the form of de-identified GPS location pings (see the ‘SafeGraph’ section of the Methods). From this data, we capture geolocated individual-level exposures between individuals of similar or different SES. This enables us to develop city-level and county-level measures of segregation that capture where people go, when they go there and whom they encounter on the way.

We first determine the SES of a person by identifying their home location and its monthly rent value. We next construct a dynamic network that captures each individual’s exposures to other individuals in their everyday life. Our network contains 1,570,782,460 edges (representing exposures in physical space) among 9,567,559 nodes (representing individuals, that is, mobile phones) across 382 MSAs and 2,829 counties in the United States. Every timestamped edge between a pair of nodes signifies that the two individuals crossed paths with and encountered each other (that is, they were at the same location at the same time). We analysed these data to estimate the amount of exposure segregation, defined as the extent to which individuals of different economic statuses are exposed to one another within each geographical area (MSAs and counties) in the United States. Our measure of exposure segregation extends a traditional static segregation measure by capturing the diversity of person-to-person exposures localized in space and time.

A more realistic measure of segregation

To estimate each person’s SES, we first infer their home location from night-time mobile phone location pings (Fig. 1a; see the ‘Inferring home location’ section of the Methods), and we then recover the estimated monthly rent value of the home at this location (Fig. 1a; see the ‘Inferring SES’ section of the Methods). This method is more accurate in estimating individual SES than the conventional approach of using neighbourhood-level census averages30,31. We next identify each instance when a pair of individuals crossed paths and were thus exposed to each other, defined as their two devices being within D metres of each other within T minutes (see the ‘Constructing exposure network’ section of the Methods). Although our key findings are robust to the precise choice of D and T (Supplementary Figs. 58), our primary analyses use D = 50 metres and T = 5 minutes because the cosmopolitan mixing hypothesis pertains to visual exposure1,2. This approach, to our knowledge, provides the highest-resolution measure of exposure to date, compared with previous GPS-based studies30,31,39.

Fig. 1: Exposure segregation captures the likelihood of exposure between people of different socioeconomic backgrounds and reveals increased segregation in highly populated metropolitan areas.
figure 1

a, For 9.6 million individuals (mobile phones), we infer their SES (rent or rent equivalent) from their home address on the basis of their location at night (see the ‘Inferring home location’ section of the Methods). We then capture path-crossing events (that is, being at the same location at the same time) to identify pairs of individuals who were exposed to each other (see the ‘Constructing exposure network’ section of the Methods). b, The nationwide network of 1.6 billion exposures spans 2,829 counties and 382 MSAs. Our exposure network contrasts with a conventional measure of economic segregation, the neighbourhood sorting index, which assumes that individuals are exposed to other residents only within their home census tract. Graphs pertain to a sample community of 50 individuals residing in ten census tracts in San Francisco, CA. Nodes represent individuals; edges represent exposures. This sample illustrates the importance of capturing cross-tract exposures, which are undetected by conventional segregation measures. c, For each geographical region (either MSA or county), we estimate exposure segregation, defined as the correlation between an individual’s SES and the mean SES of those with whom they cross paths; 1 signifies perfect segregation and 0 signifies no segregation. This definition is equivalent to the conventional neighbourhood sorting index, but with the key difference that it leverages real-life exposure from mobility data instead of synthetic exposures from individuals grouped by census tracts. For two MSAs, we show the raw data; each point represents one individual. San Francisco–Oakland–Hayward, CA, is 2.2× more segregated (P < 10−4, 95% CI = 1.6–2.8×; two-sided bootstrap; see the ‘Hypothesis testing’ section of the Methods) than Napa, CA. d,e, Contrary to the hypothesis that highly populated metropolitan areas support diverse exposures and socioeconomic mixing, we find that larger MSAs are more segregated (d). Exposure segregation presented as a function of population size; each dot represents one MSA; the purple line indicates the LOWESS fit. An upward slope reveals that urbanization is associated with higher exposure segregation (Spearman correlation = 0.62, n = 382, P < 10−4; two-sided Student’s t-test; see the ‘Hypothesis testing’ section of the Methods). The top ten largest MSAs by population size are 67% more segregated (P < 10−4, 95% CI = 49–87%; two-sided bootstrap; see the ‘Hypothesis testing’ section of the Methods) than small MSAs with fewer than 100,000 residents. Associations are robust to controlling for potential confounding factors and are similar for population density and exposure segregation (Extended Data Table 1 and Supplementary Table 7). e, Exposure segregation across the 2,829 US counties. The analysis was limited to counties with at least 50 individuals present in the dataset. Exposure segregation varies substantially across counties in the United States. Moreover, as with MSA-level segregation, county-level exposure segregation is also positively associated with both population size and population density (Extended Data Fig. 4).

The economic segregation of each geographical region is measured by the correlation between a person’s SES and the mean SES of everyone to whom they are exposed through a path crossing (see the ‘Exposure segregation’ section of the Methods). This correlation is estimated by fitting a linear mixed-effects model that eliminates attenuation bias and secures unbiased estimates of exposure segregation even when observed exposures are sparse (Extended Data Fig.  1; see the ‘Estimating exposure segregation’ section of the Methods). The resulting measure of exposure segregation (Fig. 1b,c), which ranges from 0 (perfect integration) to 1 (complete segregation), is a generalization of a widely used measure of socioeconomic segregation—the neighbourhood sorting index7. The neighbourhood sorting index is equivalent to the correlation between each person’s SES and the mean SES of all of the people in their home census tract, whereas our measure of exposure segregation is equivalent to the correlation between each person’s SES and the mean SES of all of the people who they encounter (either inside or outside their home census tract). Thus, the key difference between these two measures is that the neighbourhood sorting index assumes that exposures occur uniformly and only among co-residents of the same home tract, whereas exposure segregation captures real-world exposures among people as they navigate their daily lives.

Extreme segregation in large cities

We find that, contrary to the cosmopolitan mixing hypothesis, exposure segregation is higher in large MSAs (Fig. 1d). The Spearman correlation between MSA population and MSA segregation is 0.62 (P < 10−4), and the ten largest MSAs by population size are 67% more segregated (P < 10−4, 95% confidence interval (CI) = 49–87%) than small MSAs with fewer than 100,000 residents. This result is robust. We validated it by recalculating the correlation with a measure of density rather than population size (Spearman correlation = 0.45, P < 10−4; Supplementary Table 7), by controlling for potential confounding factors (Extended Data Table 1 and Supplementary Table 7), by varying the granularity of the analysis (Fig. 1e and Extended Data Fig. 4) and by testing a variety of specifications of exposure segregation (Supplementary Table 6 and Supplementary Figs. 210). The consistent result that larger, denser cities are more segregated runs counter to the hypothesis that such cities promote socioeconomic mixing by attracting diverse individuals and constraining space in ways that oblige them to encounter one other1,2,3,4,5,6. Our results support the opposite hypothesis: big cities allow their inhabitants to seek out people who are more like themselves. The key advance that enables this finding is our fine-grained measure of proximity with respect to both time and space (Supplementary Fig. 66).

Exploring exposure segregation

Our methodology further allows for comparisons between a conventional static measure of segregation (neighbourhood sorting index) and our dynamic measure. The median level of exposure segregation across all MSAs is 38% lower (P < 10−4, 95% CI = 37–41%) than the corresponding value for a conventional static estimate31 (neighbourhood sorting index; Fig. 2a (top)). We explain this result by disaggregating our measure into components pertaining to exposures in which both, one or neither individual was within their home census tract (Fig. 2a (bottom)). Exposure segregation is lower because, when people venture outside their home tracts, they experience more diversity. For example, exposures are 50% less segregated (P < 10−4, 95% CI 48–53%) when both people are outside the home census tract than when both people are within their home tract. Within their own neighbourhood, people cross paths with neighbours who are socioeconomically most similar to them, but this has little effect on overall exposure segregation because only 2.4% of exposures (95% CI = 2.4–2.4%) occur when both individuals are within their home tract. Finally, we observe that not only is overall exposure segregation elevated in large cities, but also each of its components is elevated in large cities (Supplementary Fig. 10).

Fig. 2: Exploring the dynamics of exposure segregation reveals that socioeconomic differentiation of spaces accounts for increased segregation in large cities.
figure 2

a, Each point represents the segregation estimate in one of the n = 382 MSAs; the vertical coloured lines represent the median across MSAs. Top, exposure segregation is 38% lower (P < 10−4, 95% CI = 37–41%; two-sided bootstrap; see the ‘Hypothesis testing’ section of the Methods) than the conventional segregation measure—the neighbourhood sorting index. Bottom, a breakdown of exposure segregation into its component parts. Exposures in which both people are within their home census tract (green) are most segregated, reflecting the homophily effect in which people preferentially encounter those of a similar SES in their home tracts. Out-of-tract exposures (orange and red) are less segregated, reflecting the visitor effect in which entering other tracts exposes individuals to economically diverse individuals. As a small minority (2.4%, 95% CI = 2.4–2.4%; two-sided bootstrap; see the ‘Hypothesis testing’ section of the Methods) of exposures happen within the home tract, the visitor effect dominates the homophily effect and exposure segregation is therefore lower than the conventional neighbourhood sorting index. b,c, Exposure segregation varies by tie strength and location type. Each point represents segregation in one of n = 382 MSAs using only exposure pairs occurring with a specific tie strength (b) or in a given location type (c). The boxes indicate the interquartile range across MSAs. Segregation increases with tie strength and is especially high for the strongest ties (5+ exposures; median exposure segregation, 0.57). Segregation is highest at golf courses and country clubs (median exposure segregation, 0.42) and lowest at performing arts centres (median exposure segregation, 0.16) and stadiums (median exposure segregation, 0.17). df, A case study of full-service restaurants illustrates the relationship between urbanization and exposure segregation. Highly populated metropolitan areas are more segregated not only because they offer a wider choice of venues but also because these venues are more socioeconomically differentiated. d, Larger MSAs have more restaurants within 10 km of the average resident, giving residents more options to self-segregate. e, Moreover, restaurants in larger MSAs vary more in the median SES of their visitors, meaning that a greater choice of socioeconomically differentiated restaurants is offered. The coefficient of variation across restaurant SES (that is, the median SES of a restaurant’s visitors) in the ten largest MSAs is 63% more (P < 10−4, 95% CI = 37–100%; two-sided bootstrap; see the ‘Hypothesis testing’ section of the Methods) than the coefficient of variation in small MSAs (with fewer than 100,000 residents). f, Consequently, exposure segregation within restaurants is higher in larger MSAs. These relationships are also detectable at the scale of city hubs (defined as higher-level clusters of POIs such as plazas and shopping centres) as well as at the neighbourhood level (Extended Data Figs. 5 and 6).

We quantify variability in exposure segregation both by tie strength (Fig. 2b,c) and across different points of interest (POIs). Stronger ties are more segregated40,41 (Fig. 2b). We also find much variability in POI-level segregation11,30 (Fig. 2c; see the ‘Decomposing segregation by activity’ section of the Methods). We explain this variability in POI-level segregation (Fig. 2c) by the extent to which a POI category (such as restaurants) contains differentiated POIs that service small and thereby socioeconomically homogeneous communities (for example, Michelin star restaurants). We operationalize the extent of a POI category’s differentiation using the average travel distance to the nearest POI30 and the total number of POIs (Spearman correlation = −0.75, P < 0.001 (travel distance); Spearman correlation = 0.69, P < 0.01 (number of POIs); Extended Data Fig. 3a,b). For example, in the median MSA, religious organizations require 92% less travel distance (P < 10−4, 95% CI = 92–93%) and are 16× more numerous (P < 10−4, 95% CI = 8–18×) than stadiums. Because religious organizations can therefore target more narrowly defined socioeconomic communities, they are 75% more segregated (P < 10−4, 95% CI = 58–87%) than stadiums. In rare cases, a POI category with only a small number of POIs may still exhibit substantial segregation (such as golf courses) owing to economic differentiation among its POIs caused by other factors (such as a public–private distinction; Extended Data Fig. 3c). Below, we show that this link between the socioeconomic differentiation of spaces and segregation is also critical to explaining why large cities are more segregated.

Differentiation of space in large cities

To understand why large metropolitan areas support segregation, we present an example of segregation within leisure POIs. Full-service restaurants provide an illustrative example (Fig. 2d–f) of a segregation-inducing dynamic that holds widely across other leisure sites (Supplementary Fig. 22) and other scales of analysis (Extended Data Figs. 5 and 6). We find that larger MSAs offer their residents a greater number of leisure choices: the average resident of one of the ten largest MSAs has 22× more restaurants (P < 10−4, 95% CI = 11–39×) within 10 km of their home compared with an average resident of a small MSA (where a ‘small MSA’ is defined as one with fewer than 100,000 residents; Fig. 2d). These choices are also more socioeconomically differentiated. When a restaurant’s SES is defined as the median SES of all people who visited it and encountered another person, the coefficient of variation of ‘restaurant SES’ in the ten largest MSAs is 63% greater (P < 10−4, 95% CI = 37–100%) than that in small ones (Fig. 2e). Thus, not only do large MSAs offer their residents a larger choice of restaurants, but these restaurants are also more socioeconomically differentiated. For example, in large cities such as New York, one can spend US$10, US$100 or US$1,000 on a meal, depending on the choice of restaurant42,43. These processes mean that exposure segregation in restaurants is 29% higher (P < 10−3, 95% CI 8–49%) in the ten largest MSAs than in small MSAs (Fig. 2f). We find analogous results across many POI types (Supplementary Fig. 22) and at higher levels of scale pertaining to city hubs (for example, plazas, shopping centres, boardwalks) as well as neighbourhoods (Extended Data Figs. 5 and 6).

Mitigating segregation through urban design

Our results suggest that segregation could be mitigated when frequently visited POIs, which we refer to as ‘hubs’, are positioned in close proximity to diverse neighbourhoods. These hubs would serve as bridges between residents of nearby high-SES and low-SES neighbourhoods, enabling them to easily visit the hubs44,45,46 and encounter one another (Fig. 3c). We developed the bridging index (see the ‘Bridging index’ section of the Methods) to measure whether hubs are located in such bridging positions. Our index measures the economic diversity of the groups that would encounter each other if everybody visited only their nearest hub. It is computed by clustering individuals by the nearest hub to their home and then measuring the economic diversity within these clusters (Extended Data Fig. 7). The resulting index ranges from 0 to 1, where 0 means that individuals near each hub have a uniform SES, and 1 means that individuals near each hub are as diverse as the overall area (Extended Data Fig. 8). We compute our bridging index for commercial centres (such as plazas, shopping centres, boardwalks) because we find that they are common hubs of exposure: the majority (56.9%, 95% CI = 56.9–56.9%) of exposures across all 382 MSAs occur in close proximity (within 1 km) to a commercial centre, even though only 2.5% of land area is within 1 km of a commercial centre (Fig. 3c). The results show that our bridging index is strongly associated with exposure segregation (Spearman correlation = −0.78, P < 10−4; Fig. 3d). The top ten MSAs with the highest bridging index are 53.1% less segregated (P < 10−4, 95% CI = 44–60%) than the ten MSAs with the lowest bridging index. This finding is again robust: the hub-bridging effect is strong and significant (P < 10−4) even after including controls for race, population size, economic inequality and many other variables (Extended Data Tables 2 and 3, Supplementary Table 6 and Supplementary Figs. 2, 8 and  13). It follows that zoning laws and related policies that encourage developers to locate hubs, such as shopping centres, between diverse residential neighbourhoods may reduce exposure segregation. We have identified several large cities that increase integration in this manner (Supplementary Table 21) and present an illustrative example (Fig. 3c,d) in which well-placed hubs bridge diverse individuals in Fayetteville, North Carolina.

Fig. 3: Exposure segregation is lower when frequently visited hubs bridge socioeconomically diverse neighbourhoods.
figure 3

a, We developed an index (see the ‘Bridging index’ section of the Methods) to quantify the extent to which highly visited hubs bridge socioeconomically diverse neighbourhoods. The metric was constructed by clustering homes by the nearest hub, then measuring the within-cluster diversity of SES. Two plots illustrate that the bridging index is distinct from conventional measures of residential segregation such as the neighbourhood sorting index. The bridging index ranges from 0 (no bridging; top) to 1 (perfect bridging; bottom), while residential segregation is constant (high-SES and low-SES individuals are highly segregated by census tract, denoted by purple and yellow bounding boxes). We compute our bridging index with hubs defined as commercial centres (such as shopping centres and plazas) because the majority (56.9%, 95% CI = 56.9–56.9%; bootstrapping; see the ‘Hypothesis testing’ section of the Methods) of exposures across all 382 MSAs occur in close proximity (within 1 km) to a commercial centre, even though only 2.5% of land area is within 1 km of a commercial centre. b, Our bridging index strongly predicts exposure segregation (Spearman correlation = −0.78, n = 382, P < 10−4; two-sided Student’s t-test; see the ‘Hypothesis testing’ section of the Methods). The top ten MSAs with the highest bridging index are 53.1% less segregated (P < 10−4, 95% CI = 44–60%; two-sided bootstrap; see the ‘Hypothesis testing’ section of the Methods) than the ten MSAs with the lowest bridging index. The bridging index predicts segregation more accurately (P < 10−4; two-sided Steiger’s Z-test; see the ‘Hypothesis testing’ section of the Methods) than population size, SES inequality, neighbourhood sorting index and race, and is significantly associated (P < 10−4; two-sided Student’s t-test; see the ‘Hypothesis testing’ section of the Methods) with exposure segregation after controlling for these variables and other potential confounding factors (Extended Data Tables 2 and 3). c,d, A case study of Fayetteville, North Carolina, an MSA with low exposure segregation (21st percentile) despite having an above-median population size (64th percentile) and income inequality (60th percentile). c, Exposure heat map of Fayetteville; all visually discernible hubs are associated with one or more commercial centres. d, Hubs are located in accessible proximity to both high-SES and low-SES census tracts (bridging index = 0.90, 62nd percentile), leading to diverse exposures. An illustrative example of one hub (Highland Center) in Fayetteville and a random sample of ten exposures occurring inside of it. The home icons demarcate home locations of individuals (up to 100 m of random noise was added for anonymity); the colours denote individual and mean tract SES. The maps in c and d were generated using OpenStreetMap data.

Discussion

As big cities continue to grow and spread, it is important to examine whether they encourage socioeconomic mixing. Although it is often argued that big cities promote mixing by increasing density, we find that exposure diversity and city size are negatively related. This result means that scale matters. We have shown that, because large cities can sustain venues that are targeted to thin socioeconomic slices of the population, they have become homophily-generating machines that are far more segregated than small cities. We also find that some cities are able to mitigate this segregative effect because their hubs are located in bridging zones that can draw in people from diverse neighbourhoods. We were able to detect these pockets of homophily (and the counteracting effects of bridging hubs) because we have developed a dynamic measure of economic segregation that captures everyday socioeconomic mixing at home, work and leisure.

This new methodology for measuring exposure segregation, while an improvement over static approaches, has limitations. For example, it is difficult to ascertain how weak or strong the ties are, as we are obliged to use physical proximity as a proxy for exposure47. It is reassuring in this regard that our core results persist under stricter time, distance and tie-strength thresholds (Supplementary Table 6 and Supplementary Figs. 58), and are associated with key downstream outcomes (Extended Data Fig. 2 and Supplementary Fig. 24). It is likewise important to locate and analyse supplementary datasets that cover subpopulations (for example, subpopulations of homeless individuals) that are not as well represented in our dataset48. The available evidence indicates that our sample is well balanced on many key racial, economic and demographic variables49, but mobile phone market penetration is still not complete, and GPS ping data are unevenly distributed by time. Finally, our measure of SES relies on housing consumption, an indicator that does not exhaust the concept of SES. It is again reassuring that our analytical approach, which improves on conventional neighbourhood-level imputations, is robust under a range of alternative measures of SES (Supplementary Fig. 3).

This is all to suggest that dynamic segregation data are rich enough to overcome many seeming limitations. The dynamic approach that we have taken here could further be extended to examine cross-population differences in the sources of segregation and to develop a more complete toolkit of approaches for reducing segregation and improving urban design.

Methods

The Methods is structured as follows. In the ‘Datasets’ section, we explain the datasets used in our analysis; in the ‘Data processing’ section, we explain the data processing procedures that we use to infer SES and exposures; and in the ‘Analysis’ section, we explain the analyses underlying our main results.

Datasets

SafeGraph

Our primary mobility and location data comprise GPS locations from a sample of adult smartphone users in the United States, provided by the company SafeGraph. The data are de-identified GPS location pings from smartphone applications that are collected and transmitted to SafeGraph by participating users50. As described by SafeGraph in the public documentation, SafeGraph data are collected by “partner[ing] with mobile applications that obtain opt-in consent from users to collect anonymous location data. This data is not associated with any name or email address”. SafeGraph ensures that its mobile application partners obtain consent for data to be used for commercial and research purposes, including academic publication. SafeGraph users are able to opt out of data collection at any time.

Although the sample is not random, previous work has demonstrated that SafeGraph data are geographically well balanced (that is, an approximately unbiased sample of different census tracts within each state) and well balanced along the dimensions of race, income and education49,51. Furthermore, SafeGraph data are a widely used standard in large-scale studies of human mobility across many different areas including COVID-19 modelling51, political polarization39 and consumer preference tracking52. All data provided by SafeGraph were stored on a secure server behind a firewall. Data handling and analysis was conducted in accordance with SafeGraph policies and in accordance with the guidelines of the Stanford University Institutional Review Board.

The raw data consist of 91,755,502 users and 61,730,645,084 pings from three evenly spaced months in 2017: March, July and November. Each ping consists of a latitude, longitude, timestamp, and de-identified user ID. The mean number of raw pings associated with a user is 667 and the median number of pings is 12. We applied several filters to improve the reliability of the SafeGraph data, and subsequently linked each user to an estimated rent (that is, Zillow Zestimate) using their inferred home location (that is, CoreLogic address), as described in the ‘Inferring home location’ and ‘Inferring SES’ sections.

We applied several filters to improve the reliability of the SafeGraph data. To ensure that the locations are reliable, we excluded pings with location estimates less accurate than 100 m, as recommended by SafeGraph53. We filtered out users with fewer than 500 pings, as these are largely noise. We also filtered out users for whom we were unable to infer a home, because we rely on home rent values to measure SES. Finally, to avoid duplicate users, we removed users if more than 80% of their pings had identical latitudes, longitudes and timestamps to those of another user; this could potentially occur if, for example, a single person in the real world carries multiple mobile devices. After these initial filters, we were able to infer home locations for 12,183,523 users in the United States (50 states and Washington DC), leveraging the CoreLogic database. Of users for whom we could infer a home location, we were able to successfully link 9,576,650 to an estimated rent value through the Zillow API. The ‘Inferring home location’ and ‘Inferring SES’ sections provide full details on the use of CoreLogic database to infer home locations and the use of the Zillow API to link these home locations to estimated rent values. Finally, after removing users for whom >80% of their pings were duplicates with another user, we reduced the number of users from 9,576,650 to 9,567,559 (that is, we removed less than 0.1% of users through de-duplication).

CoreLogic

We use the CoreLogic real estate database to link users to home locations54. The database provides information covering over 99% of US residential properties (145 million properties), over 99% of commercial real estate properties (26 million properties) and 100% of US county, municipal and special tax districts (3,141 counties). The CoreLogic real estate database includes the latitude and longitude of each home, in addition to its full address: street name, number, county, state and zip code.

Zillow

We used the Zillow property database to query for rent estimates55 (our primary measure of SES). The Zillow database contains rent data (rent Zestimate) for 119 million US residential properties. We were able to determine a rent Zestimate, the primary measure of SES used in our analysis, for 9,576,650 out of 12,183,523 inferred SafeGraph user homes (a 79% hit rate).

SafeGraph Places

Our database of US business establishment boundaries and annotations comes from the SafeGraph Places database50, which indexes the names, addresses, categories, latitudes, longitudes and geographical boundary polygons of 5.5 million US POIs in the United States. SafeGraph includes the North American Industry Classification System (NAICS) category of each POI, which is standard taxonomy used by the Federal government to classify business establishments56. For example, the NAICS code 722511 indicates full-service restaurants. We identified relevant leisure sites using the prefix 7, which includes arts, entertainment, recreation, accommodation and food services, and supplemented these POIs with the prefix 8131 to include religious organizations such as churches. We restricted our analysis of leisure sites to the top-most frequently visited POI categories within these NAICS code prefixes (Fig. 2c): full-service restaurants, snack bars, limited-service restaurants, stadiums and so on. SafeGraph Places also includes higher-level ‘parent’ POI polygons that encapsulate smaller POIs. Specifically, we identified hubs with the NAICS code 531120 (lessors of non-residential real estate), which we find in practice corresponds to commercial centres such as shopping centres, plazas, boardwalks and other clusters of businesses. We provide illustrative examples of such hubs in Supplementary Figs. 1618.

US census

We extracted demographic and geographical features from the five-year 2013–2017 American Community Survey57. This enables us, as described below, to link mobile phone locations to geographical areas including census block group (CBG), census tract and MSA, as well as to infer demographic features corresponding to those demographic areas including median household income.

A CBG is a statistical division of a census tract. CBGs are generally defined to contain between 600 and 3,000 people. A CBG can be identified at the national level by the unique combination of state, county, tract and block group codes.

A census tract is a statistical subdivision of a county containing an average of around 4,000 inhabitants. Census tracts range in population from 1,200 to 8,000 inhabitants. Each tract is identified by a unique numeric code within a county. A tract can be identified at the national level by the unique combination of state, county and tract codes.

Census tracts and block groups typically cover a contiguous geographical area, although this is not a constraint on the shape of the tract or block group. Census tract and block group boundaries generally persist over time so that temporal and geographical analysis is possible across multiple censuses.

Most census tracts and CBGs are delineated by inhabitants who participate in the Census Bureau’s Participant Statistical Areas Program. The Census Bureau determines the boundaries of the remaining tracts and block groups when delineation by inhabitants, local governments or regional organizations is not possible58.

An MSA is a US geographical area defined by the Office of Management and Budget (OMB) and is one of two types of Core Based Statistical Area (CBSA). A CBSA comprises a county or counties associated with a core urbanized area with a population of at least 10,000 inhabitants and adjacent counties with a high degree of social and economic integration with the core area. Social and economic integration is measured through commuting ties between the adjacent counties and the core. A micropolitan statistical area is a CBSA of which the core has a population of between 10,000 and 50,000; an MSA is a CBSA of which the core has a population of over 50,000. In our primary analysis, we follow a previous study31 and focus on MSAs, excluding micropolitan statistical areas owing to data sparsity concerns.

TIGER

Road and transportation feature annotations come from the census-curated Topologically Integrated Geographic Encoding and Referencing system (TIGER) database59. The TIGER databases are an extract of selected geographical and cartographic information from the US Census Bureau’s Master Address File/Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). We used the MAF/TIGER Feature Class Code (MTFCC) from the TIGER Roads and TIGER Rails databases to identify road and railways. TIGER data are in the format of Shapefiles, which provide the exact boundaries of roads and railways as latitude/longitude coordinates.

Data processing

For each individual, we first infer their home location and subsequently estimate their SES on the basis of their home rent value (see the ‘Inferring home location’ and subsequently ‘Inferring SES’ sections). We then calculate all exposures between individuals (see the ‘Constructing exposure network’ section). We then annotate exposures according to the location in which they occurred. Specifically, we annotate whether the exposure took place in both individuals’ home tract, in one individual’s home tract, or in neither home tract. We also determine whether it occurred inside a fine-grained POI, such as a specific restaurant, as well as whether it took place within a parent POI, like a hub (see the ‘Annotating exposures’ section). Details on all inferences and exposure calculations are provided below.

Inferring home location

We first infer a user’s home latitude and longitude using the latitude and longitude coordinates of their pings during local night-time (and early-morning) hours, based on best practices established by SafeGraph60. We first remove users with fewer than 500 pings to ensure that we have enough data to reliably infer home locations. We then interpolate each person’s location for each 1 h window (for example, 18:00–19:00, 19:00–20:00 and 20:00–21:00) using linear interpolation of latitudes and longitudes to ensure that we have time series at a constant time resolution. We perform interpolation using the interpolate package of the scipy library. We filter for hours between 18:00 and 09:00 during which the person moves less than 50 m until the next hour; these stationary night-time (and early-morning) pings represent cases in which the person is more likely to be at home. We filter for users who have such stationary pings on at least three dates and with at least 60% of pings within a 50 m radius. Finally, we infer home latitude and longitude as the median latitude and longitude of these stationary pings (after removing outliers outside the 50 m radius). We choose the thresholds above because they yield a good compromise between inferring the home location of most users and inferring home locations with high confidence. Overall, we are able to infer home locations for 70% of users with more than 500 pings, and these locations are inferred with high confidence; 89% of stationary night-time observations are within 50 m of the inferred home latitude and longitude. Our key findings are robust to the exact choice of threshold for home identification (Supplementary Fig. 62).

Inferring SES

Having inferred each user’s  home location, we link their latitude and longitude to a large-scale housing database (Zillow) to infer the estimated rent of each individual’s home, which we use as a measure of SES. We do this in two steps. First, we link the inferred user’s home latitude and longitude to the CoreLogic property database (see the ‘CoreLogic’ section), a comprehensive database of properties in the United States, by taking the closest CoreLogic residential property (single family residence, condominium, duplex or apartment) to the user’s inferred home latitude and longitude. Second, we use the CoreLogic address to query the Zillow database (see the ‘Zillow’ section), which provides an estimated home rent and price for each individual (the Zillow database does not allow for queries using raw latitude and longitude, which is why it is necessary to leverage to CoreLogic to obtain an address for each user). We use Zillow’s estimated rent for the user’s home as our main measure of SES. We apply several quality-control filters to ensure that the final set of individuals that we use in our main analyses have reliably inferred home locations and SES: (1) we remove a small number of users whose  median latitude and longitude at home are  identical to another user’s, as we empirically observe that these people have unusual ping patterns; (2) we remove users for whom we are lacking a Zillow rent estimate, as this constitutes our primary SES measure; (3) we winsorize Zillow rent estimates that are greater than US$20,000 to avoid spurious results from a small number of outliers; (4) we remove a small number of users who are missing census demographic information for their inferred home location; (5) we remove users whose Zillow home location is further than 100 m from their CoreLogic home location, or whose CoreLogic home location is further than 100 m from their median latitude and longitude at home; (6) we remove a small number of users in single family residences who are mapped to the exact same single family residence as more than 10 other people, as this may indicate a data error in the Zillow database.

The set of users who pass these filters constitutes our final analysis set of 9,567,559 users. We confirm that the census demographic statistics of these users’ inferred home locations are similar to those of the US population in terms of income, age, sex and race.

Any individual quantitative measure provides only a partial picture of a person’s SES. Recognizing this, we conduct robustness checks in which, rather than using the Zillow estimated rent of the user’s home as a proxy for SES, we use (1) the median CBG household income in that area; and (2) the percentile-scored rent of the home, to account for long-tailed rent distributions. Our main results are robust to using these alternative measures of SES (Supplementary Fig. 3).

Constructing the exposure network

We constructed a fine-grained, dynamic exposure network \({\mathcal{G}}\) between all 9,567,559 individuals across 382 MSAs and 2,829 counties, which is represented as an undirected graph \({\mathcal{G}}=({\mathcal{V}},{\mathcal{E}})\) with time-varying edges. Each node \({v}_{i}\in {\mathcal{V}}\) in the graph represents one of the n = 9,567,559 individuals in our study, such that the set of nodes is V = {v1, v2, ..., vn}. Each node vi has a single attribute xi, representing the inferred SES (estimated rent) of the individual.

Individuals vi and vj are connected by one edge \({e}_{i,j,k}\in {\mathcal{E}}\) per exposure, with k indicating the kth exposure between individuals vi and vj. Each edge ei,j,k has three attributes ti,j,k, lati,j,k and loni,j,k, indicating the timestamp, latitude and longitude of the exposure, respectively. We now focus our discussion on explaining how each of the exposure edges of the network is calculated.

We define an exposure to occur when two users have GPS pings that are close (according to a fixed threshold) in both physical proximity and time. Specifically if user vi has a GPS ping with ti, lati, loni (indicating the timestamp, latitude and longitude of the ping respectively), and user vj has a GPS ping with tj, latj, lonj, then the users are said to have crossed paths if ti − tj < T and distance((lati, lati), (latj, latj)) < D, where T represents the time threshold (that is, the maximum time distance the two pings can be apart to count as an exposure) and D represents the distance threshold (that is, the maximum physical distance that the two pings can be apart to count as an exposure). We filter for both distance and time simultaneously to ensure that our exposure network includes only pairs of users who are likely to have crossed paths with each other. This high-resolution definition of exposure contrasts with other methods that consider all individuals that visit the same location, irrespective of time30, to have an equal likelihood of exposure, an unrealistic assumption because the SES of visitors to a given location can vary significantly by time (Supplementary Fig. 63)61. This fine-grained measure of proximity with respect to both time and space is the key advance that enables our findings (Supplementary Fig. 66). We use a threshold T of 5 min, which is a stringent threshold on time as the mean number of pings per person per hour during day time is approximately one ping. We use a distance threshold D of 50 m, because the cosmopolitan mixing hypothesis pertains to visual exposure1,2,3 and following previous work showing that even exposure to individuals from afar is linked to long-term outcomes19. Our network is validated by correlation to external, gold-standard datasets (Extended Data Fig. 2). Furthermore, we show through a series of robustness checks that our key results in Figs. 13 are highly robust to varying thresholds (that is, 1 min or 2 min time thresholds, as well as 10 m or 25 m distance thresholds), as well as additional criteria to increase the tie strength (that is, requiring prolonged exposures, or multiple exposures on unique days). Under all of these different definitions of exposure, our main findings remain consistent (Supplementary Table 6 and Supplementary Figs. 28).

To efficiently calculate exposures that occurred among all users, we implement our exposure threshold as a k-dimensional (k-d) tree62, a data structure that enables one to efficiently identify all pairs of points within a given distance of each other in a k-dimensional space. In total, we identify 1,570,782,460 exposures. The timestamp ti,j,k of the exposure is the minimum ping timestamp in the pair of individuals’ ping timestamps (ti,tj). The location lati,j,k,loni,j,k of the exposure is the average latitude and longitude of pair of pings belonging to the two individuals (lati,latj) and (loni,lonj). We implement our exposure detection system to parallelize across multiple cores, enabling us to efficiently construct the network using a single supercomputer (with 12 TB RAM and 288 cores) in under a week. By contrast, a naive implementation (without k-d trees or parallelization) would necessitate on the order of ~10 years of computing time. The key challenge is accounting for proximity in time and space simultaneously, which results in an O(n2) time complexity for a naive implementation (where n is the number of pings in the dataset), in contrast to previous work that is time agnostic and can therefore compute exposures using geohashes in O(n) time31,38.

Annotating exposures

Exposures are annotated to indicate whether they occurred at or near POIs, for example, at a user’s home, or within a restaurant. Annotations are not mutually exclusive in that an exposure may be simultaneously tagged as having occurred near multiple POIs from multiple data sources. We describe the specific annotations below.

We annotate a user’s exposure as having occurred at their home if it occurs within 50 m of the user’s home location. An exposure is annotated with a TIGER road/railway if it occurs within 20 m from that feature. An exposure is annotated as having occurred within a SafeGraph Places POI if the exposure occurs within the polygon defined for the POI. Polygons are provided by the SafeGraph Places database for both fine-grained POIs (for example, individual restaurants) as well as parent POIs (such as hubs). We focus our analysis of fine-grained POIs (Fig. 2c and Extended Data Fig. 3) on the most visited fine-grained POIs, such as full-service restaurants, snack bars, limited-service restaurants (such as fast food) and stadiums (a full list is shown in Fig. 2c). These categories approximately align with those used in previous work31.

Analysis

Exposure segregation

We define the exposure segregation of a specified geographical area (that is, MSA or county) as the Pearson correlation between the SES of an individual residing in that geographical area and the mean SES of those who they encounter.

$${\rm{E}}{\rm{x}}{\rm{p}}{\rm{o}}{\rm{s}}{\rm{u}}{\rm{r}}{\rm{e}}\,{\rm{s}}{\rm{e}}{\rm{g}}{\rm{r}}{\rm{e}}{\rm{g}}{\rm{a}}{\rm{t}}{\rm{i}}{\rm{o}}{\rm{n}}={\rm{C}}{\rm{o}}{\rm{r}}{\rm{r}}({\rm{S}}{\rm{E}}{\rm{S}},{\bar{{\rm{S}}{\rm{E}}{\rm{S}}}}_{{\rm{e}}{\rm{x}}{\rm{p}}{\rm{o}}{\rm{s}}{\rm{u}}{\rm{r}}{\rm{e}}{\rm{s}}})=\frac{{\rm{c}}{\rm{o}}{\rm{v}}({\rm{S}}{\rm{E}}{\rm{S}},{\bar{{\rm{S}}{\rm{E}}{\rm{S}}}}_{{\rm{e}}{\rm{x}}{\rm{p}}{\rm{o}}{\rm{s}}{\rm{u}}{\rm{r}}{\rm{e}}{\rm{s}}})}{{{\sigma }}_{{\rm{S}}{\rm{E}}{\rm{S}}}{{\sigma }}_{{\bar{{\rm{S}}{\rm{E}}{\rm{S}}}}_{{\rm{e}}{\rm{x}}{\rm{p}}{\rm{o}}{\rm{s}}{\rm{u}}{\rm{r}}{\rm{e}}{\rm{s}}}}}$$

Our metric captures the extent to which an individual’s SES predicts the SES of their immediate exposure network. Thus, in a perfectly integrated area in which individuals encounter others randomly regardless of SES, exposure segregation would equal 0.0. In a perfectly segregated area in which individuals encounter only those of the exact same SES, exposure segregation would equal 1.0. Our primary metric does not upweight repeated exposures to the same person (to avoid overly weighting strong ties such as housemates), although our key findings are robust to doing so (Supplementary Fig. 2).

Exposure segregation nests a classic definition of residential segregation, the neighbourhood sorting index7, which is equivalent to the Pearson correlation between each person’s SES and the mean SES in their census tract. The neighbourhood sorting index is widely used because it can be calculated directly from census data on the SES of people living in each tract. However, a fundamental limitation of the neighbourhood sorting index as a measure of segregation is that the census tract in which people live is a weak proxy for who they encounter. Census tracts are static and artificial boundaries that do not capture socioeconomic mixing as individuals move throughout the cityscape during work, leisure time and schooling.

We design our exposure segregation metric such that it accommodates any exposure network, and the neighbourhood sorting index is therefore a special case of our metric. Specifically, if exposure segregation is computed for a synthetic exposure network under the unrealistic assumptions that (1) people are exposed only to those in their home census tract; and (2) exposures occur uniformly at random, then it is equivalent to the neighbourhood sorting index (Supplementary Fig. 19). However, constructing such a synthetic exposure network from census tracts has limited applicability to measuring segregation in the real world, because people may also be exposed to more heterogeneous populations as they visit other census tracts for work, leisure or other activities, a phenomenon that we refer to as the visitor effect. Furthermore, even within the home tract, individuals may seek out people of similar SES; we refer to this as the homophily effect. We therefore instead leverage dynamic mobility data from mobile phones to capture the extent of contact between diverse individuals throughout the day, and apply our metric, exposure segregation, to this real-world exposure network. Our analyses reveal that our measure of exposure successfully captures both the visitor effect and the homophily effect (Fig. 2a). An advantage of our definition of exposure segregation is that it allows for direct comparability to the neighbourhood sorting index because both measures are of the same underlying statistical quantity, but differ in their definition of the exposure network. Our results indicate that this choice of exposure network matters; exposure segregation is a stronger predictor of upward economic mobility than the neighbourhood sorting index (Extended Data Fig. 2), and the two metrics are shown to be distinct (Supplementary Fig. 20).

To calculate the exposure segregation of a specified geographical area (that is, MSA or county), we first select the set of all individuals who reside in area \({{\mathcal{V}}}_{A}\subset {\mathcal{V}}\). For example, to calculate exposure segregation for Napa, California (Fig. 1c (top)), \({{\mathcal{V}}}_{A}\) is the 3,707 users with home locations inside the geographical boundary of Napa, CA. Subsequently, for each individual resident of the area \({v}_{i}\in {{\mathcal{V}}}_{A}\), we query the population exposure network (\({\mathcal{G}}=({\mathcal{V}},{\mathcal{E}})\)) for the SES of the set of individuals who they cross paths with \({{\mathcal{Y}}}_{i}\): \(\{{x}_{j}\in {\mathcal{V}}| {e}_{i,j,k}\in {\mathcal{E}}\}\). We then aim to estimate the Pearson correlation between the SES of each individual xi and the (unweighted) mean SES of those to whom they are exposed from all path crossings, \({y}_{i}={\rm{mean}}({{\mathcal{Y}}}_{i})\).

Estimating exposure segregation

Here we first motivate why a ‘naive’ approach to estimating exposure segregation through a sample Pearson correlation on the observed exposure network is problematic (resulting in downwardly biased estimates of exposure segregation). We then elaborate on how we leverage a linear mixed effects model to compute a corrected Pearson correlation, enabling us to obtain unbiased estimates of exposure segregation even in areas where data are sparse.

A naive approach to estimate exposure segregation would be to first compute the observed sample mean SES of individuals who each person is exposed to. Exposure segregation could then be estimated using a sample Pearson correlation:

$${r}_{xy}=\frac{{\sum }_{i=1}^{n}({x}_{i}-\overline{x})({y}_{i}-\overline{y})}{\sqrt{{\sum }_{i=1}^{n}{({x}_{i}-\overline{x})}^{2}{({y}_{i}-\overline{y})}^{2}}}$$

between an individual’s SES (xi) and the sample mean SES of those they are exposed to (yi). This approach is problematic because naively computing such a correlation based on limited data (in counties or MSAs with low population sizes) will result in estimates that are downward biased. To illustrate why naive estimates of exposure segregation are downward biased, imagine that we compute the correlation between a person’s SES and the ‘true’ mean SES of the people who they are exposed to. Now, we add noise to the mean SES values, which represents the noisy mean estimates given limited data. As the noise is increased, the correlation is decreased. Thus, because estimates of each person’s mean SES will be more noisy in geographical areas with less data, there will be a downward bias to naive estimates of the Pearson correlation in these areas.

We instead compute a corrected Pearson correlation, using a linear mixed effects model to accurately estimate exposure segregation: the correlation between a person’s SES and the mean SES of the people they are exposed to. Our linear mixed effects model is an unbiased estimator of the Pearson correlation. We compare the unbiased estimates from our linear mixed-effects model to the downwardly biased sample Pearson correlation estimates in Extended Data Fig. 1.

Our mixed model represents the distribution of datapoints (xi, yij) through the following equation:

$${y}_{ij}=a{x}_{i}+b+{{\epsilon }}_{i}^{(1)}+{{\epsilon }}_{ij}^{(2)},$$

where xi is the SES of person i, yij is the SES of person j who was exposed to person i, a and b are model parameters, \({{\epsilon }}_{i}^{(1)}\) is a person-specific noise term and \({{\epsilon }}_{ij}^{(2)}\) is the noise for each datapoint. Above, the true mean SES of the exposure set for each person is modelled as \(a{x}_{i}+b+{{\epsilon }}_{i}^{(1)}\). Individual exposures yij are then modelled as noisy draws from a distribution centred at this true mean. The Pearson correlation coefficient between person i’s SES and the mean SES of the people they were exposed to is then computed as follows. We assume that xi has a variance of 1 through data preprocessing and that xi is uncorrelated with \({{\epsilon }}_{i}^{(1)}\).

$$\begin{array}{l}{\rm{corr}}({x}_{i},a{x}_{i}+b+{{\epsilon }}_{i}^{(1)})\,=\,{\rm{corr}}({x}_{i},a{x}_{i}+{{\epsilon }}_{i}^{(1)})\\ \,\,\,\,\,=\,\frac{{\rm{cov}}({x}_{i},a{x}_{i}+{{\epsilon }}_{i}^{(1)})}{\sqrt{{\rm{Var}}({x}_{i}){\rm{Var}}(a{x}_{i}+{{\epsilon }}_{i}^{(1)})}}\\ \,\,\,\,\,=\,\frac{{\rm{cov}}({x}_{i},a{x}_{i})}{\sqrt{{\rm{Var}}(a{x}_{i}+{{\epsilon }}_{i}^{(1)})}}\\ \,\,\,\,\,=\,\frac{a}{\sqrt{{a}^{2}+{\rm{Var}}({{\epsilon }}_{i}^{(1)})}}\end{array}$$

We estimate a and \({\rm{Var}}\left({{\epsilon }}_{i}^{(1)}\right)\) by fitting the mixed model using the R lme4 package, optimizing the restricted maximum-likelihood (REML) objective.

Decomposing segregation by time

Each exposure edge (ei,j,k) in our exposure network is timestamped with a time of exposure ti,j,k. This enables us to decompose our overall exposure segregation into fine-grained estimates of segregation during different hours of the day by filtering for exposures that occurred within a specific hour. In Supplementary Fig. 21, we partition estimates of segregation by 3 h windows to illustrate how segregation varies throughout the day (Supplementary Information).

Decomposing segregation by activity

Each exposure edge (ei,j,k) in our exposure network occurs at a specific location lati,j,k, loni,j,k. It is therefore possible to annotate exposures by the fine-grained POI (for example, specific restaurant) that they occurred in, as well as the by the higher-level parent POI (for example, shopping centre) in which the POI was located (see the ‘Annotating exposures’ section). This enables us to decompose our overall exposure segregation into fine-grained estimates of segregation by specific leisure activity. We do so by filtering the network for all exposures that occurred in a specific POI category, and recalculating exposure segregation for the MSA or county using only those exposures. In Fig. 2c, we show the variation in exposure segregation by leisure site, and further explain these variations in Extended Data Fig. 3.

Bridging index

We seek to identify a modifiable, extrinsic aspect of a city’s built environment that may reduce exposure segregation. One promising candidate is the location of a city’s highly visited POIs (that is, hubs). We define a new measure, the bridging index, which measures the extent to which a particular set of hubs (\({\mathcal{P}}\)) may facilitate the integration of individuals of diverse SES within a geographical area (that is, MSA or county). The bridging index measures the economic diversity of the groups that would encounter one another if everybody visited only their nearest hub from \({\mathcal{P}}\), based on the observation that physical proximity significantly influences which hubs individuals visit44,45,46.

The bridging index is computed through two steps (Extended Data Fig. 7):

  1. (1)

    Cluster all individuals who live in an area (that is, MSA or county residents, \({{\mathcal{V}}}_{A}\)) into K clusters (\({{\mathcal{H}}}_{1},{{\mathcal{H}}}_{2},...,{{\mathcal{H}}}_{K}\)) according to the hub from \({\mathcal{P}}\) closest to their home location. K is the number of hubs in \({\mathcal{P}}\).

  2. (2)

    The bridging index is computed as the weighted average of the economic diversity (that is, Gini index) of these clusters of people, relative to the area’s overall economic diversity.

$${\rm{B}}{\rm{r}}{\rm{i}}{\rm{d}}{\rm{g}}{\rm{i}}{\rm{n}}{\rm{g}}\,{\rm{i}}{\rm{n}}{\rm{d}}{\rm{e}}{\rm{x}}=\frac{{\rm{W}}{\rm{i}}{\rm{t}}{\rm{h}}{\rm{i}}{\rm{n}}-{\rm{h}}{\rm{u}}{\rm{b}}\,{\rm{e}}{\rm{c}}{\rm{o}}{\rm{n}}{\rm{o}}{\rm{m}}{\rm{i}}{\rm{c}}\,{\rm{d}}{\rm{i}}{\rm{v}}{\rm{e}}{\rm{r}}{\rm{s}}{\rm{i}}{\rm{t}}{\rm{y}}}{{\rm{O}}{\rm{v}}{\rm{e}}{\rm{r}}{\rm{a}}{\rm{l}}{\rm{l}}\,{\rm{e}}{\rm{c}}{\rm{o}}{\rm{n}}{\rm{o}}{\rm{m}}{\rm{i}}{\rm{c}}\,{\rm{d}}{\rm{i}}{\rm{v}}{\rm{e}}{\rm{r}}{\rm{s}}{\rm{i}}{\rm{t}}{\rm{y}}}=\frac{{\sum }_{i=1}^{K}|{{\mathcal{H}}}_{i}|\times {\rm{G}}{\rm{i}}{\rm{n}}{\rm{i}}\,{\rm{i}}{\rm{n}}{\rm{d}}{\rm{e}}{\rm{x}}({{\mathcal{H}}}_{i})}{|{{\mathcal{V}}}_{A}|\times {\rm{G}}{\rm{i}}{\rm{n}}{\rm{i}}\,{\rm{i}}{\rm{n}}{\rm{d}}{\rm{e}}{\rm{x}}({{\mathcal{V}}}_{A})}$$

We illustrate the intuition for our bridging index and how it captures the relationship between home and hub locations in Extended Data Fig. 8. A bridging index of 1.0 indicates that, if everybody visits their nearest hub, each person will encounter a set of people as economically diverse as the overall city they reside in. Thus, a bridging index of 1.0 signifies perfect bridging, that is, even if individuals live in segregated neighbourhoods, hubs are located such that individuals must leave their neighbourhoods and encounter diverse others. On the other hand, a bridging index of 0.0 signifies the opposite extreme; a city with a bridging index of 0.0 is one in which, if everybody visits the nearest hub, each person will encounter only people of the exact same SES.

The economic diversity of each hub \({{\mathcal{H}}}_{i}\) is quantified using the Gini index: \({\rm{G}}{\rm{i}}{\rm{n}}{\rm{i}}\,{\rm{i}}{\rm{n}}{\rm{d}}{\rm{e}}{\rm{x}}({{\mathcal{H}}}_{i})\), a well-established measure of economic statistical dispersion63 (Extended Data Fig. 7c), although results are robust to choice of economic diversity measure such as using variance instead of Gini index (Supplementary Fig. 14). The bridging index normalizes to the baseline economic diversity observed in the city, enabling direct comparisons between cities.

In our primary analysis, we identify hubs through commercial centres (such as shopping centres and plazas, which are higher-level clusters of individual POIs) because they are associated with a high density of exposures. Specifically, the majority (56.9%) of exposures happen inside of or within 1 km of a commercial centre even though only 2.5% of the land area of MSAs is within 1 km of a commercial centre. We therefore compute our bridging index using the set \({\mathcal{P}}\) of all commercial centres within each MSA. We find that our bridging index strongly predicts exposure segregation (Spearman correlation = −0.78; Fig. 3d). The top 10 MSAs with the highest bridging index are 53.1% less segregated than the 10 MSAs with the lowest bridging index. The bridging index predicts segregation more accurately than population size, racial demographics SES inequality, the neighbourhood sorting index and racial demographics, and is significantly associated with segregation (P < 10−4) after controlling for all aforementioned variables (Extended Data Tables 2 and 3).

Hypothesis testing

Unless otherwise noted, hypothesis tests and CIs were conducted using a bootstrap with 10,000 replications64. Steiger’s Z-test was used to compare different predictors of segregation indices, and hypothesis tests for Spearman correlation coefficients were computed using two-sided Student’s t-tests65,66,67. P values were not adjusted for multiple comparisons.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.