Behavioral changes during the COVID-19 pandemic decreased income diversity of urban encounters

Diversity of physical encounters in urban environments is known to spur economic productivity while also fostering social capital. However, mobility restrictions during the pandemic have forced people to reduce urban encounters, raising questions about the social implications of behavioral changes. In this paper, we study how individual income diversity of urban encounters changed during the pandemic, using a large-scale, privacy-enhanced mobility dataset of more than one million anonymized mobile phone users in Boston, Dallas, Los Angeles, and Seattle, across three years spanning before and during the pandemic. We find that the diversity of urban encounters has substantially decreased (by 15% to 30%) during the pandemic and has persisted through late 2021, even though aggregated mobility metrics have recovered to pre-pandemic levels. Counterfactual analyses show that behavioral changes including lower willingness to explore new places further decreased the diversity of encounters in the long term. Our findings provide implications for managing the trade-off between the stringency of COVID-19 policies and the diversity of urban encounters as we move beyond the pandemic.


Introduction
Cities are the central drivers of economic productivity and innovation owing to its capacity to foster dense social connections through physical encounters [1,2,3]. Among the various characteristics of social connections and network structures, empirical studies have shown that the diversity of networks are significant predictors of economic growth and recovery [4,5]. Moreover, integrated community networks and the inherent social capital have been shown to be crucial for resilience to shocks such as natural hazards [6,7]. The lack of community support could lead to inequitable access to urban amenities and services, ultimately affecting social, economic, and health outcomes of people living

Results
Using a large and longitudinal dataset of individual GPS location records in four major metropolitan areas in the US across more than three years, we analyze how experienced income diversity of urban encounters have changed during different periods of the COVID-19 pandemic. Specifically, we analyze the dynamics of income diversity of encounters at the level of individual places (points-of-interest; POIs) and individual users in cities. We seek to identify behavioral changes that were at the cause of such long-term changes, and we further unravel the sociodemographic, economic, and behavioral characteristics that explain the spatial heterogeneity in decreased diversity. Mobility data was provided by Spectus, who supplied anonymized, privacy-enhanced, and high-resolution mobile location pings for more than 1 million devices across four U.S. census core-based statistical areas (CBSAs) (Supplementary Table S2). All devices within the study opted-in to anonymized data collection for research purposes under a GDPR and CCPA compliant framework. Post-stratification techniques were implemented to ensure the representativeness of the data across regions and income levels (Supplementary Note 2 and Supplementary Figure S6). Our second data source is a collection of 433K verified places across four CBSAs, obtained via the Foursquare API (Supplementary Table S1). Robustness of the results on income diversity against the choice of places dataset was checked using the ReferenceUSA Business Historical Data [29] (Supplementary Note 1.3 and Supplementary Figure S2).
To analyze the income diversity of urban encounters, each anonymized individual user in the dataset was assigned a socio-economic status (SES) proxy, estimated from their home census block group (CBG) (Supplementary Note 1.4 and Supplementary Figure S3). The approximate home area of each individual user was estimated by Spectus at the granularity of CBGs using their most common location during the nighttime, between 10 p.m. and 6 a.m. every week. Individuals were then categorized into four equally sized SES quantiles according to the median household income of their home CBG. The results on decreased income diversity were robust against the number of quantile categories used (Supplementary Note 1.4 and Supplementary Figure S4). Only users who were observed more than 300 minutes each day were used for the analysis to remove users with substantial missing data. Stays (stops) longer than 10 minutes and shorter than 4 hours were then extracted from the dataset, and each stay was spatially matched with the closest place locations within 100 meters to infer stays at specific POIs. The results on income diversity were robust against the choice of data filtering parameters (Supplementary Note 1.5 and Supplementary Figure S5 19. A) Income diversity of encounters in places in the Boston and Cambridge area decreased during the pandemic. Diversity gradually recovers with reopening, albeit not fully compared to pre-pandemic levels. B) Aggregate mobility metrics, such as the daily number of visits per individual, daily amount of time spent at POIs, and number of visited unique POIs have all returned back to pre-pandemic levels by late 2021. C) Despite the recovery in mobility statistics, the diversity in encounters experienced at places and by individuals have decreased and have not recovered back to pre-pandemic levels. D) Income diversity decreased in all major place categories both in the short-term (e.g., April 2020) and long-term (e.g., October 2021) in all cities. Grocery stores consistently experienced the least effects of the pandemic while museums, leisure, transport, and coffee places saw the largest decrease. from different income quantiles for each individual (see Methods and Supplementary Notes 3.1 and 3.2). For places, D α = 1 when the place is fully diverse, with 25% of time spent by people from each of the four income quantiles, and D α = 0 when the place is visited by members of only a single income quantile. Similarly, to calculate the diversity of individuals D i , we measure the exposure of the individual i to each income quantile q across all the places α the individual has visited. The robustness of the results to the choice of diversity metric was tested (Supplementary Note 3.3 and Figure S12). The diversity measures were computed for each 2-month moving window to ensure sufficient number of visits to POIs, and were de-seasonalized using monthly trends observed in 2019. The panels in Figure 1A show how income diversity at places around the Boston and Cambridge area substantially decreased during the first wave of the pandemic. The diversity of encounters gradually recovers, however, not fully even after more than 1 and a half years from the lockdown, in October 2021. Similar patterns can be observed in all three other cities in the study (Supplementary Figure S7). The maps highlight the significant spatial heterogeneity of income diversity (e.g., Back Bay area is more diverse compared to the suburban areas), which is further investigated in the later sections.

Diversity of urban encounters have decreased during the pandemic
The pandemic substantially changed people's mobility patterns in the early waves, as documented in previous studies using mobility data (e.g., [30]). However, several individual mobility metrics indicate that individual based mobility patterns have returned back to pre-pandemic levels by late 2021. Figure 1B shows monthly average values of several individual mobility metrics across the two years in 2020 and 2021. Mobility metrics, more specifically the daily number of visits per individual, daily amount of time spent at POIs per individual, average dwell time spent per visit, and number of visited unique POIs per individual, have all returned back to pre-pandemic levels (annotated by horizontal dashed lines) by late 2021 in all four CBSAs. The drop in the rate of visits to POIs as well as the duration of visits to POIs during the earlier stages of the pandemic agree with the findings in previous studies [31], however our analysis extends the analysis to two years into the pandemic and confirms how activity patterns have recovered back to pre-pandemic levels by October 2021. The mobility data confirms that people have resumed spending time outside their homes and visiting different POIs, similar to before the pandemic.
Given the recovery of aggregate mobility metrics, one could expect the income diversity of encounters to also return back to pre-pandemic levels by late 2021. However, as shown in Figure 1C, income diversity experienced at places and by individuals are consistently lower than the pre-pandemic levels for all four cities even after 2 years into the pandemic. Absolute values of D α and D i are shown in the Supplementary Figure S10. Cities experience the most decrease of diversity in April 2020, 30% lower than pre-pandemic levels during the lockdown. A second peak in the loss of diversity is observed in late 2020, which corresponds to the increase in cases due to the first SARS-CoV-2 variant. Despite the recovery of individual mobility metrics, income diversity of encounters is still around 10% less than pre-pandemic levels even by late 2021. The decrease in income diversity was robust to the choice of diversity metrics, such as the entropy of income quantiles for encounters at places and for individuals (Supplementary Note 3.3 and Supplementary Figures S11 and S12).
Dissecting the place-based diversity results into POI categories, we further observe that diversity in places in Boston decreased in all POI categories both on the short-term (e.g., April 2020) and long-term (e.g., October 2021) in Figure  1D and Supplementary Figure S9. Especially, 'Museums', 'Leisure', 'Transportation', and 'Coffee' places had the largest decrease in diversity, while 'Grocery' places consistently experienced the least effects of the pandemic. This agrees with the fact that we observe the number of visits to follow similar patterns in Supplementary Figure S8, where we see a decrease during the early stages of the pandemic and gradual recovery by late 2021 for all POI categories, with the exception of grocery stores, which experienced no reduction in the number of visits even during the first waves. This suggests that the reduction in the number of visits indeed is one of the factors that cause the decrease in diversity of encounters. In the following section, we employ a counterfactual analysis approach to further understand why the diversity of encounters have consistently decreased during the pandemic.

Behavioral changes worsened income diversity in cities
To investigate the behavioral factors that led to the consistent decrease in income diversity experienced at places and by individuals, we consider three possible hierarchical levels of changes in the behavior of individuals due to the pandemic. As illustrated in 2A, the pandemic led, especially during the beginning of the pandemic, to a (i) reduction in the total amount of time spent at places outside homes and workplaces. Moreover, due to stay-at-home orders and also reluctance towards long-distance trips compared to before, we also consider (ii) changes in travel distances for each income quantile. In addition, since some type of activity categories were particularly affected by social-distancing policies, we also consider changes in visits to major activity categories and traveled distances for each income quantile, shown in the Supplementary Note 4.2 and Supplementary Figure S16. Finally, we also consider the possibility of (iii) microscopic changes in place preferences, including changes in exploration behavior and visitation patterns across place sub-categories.
To disentangle the relative weights of these behavioral changes, we created different counterfactual mobility datasets. For example, to estimate the effects of reduction of total activity time on the loss of diversity, we created a counterfactual mobility dataset that contains the same total visit duration at places during the pandemic (e.g., 2020 April), by randomly down-sampling visits from pre-pandemic data observed on the same month (e.g., 2019 April) (see Methods and Supplementary Note 4.1). The resulting generated counterfactual data retains the behavioral mobility patterns observed in 2019, but includes the effects of reduced activity during the pandemic. By comparing the place and individual-based diversity measures computed from the actual and the counterfactual mobility datasets, we are able to delineate the effects of activity reduction on the decrease in diversity. Similarly to measure the effects of (ii) changes in traveled distances by income quantiles, we extended the previous counterfactual to have the same total visit duration by distance ranges for each income quantile (see Methods and Supplementary Note 4).

Figure 2:
Behavioral changes worsened income diversity in cities. A) Three hierarchical levels of behavioral changes were simulated to understand why experienced income diversity decreased: (i) reduction in total outside activity by income groups, (ii) changes in traveled distances by income groups, and (iii) microscopic changes in place preferences, including exploration behavior and place sub-categories. B) Decrease in diversity of encounters for places and individuals decomposed into the three behavioral factors for Boston. Counterfactual simulations show that reduction in total activities (i) in the short-term, and mostly changes in exploration and place preferences (iii) in the long-term, were the major factors that decreased diversity. C) Social exploration decreased during the pandemic compared to 2019 trends in all four cities. D) POI subcategories which were more (and less) visited in different periods during the pandemic. Figure 2B shows the decreased diversity experienced at places and by individuals decomposed into the three behavioral factors (full results shown in Supplementary Figure S17). The counterfactual simulations show that (i) reduction in total activities caused around 50% of the decrease in diversity during the first pandemic wave, however, decreases to almost 2% by late 2021 when mobility metrics have recovered back to normal, as shown in Figure 1B. Although we observe different rates of dwell time decrease and recovery across income quantiles where the richer populations disproportionately reduce dwell times at places than poorer populations (Supplementary Figure S14b Heterogeneity in activity reduction rates across income quantiles and changes in traveled distances explain around 55% of the decreased diversity during the first wave of the pandemic, however, the remaining 45% is due to more microscopic, place-based preference changes. These effects become the single dominant factor in the later stages of the pandemic. To identify the changes in the mobility behavior during the pandemic, we fit the social exploration and preferential return (Social-EPR) model [16,32] to the data for each time period and assess the model parameters (see Supplementary Note 4.3). Among the parameters of the social-EPR model, the parameter which changed the most between before and during the pandemic was the social exploration parameter σ s , as shown in Figure 2C and Supplementary Figure S18. Social exploration σ s measures the probability of an individual to visit a place where their income group is not the majority income quantile group when they decide to explore a new place. During the pandemic, people's willingness to socially explore substantially decreased compared to the 2019 levels (horizontal dashed line) in all four cities, leading to less experienced diversity.
Furthermore, we observe changes in place level preferences across POI sub-categories. Sub-category popularity f k is measured by computing the probability that a POI sub-category is included in an individual's top k most frequently visited places. Figure 2D and Supplementary Figure S19 shows the POI sub-categories which were more (and less) visited in different periods during the pandemic compared to 2019 levels. Hardware stores, big box stores, grocery stores were POI sub-categories which gained popularity during the pandemic, and gyms, movie theaters, American food places were subcategories which were less visited frequently. Taken together with the results that controlling by major activity categories did not explain additional decreased diversity to scenario (ii) as shown in Supplementary Note 4.2 and Supplementary Figure S13, this result shows that people have not changed their proportion of time spent for major activity categories, but have changed which specific types of places they visit within each major activity (e.g., less time at American restaurants, but more time at fast food and donut stores). To summarize, not only reduction in activity, but also microscopic behavioral changes especially during the later stages of the pandemic, including less exploration and shift in preferences, led to decreased diversity in urban encounters.

Spatial and socioeconomic heterogeneity in decreased diversity
Which sociodemographic groups and areas were more affected by the decrease in income diversity? To understand the heterogeneity in decreased diversity, the mean CBG-level income diversity of all individuals living in the CBG were computed for each CBG in the four CBSAs, thus Figure 3A (and other CBSAs in Supplementary Figure S20), we observe spatial heterogeneity in the changes in diversity in the early stages of the pandemic, however more homogeneity in the long-term. The insets also show the magnitude of ∆D CBG decreasing as cities recovery from the pandemic. The correlation between D CBG in April 2020 and D CBG in April 2019 is much smaller (R 2 = 0.37) than for October 2021 and October 2019 (R 2 = 0.71), indicating the larger heterogeneity in ∆D CBG during the earlier stages of the pandemic (Supplementary Figure S21).
To understand the spatial and sociodemographic heterogeneity in the decreased diversity of encounters during the pandemic compared to 2019, we model D CBG and its difference ∆D CBG , using a simple regression model (see Methods and Supplementary Note 5). We include variables describing the places visited by the residents in the CBG (in 2019), mobility metrics including the average total traveled distance and radius of gyration (in 2019), and sociodemographic and economic characteristics of the CBG, including its population density, median income, age and race composition, and transportation behavior (e.g., public transportation usage), all of which were standardized (Supplementary Table S3). Regression analysis was conducted for each month, including all four cities. To control for the difference between areas across and within the metropolitan areas, we include geographical fixed effects at the level of Public Use Microdata Areas (PUMAs), which typically span around 20km and contain a residential population of 150 thousand people. Detailed summary statistics (Supplementary Table S3), collinearity and correlations between variables (Supplementary Figure S22), variance inflation factor analysis, and full regression results can be found in Supplementary Note 5. Figure 3B shows the adjusted R 2 of regression models for D CBG and ∆D CBG , respectively, across different time periods. The three groups of variables (places visited, geographical mobility, residence and demographics) explain around 60% to 70% of the variance of income diversity (D CBG ), which agrees with previous findings [16] (Supplementary Tables S4 -S6). However, the difference in diversity with respect to 2019 levels (∆D CBG ) has lower explained variance (at most R 2 = 0.31), and also decreases where there is no pandemic outbreak. In the long-term (October 2021), the regression model has low explained variance (R 2 = 0.11), indicating that regions homogeneously became less diverse, irrespective of sociodemographic or behavioral characteristics of the areas. Figure 3C shows the factors that were most important in explaining the variance of ∆D CBG in the months where R 2 was relatively high (April, May, December 2020 and January 2021) (Supplementary Tables S7 -S10). The highlighted regression coefficients suggest that whenever there is an outbreak, areas with higher population density and higher proportion of working age populations (age 25 -64), higher reliance on public transport, and larger movement range (radius of gyration) experience the largest decrease in income diversity of encounters. The three groups of variables (places visited, geographical mobility, residence and demographics) explain around 55% to 70% of the variance in income diversity. However, the same variables explain much lower variance of ∆D CBG , indicating that regions became less diverse homogeneously. C) Regression coefficients that explain the heterogeneity in ∆D CBG for the four different time periods where the R 2 was relatively higher. Filled variables are statistically significant at the P < 0.05 threshold.

Trade-off between income diversity of encounters and stringency of policy measures
From a public policy perspective, an important and interesting question is to understand how COVID-19 containment measures, including lockdowns, school and workplace closures, and restrictions on public gatherings, have affected resulted in the loss of diversity in urban encounters. To measure the relationship between the stringency of COVID-19 measures and experienced income diversity, we utilize the COVID-19 Stringency Index [33] (Supplementary Figure  S23), which is a composite measure of nine response metrics, including school and workplace closures, restrictions and cancellation of public events and gatherings, and restrictions on movement and travel (See Supplementary Note 6). Figure 4 shows the relationship between the stringency of COVID-19 policies and the decrease in diversity of urban encounters. In all four cities we observe statistically significant (p < 0.01) and strong negative correlation (ρ(SI CBSA , ∆D CBSA ) ∈ [−0.9, −0.73]). The robust negative correlations suggest a strong trade-off relationship between income diversity and COVID-19 policy and outbreak intensity in all cities. The decrease in diversity become pronounced during COVID-19 outbreaks, especially during the first pandemic wave (red plots) in Boston and Seattle, Figure 4: Trade-off between decreased income diversity of encounters and stringency of COVID-19 policies. Decrease in income diversity of encounters ∆D CBSA has strong and significant correlation with the stringency of COVID-19 measures in all four CBSAs, with outliers during the pandemic waves especially in Boston, Seattle (first wave; in red) and Los Angeles (second wave; in orange). and during the second pandemic wave (orange plots) in Los Angeles, where the number of cases and deaths were substantial in the respective cities. Moreover, for Boston, Seattle, and Los Angeles, even though the Stringency Index has decreased to around 20 in late 2021 (which indicate already less strict policies in place), the decrease in income diversity is positive, suggesting that the COVID-19 pandemic may have had a long-lasting decreasing effect on the income diversity of urban encounters. Regression results using additional exogenous variables such as the number of COVID-19 cases and deaths on the federal and local (CBSA) levels are shown in Supplementary Note 6.1 and Supplementary Figure S24. Since ∆D CBSA (t) is a temporal data with autocorrelation, we tested ARIMA type models as well, however the regression results and especially the estimated coefficients were found to be robust (see Supplementary Note 6.2, Supplementary Tables S13 and S14, and Supplementary Figure S25).

Discussion
Cities around the world currently face a wide array of challenges, ranging from combating inequality in wealth and economic opportunities [34], to avoiding catastrophic outcomes caused by climate change induced disasters [35]. Improving the inherent social capital of local communities and neighborhood networks, which are the fundamental units of collective decision making and support, is crucial for tackling these complex and global-scale societal challenges. With many cities expanding and urban inhabitants increasing at an unprecedented pace, the importance of promoting diverse encounters has never been higher [36]. Previous literature show that physical co-location and encounters are known to be significant factors [37] and predictors [38] for real world friendship formation, accounting for around 30% of new friendship additions [39]. Therefore, decrease in income diversity over the long-term could have substantial cumulative effects on the number and diversity of friendship ties, leading to more income segregation and polarization.
In this study, we make three important contributions towards understanding the dynamics of urban income diversity during and beyond the COVID-19 pandemic. First, we empirically revealed that physical encounters in US cities have indeed become less diverse than pre-pandemic levels even two years after the first case in the US, despite almost full recovery in aggregate mobility statics (e.g., number of visits per day). Second, we identified key behavioral changes that resulted in lower income diversity of encounters during the pandemic, including the consistent decrease in the exploration of socially diverse places and shifts in visitation preferences. Third, comparative analysis with COVID-19 policies suggested a strong trade-off relationship between COVID-19 policy stringency and income diversity. Thus, although social-distancing policies helped to mitigate the propagation of the epidemic, they also had negative effects on the social fabric of our cities. These insights, which are extremely difficult to quantify using traditional residence-based measures, collectively allow us to understand how and why urban encounters have become less diverse due to the pandemic.
Studies have suggested that while the development of effective vaccines have successfully suppressed the mortality rates of COVID-19, the new behavioral habits and social norms that we have acquired during the pandemic, such as higher rates of work from home, and dramatic changes in physical activity, sleep, time use, and mental health [40], could have long-lasting impact on society [25]. Behavioral changes that were observed in this study, such as less social exploration when visiting new places and changes in place preferences, may also remain for a long period due to persistent fear of infections. Our results suggest that policy interventions on urban mobility, such as the introduction of fare-free transit systems and development of public spaces, should target and evaluate the recovery of social exploration to potentially improve income diversity after the pandemic. Increasing the quantity and diversity of our social encounters [37] could help communities to acquire social capital, which could improve the resilience to natural hazards [6] and foster economic growth [41].
The results of our study should be interpreted in light of its limitations. Regarding the limitations of the mobility data, we are not able to identify the purpose of visits or the types of the encounters, for example, whether it is a co-visitation at a cafe where no conversations take place, or a cocktail party where strangers meet and have a conversation over a common topic. Therefore, the metrics computed in our study should be interpreted as a proxy for all meaningful encounters, and as a bound for urban income diversity. Regarding the study design, we focus on income diversity and not other socioeconomic and demographic dimensions, including racial diversity [42,13]. The methods and approaches may be applied to other sociodemographic data available in the American Community Survey to understand the dynamics of these other types of social diversity experienced in cities.

Mobility data
We utilize an anonymized location dataset of mobile phones and smartphone devices provided by Spectus Inc., a location data intelligence company which collects anonymous, privacy-compliant location data of mobile devices using their software development kit (SDK) technology in mobile applications and ironclad privacy framework. Spectus processes data collected from mobile devices whose owners have actively opted in to share their location, and require all application partners to disclose their relationship with Spectus, directly or by category, in the privacy policy. With this commitment to privacy, the dataset contains location data for roughly 15 million daily active users in the United States. Through Spectus' Social Impact program, Spectus provides mobility insights for academic research and humanitarian initiatives. All data analyzed in this study are aggregated to preserve privacy. The home locations of individual users are estimated at the CBG level using different variables including the number of days spent in a given location in the last month, the daily average number of hours spent in that location, and the time of the day spent in the location during nighttime. See Supplementary Note 1.1 for more details. The representativeness of this data has been tested and corrected in Supplementary Note 2 using post-stratification techniques.

Estimation of stays at places
Stops, which are location clusters where individual users stay for a given duration, are estimated using the Sequence Oriented Clustering approach [43]. The stops are attributed to places (or points-of interest; POIs) by simply searching for the closest place from the stops within a 100 meter radius. The robustness of the estimated income diversity to this spatial parameter was tested in Supplementary Note 1.2. Stays between 10 minutes and four hours, of individuals who were observed more than 300 minutes each day were used for the analysis. The results were shown to be robust against the choice of these temporal parameters in Supplementary Note 1.5. Moreover, the robustness of the results on income diversity against the choice of place datasets were tested using the ReferenceUSA dataset [29] in Supplementary Note 1.3.

Income diversity of encounters
To measure the income diversity of encounters experienced at each place α in each city, we compute the proportion of total time spent at place α by each income quantile q, τ qα . Income thresholds for the quantiles are chosen based on the income distributions in each city. We checked that the results on income diversity are independent of the choice of the number of income quantiles in Supplementary Note 1.4. We define full diversity of encounters at a place when people from all income quantiles spend the same amount of time, τ qα = 1 4 for all q. Using the metric used to compute income segregation in urban encounters in previous studies [16], we define the income diversity experienced at each place α, D α as a measure of evenness of time spend by different income quantiles D α = 1 − 2 3 q |τ qα − 1 4 |. The diversity measure is bounded between 0 and 1, where D α = 0 means there is no diversity (the place is visited by people from only one income quantile), and D α = 1 indicates that all income quantiles spent equal amount of time at the place. Similarly for individuals, given the proportion of time individual i spent at place α, τ iα , the individual's relative exposure to income quantile q, τ iq can be computed by τ iq = α τ iα τ qα . Then, the income diversity experienced by individual i can be measured using the same equation used for places D i = 1 −

Counterfactual simulation of mobility
To understand the underlying behavioral changes that contributed to the decrease of income diversity in urban encounters, we design a simulation framework that leverages the pre-pandemic data to create synthetic, counterfactual mobility patterns. The synthetic, counterfactual mobility dataset is designed so that while the fundamental behavioral patterns observed in 2019 are kept consistent, the number of users and stays at different place categories by different income quantiles are reduced to post-pandemic levels. This way, we are able to delineate the effects of different levels of behavioral changes to the total decrease in income diversity.
The following steps are performed to simulate the synthetic mobility datasets. To create the synthetic counterfactual data for year y and month m, denoted as S y,m , we use the mobility data observed in the year 2019 on the same month m as input data D 2019,m , for example, to create a synthetic mobility dataset for April 2020, we use the mobility data observed in April 2019. Several different synthetic datasets, S y,m employs a more granular removal process, where we randomly remove visits to places from D 2019,m by income quantiles q and traveled distance d (binned into 7 distance ranges: to adjust the amount of dwell time spent at visits to places. We also tested removing visits by income quantiles q, traveled distance d, and place taxonomy c, however the results were similar to scenario (ii), as shown in Supplementary Note 4.2. More details on creating the counterfactual synthetic datasets can be found in Supplementary Note 4. After creating the synthetic counterfactual datasets, we compute the income diversity of encounters and compare with the income diversity measured using the actual observed data D y,m to delineate the effects of reduction in active users and visits to place categories on the decrease in income diversity.

Modeling the heterogeneity in income diversity
To further understand how the income diversity of encounters decreased heterogeneously across sociodemographic groups during throughout the pandemic, we build simple linear regression models of the form: (1) where D CBG (t) and ∆D CBG (t) denote the differences in diversity at time t compared to the same month in the year 2019. {R CBG } is the set of all residential variables from the census that describe the demographic, transportation, education, race, employment, wealth, etc. of the Census Block Group. {P CBG } is a vector of variables that indicate the places where individuals living in the CBG spent most of their time in 2019, out of the place subcategories which have at least 100 venues. For each individual, we identify the subcategories where the individual stays more than 0.3% of their time and obtain a binary vector with the length of 564, which is the number of place subcategories. {M CBG } is a set of variables that describe the geographical mobility behavior of people living in the corresponding CBG. We use two variables: (i) the radius of gyration of all the places visited by each user, and (ii) the average distance traveled to all places from each individual's home. Details of the regression covariates, including their summary statistics and correlations, are studied in Supplementary Note 5.
To further understand the differences in decreased income diversity across CBSAs, the correlation between the stringency of COVID-19 policies and the decrease in diversity was analyzed. The stringency index SI CBSA (t) is a composite metric that measures the strictness of COVID-19 policies calculated using data collected in OxCGRT [33], and are provided at the state levels for the United States. The stringency index takes into account policies including the closings of schools and universities, closings of workplaces, cancelling of public events and gatherings, closing of public transport, orders to shelter-in-place, restrictions on internal movement between cities/regions and international travel, and presence of public info campaigns. More details are provided in the codebook in the github webpage 1 .

Author contributions statement
T.Y. designed the algorithms, performed the analysis, developed models and simulations. B.G.B.B. and E.M. performed part of the analysis, partially developed models and simulations. A.P., X.D. and E.M. supervised the research. All authors wrote the paper. Company data were processed by T.Y. and partially by B.G.B.B. and E. M. All authors had access to aggregated (nonindividual) processed data. All authors reviewed the manuscript.

Data Availability
The data that support the findings of this study are available from Spectus through their Social Impact program, but restrictions apply to the availability of these data, which were used under the licence for the current study and are therefore not publicly available. Information about how to request access to the data and its conditions and limitations can be found in https://spectus.ai/social-impact/.

Code Availability
The analysis was conducted using Python. Code to reproduce the main results in the figures from the aggregated data is public available on github https://github.com/takayabe0505/IncomeDiversity. 1 Mobility data

Home estimation and stop detection
In this study we utilize an anonymized location dataset of mobile phones and smartphone devices provided by Spectus Inc., a location data intelligence company which collects anonymous, privacycompliant location data of mobile devices using their software development kit (SDK) technology in mobile applications and ironclad privacy framework. Spectus processes data collected from mobile devices whose owners have actively opted in to share their location, and require all application partners to disclose their relationship with Spectus, directly or by category, in the privacy policy. With this commitment to privacy, the data set contains location data for roughly 15 million daily active users in the United States. Through Spectus' Data for Good program, Spectus provides mobility insights for academic research and humanitarian initiatives. All data analyzed in this study are aggregated to preserve privacy 1 . Each entry in the data table comprises anonymized device ID, location coordinates, start time, and dwell time of the stop for the device.
To define the type of location (Home or Work), different variables are used, including the number of days spent in a given location in the last month, the daily average number of hours spent in that location, and the time of the day spent in the location (nighttime/daytime). To estimate the home position of a user, the algorithm combines the three variables and creates a score that represents the probability that the position points to the home. The more days and the average number of hours spent in the position, the higher the score is. Higher scores will also be assigned to the most common places during the night. The location that maximizes this score is defined as the home of the device.
Once the location of the home location is identified, the algorithm looks for the work position. Note that the algorithm requires the work location to be located at least 100 meters apart from the home location. The same variables used for the detection of the home location are used, but a higher score is given to daytime locations for the work location rather than nighttime locations. Spectus runs the algorithm every week in order to confirm or update the inferred home and work locations as we observe new data. We will only consider devices that have been present in Spectus' dataset for at least 15 days. Spectus tightly restricts access to the inferred precise home and work locations of devices. Furthermore, it is used as input into various downstream processes to create more privacy-protected versions of Spectus datasets. For example, we only expose home and work datasets in Spectus Workbench associated with standard Census Block Groups, created by the U.S. Census Bureau, rather than the precise locations. This offers a good balance between utility and privacy: according to the U.S. Census Bureau, there are between 600 and 3000 people living in each block group. Each block group is an aggregate of contiguous U.S. blocks sharing similar socio-demographic characteristics. The representativeness of this data has been tested and corrected in Section 2 in the Supplementary Material. The stops, which are location clusters where individual users stay for a given duration, are estimated using the Sequence Oriented Clustering approach [18].

Robustness to threshold distance for attribution of stays to places
To measure the diversity of physical encounters in urban environments, we attribute the stops of individual users to specific places in the city. To study the stops at different places, we use stops that are longer than 10 minutes but shorter than four hours. In our study, we use location data of places Figure S1: Sensitivity of place based diversity of encounters with respect to spatial parameters for the visit attribution algorithm.  Table S1. To attribute a stop to a place, we simply attribute each stop the closest place in our dataset. o avoid attributing a stop to place far away, we attribute the stop to a place within d max = 100 meters from the observed location of the stop. If the stop is further away than 100 meters from any place in the dataset, the stop is discarded from our dataset and not used for computing the diversity of encounters. The robustness of our results on the diversity of encounters have been tested using different spatial

Robustness against choice of POI dataset
Although we may assume that our dataset of places (name, location coordinates, business category) collected via the Foursquare API is relatively comprehensive, there could be places that are missing from the dataset, which could affect our results on income diversity. To check whether our findings in our study are independent on the selection of the dataset of places, we used the "ReferenceUSA Business Historical Data", which is a record of companies across the US. The dataset is created annually from Infogroup's U.S. Business Database, and a snapshot of the data is saved each December (we used the 2020 version). The data, similar to the Foursquare data, contains the company name, mailing address, SIC and NAICS codes, employee size, sales volume, latitude/longitude, and other variables about each company [6]. In Boston, there were 12641 food and restaurant places (NAICS code starts with 722) and 3886 grocery stores (NAICS code starts with 445) in the ReferenceUSA dataset, compared to the 14,791 and 2,017 places in the Foursquare data, respectively. The income diversity experienced at food, restaurant, and grocery places were calculated using the two different datasets for several time periods (April and October in 2019, 2020, 2021). Figure S2 shows the mean ± standard errors of income diversity of encounters at places. Despite the differences in the number of places and the minor differences in category labels between the Foursquare data and the ReferenceUSA datasets, similar levels of decrease in income diversity are observed between the two datasets, suggesting the results we obtain are robust against the choice of place datasets.

Robustness against definition of income quantiles
To estimate the socioeconomic status of each individual, we use the median income of the census block group (CBG) where their estimated homes are located in as a proxy for their income. Individuals in our dataset are then grouped into four equal-size quantiles of economic status within each city. The diversity of encounters at places and for individuals are hereon calculated using these assigned quantile values. For Boston, the median income thresholds for the quantile classification are: [$0,$59K] for quantile 1 (low income), [$59K,$84K] for quantile 2 (medium-low income), [$84K, $108K] for quantile 3 (medium-high income), and [$108K, $250K] for quantile 4 (high income). The four income quantile ranges for the four cities are shown in Figure S3a. While Boston has the highest quantile thresholds, Los Angeles and Dallas have slightly lower income quantile ranges.
Since our estimation of income diversity is conducted by grouping the encountered individuals into income quantile groups and measuring the unevenness of the group sizes, it is important to check whether our income diversity estimates are affected by the number of income quantile groups we use. To check the robustness of our income diversity measures against the selection of the number of income quantiles n, we compute the place-based and individual-based diversity measures when using different number of income quantiles (n = 2, 3, 4, 5, 6). The income diversity metric under a given n Figure S4: Sensitivity of income diversity of encounters with respect to number of income quantiles used. The results on income diversity are robust and independent of the choice of n.
is computed as the following: where n is the number of quantiles used for income quantile classification. For Boston's case, the income ranges of quantiles under different number of quantiles are shown in Figure S3b. Figure S4 shows the estimated decrease in diversity in Boston when using different number of income quantiles. We observe that both the dynamics of the diversity of encounters experienced at places and by individuals are consistent across time, showing high agreement with the result obtained using n = 4 (green color). Therefore, we conclude that our findings related to the loss of diversity in the short-and long-term during the pandemic is independent of the choice of the number of quantiles.

Robustness against choice of data filtering parameters
Since mobile phone location pings are collected via various smartphone apps at asynchronous timings and frequencies, some users are observed for a long duration during the day while others could be observed for just a very short period of time. Using a group of individuals with very short observation times could skew the results of the income diversity of encounters. Therefore, we limit the group of individual users analyzed in this study to those who are observed a substantial amount of time each day. In this study, we use users who are observed more than t min = 300 minutes across all visited places (including their homes) to select the users used in our analysis. Since 300 minutes is an arbitrary temporal threshold, we tested whether income diversity experienced at places and by individuals are affected by the selection of the t min parameter. There is an obvious trade-off between the number of available users in the dataset and the temporal coverage of the users' mobility patterns, as shown in Figure S5a for the Boston CBSA. Out of all the 175K users in the dataset, 140K users were observed more than 300 minutes.     Figure S5b shows how the income diversity dynamics experienced at places (left panel) and by individuals (right panel) vary when using different t min parameters. The losses in diversity in encounters are amplified for both places and individuals when we employ a stricter threshold for selecting the users, mainly due to the lack of individuals visiting each place, which increases the likelihood of lower diversity. However, the main takeaways of the dynamics in income diversity are consistentthe income diversity in urban encounters have become decreased in both the long and short term, both from the places' and individuals' perspectives.
To summarize the mobility data filtering process, we 1) estimate home and stop locations for each individual, 2) attribute the stays to specific places, 3) estimate each individual user's socioeconomic status using census-block group level data, and 4) select users who are observed more than 300 minutes per day. After pre-processing the mobility datasets for each of the four urban areas, the entire dataset contains a total of 1.16 million unique users and 23.4 million stays across a total of 97K places. Table  S2 shows the summary statistics for the four CBSAs.

Data representativeness
The location data used in our study is collected from smartphones via various apps and services. Although a significant portion (85% according to 2021 data 4 ) of the US population owns a smartphone, one could question the representativeness of the 1.16 million user samples across geographical regions and income quantiles. Studies have reported digital divide and smartphone usage gaps across sociodemographic groups in the US [16]. In this section, we test whether our group of users in the mobility data are representative of the total population, and further employ post-stratification techniques to correct for any potential biases in the sampling rates across places and socioeconomic status and to test whether the results on income diversity dynamics are robust to such uncertainties concerning data representativeness.

Population and income representativeness
The sampling percentage of the mobility data (100% × number of observed mobile phone users divided by the total population from the census data) is around 3% across all metropolitan regions. To test whether the users in the location data are representative of the entire population, first we compare the population detection in our mobility data and the 2019 ACS data for each of the CBGs in the cities. The left panel in Figure S6a shows the comparison between the census population (x-axis) and the number of observed smartphone users (y-axis) on the CBG scale in the month of January 2020 in the Boston CBSA. The correlation is moderately high (ρ = 0.767) showing that despite the use of such small census areas and potential bias in the smartphone usage patterns, we are able to obtain a good representation of the population. This correlation is relatively stable before and during the pandemic at around ρ = 0.75, which is moderately high. In Section 2.2 we use post-stratification techniques to correct for any differences in the sample percentages across CBGs and assess whether our estimates on income diversity of urban encounters are affected by the representativeness of the data.
In addition to the differences in sampling rates across CBGs, differences in representativeness across income quantiles are important for our study. To study the representativeness across income quantiles, we compute the proportion of users in the four income quantiles across time, which is shown in the right panel of Figure S6a. A completely balanced dataset would have all income quantiles each represent 25% of the proportion of the users. However, we can observe that the highest income quantile (Q4) is over-represented in the dataset throughout the 3 years period, while the lowest income quantile (Q1) is under-represented. In Section 2.2, we investigate whether this bias in income representativeness affects our estimates on income diversity using post-stratification techniques.

Robustness check via post-stratification
To understand the effects of the varying sampling rates across CBGs and income groups on our estimation on income diversity of urban encounters experienced at places and by individuals, we apply a post-stratification technique, which is used in a previous study [11]. Post-stratification is a well know sampling tool [13] and is typically used to study the impact of sampling biases in mobile phone location data [7] or (geolocated) social media data [17] on various downstream tasks and analyses. Following the methods employed in Moro et al. [11], we denote w g the expansion factor, which is the ratio of the population of census block g to the population detected in our mobility data. We then weight the time people from census block group g spends at place α bŷ τ gα = w g τ gα where the assumption is that τ gα is proportional to the number of people visiting the place. Using this method, we could increase (decrease) the time spent at places by people coming from census block groups that are under-estimated (over-estimated).
Recomputing the income diversity of urban encounters using the corrected duration of staysτ gα , as shown in Figure S6b we observe that the dynamics of the income diversity decrease between the raw mobility data and the post-stratified data are very similar. These results show the robustness of the insights on income diversity, and that even though the representativeness of mobile phone users are not perfect, the effect on our estimations are very limited.

Income diversity at places
To measure the income diversity of encounters experienced at each place α in each city, we compute the proportion of total time spent at place α by each income quantile q, τ qα . Income thresholds for the quantiles are chosen based on the income distributions in each city, as described in Section 1.4. We also checked that the results for income diversity are independent of the choice of the number of income quantiles in Section 1.4. We define full diversity of encounters at a place when people      from all income quantiles spend the same amount of time, τ qα = 1 4 for all q. Using the metric used to compute income segregation in urban encounters in previous studies [11], we define the income diversity experienced at each place α, D α as a measure of evenness of time spend by different income quantiles: The diversity measure is bounded between 0 and 1, where D α = 0 means there is no diversity (the place is visited by people from only one income quantile), and D α = 1 indicates that all income quantiles spent equal amount of time at the place. Results in Section 3.3 show that using different popular measures of diversity such as entropy does not affect the results on income diversity of encounters. Figure S7 shows the changes in income diversity at places across four time periods: April 2019 (before the pandemic), April 2020, April 2021, and October 2021 in the four CBSAs. Figure S9 shows the income diversity experienced at different types of places across the four cities, across four time periods. Similar to the results for Boston in Figure 1D in the main manuscript, museums and leisure places had the largest decrease in diversity while health and grocery related places had the smallest decrease in diversity. This result agrees with the large decrease in visits to places such as museums, food places, and leisure places, as shown in Figure S8, indicating that the decrease in number of visits per user is correlated to the decrease in income diversity experienced at places. We further investigate how much of income diversity reduction is due to the decrease in the number of visits in Supplementary Note 4.

Income diversity experienced by individuals
In addition to the income diversity experienced at places, we are interested in measuring the income diversity that each individual experiences across all places they visit. Given the proportion of time individual i spent at place α, τ iα , the individual's relative exposure to income quantile q, τ iq can be computed by:  Then, the income diversity experienced by individual i can be measured using the same equation used for places: Note that the exposure to income quantiles are calculated in a probabilistic manner across a two month time horizon to overcome the sparsity in actual encounters observed in the mobility data. Figure S10 shows the average income diversity at places and experienced by individuals for the four CBSAs. Los Angeles has the lowest income diversity both at places and by individuals out of the four cities. Different cities, which are located in different states, were restricted with COVID-19 lockdown polices of different levels of strictness. We investigate the regional differences from this perspective in Figure  4 in the main manuscript and in Section 6 in the Supplementary material. All monthly time series data, including the mean place diversity and individual diversity data are de-seasonalized by removing the monthly fluctuations (simply the deviations from the annual mean) observed in 2019. Most of the results in the main manuscript are shown by percentage differences, which is computed by is the income diversity of encounters observed on the same month as t in 2019, before the pandemic.

Other measure of diversity: entropy
The metric for diversity used in our study captures the (un)evenness of exposure between different income quantile groups adopted in previous studies [11]. Another popular metric used to measure the (un)evenness of distribution groups is the entropy metric, which has been used in previous studies related to the diversity of communication networks across cities [2]. In our scenario, the entropy of  the physical encounters at places are computed as the following: The left panel in Figure S11 shows the histogram of the diversity (used in our study) and entropy of the encounters taken place at places. The histograms shows how the entropy metric is heavily skewed to high values between 0.8 and 1.0, whereas the diversity metric has relatively larger variability, spanning from 0 to 1. Despite these different characteristics, the right panel in Figure S11 plots the correlation between the diversity (x-axis) and entropy (y-axis) metrics. The Pearson's correlation between these two metrics is very high (ρ = 0.971), indicating that these two different metrics are both able to capture the income diversity of encounters.
Indeed, when using the entropy metric to measure the changes in diversity of encounters experienced at places and by individuals, we obtain similar results to when we use the diversity metric. Figure S12 shows how similar to Figure 1C in the main manuscript, we observe a decrease in income diversity of encounters during the first and second waves (April 2020 and December 2020). Moreover, the long-term decrease in diversity in late 2021 is consistent with the results using the diversity metric. Because of the consistency in the key insights between the two metrics, both these metrics are suitable for measuring the income diversity in encounters. Given the wider variability in the range between 0 and 1, we employ the diversity metric as our main metric for measuring income diversity.

Counterfactual simulations
To understand the underlying behavioral changes that contributed to the decrease of income diversity in urban encounters, we design a simulation framework that leverages the pre-pandemic data to create synthetic, counterfactual mobility patterns. The synthetic, counterfactual mobility patterns dataset is designed so that while the fundamental behavioral patterns observed in 2019 are kept consistent, the number of visits to different place categories, in different distance ranges, by different income quantiles are reduced to post-pandemic levels. This way, we are able to delineate the effects of different levels of behavioral changes to the total decrease in income diversity.

Synthetic data generation procedure
The following steps are performed to simulate the synthetic mobility dataset. To create the synthetic counterfactual data for year y and month m, denoted as S y,m , we use the mobility data observed in the year 2019 on the same month m as input data D 2019,m , for example, to create a synthetic mobility dataset for April 2020, we use the mobility data observed in April 2019. Three different synthetic data, S  y,m : Randomly remove visits from D 2019,m by income quantiles q, place taxonomy c, and traveled distance d to adjust the total dwell time spent at places -Visits are randomly retained by rate r(y, m, q, d, c) = min 1, i∈Dy,m(q,d,c) τ i i∈D 2019,m (q,d,c) τ i , where i∈x(q,d,c) τ i is the total amount of dwell time spent by all users in dataset x by users from income quantile q, to places in major taxonomy c, within distance d from the user's home location. Similar to the previous counterfactual, d was binned into the same 7 distance ranges to obtain rates for each category. The 10 taxonomies shown in Table S1 are used. As a result, we obtain S y,m , and S (ii−3) y,m from the observed changes in aggregate behavior metrics, we compute the income diversity of encounters and compare with the income diversity measured using the actual observed data D y,m . Figure S13 shows the percentage changes in income diversity at places ∆D α and by individuals ∆D i computed using (a) Retain rate using total dwell time.   To summarize the findings, counterfactual simulations show that: 1. Using different retain rates across income quantiles have no effect on income diversity measures (no difference between S y,m ), and; 3. Using different retain rates across place taxonomies (major categories) have no effect on income diversity measures (no difference between S y,m ), which will be further investigated in the following sections.

Analysis of the impacts of removal rates under different scenarios 1. Effects of different retain rates across income quantiles
To understand why the impacts of using different retain rates across income quantiles (as shown in Figure S14b) yield no difference in diversity decrease, we plot the histograms of τ α , q of each place α for each income quantile q and in aggregate, in Figures S15a and S15b, respectively, for the two counterfactual scenarios (i) and (ii-1). We observe that, in agreement with Figure S14b, during the pandemic τ q 1 and τ q 2 increased and τ q 4 decreased due to poorer populations disproportionately visiting places than richer people. However, when we aggregate and plot the τ q values for all q ∈ {q 1 , q 2 , q 3 , q 4 }, there is no significant difference across the two counterfactual scenarios, consequentially yielding similar values of diversity, since the diversity measure does not differentiate whether q 1 or q 4 had disproportionate dwell time spent at places.

Effects of different retain rates across distance distributions
The retain rates across distance ranges shown in Figure S14c show that during most of the periods in the pandemic, shorter distance trips (e.g., [0, 1km), [1km, 3km)) have higher retain rates compared to longer distance trips (e.g., [20km, ∞)), indicating that people preferred shorter distance trips than longer ones during the pandemic. As shown in previous studies, longer distance trips tend to result in higher diversity, whereas shorter distance trips are less diverse due to stronger effects of residential segregation [11]. Indeed, when we compare results (ii-1) and (ii-2) in Figure S13, especially ∆D i , scenario (ii-2) has lower diversity during periods when r(y, m, d) for shorter distances are higher than longer distances (i.e., June -September 2020, January 2021 -June 2021). On the other hand, scenario (ii-2) has higher diversity during periods when r(y, m, d) for shorter distances are lower than longer distances (i.e., September -December 2020). These observations show that changes in distance distributions does play a role in the income diversity of urban encounters, despite the small magnitude of the effects as shown in Figure S13.
(b) Histograms of τ q∈{q 1 ,q 2 ,q 3 ,q 4 } for counterfactual scenarios (i) and (ii-1). Figure S15: Differences in the distributions of τ q between counterfactual scenarios (i) and (ii-1) are significant for each income quantile, but are nearly identical when aggregated across all income quantiles, yielding similar income diversity measures.
(a) Retain rates for different place taxonomies (major categories).
(b) Average baseline diversity metric for each place taxonomy.
(c) Average diversity weighted by category-level retain rates. Figure S16: Heterogeneous retain rates across place taxonomies (major categories) suggest income diversity measures to be affected by adding place taxonomies as a constraint for creating counterfactual datasets in (ii-3). However, the effects are close to zero, since place categories which have substantially different retain rates (i.e., grocery and arts/museums) have average level diversity measures.

Effects of different retain rates across place taxonomies (major categories)
The retain rates across place taxonomies (major categories) shown in Figure S16a indicate that different major categories had varying rates during the pandemic. While most categories follow similar patterns as the overall average retain rates shown in Figure S14a, places such as grocery stores had significantly higher (almost full) retain rates, indicating that dwell times at grocery stores had very small decreases. On the other hand, arts and museums had the largest decrease in retain rates. These heterogeneous rates suggest that using different retain rates across place categories when producing the counterfactual dataset (ii-3) could significantly affect the income diversity of S y,m . However, as shown in Figure S13, this additional constraint of controlling by place taxonomies yield negligible effects. We test this by computing the average diversity weighted by category-level retain rates across time. More specifically, we take the 2019 level diversity of each place taxonomy, D c,2019 , and re-weight them by the time-varying retain rates r(y, m, c). The results in Figure S16c show almost a flat trend across time, indicating that the heterogeneity in the time-varying retain rates have no effects on the overall income diversity. This can be explained by looking at place taxonomies that had the largest deviations in retain rates -grocery stores and arts/museum places had close to the average diversity measures, as shown in Figure S16b

Summary of counterfactual simulation results
Since the effects of heterogeneous retain rates across place taxonomies was insignificant, results for counterfactual diversity decrease using the S (ii−1) y,m and S (ii−3) y,m datasets were omitted from Figure 2B in the main manuscript. Figure S13 shows the comparison of counterfactual scenarios for Boston where the visits are reduced based on (i) total activity time (S y,m ), and (iii) actual income diversity. Scenario (ii-1) was employed in the main manuscript since there was little difference between scenarios (ii-1) and (ii-2). For all cities, the decrease in income diversity when we consider the reduction in users and visits by quantile accounts (S (i) y,m ) for around 50% of the reduction in diversity in the initial stages of the pandemic in the early stages of the pandemic. The marginal decrease in the diversity due to the reduction in visits based on place categories and travel distances (S (ii) y,m ) is relatively small compared to the reduction in visits. However, as shown in Figures S17 and S13, these reductions in active users and visits to different categories do not account for all of the reduction in income diversity, and indicates that more microscopic changes in human behavior have contributed to a further decrease in income diversity in cities during the pandemic. To investigate what behavioral changes during the pandemic contributed to the decrease in income diversity, we seek to find any microscopic, individual level behavior that changed during the later stages of the pandemic. To do that, we analyze the behavioral parameters of the Social-EPR model (proposed in [11], which extended the EPR model proposed in [15]).

Parameters of the Social-EPR model
The social exploration and preferential return (Social-EPR) model [11,15] characterizes visitation patterns of individuals using two mechanisms: exploration (visiting a new place) or preferential return (visiting an already visited place). The probability of exploration when an individual has already visited S T places is modeled as P new = ρS −γ T , where ρ and γ are model parameters. If an individual decides to explore, they then decide whether to socially explore (visit a new place where their income   Figure S18: Key parameters of the Social-EPR model, ρ, γ, and π, are fairly consistent during the pandemic. The social exploration parameter σ s (shown in main manuscript Figure 2D) was the only parameter with significant changes. group is not the majority income quantile group) with probability σ s . In the case that the individual decides to return, the individual selects the destination α with probability Π α ∼ τ α,i , where τ α,i is the proportion of time already spent at place α by individual i.
To investigate whether any of the fundamental behavioral characteristics have changed due to the pandemic, we fitted the Social-EPR model to the observed mobility data patterns and estimated the model parameters across different periods of time. The fitted parameters are shown in Figure S18. Surprisingly, we find that the key parameters of the Social-EPR model, including ρ, γ, and linear relationship between Π α and τ α,i , are mostly consistent across time (with the exception of April and May 2020 due to the initial lockdown). This indicates that the fundamental characteristics of individual mobility, including exploration and preferential return, were consistent during the pandemic, when controlled by the number of visits an individual makes. The model parameter with the most significant change during the pandemic was the social exploration parameter σ s , as shown in Figure 2D in the main manuscript.
From the counterfactual experiment, we found that there is an excess level of decrease in diversity in urban encounters even when controlled for the number of visits to different place categories by different income groups, by travelled distance. The Social-EPR model revealed that such decrease was not due to changes in exploration and preferential return behavior, but because of decrease in social exploration behavior and microscopic changes in where people prefer to visit (sub-category level changes), which is shown in Figure S19. Across all four CBSAs, we observe that places such as hardware, big box stores, banks, and grocery stores were the places with the highest increase in the proportion of individuals who visited them with a top-10 frequency, while gyms, food places (pizza, fast food), apparel, movie theaters were places with the largest decrease.

Explaining spatial heterogeneity in diversity
To further understand how the income diversity of encounters decreased heterogeneously across sociodemographic groups during throughout the pandemic, we build simple linear regression models of the form: where D CBG (t) and ∆D CBG (t) denote the differences in diversity at time t compared to the same month in the year 2019, and: • {R CBG } is the set of all residential variables from the census that describe the demographic, transportation, education, race, employment, wealth, etc. of the Census Block Group. The entire list of these variables can be found in Table S3.
• {P CBG } is a vector of variables that indicate the places where individuals living in the CBG spent most of their time in 2019, out of the place subcategories which have at least 100 venues.
For each individual, we identify the subcategories where the individual stays more than 0.3% of their time and obtain a binary vector with the length of 564, which is the number of place subcategories. Then to obtain {P CBG } we simply take the average of the vectors of all individuals who are living in the corresponding CBG. The threshold method previously employed in [11] are used for sparse and highly-skewed human data [1] to minimize the effect of the noisy and long-tailed distribution of human activities.
• {M CBG } is a set of variables that describe the geographical mobility behavior of people living in the corresponding CBG. We use two variables: (i) the radius of gyration of all the places visited by each user, and (ii) the average distance traveled to all places from each individual's home.
The summary statistics of the residential variables are shown in Table S3. Variables that have high correlation amongst eachother, such as '% of people above the age 65', '% of people commuting by car', '% of population between Grades 9 and 12' were removed from the set of variables, and as shown in Figure S22, the correlation among variables are generally low, with the highest magnitude of correlation at ρ = −0.59 between '% with Bachelors degree or higher' with '% below Grade 9'. We also checked that the variance inflation factor (VIF) are all between 1 to 5, indicating that there is no significant issue of multicollinearity. Figure S20 shows the differences in ∆D CBG for different periods during the pandemic, and S21 shows the scatter plots of the income diversity in each CBG compared between before the pandemic and during the pandemic at three time points, in Boston CBSA.
To evaluate the relative importance of the three groups of variables, we used the approach proposed by Lindeman, Merenda, and Gold (LMG method) [9]. The LMG method measures the additional R 2 when the variable group is added to the model. Since we have three groups of variables (A,B,C) with six different permutations, thus the contribution of variable group A, for example, is: Figure 3C in the main manuscript shows the relative importance of the three groups of variables for each month, for D CBG (t) and ∆D CBG (t). Tables S4 to S6 and Tables S7 to S10 show the full       Tables S4 (pre-pandemic), S5, and S6 (during the pandemic) largely agree with the results in [11], where the residential, mobility, and places variables collectively explain the heterogeneity in diversity well (R 2 = 0.662 for October 2021). On the other hand, the differences in diversity ∆D CBG are less well explained by these variables, where the R 2 is at most around 0.3 during the COVID-19 outbreak periods (April, May 2020 and December 2020 and January 2021), as shown in Figure 3C in the main manuscript. This indicates that the decrease in income diversity during the pandemic (especially during the off-peak months) are relatively homogeneous across all sociodemographic segments.

COVID-19 intensity and segregation
An interesting aspect of the COVID-19 pandemic was its asynchronicity in terms of outbreaks (number of cases and deaths) and the strictness of implemented policies. To further understand the differences in decreased income diversity across CBSAs, we build simple linear regression models with the form: where Cases CBSA (t), Deaths CBSA (t), Cases U S (t), and Deaths U S (t) denote the number of cases and deaths in the corresponding CBSA and the entire USA on time t, which is aggregated monthly. Data about the number of cases and deaths in each CBSA and for the entire USA were collected from the New York Times Github page 5 . The data were provided on the county scale and for each          day, and were aggregated into monthly values for each CBSA. The number of cases and deaths for the four CBSAs are shown in Figure S23. The Oxford Covid-19 Government Response Tracker (OxCGRT) 6 collects systematic information on policy measures that governments have taken to tackle COVID-19. The different policy responses are tracked since 1 January 2020, cover more than 180 countries and are coded into 23 indicators, such as school closures, travel restrictions, vaccination policy. These policies are recorded on a scale to reflect the extent of government action, and scores are aggregated into a suite of policy indices. The stringency index SI CBSA (t) is a composite metric that measures the strictness of COVID-19 policies calculated using data collected in OxCGRT [3], and are provided at the state levels for the United States. More specifically, the stringency index takes into account:   More details are provided in the codebook in the github webpage 7 . The stringency index for each CBSA are shown in the bottom row of Figure S23. While all cities had high stringency until late 2020, the rollout of vaccines in early 2021 have significantly lowered the stringency.

Model estimation results
The regression model results for the effects of COVID-19 intensity on income diversity are shown in Table S11. We observe that the stringency index is significant for all CBSAs with a negative coefficient, which indicates that stricter the COVID-19 policies, the less diverse urban encounters become. In addition to the stringency index, the number of deaths at the CBSA and federal levels are also significant for Boston and Seattle. Both coefficients are negative, which indicate that when the monthly number of deaths are higher, the less diverse urban encounters become. To remove insignificant variables from the model, we tested a more simpler version with the form: ∆D CBSA (t) ∼ Deaths CBSA (t) + SI CBSA (t).  Table S12. The significance of the variables nor their direction are consistent with the first version of the model. The constants for Boston and Los Angeles are significantly negative, indicating that in the hypothetical scenario where there are zero monthly COVID-19 deaths and zero stringency of policies, the income diversity will have a negative change compared to 2019. Given that the scenario where we completely eliminate COVID-19 cases and deaths as well as social distancing policies in the near future with the coronavirus becoming an endemic disease, this result suggests that there could be a long-lasting effect of the pandemic on the income diversity of urban encounters. Regression results when we use only the stringency index is shown in Figure S24.

Robustness of results via time series modeling
Since the variables used in this model are temporal data, including the decrease in diversity as well as COVID-19 related data, it is important to check stationarity, autocorrelation, and partial autocorrelation, and if applicable test whether such temporal dependencies affect the outcomes of the results.
To check the stationarity of ∆D CBSA (t), we conduct the Augmented Dickey Fuller (ADF) test [12]. Table S13 shows the ADF statistic, p-value, and whether the time series is determined to be stationary or not. The results show that except for Boston, the time series are non-stationary, thus we need to do some differencing. Figure S25 shows the autocorrelation and partial autocorrelation of the data   ∆D CBSA (t) under no differencing and 1st order differencing for the four CBSAs. We observe that for all three cities except Boston (which requires no differencing), 1st order differencing is enough to obtain no autocorrelation beyond 1 time step.
To model the temporal dynamics, we apply an ARIMA(p, d, q) model with covariates (number of local COVID-19 deaths and local stringency index). The model parameters p, d, q of the ARIMA model, each corresponding to the autoregressive term (or the lag of the dependent variable, number of differencing needed for stationary time series, and the lagged forecast error term, respectively. For Boston, the ADF test shows that no differencing is needed, thus d = 0. Under no differencing, ARIMA(1,0,0) and ARIMA(0,0,1) were tested for Boston and only the MA term was significant (shown in first column in Table S14). Using the ARIMA(0,0,1) model for Boston, both the he number of local deaths and the local stringency index were statistically significant with p < 0.01, indicating robustness of the OLS results in Table S12. For the other three cities, since d = 1 was determined using the ADF test, ARIMA(1,1,0), ARIMA(0,1,1), and ARIMA(1,1,1) were modeled and the statistical significance of autoregressive and moving average terms were tested. For Seattle, as shown in the second column in Table S14, the moving average term showed statistical significance with p < 0.05 and both the number of local deaths and the local stringency index were also statistically significant with p < 0.05, indicating robustness of the OLS results in Table S12. For Los Angeles and Dallas, both the autoregressive and moving average terms were statistically insignificant, indicating that the dependent variable can be modeled using OLS instead of time series models. To summarize, for Boston and Seattle ∆D CBSA could be modeled as a moving average process but the coefficients and significance of the independent variables were consistent with the OLS results in Table S12. For Los Angeles and Dallas, the temporal components were insignificant, therefore the results in Table S12 are robust.

Software
Analysis was conducted using Python, Jupyter Lab, and the following libraries and software: • NumPy [4] for general computation on Python.
• Statsmodels [14] for statistical modeling and econometric analysis.   • A Python implementation of the R Stargazer multiple regression model creation tool 8 was used to create the regression tables.