Natural hazards continue to threaten lives and property around the world, killing nearly 12,000 people and causing $130 billion US damage in 2019 alone (EM-DAT, 2019). Societies have set up a variety of ways to mitigate extreme weather events such as flooding; the most effective involves removing residents from vulnerable areas in advance of the hazard. Evacuations are not without challenges, however: effective departure may be impeded by severe traffic, city-level dwindling resources (e.g., fuel shortage, etc.), and uninhabitable shelter conditions (Dow and Cutter, 2002; Wolshon et al., 2005; Wolshon and McArdle, 2009). Understanding the underlying patterns of collective, abrupt displacement is, therefore, crucial for emergency management agencies to execute comprehensive, well-ordered plans (Wang et al., 2013; W. Wang et al., 2019).

There has been an increasing interest in linking social vulnerability to disasters through frameworks such as equity and environmental justice (Adeola and Picou, 2017). Previous studies suggest that minority populations from disadvantaged neighbourhoods are more likely to be affected by disasters (Bolin and Kurtz, 2018). Examinations of Hurricane Katrina, for example, showed that black and poor residents of New Orleans suffered disproportionately severe impacts (Lovekamp, 2008). Further, race and class gaps are present from the preparation to response phases of shocks as well as through the recovery period (Laska and Morrow, 2006; Yabe and Ukkusuri, 2020). Much research has argued that White communities are better prepared for disasters than communities of colour and that White residents return at higher rates than Black or Hispanic residents (Aldrich and Meyer, 2015; Tierney, 1989). Another study using post-disaster survey data of residents affected by 2012 tornadoes shows that households with more advantaged characteristics such as social capital and social ties recover faster from the disaster (Sadri et al., 2018). Despite these critical findings, there is still a lack of large-scale quantitative research on the impact of wealth and racial difference on disaster evacuation and return. To this end, scholars have typically relied on retrospective surveys and interviews (Houston et al., 2015; Reinhardt, 2015) which face challenges such as sample bias, faulty retrospective recall, and low response rates (Fussell et al., 2010; Rufat et al., 2015).

Recently, GPS-enabled devices which generate unprecedented amounts of mobility data have enabled more nuanced studies of behaviour (Alessandretti et al., 2018, 2017; González et al., 2008; Hasan et al., 2013), including in large-scale disasters and extreme events (Bagrow et al., 2011; Bengtsson et al., 2011), allowing the construction of probabilistic models (Song et al., 2014, 2013), and highlighting the importance of social networks (Finch et al., 2010; Metaxa-Kakavouli et al., 2018; Yabe et al., 2019) and socio-economic factors (Yabe and Ukkusuri, 2020). By comparing multiple disasters, certain typical behaviours have been established, including an exponential return rate (Yabe et al., 2020).

Hurricane Harvey made landfall on August 25, 2017, and to date is rivalled only by Hurricane Katrina as the costliest tropical cyclone on record. It was accompanied by wind speeds as high as 130 mph and more than 50 inches of rain, making it the wettest tropical cyclone in history and leaving no part of the city unscathed. Yet despite this hurricane’s magnitude, authorities did not order a mandatory evacuation in Houston, providing a rare example of widespread evacuation that occurred emergently through countless individual decisions without the intervention of the state.

Here, we take Hurricane Harvey as a case study and use anonymized mobility data to identify detailed evacuation behaviour. We then link the observed behaviour with information on wealth and race at the level of the census block group—the smallest spatial unit of reporting of the US Census. This allows us to quantify the extent to which these socioeconomic factors impacted evacuation behaviour. We show how these disparities influenced not only who evacuated, but also where they evacuated to, when they evacuated and how long they remained evacuated. Houston itself is an ideal testbed to study the disparities of interest. The Houston Metropolitan Statistical Area (Greater Houston) spans an area of over 24,460 square kilometres. It has an approximately 7 million population according to the 2018 Census estimate, the fifth-largest among all MSAs in the US. The centre, the city of Houston (see Fig. S1), has more than 2.3 million population and is the largest economic hub of the Southern United States.

We use anonymized data generated by more than 2 million opted-in mobile devices between July 1, 2017 to October 1, 2017 to illuminate the class and race interactions with disaster evacuation and recovery. More details about the data can be found in SI Section 1, Data Description, Figs. S1S5, and Table S1. We present a multi-scale study of disaster-induced evacuations of Hurricane Harvey, looking at who left, where they went, and how long they were gone. First, we infer residents’ evacuation status and destination using spatial data analysis. We then examine the socio-demographic differences across the Houston metropolitan statistical area and integrate the information with the distributions of spatial and temporal patterns at the individual level. Finally, we use residential home neighbourhood characteristics capturing both race and wealth to illuminate the differences in disaster responses across all affected regions.


Data and preprocessing

We use mobility data provided by Cuebiq, a leading private sector firm that handles human mobility data with anonymized identifiers and guarantees user privacy. Prior studies have used these and similar data sets to understand commuting patterns (F. Wang et al., 2019), accessibility to public resources (Akhavan et al., 2018), neighbourhood connectedness (Wang et al., 2018), and social distancing during epidemics (Gao et al., 2020; Zhang et al., 2020). We confine the geographical bounds to the Houston Metropolitan Statistical Area (MSA) and collect the de-identified mobility data for a 3-month period overlapping with Hurricane Harvey (July 1–September 30, 2017), comprising over 3 billion geographical records from over 2.5 million unique anonymous users who opted-in to share their location data anonymously for research purposes. Due to differences in hardware/software among the devices in the dataset, the GPS coordinates therein possess different levels of spatial noise. We adopt a uniform minimum spatial granularity of 50 m by implementing an established stay-point algorithm (Jiang et al., 2016; Li et al., 2008) to extract meaningful stays and filter out transient locations. The algorithm works by processing a de-identified GPS trajectory (timestamped sequence of latitude/longitude pairs) into a series of geographic regions where the user has a stay duration surpassing a specified threshold. The effect is to compress trajectories, allowing efficient analysis of individual mobility patterns; each anonymized device’s trajectory becomes a series of stay points encoded as tuples of the form (latitude, longitude, start_time, end_time). Throughout this study, we use temporal and spatial thresholds of 5 min and 50 m, respectively. To minimize the impact of tourists and other non-residents, we retain only users who have at least data points from at least 60 unique days and at least 100 stay points. This filtering leaves roughly 30 million data points corresponding to roughly 150,000 unique, anonymous users. The detailed data specifications are shown in SI Table 1.

Sampling bias correction

A potential challenge with crowdsourced large-scale mobility data is the inherent bias in sampling rates for different populations. To test how this may confound the results that follow, for each block group we compare the number of devices in the dataset with the total population reported in the 2018 American Community Survey (ACS 2018), as detailed in SI Section 1.4 Data Representativeness. We observe a strong correlation between the two datasets, although the distributions do not align perfectly.

To minimize potential bias, we employ two strategies: (i) a weighting procedure, in which we weight each block group according to the ratio of the number of actively reporting devices to the reported population based on ACS 2018. By comparing both the unweighted and weighted results, we can gauge the magnitude of deviation of the results caused by the sampling bias. However, in the case when the subgroup of evacuees was small or where evacuees from a small number of specific neighbourhoods predominate, the sampling bias from those neighbourhoods can appear to be more prominent than the neighbourhood disparity. Therefore, we also (ii) bootstrap the samples from each block group with a uniform sampling rate proportional to the population with 100 iterations.

Home census block group detection

For every anonymized device in the dataset, we first determine the home locations prior to the landfall of the disaster by identifying the weekly primary locations using stay points overlapping the weekday evenings (8 p.m. to 7 a.m.). We then apply an agglomerative clustering method using complete linkage (Murtagh and Contreras, 2012) (enforcing a maximum spatial diameter threshold of 50 m to each cluster) and assign the cluster region with the longest cumulative stay duration as the candidate home census block groups. In addition, a threshold of a minimum of two different days has been applied to mitigate the uncertainty of unbalanced usage across different evenings. After obtaining 5 consecutive weeks of potential home census block groups for each individual, we perform weekly cross-validation with a tolerance of only one week missing or mismatch, using 50 m as the maximum deviation Haversine distance that is consistent with the stay-point accuracy.

Evacuation detection

To determine each user’s evacuation status and identify the corresponding evacuation and return time, we use the cross-validated home location (above) for each user as the prime baseline location. The key challenges are the uncertainty for daily reported data as well as the uncertainty of mislabeling short trips as evacuations. We, therefore, introduce a sliding time window (testing both 5 and 7 days) to mitigate such risks and iteratively identify the primary home locations of each sliding time window, computing the deviation distance to the prime location. Within each rolling time window, we apply the same home detection technique and determine the primary location. If a departure of at least three consecutive calendar days from home is observed, we consider this user as a potential evacuee and label the new primary location as the evacuation destination. We used a 1 km distance threshold of evacuation for the maximal cluster diameter from 50 m to, in order to cope with the fact that the accuracy of signal and cellular service could be affected by Hurricane Harvey. The middle date of the time window is taken as the date of departure.

Net evacuation intensity estimation

In order to estimate the geographical patterns of intra-urban relocation, we use non-parametric kernel density estimation (KDE) to evaluate the net migration intensity for each urban grid s denoted as \(\widehat K_s^{{\rm {net}}}\). Specifically, we use the pre-disaster and post-disaster latitudinal and longitudinal information of all evacuees’ detected home locations as inputs for two KDE models that capture the probability density of home locations before and after evacuation, respectively. Each density function \(\widehat f\) has the following form:

$$\widehat f\left( s \right) = \mathop {\sum}\nolimits_{i = 1}^n {\frac{1}{{nr^2}}} K\left( {\frac{{d_{i,s}}}{r}} \right)$$

where s represents a specific location of interest and r denotes the bandwidth of the KDE estimate, which relates to model resolution. In this study we and Gaussian kernel function K, with r = 100 m. Here, i runs over all n homestay points in the time period of interest, and di,s is the Haversine (geodesic) distance between home location i and the location of interest, s. We use the set of original detected home location data of all evacuees and construct the spatial density function \(\widehat K_s^{{\rm {pre}}}\). Similarly, we can obtain \(\widehat K_s^{{\rm {post}}}\) with the post-evacuation destination. We then compute the evacuation intensity of each area as \(\widehat K_s^{{\rm {net}}} = \widehat K_s^{{\rm {post}}} - \widehat K_s^{{\rm {pre}}}\). In order to facilitate the comparison of contour surfaces based on different subgroups with varying sample sizes, the final values of \(\widehat K_s^{{\rm {net}}}\) are normalized to have zero mean and lie in the range [−1, 1]. This allows us to easily depict influxes (outfluxes) of people above/below the baseline.


Who evacuated?

When Hurricane Harvey approached the Texas coastline, officials did not instruct the populace in Houston to evacuate. And yet, a significant percentage of the population chose to evacuate anyway. Without fine-grained data, it is difficult to determine to what extent there were patterns of evacuation based on location or socioeconomic features. Based on our analysis of the mobility data, we detected 10,179 unique evacuees and 141,828 non-evacuees, indicating that 6.7% of the total population evacuated, on par with the official reports. Examples of evacuation detections of three users are shown in Figs. S6S8. However, as shown in Fig. 1, this baseline evacuation rate varied substantially across different locations and neighbourhood types. In addition to a higher evacuation rate closer to the coast, we also identify many inland areas with evacuation levels as high as the coastal areas, such as Fort Bend and East Houston areas (Fig. 1a). We further identify two distinct geographic patterns for long-distance (90th percentile, >41.25 km) and short-distance (10th percentile, <2.71 km) evacuations, respectively. Whereas long-distance evacuation was almost entirely directed from the coastline inland, short-distance evacuation occurred in distinct, concentrated pockets across the MSA.

Fig. 1: Geographical and socio-demographic information on evacuation from Hurricane Harvey.
figure 1

a–c Illustrate the net evacuation intensity of each 100 × 100 m grid, defined as the difference \(\widehat K_s^{{\rm {net}}} = \widehat K_s^{{\rm {post}}} - \widehat K_s^{{\rm {pre}}}\), where \(\widehat K_s^{{\rm {post}}}\,\widehat K_s^{{\rm {pre}}}\) are KDE estimates of the probability density of home locations after (before) evacuation. These values are normalized to have zero mean and lie in the range [−1,1] (see the “Methods” section). As such, positive (blue) values correspond to an influx of resident above the baseline, with negative (red) values indicating an exodus. We compare the collective evacuation behaviour of all evacuees (a); evacuees who evacuated further than the 90% quantile (b); and evacuees who evacuated closer than the 10% quantile (c). The triangles in (a) represent reported fatalities and red dense cells indicate a greater rate of evacuation while the blue shaded cells reflect the likelihood as a relocation destination for the evacuees. The relative evacuation rates that compare the proportion difference for subgroups of populations are shown in (d)–(f) and an evident disparity in the evacuees’ economic and social compositions can be observed.

We studied social disparity by classifying six types of neighbourhoods. A neighbourhood in this study is a census block group, a geographical unit used by the United States Census Bureau. Each block group contains between 600 and 3000 residents. We divided the census block groups in the Greater Houston area into six classes based on the wealth level (using a threshold of 25% of the residents below the federal poverty level) and the majority racial group (with a threshold of 50%): non-poor Black, non-poor Hispanic, non-poor White and poor Black, poor Hispanic, and poor White. We also conducted a sensitivity analysis by setting different thresholds on the poverty level and the majority racial group; the results can be found in SI Section 4 Sensitivity Analysis and Figs. S9S11. The population baseline ratio of all neighbourhoods in Greater Houston can be found in Table 1. The total population is predominantly white (about 60%). There were increases of 39.8% and 24.3% for Asian and Hispanic populations, respectively, in the region between 2010 and 2018, making it more racially diverse than the overall population in the US.

Table 1 Statistics of each type of neighbourhood included in this study.

To investigate both geographical and sociodemographic features of the evacuation mobility flow induced by Hurricane Harvey, we start with a multivariate kernel density estimate with Gaussian kernels on both origins and destinations (see the “Methods” section) to produce the contour plot of net flow density in each 100 m × 100 m grid. We then compare all evacuees’ residential block group characteristics (see Table 1) with the population breakdown to show the percentage difference as the relative evacuation rate.

Based on the locations from which the individuals evacuated, we can examine whether socioeconomic factors impacted evacuation behaviour. After reweighting the mobility records to account for sampling bias (as described in the “Methods” section), we find that evacuees from non-poor majority White block groups are significantly overrepresented among evacuees, 19.8% more than we would expect based on their fraction of the total population (Fig. 1d). All other neighbourhood types evacuated less than the baseline, even though flooding was comparable across the neighbourhoods. On the opposite end of the spectrum, individuals from poor Hispanic communities showed the least tendency to evacuate with 12.2% below their baseline (Fig. 1d). The result aligns with observations from other natural disasters. For example, wealth has also been shown to correlate positively with evacuation following Hurricane Irma (Yabe and Ukkusuri, 2020).

Through Fig. 1b, c and e, f we compare the subgroup patterns based on the evacuation distances. Figure 1b and e show the individuals who had long-distance evacuations (>41.25 km), i.e., moved greater than the 90th percentile of the evacuation distance distribution. Figure 1b shows the spatial density flow corresponds to the evacuees who travelled long distances, where we can observe clear hotspots along the coast moving toward the inland areas. The patterns of evacuation rates from different neighbourhoods remain largely the same as the ones observed in overall evacuees. The only difference is that we observe a slight increase (4.2%) of the individuals from nonpoor White communities when comparing it to the overall relative evacuation rates (see Fig. 1e). Similarly, the evacuation discrepancy holds for those who relocated less than the 10th percentile (<2.71 km) even though the origins and destinations of evacuations are more spatially dispersed (Fig. 1c). However, nonpoor White neighbourhoods are the only groups that provide more evacuees than their baselines (Fig.1f).

Where did they go?

In addition to detecting who left vulnerable areas during shocks, it is important to understand the locations to which they evacuated. This understanding is vital for disaster managers seeking to efficiently pre-position shelters with food, water, and medical care and for police officials and first responders who seek to ameliorate transportation challenges, such as gridlock. Without fine-grained mobility data, these remain major policy challenges. Figure 1a–c shows the general distribution of evacuation sources and destinations, but the evacuation distance is not discernable. Figure 2 shows the spatial distribution of evacuation activities and the transition matrices considering the sociodemographic characteristics. We find that, despite the wide differences in evacuation rates for different neighbourhood types discussed above, the evacuation distance follows a scale-free distribution with an exponential cutoff (Fig. 2a), in line with findings from other disasters (Yabe et al., 2020). Specifically, the distribution Pd) = (Δd0 + Δd)β exp(−Δd/K)β fits for evacuees from different communities with exponent value β = 1.57, Δd0 = 2.19 km and cutoff value K = 38.29 km.

Fig. 2: Evacuation distance and destination choice.
figure 2

a The distribution of evacuation distance for residents from different block group categories. b The correspondence between evacuation origin and destination type (note: all rows sum to 1). There is apparent universality between different block group types when viewed from the probability of evacuating a given distance (a). But this apparent similarity breaks down upon examining destination choices, with residents of wealthier (b, right) and/or white neighbourhoods tending to evacuate to block groups of the same composition.

Based on this apparent universality, we might expect that the evacuation process is spatially uniform, with evacuation destinations determined only by one’s distance from their origin. We find that this is not in fact the case; there is remarkable socioeconomic homophily in evacuation behaviour, with people from a given block group type tending to evacuate to the same block group type, even if it is far away. To assess this, we compute the transition matrix between block group types based on the origin–destination information for all evacuees. We normalize the matrix so that the values represent the evacuees’ fraction from a block group type who evacuated to a given block group type. In Fig. 2b and c we see strikingly different behaviour from the spatial pattern. The residents from White neighbourhoods have an 88.1% probability of evacuating to the same type of block group, while for Hispanic communities that ratio dropped to 56.8%. Evacuees from Black communities are the least likely to relocate to a neighbourhood that is consistent with their original home community type (16.7%). Similarly, individuals from non-poor communities have a 92.9% probability of relocating to a similar social background. In contrast, the rate drops to 35.4% for residents of poor neighbourhoods.

How long were they gone?

Evacuation duration is a key factor in disaster recovery and resilience (Fussell et al., 2010; Gao et al., 2016; Gehlot et al., 2019; Liu et al., 2020; Sadri et al., 2017) (the definitions of resilience and other terms in this study can be found in the SI Section 2). Identifying the precise beginning and end of evacuation using the mobility data, we find that most evacuation departures occurred in the first few days following landfall, with 95.6% between 1 and 7 days following landfall (Fig. 3). Only a few people (4.3%) evacuated in advance of landfall (Fig. 3a). In contrast, returns began shortly after the peak of departures but spread out over a more skewed, heavier-tailed distribution; though most evacuees returned within two weeks, some took far longer. The exact dates of departure and return were determined via spatial clustering with a 5-day sliding window (see SI Section 3) to reflect whether an individual stabilized in a given location.

Fig. 3: Race and wealth impacts on evacuation/return times.
figure 3

a The kernel density of both departure and return of relocation activities. b Distribution of individual evacuation durations group by income and race difference in communities. Though most departure dates are tightly clustered within the first week of Harvey’s landfall (a, red), we find a right-skewed distribution of return dates (a, blue), with some Houston residents taking far longer to return than others. The tail of this distribution is formed mostly from residents of wealthier neighbourhoods of all racial compositions (b), suggesting only those residents could afford long displacements.

As with the distance distribution, it is tempting to think that the approximately exponential return time distribution indicates a homogeneous behavioural pattern. Yet when we incorporate the sociodemographic information of each evacuee, we uncover marked socio-economic disparities. We perform the bootstrapping on the overall samples iteratively for the different neighbourhood types compared to the baselines based on the cumulative evacuees and returners as a subgroup. From the violin plot of evacuation duration shown in Fig. 3b, we can see that, as with the spatial patterns, the basic statistics such as mean/median show little variation across different types of block groups. 90% of poor Black evacuees returned after 13 days, whereas at that point only 75% of poor White evacuees had returned, indicating huge variability in return times. Additionally, the vast majority of long-duration relocations (over 30 days) involve evacuees from higher social class neighbourhoods. Past research has suggested that poorer residents have greater challenges finding new, short-term housing while wealthier and better-positioned ones can do so more easily (SAMHSA, 2017).

We are further interested in discovering whether socio-demographic factors impact the time of evacuation day by day, for example, to determine whether certain groups are more likely to have evacuated immediately upon landfall or to have waited. A similar question regards return from evacuation: Which groups are over or underrepresented on a given day? To address these questions, we compute the disparity rate in departure and return times. Due to the limited number of data points in certain block group types, we calculate two disparity rates based on two aggregations of neighbourhoods into (i) nonpoor vs. poor; and (ii) white majority vs. non-white majority, as defined above. For each such division, we then find the relative difference in the cumulative number of prevalence of departures/returns on each day. The racial (wealth) disparity rate Di for a given day i is thus calculated as (RiR)/R, where R is the total fraction of evacuees coming from white (nonpoor) neighbourhoods while Ri is the same fraction, but only including evacuations/departures up to day i. High values of Di thus indicate that evacuations (or returns) from white or non-poor neighbourhoods were overrepresented up to time i in the process. The results can be found in Fig. 4.

Fig. 4: Time progression of class and racial disparity in evacuation and return.
figure 4

The vertical axis shows the relative disparity in the cumulative numbers of departures (a) and returns (b) over time, between residents of poor vs. non-poor neighbourhoods (squares), and white vs. non-white neighbourhoods (circles). A positive value on a given day indicates a disproportionate number of people from white or non-poor neighbourhoods having evacuated (returned) up to that point.

Figure 4a indicates that while most departures took place within a short period, the social disparities persist along the temporal dimension. In terms of poverty (or wealth), we see that for those who departed prior to the disaster, the percentage of residents from wealthy neighbourhoods is over 50% higher than that of all evacuees. Such variation presents a steady decrease as the hurricane strikes the city and falls below 50% after the first day of the landfall. We also observe that the standard deviation is notably larger in the early days which is due to the small number of early evacuees.

Overall the disparity rates along the wealth dimension were positive over multiple days, suggesting that the people with higher income were more likely to evacuate ahead of those with lower wages. The trend based on race also shows that evacuees from neighbourhoods with higher White-population ratios were more likely to evacuate early, yet right after the landfall people from disadvantaged block groups comprised more of the evacuees (with about 10% higher than the baseline). Overall, both income and race disparity metrics indicate that majorities of early evacuees comprise people from potential higher social classes, yet this disparity decreases at a steady pace.

When we project the disproportionate effect of return behaviour over the period between 28 August and 17 September which is shown in Fig. 4b, we observe that like the departure behaviour, the returners’ daily disparities show greater class than racial difference. We consider Day 5 after Harvey as the beginning of the return period based on local news and weather reports verifying rainfall had stopped and Day 23 as the end, based on the 95th quantile of the overall return time. We see that early returners, especially those who came back to their homes in less than 3 days, primarily came from wealthy communities. However, this class disparity effect continues to decline with a greater proportion of poor evacuees returning and reaches equity 2 weeks after the disaster. Such class differences increase monotonically, reaching the same proportion after 3 weeks, at which point most of the evacuees have returned to their homes in Houston.

Finally, we estimate the extent to which the evacuation and recovery timing varied by race and class by computing the daily disparity rate. We find that a small percentage of the overall evacuees (4.3%) responded to Harvey and relocated before its landfall. The low rate of evacuation may be because urban residents collectively perceived a low level of risk in the absence of mandatory evacuation orders in Houston. Particularly among early evacuees there are marked disparities, suggesting that the ratio of the wealthy residents has a 50% jump compared to the evacuee baseline and 25% for the ratio of people residing in White majority neighbourhoods. As expected, the class and racial minority proportion of the evacuees increases steadily after Harvey’s arrival. This finding confirms that minorities are at higher risks of disruption and adverse impacts. This behavioural difference indicates that income level could be a contributing factor in early-evacuation decision-making. Further, disparities in terms of time away indicate that income plays an important role. We can see that the proportion of evacuees from disadvantaged communities started increasing as the storm ended, echoing previous studies (Fussell et al., 2010; Yabe and Ukkusuri, 2020) that class and housing ownership has a strong impact on return dates.


Our analysis of three months of detailed human mobility data before, during, and after Hurricane Harvey reveals the complexity of disaster response patterns, exhibiting both universality and heterogeneity in different dimensions. We first quantify the wealth and racial disparity in terms of the overall evacuee composition as well as the intra-urban displacement patterns. By examining these hotspots, we find that three of the peaks correspond to areas with high fatalities while the other hotspots to the southwest of downtown Houston had no fatalities. We further observe that non-poor White populations were the most likely to evacuate for evacuations of all distances. There are several plausible reasons for such class and race differences, including uneven access to transportation, inequitable regional development, and perhaps even different levels of preparedness and perceived risks. In Harvey’s case, with no evacuation order declared for Houston, the evacuation decision process may have taken place with little prior planning, increasing the role of financial resources and social networks in making such decisions.

Furthermore, we extract the spatial and temporal features of evacuation activities and investigate the heterogeneity among different residential groups for each metric. There is a high degree of universality in evacuation distance after comparing the evacuees’ distributions from different types of communities. This finding aligns with previous studies based on similar extreme disasters such as hurricanes and earthquakes, indicating that evacuation during Harvey followed the heavy-tailed human movement pattern of past crises. The overall temporal features demonstrate limited impact from differences in the socio-demographic features of communities. For most of the block groups, the evacuees’ displacement duration was less than one week while residents from more affluent communities were displaced for longer. It is also notable that displacements longer than 30 days are disproportionately residents of socio-economically advantaged neighbourhoods. The longer exodus is likely because only the wealthy possess the financial resources to afford a longer stay away; those with low incomes would struggle with limited resources and potentially higher burdens.

Even with the high spatial and temporal resolution of our data, our analysis is not without limitations. First, we have focused only on mobility in Houston MSA during a three-month period between July and October 2017. Larger areas and longer observation windows would be required to detect larger (e.g., out of state) displacements and long-term impacts such as permanent relocations. A second limitation is the sampling bias amongst different communities owing to different penetration rates of smart devices. Though we have made efforts to quantify and mitigate this bias through resampling (see the “Methods” subsection “Sampling bias correction”), ensuring data balance remains a challenge. Specifically, we could not fully control the features related to evacuation behaviour such as home damage, infrastructure conditions, and strength of social ties when analysing the factors related to race and class due to the dearth of these data sets. Third and finally, our analysis of race and class has not fully controlled for other features known to impact evacuation behaviour, such as home damage, infrastructure conditions, and strength of social ties. We have tested a regression model incorporating data on flooded roads to capture infrastructure conditions and found the overall predictive power of that factor is not significant (see SI Section 5, Evacuation Rate Prediction and Figures S12 and S13). Nonetheless, additional information about the urban fabric—encompassing both physical and social infrastructure—is needed to fully understand the complex interplay between race, wealth, and disaster outcomes.

The significance of income levels in determining evacuation response, with wealthier residents evacuating earlier, longer, and at a higher overall rate, indicates that some residents may have preferred to evacuate but judged that the costs of evacuation were too high, and they, therefore, took on the risks of staying put. This impossible choice between physical harm and economic harm is a recurring theme, notably in the very different crisis of COVID-19. In the first wave of spring 2020, compliance with social distancing guidelines was significantly lower in low-income neighbourhoods across the United States (Oliver et al., 2020; Ruiz-Euler et al., 2020). This points to potential interventions in the form of temporary financial incentives, government messaging, and/or preferential allocation of resources aimed at ameliorating the unequal effects of major urban disruptions. The approaches and high-resolution data sets from this study can help us arrive at concrete conclusions on the disparities and devise effective strategies to mitigate human suffering.