A network percolation-based contagion model of flood propagation and recession in urban road networks

In this study, we propose a contagion model as a simple and powerful mathematical approach for predicting the spatial spread and temporal evolution of the onset and recession of floodwaters in urban road networks. A network of urban roads resilient to flooding events is essential for the provision of public services and for emergency response. The spread of floodwaters in urban networks is a complex spatial–temporal phenomenon. This study presents a mathematical contagion model to describe the spatial–temporal spread and recession process of floodwaters in urban road networks. The evolution of floods within networks can be captured based on three macroscopic characteristics—flood propagation rate (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\beta$$\end{document}β), flood incubation rate (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha$$\end{document}α), and recovery rate (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document}μ)—in a system of ordinary differential equations analogous to the Susceptible-Exposed-Infected-Recovered (SEIR) model. We integrated the flood contagion model with the network percolation process in which the probability of flooding of a road segment depends on the degree to which the nearby road segments are flooded. The application of the proposed model is verified using high-resolution historical data of road flooding in Harris County during Hurricane Harvey in 2017. The results show that the model can monitor and predict the fraction of flooded roads over time. Additionally, the proposed model can achieve 90% precision and recall for the spatial spread of the flooded roads at the majority of tested time intervals. The findings suggest that the proposed mathematical contagion model offers great potential to support emergency managers, public officials, citizens, first responders, and other decision-makers for flood forecast in road networks.

Scientific RepoRtS | (2020) 10:13481 | https://doi.org/10.1038/s41598-020-70524-x www.nature.com/scientificreports/ intensive computation 17 . Due to delay and computational cost issues, the existing physics-based hydrodynamic models may not be conducive to providing timely and reliable predictions for the spatial-temporal spread of floods and the failure of road segments within short time periods 18 . To overcome the limitations in empirical models, machine learning techniques have been proposed and tested for predicting the spread of floodwaters in urban areas 19,20 . Compared to the hydrodynamic methods, machine learning models, such as multiple linear regressions 21 , deep neural networks 22 , and Bayesian forecasting models 23 require fewer input parameters, so that the models can be easily trained on historical flood event data. For example, Khosravi et al. tested four decision tree-based machine learning models-logistic model trees, reduced error pruning trees, naïve Bayes trees, and alternating decision trees-for flood mapping 24 . The results show that, with adequate training, these models can achieve greater than 80% accuracy for predicting the flooded locations. Youssef et al. integrated frequency ratio and logistic regression models to evaluate the correlation between flood occurrence and various potential factors and developed a model providing an acceptable prediction accuracy 14 . Although existing machine learning models can achieve good predictive performance to capture the flood propagation in urban networks, these models are limited due to their dependence on large sets of historical data for model training. In addition, existing machine learning models are designed to capture only the propagation of flood in urban areas. The flood recession process, which is also important for assessing the resilience of urban networks, is often ignored by the existing machine learning models.
Recognizing the limitations of existing models, there is a real need for mathematical models that can capture the spatial-temporal evolution of flooding without relying on a variety of input parameters and historical data such as the volume of waters and the width of the roads. Recent studies have demonstrated a surprisingly significant similarity among spreading processes in different systems, including the spread of traffic congestion in transportation, the contagion of infectious disease in populations, the diffusion of ideas in social networks 25 , as well as the evolution of flooding in urban road networks 26,27 . Motivated by these studies, our goal in this research was to describe the floodwater spreading process using generalized mathematical contagion models, such as classical epidemic models 28 . Existing epidemic models offer an analytical and numerical framework to quantify and forecast multiple spread phenomena in a variety of contexts. In particular, the popular susceptibleinfectious-recovered (SIR) model created the basic building blocks of epidemic modeling using infectious and recovery rates. These mathematical models have two fundamental hypotheses: compartmentalization, in which each entity is associated with a state or compartment; and homogenous mixing, in which each entity has the same chance of contacting an inflected entity 29 . In the context of flooding, each road cell is associated with a state, functional-flow or flooded. Hence, the flooding propagation problem fits this compartmentalization hypothesis. While these hypotheses simplify the modeling of contagion by eliminating the need to know the structure of the networks, the mathematical models can still capture the temporal evolution of the fraction of infected entities in the networks very well.
Flood risk prediction is a task that should take into account both the temporal and spatial natures of the floodwater in road networks. Urban flood risk characterization requires not only knowledge of the fraction of flooded roads at each timestamp, but also needs to identify the geographic locations of flooded roads as flooding unfolds. Hence, pure mathematical models are not able to satisfy these requirements. To this end, the network percolation process has gained attention recently because it enables to capture of the propagation process through the topological connectivity in networks 30 . As defined in percolation theory, the spread of infection relies on the probability of the infected neighbors, in which the heterogenous mixing assumption is held in local components of the networks 31 . Specifically, the infection spreads from an initial node along edges of the percolated network 32 . Hence, the percolation process reflects the "amplification" effect of neighbors and weakens global network interactions. An infection is more likely to be transmitted to those the node comes encounters. This characteristic is essential for flood propagation prediction in urban networks since it considers the spatial co-location and constraints of urban networks. Without the temporal information about the fraction of flooded roads, however, the percolation process would fail to capture the temporal evolution of flooding in urban networks.
This study proposed and tested a network percolation-based contagion model that integrates the mathematical framework and network percolation process to predict the spatial propagation and temporal evolution of flooding in urban road networks. The mathematical framework fits the temporal dynamics of the flood situations, and the percolation process identifies locations of flooded road segments. To illustrate the performance of the model, we applied it to a case study of flood evolution in Harris County road networks during Hurricane Harvey. Potential control strategies are also identified based on the outcomes of the model in the case study.
Scientific RepoRtS | (2020) 10:13481 | https://doi.org/10.1038/s41598-020-70524-x www.nature.com/scientificreports/ Each point is associated with a longitude and a latitude so that it can be located on a geographical map. Then, a road segment can be represented as lat s , lng s , lat e , lng e , where lat s and lng s are the latitude and longitude of the starting point, while lat e and lng e are the latitude and longitude of the ending point.
Although road segments enable good resolution for understanding the situation in urban networks, flooded areas are usually not restricted in a banded road segment. Floodwaters tend to start from a point and spread in all directions. In addition, massive segments and their complex connections in the networks would also cause intensive computational cost. Hence, in modeling road networks, segment-to-segment modeling is limited to follow the nature of flood spread and achieve efficient computation. To this end, grid decomposition, a commonly used method 33 , is adopted in this study to generate equal-sized cells and divide the study area into small regions (see Fig. 1).

Definition 2.
A road cell is a square over a rectangular projection of the geographical map.
The spatial boundary of the cell can be represented as a set of geo-coordinates lat bl , lng bl , lat ur , lng ur , where lat bl and lng bl are the latitude and longitude of the bottom left corner, while lat ur and lng ur are the latitude and longitude of the upper right corner. To covert the road segments to grid, we apply the following criteria to the road network: After grid decomposition, the road network assembled from road segments could be described as a series of cells. In practice, multiple factors would influence the outcome of grid decomposition. For example, a large cell would include many segments, which might lead to losing spatial resolution. Accordingly, the model would further lose the capability of capturing the spatial spread of flood. On the other hand, a small grid that cannot cover at least one road segment will partition the network into discontinuous components, and subsequently increase the computational cost. Hence, grid decomposition requires pre-testing to ensure that a cell in the grid is able to maintain spatial resolution without unduly burdening computation.
Once the road segments are assigned to cells, we remove cells lacking segments to reduce computational cost. The remaining cells form a network in which the cells are considered as nodes, and their shared borderlines are considered as links. By doing so, we can construct an undirected grid network G with average degree of k to represent the topology of the road networks. Average degree of a road cell is the average number of adjacent cells per cell in the road network. To model the flood propagation and process, we then associate a dynamical binary state variable x to each of the N cells (also called nodes) of the grid network G , such that x i (t) ∈ {0, 1} represents the flood status of node i at time t . Using a standard notation, we divide the cells into two classes, functional flow (F) and flooded (C) , corresponding respectively to the values 0 and 1 of the status variable x . Functional flow is a state in which traffic can utilize a road (regardless of traffic level). In the context of flooding spread process, the status C represents the cells which have been flooded. At each time t , the macroscopic flooding situation is given by the fraction of flooded cells Flood spread characterization. The flood propagation and recession process are temporally and spatially variant. To capture the temporal nature of flood evolution in urban road networks, the proposed model considers macroscopic characteristics to predict the temporal evolution of floodwater spread in urban networks.
In the first step, we define four flooding statuses for a cell: functional flow, exposed, flooded, and recovered, statuses by including the temporal attributes. C(t) represents the number of flooded cells in the network at time t ; F(t) represents the number of functional flow cells at time t ; E(t) represents the number of cells that are in flood incubation stage (i.e., roads in the path of approaching floodwater but on which traffic still moves) at time t ; and R(t) represents the number of cells that have recovered from flooding at time t . Given a grid network G with an average degree of k and N nodes, each cell in the network has functional flow at time t = 0 . That is,F(t = 0) = N and C(t = 0) = 0 . The flood initially occurs at a set of nodes and then propagates throughout the network. From (1) s i ∈ grid j , if lat bl ≤ lat s + lat e 2 ≤ lat ur and lng bl ≤ lng s + lng e 2 ≤ lng ur www.nature.com/scientificreports/ a macroscopic perspective, a cell in the undirected network is on average connected to k other cells. The neighbors of a flooded cell are exposed to flood at a rate of β . In modeling the temporal evolution, the connections of the cells are assumed to be homogeneous, which forms the basis to formulate a general differential equation system. Then, the probability of a flooded cell being connected to a functional-flow link is F(t)/N at time t . Therefore, a flooded cell comes into contact with kF(t)/N . Since C(t) flooded cells have water flowing, each at a rate β , and the exposed cells become flooded at a fixed rate α , the average number of new exposed cells dE(t) during a unit timeframe dt is determined as follows: The exposed cells are not completely inundated, and traffic can still pass through, but they could be flooded within the next few timestamps. We call this stage, flood incubation stage. The defined four statuses of the road cells are the only statuses that a road cell could have. Hence, the sum of the four status variables should be N .
To capture the fraction of the cells in each status, we use the following variables: c(t) to represent the fraction of flooded cells in the grid network at time t ; f (t) to represent the fraction of functional-flow cells in which all road segments are accessible at time t ; r(t) to represent the fraction of recovered cells from flooding at time t ; and e(t) to represent the fraction of cells that are exposed to flooding but still in flood incubation at time t . Then, the differential equation for the change rate of exposed cells can be derived as follows: where, the product of βk is called transmission rate or transmissibility. The transmissibility can be used to measure the capability of floodwater to spread in an urban network under the same propagation rate. This also allows us to understand the effects of the topological structure of an urban network on the transmission of floodwaters.
Since a fraction of the functional-flow cells become exposed at each timestamp, the decreasing rate of the fraction of functional-flow cells can be represented by: Simultaneously, in a unit timeframe, a fraction of exposed nodes would be flooded at a rate α , and some of the flooded cells recover at a rate µ . Hence, the changing rate of the fraction of flooded cells and the fraction of recovered cells can be formulated as: Evidently, in a large-scale network where N is large, the probability of a flooded cell being connected to a functional-flow cell could be close to zero. At the macroscopic scale, the assumption of homogeneous mixing makes the prediction more tractable and robust. It should also be noted that the model does not include mortality (i.e., significantly damaged roads that would not be functional after the floodwater recedes). Since it is not often the case that a flooded road cell is severely damaged, excluding the mortality is reasonable and realistic. Based on the former constructs, the model component derived for capturing the temporal dynamics of flooding scale using macroscopic characteristics is established ( Fig. 2A).
network percolation process. The spatial nature of floodwater spread in urban networks is modeled using a network percolation process. The network percolation process describes the contagion effects of a flooded cell on its network neighbors 34 . Specifically, the cells whose neighbors are flooded are more likely to be flooded than the cells whose neighbors are not flooded 29 . Similarly, floodwater is more likely to recede first from the cells whose neighbors are not flooded compared to the flooded cells whose neighbors are also flooded. To characterize this spatial aspect of flood spread, we define the probability of a node to be flooded or to be recovered in the next timestamp based on the number of flooded neighbors. The propagation and recession processes are modeled as described below, respectively (see Fig. 2B).
In the propagation process, the percolating cluster is a set of road cells which are flooded next to another bounded by functional-flow cells at a specific time interval. Since the extreme event, Hurricane Harvey in this study, affected a large area, multiple percolating clusters presented in the road network. They were spatially scattered across the network at the early stage of flooding. As the event progressed, some clusters broke through the bound and formed larger percolating clusters. We can obtain the number of flooded cells N   p , in the current timestamp t is obtained by: As discussed earlier, the spatial pattern of the propagation process ( F → E → C ) is controlled by the fraction of flooded neighbors of a cell. Hence, we assign the probability of flooding (i.e., the fraction of flooded cells among all neighbors) to the unflooded cells. Each unflooded cell will be assigned a probability p In the next step, we assign the probabilities of flooding (i.e., the fraction of flooded cells among all neighbors) to the flooded cells. Each flooded cell is assigned a probability p . That is, the probability of floodwater receding is 1 − p (t) i . In a manner different from the propagation process (in which the cells are sorted based on their probabilities of flooding from high to low), we sort the cells based on their probabilities of flooding from low to high (i.e., probabilities of floodwater receding from high to low). The recovered cells at timestamp t are assigned a low probability of flooding.
Using the above calibration in the percolation process, we can mitigate the homogeneous mixing assumptions in the flood spread characterization model by adopting local heterogenous flood probabilities and achieve high accuracy in predicting the spatial distribution of flooded cells in urban networks.
Model evaluation. We employed two metrics to evaluate the performance of the model for predicting the temporal evolution and spatial propagation of flooding spread in urban networks. The first component of the model is a system of differential equations to capture the magnitude of flooded cells in networks. The objective of this component is analogous to addressing a fitting curve. Hence, we use the root mean square error (RMSE) 35 to measure the error of the model: www.nature.com/scientificreports/ where, ŷ i is the predicted value, y i is the observed value, and n is the number of observations. Ignoring the division by n under the square root, the formula can be considered as a formula for Euclidean distance between the vectors of predicted values and observed values. Hence, the RMSE is a normalized distance between the predicted outcomes and the observations which can be used to evaluate the model accuracy. This can also serve as a heuristic for a training model, which will be used in a pattern search algorithm to obtain the numerical solution for the model (see "Results" section). The spatial nature of the flooding propagation and receding is captured by the outcomes of the flood spread characterization model and the network percolation process. The intersection between the set of predicted flooded road cells and the set of observed flooded road cells indicates the precision and recall of the model 36 . They are formulated as follows: where true positive is an outcome where the model correctly predicts the flooded class, false positive is an outcome where the model incorrectly predicts the flooded class, and false negative is an outcome where the model incorrectly predicts the unflooded class. The calculated precision and recall allow us to assess the performance of the model by identifying the specific correctly predicted road cells. We collected the traffic data for 19,712 roads in Harris County from INRIX, a private company providing location-based data and analytics. The INRIX traffic data covers all available road roads-from interstates to intersections, country roads to neighborhoods. The data includes the average speed for each road segment in 5-min intervals. The timeframe of data is August 1 to September 30, covering the entire flooding period. In this case study, we used squares with a length of 400 m for generating the grid network.

Study context and data collection.
In the INRIX dataset, the flooded road segments can be identified by a designation of NULL for average speed, meaning there was no vehicle driving through the segment. Comparing the data before and after Hurricane Harvey, we found that the NULL average speed appears only during the flooding period. By cross-checking the flooded roads on government reports, most of the roads with NULL speed were flooded at that time. Although the average speed data was collected in 5-min intervals, the flood situation did not evolve significantly in such a short time period. To better capture the flood propagation and receding process, we aggregated the data at 4-, 8-, and 12-h intervals to test the performance of the model. Each interval was assigned a binary value of flooding status to the road segments based on their average speed. As documented in the model section, the value would be 0 if there is no NULL speed record, and the value would be 1 if there is a NULL speed record in the dataset (indicating functional flow status for the roads). pattern search for parameter estimation. It is often the case that the analytical solution to the differential equation system cannot be generated. That is because the functions are usually not continuous or differentiable 39 . To efficiently estimate the parameters in the proposed model, we applied a global pattern search algorithm 40 as a derivative-free numerical optimization method to fit the curves for each variable (i.e., f (t) , e(t) , c(t) , and r(t)) 26 . The objective function in this optimization process is the RMSE, which will be minimized by searching for the optimal propagation rate β , recovery rate µ , and exposed rate α . To start with this algorithm, we first specified initial values for the three parameters. The change of the parameters could be either increased or decreased in each step. The algorithm would compute the RMSE for each step until it finds an optimal point at which the current RMSE is smaller than the previous one. If the algorithm cannot find the optimal point using the current step size, it will use half of the current step size and repeat the computing process. Then all three parameters will move to the optimal values. The algorithm runs iteratively until one of the stopping criteria is met: the maximum number of iterations is reached or the step size is smaller than a certain threshold. In this study, we set the maximum number of iterations to be 2,000, and the threshold for the step size is 0.0001. The initial values for models with time intervals are the same, β = 1 , α = 1 , and µ = 0 . Through our tests, the selection of values around these initial values would yield similar results for the parameters. Table 1 shows the estimated values for the parameters and the optimized RMSE for three models. The RMSE increases a little bit with the increase of the length of the time intervals. That is because the fraction of road cells that are flooded (i.e., c(t) ) is greater in longer time intervals than in shorter time intervals. But, all three models fit the flood spread patterns in road cells well since the RMSE value accounts for only a very small proportion of the peak value of c(t) . In addition, the propagation rate β remains the same across three models. This result indicates that the length of the time interval does not affect the capability of the model to predict flooding propagation. The flood incubation rate α and recovery rate µ , however, decrease with the increase in the length of the time interval. That is because the number of flooded road cells increases with the increase in time intervals. Hence, www.nature.com/scientificreports/ a great number of road cells would be in the flood incubation state, which leads to a low rate of α (Fig. 3D-F). As such, the recovery rate must be small to represent a large number of flooded road cells. This result reveals that the length of the time interval has an influence on parameter estimation, but it does not significantly affect the fitting performance.
prediction results. Once the optimal values for the parameters are obtained, we can capture the temporal evolution of flood propagation and recession in urban network by showing the spread curves for each variable (i.e., f (t) , e(t) , c(t) , and r(t) ) ( Fig. 3D-F). Basic reproduction number R 0 = β/µ is used as a measure of the number of secondary flooded cells generated by the first flooded cell over the course of the flooding unfolding 41 . Mathematically, when R 0 is smaller than 1, the propagation will not occur as the recovery rate is greater than the propagation rate 42 . In the Hurricane Harvey case study, the estimated R 0 is greater than 1 across all three models and increases with the length of time interval (see Table 1). This result explains how rapidly floodwaters spread. The closer the value of R 0 to 1, the more stable the flooding situation is. In this case, we can observe that the flood situation would change more slightly if the time interval is shorter than 4 h, since R 0 would be close to 1. When the time interval increases from 4 to 8 h, the value of R 0 jumps from 1.20 to 2.45, meaning that in every 4-to 8-h increment, the flood situation would change more drastically.
In addition, the changes in c(t) allow us to predict the extent of flood in the networks and critical timestamps for breakout and peak (see Fig. 3A-C). The figures show the flood spread characterization in Eqs. (3)-(6) are fit to the empirical flooding data very well. We can observe that at the beginning of the hurricane (the hourly timestamps in day 1 and day 2), the number of flooded cells grew exponentially between every 4-, 8-, or 12-h increment, and reached the peak in the middle of day 3. After that timestamp on day 3, the flood receded gradually. Using the results, we can estimate the time that the urban road network needs to recover from flooding. In this case, evidently, the recovery rate was slow, which led to a long period for recovery at some severely flooded locations.
In the next step, we examined the predictive performance for both spatial spread and temporal evolution for the flooding, shown in Fig. 4. All models perform very well in terms of predicting the specific locations of flooded segments based on the flooding data at the last timestamp. The best performance of these models appears during the peak and receding period, while the performance is not particularly good at the beginning of the flooding. That is because the locations of the initial flooded cells depend on the rainfall magnitude and spatial patterns of precipitation in different areas. The emergence of initial flooded cells is difficult to be identified by the model without additional information about the precipitation pattern. After a few hours, however, the flood propagation  To demonstrate the adaptation of the model in other contexts, we conducted further controlled experiments to examine the adaptation of the model to different topological properties of urban networks (average degree), the propagation rate which reflects the stress of rainfall, and flood incubation rate and recovery rate. We change the value of one parameter that was learned from the INRIX data and control other parameters to be constant as shown in Table 1. The results show the extent to which different parameters could affect the predicted outcomes. This experiment also provides evidence for supporting the generalizability of the proposed contagion model for different cities and various intensities of flooding.
First, we tested the effect of average degree of an urban network on the flood dynamics by keeping the same values for other parameters. Figure 6A shows the changes in model results for predicting the flood status in 4-h time intervals. With the increase of the average degree k of the networks, the growth rate of the flooded cells increases significantly during the propagation phase. That is because the network with a large k will allow the flooded road cells to infect more neighbors, which expedites the spread of flood even when the propagation rate remains the same. In addition, a large k would decrease the time to reach the peak of flooding since the flooded cell would have more connected neighbors. In contrast, a small k would lead to a longer period of flooding, although the fraction of flooded cells will not reach to the same flooding scale as the urban networks with a large k.
We further tested the effects of the three macroscopic parameters (i.e., β , α , and µ ) on the predicted flood spread (see Fig. 6B-D). On the one hand, similar to the impact of the average degree, the increase in the propagation rate β and the flood incubation rate α would decrease the time to reach the peak of flooding and the number of flooded cells at the peak, but increase the time to recover to the normal situation. On the other hand, an increase in the recovery rate µ significantly decreases the number of flooded cells at the peak, but the time to reach the peak and the time to recover to a normal situation are not changed significantly. The results of these experiments show that by adjusting parameters, the model can be adapted to different scenarios with various intensities of flood events and topological structures of urban networks.

Discussion and concluding remarks
We have presented a predictive model which integrates flood spread characterization and network percolation processes to forecast the propagation and recession of flood in urban road networks. The model is formulated as a system of ordinary differential equations relying on three characteristics: flood propagation rate β , flood incubation rate α , and recovery rate µ , analogous to the SEIR model. Using the output of the model, the network percolation process is obtained to model the spatial patterns of the flooded road cells over time. The study showed the application of the proposed model in an empirical case study of urban flooding in Harris County road networks during Hurricane Harvey in 2017. The application of the model using empirical data informs about some key findings and implications for urban flood risk prediction. First, the extent to which flooding builds up in a network and how fast it recovers are shown to be dependent on the basic reproduction number R 0 = β/µ , which can help us infer the effective time period in response to the flooding in road networks. In the case study, we found that the basic reproduction number changed significantly when the time interval increased from 4 to 8 h. This result indicates that flood situation evolves slightly in time intervals shorter than 4 h, but, the situation changes dramatically for time intervals that are longer than 4 h. Hence, to better monitor the flood in road networks, the status of roads should be observed in every 4 h. While urban networks in different cities may have different topological properties and experience different rainfall magnitudes, the proposed model can be adapted by adjusting the parameters. Results can inform decision-makers about the spatial propagation and temporal evolution of flooding.
Second, the spatial mechanisms of flood propagation and receding in urban networks is captured based on a network percolation process. This finding highlights the contagion effects of a flooded cell on its neighbors. Since the road cells are spatial networks, different from social contact networks, the spread of flood would be limited by geographical constraints. Hence, the local contagion from a cell to another cell through the link connecting www.nature.com/scientificreports/ them would be the main spreading pattern. Practically, this finding provides an important implication for flood control in urban systems. One effective control strategy would be to increase drainage capacity and to build retention ponds around the roads with a high likelihood of flooding. This strategy can reduce the proportion of flooded cells among the unflooded cell neighbors, which can contribute to reducing the probability of flooding in the next timestamp. Third, the model and its application show good performance in predicting the scale and locations of flooded road cells in disasters. In addition to the response strategies, this model could also support proactive strategies for coping with future flood events. In particular, the proposed model can be incorporated in an early warning system. The system can help officials and the public be aware of the flood situations in the coming hours so that they can perceive flooding risks surrounding them and make proactive preparations. For example, cars and buses drove through floodwater during Hurricane Harvey 43 . With the predicted outcomes of this contagion model, people can be informed about roads at a high risk of flooding in the next few hours. This predictive information could significantly contribute to reducing the economic loss of transportation agencies, and possible loss of life of residents. Finally, the model we introduced in this paper is simple and robust and can be adapted to various phenomena. Unlike hydrodynamic models and machine learning models that rely on significant data and computational resources, the proposed mathematical model provides a simple but powerful tool for predicting the spatial-temporal evolution of flooding in urban networks. Also, the proposed network percolation-based contagion model can be adapted for modeling other network spread phenomena. Future studies can further investigate the proposed model in other general predictive tasks, such as the spread of traffic congestions in road networks 44 , infectious diseases in human contact networks 45 , and innovations in global communities 46 . This model also has some limitations; for instance, for initial flooded segments without flooded neighbors, it is usually difficult to predict these segments without other information. Future studies can focus on improving the model to precisely predict initial flooded road segments based on rainfall magnitude and capacity of urban drainage systems. In addition, the model considers the parameters such as propagation rate being constant throughout the process, in order to provide a simple but fairly accurate tool for flood prediction. Future studies could extend our model by considering the dynamics of the beta to improve the performance of the model.

Data availability
The data that support the findings of this study are available from INRIX, but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available.