Revealing the day-to-day regularity of urban congestion patterns with 3D speed maps

In this paper, we investigate the day-to-day regularity of urban congestion patterns. We first partition link speed data every 10 min into 3D clusters that propose a parsimonious sketch of the congestion pulse. We then gather days with similar patterns and use consensus clustering methods to produce a unique global pattern that fits multiple days, uncovering the day-to-day regularity. We show that the network of Amsterdam over 35 days can be synthesized into only 4 consensual 3D speed maps with 9 clusters. This paves the way for a cutting-edge systematic method for travel time predictions in cities. By matching the current observation to historical consensual 3D speed maps, we design an efficient real-time method that successfully predicts 84% trips travel times with an error margin below 25%. The new concept of consensual 3D speed maps allows us to extract the essence out of large amounts of link speed observations and as a result reveals a global and previously mostly hidden picture of traffic dynamics at the whole city scale, which may be more regular and predictable than expected.

appears to be sufficient when studying fully connected graphs (like in image segmentation where pixels are connected to all its neighbors), it is no longer the case when studying transportation networks. When setting /0 = 0 when two network links that are not adjacent in a transportation network, we observe that the clusters are more identified as communities, i.e. groups of links having high edges density intra-cluster and low edge density inter-cluster, than as homogeneous regions. To tackle such a problem, Saeedmanesh and Geroliminis (2016) propose to use a complete similarity matrix using the snake similarity for every pair of links. Thus, the Ncut algorithm based on snake similarity (S-Ncut) does not guarantee full intracluster connectivity of nodes. S-Ncut requires a post-processing algorithm just like k-means and DBSCAN. The description of the algorithm is provided in section (S3).

S2 Rationales behind the choice of the a value -weighting of speed vs. space and time values
To perform network clustering based on Euclidian distances (DBSCAN and k-means methods), after normalizing each factor (speed, space-x, space-y and time), we weight speed three times more heavily (a=3) compared to all other three variables. The weighting factor is unique for all situations. We describe here the rationales for this choice in details and provide comparisons with other a values. For clarifications about post-treatment operations, readers are referred to the following section S3.
First, it should be noted that a only influences the results from the DBSCAN and k-means methods as Ncut is not using Euclidean distances but the snake similarity matrix. The results are sensitive to a for the following reasons. When a is low (close to 1), the clustering favors closeness between links, i.e. the spatial distance. The time-distance is not of great influence here as the points are evenly distributed in time for all links. So, the resulting clusters will not be very homogeneous in speeds: G values are high and G values are low, see figures S2(a) and S2(c). When a is very high, the speed values dominate and the clustering results are very good in terms of intra-cluster homogeneity (low G values) and inter-cluster dissimilarity (high G values). However, clusters can gather links from mostly anywhere in the network as only speed values matter. If we assume that close links in distance are more likely to be connected than distant ones, the immediate corollary is that post-treatment will be much easier and quicker in terms of computational times when a is low than when it is high. Furthermore, the discrepancies between the clustering results before and after the posttreatment quickly increase with a. This is confirmed by comparing figures S2(a) and S2 (b) (b) and S2(d), stabilize after post-treatment when a is higher than 6, as the post-treatment annihilates any gains coming from increasing a during the initial clustering operations. The same figures show that a values lower or equal to 2 fail in the end (after post-processing) to provide good clustering results as the intra-cluster homogeneity is poor (high G values) and so is the intercluster dissimilarity (low G values). So, the results of this sensitivity analysis show that a should be between 3 and 6 for the best. We can further elaborate on the choice of a=3 in the paper. First, Figure S2(g) shows a quick increase of the computation times when a is increasing because the post-treatment algorithm takes more time to re-organize the clusters to ensure full intra-cluster connectivity. This is again because when a is increasing the initial clustering focuses more and more on speed values only and gathers together links that are quite far (and that are thus more likely not to be connected). For example, the mean computational time for one day and the posttreatment is equal to 320 s, 604 s, 1080 s and 1800 s when a increases from 3 to 6. As we had to repeat the clustering operations a huge number of times in the paper to test different factors like the influence of the clustering method, the number of clusters and the day, a value of a=3 permitted us to speed up the calculations significantly. Second, for the same reason, when a is increasing the final results are more dependent on the post-treatment algorithms than on the initial clustering. Since clustering algorithms have been extensively studied in the literature, whereas the post-treatment algorithm has been self-designed here, choosing a=3 permitted us to put more weight on the results of the initial clustering while having final G and G values (after post-treatment) that are close to what is obtained with higher a, see Figures S2(b) and S2(d). Third, it is important to recall here that the results of k-means method with a=3 clearly outperforms the S-Ncut for all 35 days, see Figure 2b in the paper. So, a=3 looks sufficient to have good clustering results while balancing the computational times. In follow up work it would be interesting to see if the results of the travel times estimation based on the consensual clustering shapes could be further improved by taking a between 4 and 6. We nonetheless would like to stress here that the results with a=3 are already very good and certainly impressive when considering the simplicity of the method and the spatial coverage that we can achieve (basically the whole city area).

S3 Post-treatment of clustering results to ensure connectivity in each cluster
The three-step post-treatment algorithm (Lopez et al., 2017) is described as follows: Step 1 -Initialization. The connected components (CCs) are identified within each cluster. A CC is a sub-graph where links are connected to each other. For each cluster, corresponding CCs are sorted in decreasing order based on the number of links.
Step 2 -Allocation. The biggest CC in each cluster is assigned as the initial cluster. The remaining CCs from every cluster are considered as candidates for merging.
Step 3 -Merging. Assign iteratively the candidate CC to an existing cluster in order to minimize the increase of the link speed variance within this cluster. The constraint is the connectivity between a given candidate CC and the candidate cluster. Note that the size of the initial clusters grow as CCs are merged iteratively. The merging process to reach the desired number of clusters has been initially proposed by Ji and Geroliminis (2012).
Figures S3(a) and S3(b) show respectively the effect of post-treatment on G and G for a particular day and different numbers of clusters. This effect is assessed based on the variation of the G and G after and before the post-treatment. It appears that he posttreatment has little effect on G and G results when S-Ncut is applied. For k-means and DBSCAN, the post-treatment can either deteriorate or improve the G and G . Note that a deterioration means a positive value for G * − G and a negative value for G * − G , where the indexes with a star (*) correspond to the values after post-treatment. A deterioration is mostly observed from k-means results whatever the number of clusters is. DBSCAN results look to be often improved by the post-treatment in particular when the number of clusters is large (higher than 10). However, it is important to consider not only the variations of the indicators but also the final values. Furthermore, no definitive conclusion can be given from a single day. Figures S3(c) and S3(d) show a more comprehensive look at the distribution of the variations of G and G after and before post-treatment for all 35 days for number of cluster equal to 9 (optimal number). This confirms that S-Ncut results are not significantly modified by the post-treatment as the variations of G values are close to 0% for most days. The posttreatment increases G values in mean by 0.9% for DBSCAN and by 4% for k-means, see Figure S3(c). This is not surprising as adding requirement (i) about full connectivity between links within each cluster require to reshape the clusters and deteriorate the optimal allocation that was made by the clustering algorithm without considering this constraint. S-Ncut results lead to very small discrepancies compared to the other two methods because S-Ncut resorts to a similarity matrix that already partly accounts for connectivity between links, while DBSCAN and k-means only account for proximity through the distance between links. Similar trends are observed for the G values but to a lesser extent. Again, the variations with S-Ncut are mostly close to 0 for all days, see figure S3(d). The variations for DBSCAN are close to 0 in mean but the amplitude is important. The variations for k-means are negative and equal in mean to -0.6 m/s, see figure S3(d). Thus, the main conclusion here is that the posttreatment has no impact for S-Ncut and mainly deteriorates G values for k-means and DBSCAN. The variations are much pronounced for k-means but remain below 5% for most days (this is one of the reason for our choice for a=3 when weighting the speed values, see supplementary S2). However, DBSCAN and k-means results in one hand and N-Cut results in the other hand cannot be directly compared before post-treatment as N-Cut already favors connectivity between links through the definition of the similarity matrix. Furthermore, the DBSCAN method seems to less deteriorate G and G values but the results from the initial clustering are much better with the k-means. So, because requirement (i) should be fulfilled in the end by all the methods what matters is the comparison of the G and G values after post-treatment. There, the k-means outperforms the other two methods, see Figure 2b in the paper.

S4 Defining a common network for all days
After mapping the shortest path to the Amsterdam Geographic Information System (GIS) network, there are still 7512 links in the link network, which is quite large. Hence, we employ network coarsening techniques to remove nodes that satisfy certain criteria (Lopez et al., 2017). In this work, the nodes are removed based on the link weights. If two links have the same weight, the node connecting them is removed. The weight for each link, in our case, is the estimated speed of each link. As we only require speed for one time slice for assigning the weights for coarsening, a peak period time slice (4pm) was chosen. This is because the network will exhibit most variance during the peak period during which the largest differences in speed link speed occur. If we chose a non-peak period, these variances will be smoothed out. As for each day, the network state (speed of each link in the network) at 4pm is different, we will have 35 different coarsened networks for the 35 days. This is, especially, predominant in the case of weekends where some links of the network have too few observations compared to the other days. Thus, we need to extract a common network from these 35 coarsened networks. The common network is the intersection of the links in the 35 coarsened networks so that the shared dynamics of the data that is present in every day is captured. Meta-clustering methods require the same dataset which is why a common network is necessary.

S5 Quality of the classification of all days
This supplementary section (S5) presents more details about the quality of the classification of days intro groups. As explained in the paper, we use the NMI index to assess the similarity between two original clustering shapes related to two different days and . Let recall that the results of the initial clustering of day can be put into a single ordered vector of all observations p / , whose values are the cluster ID. Figure S5(a) presents the distribution NMI values for all couples of days among the total population, i.e. the 35 days. The distribution looks Gaussian with a mean and a standard deviation respectively equal to 0.59 and 0.056. The extreme values are 0.43 and 0.78. Figure S5(b) shows now for each training set (12 replications) the quality of the clustering into two or four groups based on Ncut. The quality is calculated as follows: where S is the number of groups, 5 is the number of days in group . This indicator basically represents the global average of the mean NMI for each group. The higher it is, the more homogeneous all groups are. Figure S5(b) demonstrates that considering 4 groups significantly improves the quality of the day clustering compared to only two groups for 11 training sets over 12 (the exception is replication No. 2). This is why we are using 4 groups in the paper. Note that the mean NMI value for all days and each training set is also presented in figure  S5(b) (1 group case) to show that the quality of the partitioning really improves when using 4 groups.

Figure S5 -(a) Similarity distribution between all days and (b) quality measurement all training sets
Finally, table S5 permits to better assess the quality of the partitioning into 4 groups for all training sets. It presents, for all replications, the mean and the amplitude, i.e. the difference between the max and the min, of the NMI values related to all day couples within each group (G1 to G4). These values have to be compared with the same index but for all the days without partitioning (All). First, it appears that the amplitudes within groups are significantly lower than the amplitude of the total population of days for all replications. The mean amplitude for all groups and all replications is 0.19, which is nearly half of the mean amplitude for the total population. Second, most of mean NMI values are higher than 0.59 (90% in total). Recall that 0.59 is the median of the original distribution of the NMI values as shown in figure S5(a), meaning that 50% of the NMI values over the global population of days are below 0.58. Combined with low amplitudes within groups, this confirms that the partitioning into groups lead to close NMI values for days belonging to the same group, meaning similar clustering shapes. So, the analysis of table S5 permits to conclude that the 4 groups obtained after the classification of the 28 days are homogeneous as expected for all training sets. All  G1  G2  G3  G4  All  G1  G2  G3  G4  1 0

S6 Travel time estimation on the training and validation set: absolute errors
This supplementary section (S6) presents the travel time prediction errors with absolute values. Figure S6 . All these confirm the conclusions given in the paper, which is that (i) most of the errors come from averaging speed inside a cluster and (ii) the consensus congestion map for a group (consensus cluster shape and mean speed of all days within each cluster) is sufficient to provide accurate travel time predictions (comparison of M3 vs. M1).
In figure S6(b), the mean and the median values are -0.7 min and -0.3 min, the 25 th and 75 th percentiles are -2.4 min and 1.2 min and the 10 th and the 90 th percentiles are -4.3 min and 2.5 min. Again, the conclusions of the paper are confirmed here: despite its simplicity and its very low computational cost, the travel time prediction method that consists in assigning new observations to an historical consensual congestion map lead to accurate travel time predictions for most trips.