Betweenness centrality for temporal multiplexes

Betweenness centrality quantifies the importance of a vertex for the information flow in a network. We propose a flexible definition of betweenness for temporal multiplexes, where geodesics are determined accounting for the topological and temporal structure and the duration of paths. We propose an algorithm to compute the new metric via a mapping to a static graph. We show the importance of considering the temporal multiplex structure and an appropriate distance metric comparing the results with those obtained with static or single-layer metrics on a dataset of $\sim 20$k European flights.

Centrality metrics are among the most common tools used to characterize the nodes of a network and individuate the key nodes having an important role for the transmission of information through the network.In particular, betweenness centrality [1] measures the importance a node has for the flow of information between pairs of nodes, assuming that information travels through geodesics, i.e. the shortest paths available between a pair of nodes.This metric has found applications especially for transportation, communication and infrastructural networks, e.g. in evaluating traffic loads [2] or in finding vulnerable nodes [3], thus offering a tool for the planning of these networks.The original betweenness centrality deals with static and single-layer networks.However, recently there has been a growing interest towards temporal networks [4] and multi-layer networks [5][6][7], driven by the fact that many real networks have these characteristics.Both the temporal and the multi-layer structure have important effects on the network functioning.For example, multiplexity eases cooperation in social interactions [8] and determines a non-trivial optimal condition for mobility in transportation and communication networks [9].Additionally, the temporal structure of the network critically influences dynamical processes [10], given that these proceed along time-respecting paths.Despite the abundant literature on each of these two aspects of network structure, less attention has been devoted to the investigation of their interplay (though see [11]) or to methods that apply to temporal multiplexes [12].Here we argue that both the multiplex and temporal structure must be accounted for when computing centrality metrics, as they have an influence on the flow of information [13].In fact, information (e.g.traffic, epidemics, rumours) can only flow along time-ordered paths and the determination of the shortest paths might depend not only on the topological length but also on time-related properties (e.g.path duration).Furthermore, the identification of the shortest paths also depends on the multiplex structure, because the flow across layers might be hindered and this can be conveyed assigning a larger length to inter-layer paths.For these reasons, here we introduce a betweenness centrality accounting for both the * silvia.zaoli@unibo.ittemporal and the multiplex structure of the network in identifying the shortest paths, and develop a method to compute it.As a concrete example, we will consider transportation networks, which can be represented as temporal multiplexes in which the layers represent different transportation modes or providers (e.g.different airlines).Transportation networks have particular temporal characteristics: first, a non-zero time is required to travel through a link (e.g. the duration of a flight) and secondly a minimum connecting time at nodes might be needed.These characteristics require special care but have rarely been considered in previous works on temporal networks (though see [13][14][15]).On a static single-layer network, the betweenness b of a node i is defined as where σ jk is the number of shortest paths from j to k and σ i jk is the number of such shortest paths that pass through i.This standard notion of betweenness centrality (as well as other topological metrics) could be applied to temporal multiplexes circumventing their structure in two ways: (i) aggregating the network across time and layers, and computing the metric on the resulting static single-layer network, or (ii) computing the metric on each single layer at each single time-step and then aggregating the results [16].These methods however discard structural information relevant for the determination of geodesics, and may result in a wrong estimation of nodes' importance.For example, if we compute the betweenness in the single layers, all the inter-layer paths are neglected, thus underestimating the importance of nodes acting as a bridge between layers.If instead betweenness is computed on the network obtained aggregating layers, an intra-layer path and inter-layer path using n links are considered of the same length, although, depending on the application, the latter should be considered longer, as mentioned before.This procedure thus potentially overestimates the importance of bridge nodes.Time-wise, considering single time-steps does not provide meaningful information if the time scale of the information flow is the same or larger than the time scale of the network evolution (which is the case e.g. for passenger flow in transportation networks and might be for epidemics on contact networks).If we instead aggregate over time, we are not able to distinguish the time-ordered paths, the only ones on which information can travel in the original temporal network.These observations call for a truly multiplex and temporal formulation of betweenness centrality.Here, we propose such a formulation and a method to compute it.Consider a temporal multiplex G = (V, E, I) where V is a set of N nodes, identical on each of the M layers, E is a set of intra-layer links, and I is a set of interlayer links connecting a node to its copies on the other layers.Here we consider the case in which all copies of each node are connected at all times, but our centrality can be easily adapted to different choices.In the air traffic application, for example, nodes represent airports and intra-layer links flights, while layers represent different airlines.Each link e ∈ E is characterized by a time of appearance and a time of disappearance, their difference representing the time it takes to travel through that link.Non-zero link durations allow to describe networks where travel is not instantaneous [13], as transportation networks.A valid path in this temporal multiplex is a sequence of edges {e 1 , . . ., e n } ∈ E such that if e i is incident to node j ∈ V on layer λ and disappears ('arrives') at time t, then e i+1 leaves from node j on any layer and appears ('departs') at time t ≥ t + δt, where δt is a minimum connecting time (as also introduced in [15]).To define a betweenness centrality for temporal multiplexes, we need to define a notion of distance to individuate the shortest paths.For temporal networks, several definitions have been introduced in the literature [14,15,[17][18][19], ranging from pure topological distances considering the number of links, to purely temporal ones considering the path duration or time of arrival or other time-related properties.Inspired by Ref. [20], we propose a definition combining duration, topological distance and changes of layers: the shortest path on the temporal multiplex G is the one miminimizing where n is the number of intra-layer links used, m is the number of inter-layer links, T is the duration of the path (from the departure of the first link to the arrival of the last one), α ≤ 1 and ε ∈ [0, ∞).The parameter α tunes which weights more between the topological length and the duration: when α = 0 L is simply the topological length, when α = 1 it is simply the duration.We will comment later on how to choose α.The parameter ε determines how much each inter-layer link is counted: if ε = 0 it is not counted at all, so that a path using m inter-layer links and n intra-layer ones has the same topological length as an intra-layer path using n links, if ε = 1 it is counted as much as an intra-layer link, and so on.The value of the parameter ε measures the propensity of information to jump between layers, or the associated cost, therefore its most fitting value depends on the application.For example, for the application to the air transportation network that we show below, a value ε > 0 is realistic as flight itineraries with interairline connections are risky from the passengers' point of view and therefore intra-airline paths are preferred.
Once the shortest paths between all pairs of the N nodes in V are found according to the proposed definition of path length, betweenness can be computed according to Eq. ( 1).Note that the shortest path between i and j ∈ V is the shortest among all the paths joining the two nodes on any two layers and at any time.In order to find all such shortest paths, we propose the following algorithm, inspired by the procedures used in single-layer temporal networks in [18][19][20]: (i) the temporal multiplex G is converted into a static single-layer network G = (V, E) (see Fig. 1), whose paths are all feasible temporal paths on G with path weight (sum of links' weights) equal to the path length L; (ii) the shortest paths between all pairs of nodes of G are found using Dijkstra's algorithm [21]; (iii) among all the shortest paths found, we select only those that are shortest also on the original temporal multiplex G.
Let us detail better each step.To convert the temporal multiplex into a static single-layer network we discretize time in windows of length ∆t (see SI for an analysis of the effects of time discretization [22]) and, for each vertex i ∈ V , in V we have a set of N T M vertices ν = (i, t, λ) with t = 1, ..., T and λ = 1, ..., M , where T is the number of time-steps.We call these the 'copies' of node i.An intra-layer link e ∈ E that links vertex i to vertex j on layer λ, and lasts from t 1 to t 2 becomes a link ∈ E in the static network linking ν = (i, t 1 , λ) to µ = (j, t 2 , λ).This link is given a weight α +(1−α)(t 2 −t 1 ).Additionally in E we have 'switching' links and 'waiting' links.Switching links allow paths to switch layer after using an intra-layer link.They are directed links between two copies of i on different layers, (i, t, λ) and (i, t, η).Switching links are present at each time t at which an intra-layer link ends in vertex i on layer λ, they are directed to all other layers and weight αε.Finally, we add waiting links allowing a path to wait in one node between the usage of two intra-layer links.Given a vertex i and a layer λ, a waiting link joins its copy at the time-step t 1 at which a link ends at i on any layer to its copy at the earliest successive time-step t 2 at which a link starts from i.This waiting link from ν = (i, t 1 , λ) to (i, t 2 , λ) allows a path arriving to ν (via an intra-layer or inter-layer link) to wait until the next available link.Then, every time that a link starts from i on layer λ, a waiting link is present to the earliest successive time-step at which another link starts from i on the same layer, so that the path can still wait an use a successive link.A waiting link joining (i, . In summary, a path on the static graph can use an intra-layer link, then either wait on the same layer for a further link or jump to another layer and wait there.Waiting links can be used one after the other (in case the path does not use the earliest next link but a further one).As noted in [15], Corresponding static single-layer network G it is possible to account for a minimum connecting time of δt needed between one intra-layer link and the other by simply assigning to each original link e ∈ E an extra duration of δt.This will increase the weight of every path by (1−α)δt, without affecting the ranking of their length.
Every path on G corresponds to a time-ordered path on G, and its weight corresponds to the length L of the original path.It is however possible that one path on G has more than one corresponding paths on G, with the same weight.These 'cloned' paths are a side-effect of our path-counting method, and they coincide except for the fact that they change layer at a different time-step or, in the case α = 1, for an additional 'free' wait at the origin or destination node.In section 3 of the SI we detail when these 'cloned' paths are present and how to exclude them from the counting.Dijkstra's algorithm is applied to each of the N T M nodes of V to find the shortest paths from that node to each of the others, with a run-time for each node scaling as O((N T M + L(M + 2)) log(N T M )), with L the number of links in G.We thus obtain a set of shortest paths of G that we can map to paths of G.Not all these paths are geodesics in G: for example if there are two links between u and v appearing at different time steps and such that the first has a smaller duration than the second, both would be shortest paths of G but only the first would be a geodesic in G. Therefore we select only those that are geodesics in G, and we use them to compute betweenness centrality.Note that the efficient recursive algorithm by Brandes [23] to compute betweenness given the shortest paths cannot be used in this case because the condition that subpaths of shortest paths are shortest paths themselves is not satisfied for a temporal network.
We illustrate the critical importance of accounting for the temporal multiplex structure by comparing the nodes' ranking obtained with the proposed betweenness centrality for temporal multiplex and with previously available methods, on the European air traffic network [22].We consider the scheduled departure and arrival times of ∼ 20k flights of September 1st, 2017.This network has N = 435 nodes/airports and M = 32 layers corresponding to single airlines or alliances.We build G with ∆t = 15 min (see SI for justification) corresponding to T = 116 and a minimum connecting time δt = 30 min.We computed betweenness centrality for four different values of α: 0, 4/5,12/13, 1.The meaning of the value of α can be understood as follows: when α = (1 − α)K, the use of an additional link (flight) weights as much as an additional wait of K time-steps (on top of the duration of the link).For example, if we deem that from the passenger point of view an itinerary using n + 1 flights has the same distance L as one using n flights but lasting 3h more, we would have K = 3 × 60 ∆t = 12 time-steps, therefore α = 12/13.We also consider four different values of ε: 0,0.5, 1, ∞.The last value corresponds to forbidding inter-layer paths, as they have infinite cost, and is simply obtained by not putting any inter-layer link in the network.We compare the results obtained with the proposed betweenness metric with those obtained with the standard betweenness b stat applied on the network aggregated across layers and time-steps.We consider two ways of aggregating: (i) in the aggregated network there is a link between i and j if there is at least one temporal link among them, in any layer; (ii) for each different temporal link between i and j in G, there is one link in the static network (that is, thus, a multi-link network).Note that the case ε = 0 is equivalent to aggregating only across layers (and not across time-steps) according to (ii).We remark that when α = 0 (i.e.only path duration counts), we only compare the case in which inter-layer paths are allowed (ε < ∞) to the case in which they are not (ε = ∞), because the value of ε has no effect.To compare results, we consider two aspects: 1) the similarity of the rankings for the airports having non-zero betweenness according to at least one of the two compared metrics, measured by the Kendall rank correlation coefficient τ ; 2) the similarity between the sets of airports having zero betweenness according to the two metrics, measured by their Jaccard index J.The coefficient τ takes values in [−1, 1], with 1 corresponding to two identical sequences and −1 to two sequences that are one the inverse of the other, while J takes value in [0, 1], with 1 corresponding to identical sets.The rankings obtained with the proposed betweenness metrics are always quite different from those obtained with the standard betweenness, in fact τ ranges roughly from 0.6 to 0.8 (Fig. 2(a)).As expected, τ increases as α approaches 1 and decreases as ε increases, since the single-layer betweenness does not account for the weight of inter-layer walks.The value of J varies slightly around 0.8 for almost all values of the parameters except α = 1, for which J ∼ 0.95 (Fig. S2).For all those cases, in fact, there are there are 30 to 74 airports having positive betweenness on the aggregated network but null betweenness with the proposed metric (Fig. 2(b)), while many less are found when α = 1, i.e. when path duration is ignored (see Fig. S3).The paths passing from these airports that are geodesics in the aggregated networks are either not temporally ordered or not minimizing the distance L due to their duration (as proved by the fact that fewer such differences are observed when α = 1).For the same reason some airports lose rank when the temporal multiplex structure of the network is considered.These results confirm the importance of considering the temporal and multiplex structure of the network to correctly identify the geodesics and rank nodes.To show that the multiplex structure has non negligible effect on the ranking, we also compare the ranking obtained with the temporal multiplex betweenness for ε = ∞ with the one obtained by summing the temporal betweenness centrality obtained on each single layer.Note that this is different from setting ε = ∞, as in the latter case we are summing the number of shortest paths across layers σ ij = λ σ ij,λ , where σ ij,λ is the number of shortest paths between i and j on layer λ, while in the former we are summing the fractions f λ = σ k i,j,λ /σ i,j,λ .The two different ranked sequences have τ = 0.73 (for α = 12/13), showing that even when inter-layer walks are prohibited it is important to take in consideration the multiplex structure.The two rankings, compared graphically in Supplementary Fig. S4, differ already in the highest positions.The Jaccard index is 0.89.Finally, we compared the rankings obtained with different values of α and ε.The ranking is quite stable when α varies within a meaningful intermediate range (4/5 to 12/13) (τ ∼ 0.93 between the rankings with α = 4/5 and 12/13 for all values of ε) and also the sets of airports having zero betweenness are very similar (J ≥ 0.95), while bringing it to the two extreme values makes a larger difference (e.g. for ε = 0, J = 0.82 and τ = 0.86 between the rankings with α = 0 and α = 4/5 and J = 0.82, τ = 0.88 between those with α = 12/13 and α = 1).Finally, we note that the airports' ranking remains very similar when the value of ε varies between 0 and 1 (τ ≥ 0.93 for α = 4/5, 12/13 between all combination of the values ε = 0, 0.5, 1), while it changes when inter-layer walks are not counted (τ ∼ 0.7 for α = 4/5, 12/13 between the ranking with ε = 1 and ε = ∞, see also Fig. S5).This suggests that, for a given origin-destination pair, there is rarely the choice between an inter-layer and an intra-layer walk with similar length, such that changing the cost of an inter-layer jump between 0 and 1 can make one more convenient then the other.In other words, if with ε = 0 an inter-layer walk is the shortest between i and j, probably there is no intra-layer walk between them or it is very lengthy, therefore the inter-layer one will remain the shortest when ε increases.
In conclusion, we proposed a method to compute betweenness centrality on a temporal multiplex.Our work provides a significant addition to the previous literature on centrality metrics, which considered only one of the two aspects at once.We proposed a definition of distance that combines information on the topological distance, the path duration and the number of changes of layer.We proposed a method to find the shortest paths according to such definition by converting the temporal multiplex to an appropriate static single-layer network.The paths found by this method are time-ordered, account for the potentially non-zero time required to travel through one link and for a minimum connecting time between links.By comparing this new metric to previous ones (static betweenness and temporal single-layer betweenness) on the network of European air transport we proved that accounting for the temporal multiplex structure of the network has an important effect on the ranking.This project has received funding from the SESAR Joint Undertaking under the European Unions Horizon 2020 research and innovation programme under grant and 30 minutes.The airports on the x-axis are ordered according to their centrality with ∆t= 5 min, the y-axis is in log-scale to enhance the differences.We observe that the difference in betweenness centrality are not very large between the different time windows, although they become larger for the less central airports.In particular, note in the left end of the plot a small number of points for which centrality is zero for smaller values of ∆t (points not appearing in the log-scale plot) but not for larger ones.The obtained rankings are very similar for the most central airports, and differ slightly for the less central ones.The Kendall correlation coefficients of the obtained ranking are 0.97 for the ranking obtained with 5 and 10 minutes, 0.96 for the ranking obtained with 5 and 15 minutes, 0.93 for the ranking obtained with 5 and 30 minutes.For the results shown in the main text we used a time window of 15 minutes together with a minimum connecting time of 30 minutes, meaning that in the worst case scenario we consider an itinerary with only 15 minutes of real connecting time.As mentioned above, this choice produces a ranking that is very similar to the one obtained with the finer discretization in 5 minutes windows.3 Excluding 'cloned' paths from the counting Every path on the static single-layer network G described in the main text corresponds to a timeordered path on the temporal multiplex G, and its weight corresponds to the length L of the original path.It is however possible that one path on G has more than one corresponding paths on G, with the same weight.This happens in two cases: (i) For inter-layer paths when, between the time-step at which the path arrives in a node v on layer λ and the time-step at which it leaves v from layer µ, more than one inter-layer link are available to jump between layers.In fact, in this case alternative paths that correspond in everything but the time-step at which they change layer are possible on G.Only one of these alternative paths should be counted, as they all correspond to the same path on G.This is obtained by only finding one shortest path for each pair of nodes in V, instead of all the possible ones (when running Dijkstra's algorithm).Note that in this way we can still find several shortest paths between each pair of nodes of V .Actual shortest paths are neglected with this procedure only if there are two paths of the same length between (v, t 1 , λ) and (u, t 2 , µ) that actually correspond to two different paths in G. However this seems very improbable in trasportation networks for a sufficiently fine time-discretization, as it would mean that two itineraries leave at the same time-step on the same layer to arrive at the same time-step on the same other layer; (ii) When α = 1, i.e.only the topological length of the path is considered.In this case, given a shortest path from i to j, a second path obtained waiting an additional time in i before the beginning or in j at the end has the same length.Therefore, for each shortest path between i and j in G several 'cloned' ones are found in G that differ by the waiting times in i and j.This problem can be fixed by eliminating, at the beginning and at the end of each shortest path found, the 'excess' copies of node i and node j and then removing repeated paths in the shortest paths list.
Note that (i) applies to all value of the parameters, while (ii) only to the case α = 1.Another case to treat with care is the case in which changes of layer are free, i.e. ε = 0.In this case, given a shortest path from i to j, a second path that coincides with the first except for some additional changes of layer would weight the same.For example, the two paths (i, t, λ) → (j, t , λ) and (i, t, λ) → (j, t , λ) → (j, t , η) are counted as two shortest paths of equal length.The solution not to have these cloned paths is simply to build G without copies of each node for each layer, since when ε = 0 the multi-layer structure has no effect on the path length.
Finally, some previous works dealing with shortest paths in temporal networks [?, ?] add to G dummy nodes, e.g. one outgoing dummy node i out and one incoming dummy node i in for each i ∈ V , such that i out has an outgoing link to all copies of i and i in has an incoming link from all copies of i.The weight of all links from and to dummy nodes is zero.The advantage of having dummy nodes is that one only needs to find the shortest paths between the N × N pairs of dummy nodes instead of the N T M × N T M pairs.However, with this choice it is not possible anymore to find all shortest paths between a pair i, j without counting also the cloned paths mentioned above.In fact, if we only find one shortest path for each pair of dummy nodes, we neglect potential other paths of the same length that are genuinely different paths in G. On the other hand, if we find all shortest paths between a pair (using a modified version of Dijkstra's algorithm), these will include the cloned paths of (i).The index J is computed as the quotient between the number of elements in the intersection and the number of elements in the union of the two sets.

FIG. 1 .
FIG. 1. Example of conversion of a temporal multiplex G (above) into the corresponding static single-layer network G (below).G has M = 2 layers and N = 3 nodes, with links a, b and c having the temporal structure (T = 5)indicated on the side (ti is the time-step during which the link appears and t f the one during which it disappears).Dashed lines represent inter-layer links.In G each of the three nodes has 10 copies, one per each layer and each of the 5 time-steps of the temporal discretization.Diagonal arrows represent the intra-layer links of G, vertical arrows are the switching links and horizontal ones are the waiting links.

FIG. 2 .
FIG. 2. (a) Correlation between the ranking obtained with the proposed betweenness centrality and with static betweenness centrality computed on the aggregated network obtained with method (i) (see text, results for method ii in Fig. S6 are similar); (b) Comparison between the ranking according to the static betweenness on the aggregated network (method (i)) and the betweenness proposed here, computed with ε = 1 and α = 12/13.Each dot represents an airport, red dots are airports having bstat > 0 but b = 0.The blue line is 1:1.

Figure S1 :
Figure S1: Comparison between the betweenness centrality obtained with different values of ∆t of 5, 10, 15 and 30 minutes.The airports on the x-axis are ordered according to their centrality with ∆t= 5 min, the y-axis is in log-scale to enhance the differences.Results were obtained with α = 12/13, ε = 0.

Figure S2 :
FigureS2: Jaccard index between the sets of airports with zero-betweenness according to the proposed betweenness centrality and to static betweenness centrality computed on the aggregated network obtained with method (i) (see main text), for different values of the parameters α and ε.The index J is computed as the quotient between the number of elements in the intersection and the number of elements in the union of the two sets.

Figure S3 :
Figure S3: Comparison between the ranking according to the static betweenness on the aggregated network (aggregated with method (i), see main text) and the betweenness proposed here, computed with ε = 1 and α = 1.Compare with Figure 2(b) of the main text.Each dot represents an airport, red dots are airports having b stat > 0 but b = 0.The red line is the 1:1 line.

Figure S4 :
Figure S4: Comparison between the ranking according to the sum of temporal betweenness computed on each single layer (b(i) = 32 λ=1 b λ (i) with b λ (i) temporal betweenness of node i on layer λ) and the betweenness proposed here, computed with α = 12/13 and ε = ∞.Each dot represents an airport.The red line is the 1:1 line.

Figure
Figure S6: (a) Correlation between the ranking obtained with the proposed betweenness centrality and with static betweenness centrality computed on the aggregated network obtained with method (ii) (see main text); (b) Jaccard index between the sets of airports with zero-betweenness according to the proposed betweenness centrality and to static betweenness centrality computed on the aggregated network obtained with method (ii) (see main text), for different values of the parameters α and ε.The index J is computed as the quotient between the number of elements in the intersection and the number of elements in the union of the two sets.