Temporal-topological properties of higher-order evolving networks

Human social interactions are typically recorded as time-specific dyadic interactions, and represented as evolving (temporal) networks, where links are activated/deactivated over time. However, individuals can interact in groups of more than two people. Such group interactions can be represented as higher-order events of an evolving network. Here, we propose methods to characterize the temporal-topological properties of higher-order events to compare networks and identify their (dis)similarities. We analyzed 8 real-world physical contact networks, finding the following: (a) Events of different orders close in time tend to be also close in topology; (b) Nodes participating in many different groups (events) of a given order tend to involve in many different groups (events) of another order; Thus, individuals tend to be consistently active or inactive in events across orders; (c) Local events that are close in topology are correlated in time, supporting observation (a). Differently, in 5 collaboration networks, observation (a) is almost absent; Consistently, no evident temporal correlation of local events has been observed in collaboration networks. Such differences between the two classes of networks may be explained by the fact that physical contacts are proximity based, in contrast to collaboration networks. Our methods may facilitate the investigation of how properties of higher-order events affect dynamic processes unfolding on them and possibly inspire the development of more refined models of higher-order time-varying networks.


Introduction
Interactions among individuals are usually experimentally measured as time-resolved records of face-to-face contacts between couples of people in controlled social setting such as workplaces, hospitals, schools and conferences.These time specific records are thus collected in the form of dyadic interactions, and have been effectively studied in the framework of evolving (temporal) networks, where each link between two nodes is activated only when the node pair interacts [1][2][3] .The temporal patterns of link activations (or contacts) in real-world networks are far from being fully random nor deterministic 4 .Contacts between a pair of nodes usually occur in bursts of many contacts close in time followed by a long period of inactivity 5 and the time between two consecutive interactions is usually fat-tailed distributed [6][7][8] .Such temporal properties of contacts influence the dynamic processes unfolding on the network [9][10][11][12][13][14][15][16][17] .Despite these tremendous advances in the last decade, studies on temporal networks have traditionally focused on pairwise interactions only.However pairwise interactions can only partially capture interactions among constituents of a system 18,19 .For example, a neuron may receive the output from or send a signal to many different neighbouring neurons 20 , individuals may gather in groups 21 , and scientific collaborations are not limited to couples of authors 22 .Such interactions are named higher-order, to emphasize that they involve more than just a couple of nodes.Benson et al. 23 showed that a generalization of triadic closure seems to lead the first activation of a given hyperlink.On the other hand, Cencetti et al. 24 focused on temporal inhomogeneities of activations of the same hyperlink.The focus so far is on the prediction of hyperlink activations 23 or on pure temporal properties of higher-order events 24 .However, the interplay between temporal and topological properties of higher-order events, e.g. if higher-order events close in time tend to occur also close in topology, remains far from well understood.Hence, this work aims to systematically characterize the relation between temporal and topological properties of higher-order events to compare higher-order temporal networks.Inspired by our recent work that characterizes temporal and topological properties of dyadic interactions in temporal networks 25 , we redesign the characterization method for higher-order events.In particular, we are going to explore such properties from three perspectives: 1) The interrelation between the distance in topology and the temporal delay of events, 2) Their correlation or overlap in topological location 3) The temporal correlation of local events that overlap in component nodes.In order to compare real-world networks with different sizes, we design null models where temporal and topological properties of events of an arbitrary order are systematically destroyed or preserved.We applied our methods to 8 real-world physical contact networks and 5 collaboration networks.We show that, in physical contacts, events of different orders with short temporal delay tend to be close in topology too.We then investigate the correlation of events in topology and discover that events of different orders are likely to overlap in component nodes.In particular, nodes who participate in many different groups (events) of a given order are likely to be involved in many different groups (events) of another order.Individuals do not reduce their number of interactions of one order due to frequent interactions of another order.Finally, we show that those local events that overlap in component nodes are correlated in time, which supports the finding that events close in time are also close in topology.In collaboration networks, we observe that events also overlap in component nodes.However, the correlation between topological distance and temporal delay of events are usually either weak or absent.Coherently, in collaboration networks, the temporal correlation of local events that overlap in component nodes is almost absent.Such differences between physical contacts and collaboration networks may be due to the fact that physical interactions are partly driven by proximity, so that a set of individuals close to each other tend to interact close in time among (subsets of) them.
Our methods can be applied to compare real-world higher-order networks and to investigate how the properties of their events affects the dynamic processes unfolding on them.More realistic models of higher-order evolving networks can be further developed to reproduce specific properties of the higher-order interactions observed in this paper.

Definitions
2.1 Higher-order evolving networks Time-varying social interactions or contacts have been mostly measured pairwise and studied with the formalism of (pairwise) temporal networks.A temporal network observed at discrete time within [0, T ) can be described by G = (N , C ), where N is the set of nodes or individuals, C is the set of pairwise interactions.If node u and v have a contact at time step 0 ≤ t ≤ T − 1, ( ,t) ∈ C , where = (u, v) is the link connecting the pair of nodes between which the contact occurs.The contact ( (u, v),t) can be regarded as the activation of the link (u, v) at time t.This traditional temporal network representation records social contacts as a set of pair-wise interactions.However, individuals may gather in larger groups, so that more than two people interact with each other at the same time.For example, an interaction (h(i, j, k),t) among three nodes at time t is usually measured and recorded as three pair-wise interactions ( (i, j),t), ( ( j, k),t) and ( (i, k),t).Social interactions can be more precisely represented as a higher-order evolving network H = (N , E ) (or temporal hypergraph, following the definition of Cencetti et al. 24 ), where E is the set of events of arbitrary orders.Such group interaction or higher-order event (h(u 1 , . . .u d ),t) can be regarded as the activation of the corresponding hyperlink h(u 1 , . . .u d ) at t.The size or order of the interaction is d, where d is the size of the group.The pairwise time aggregated network of a traditional pairwise temporal network is G = (N , Λ), where any couple of nodes (i, j) is connected by a link (i, j) ∈ Λ if (i, j) has been active at least once during the entire observation time [0, T ).Consistently, the higher-order time aggregated network is H = (N , L ), where any set {u 1 , . . .u d } of d nodes are connected by a hyperlink h(u 1 , . . .u d ) ∈ L with size d if h(u 1 , . . .u d ) has been activated at least once.The activity of each hyperlink h can be represented by a time series X h = {x h (t), 0 ≤ t < T } where x h (t) = 1 only if the hyperlink h is active at time t, i.e., e = (h,t) ∈ E .

Temporal and topological distance of events
The temporal distance or delay between two events e 1 = (h 1 ,t) and e 2 = (h 2 , s) is T (e 1 , e 2 ) = |t − s|.
The topological distance, also called hop-count, between two nodes on a pair-wise static network is the number of links contained in the shortest path between these two nodes.We define the topological distance η(e 1 , e 2 ) between two events e 1 = (h 1 ,t) and e 2 = (h 2 , s) as the topological distance between the corresponding two hyperlinks h 1 and h 2 , which is further defined as follows.The distance between the same hyperlink is zero, e.g., η((h 1 ,t), (h 1 , s)) = 0.The distance between two different hyperlinks h(u 1 , . . ., u d ) and h(v 1 , . . ., v d ) with size d and d , respectively, follows where δ (u, v) is the distance or hop-count between node u and v on the unweighted pairwise time aggregated network G.
The distance between two events is thus one plus the minimal distance between two component nodes from the two events respectively.For example, the distance between events e 1 = (h(i, j, k),t) and e 2 = (h(i, m, n), s) is η(e 1 , e 2 ) = 1.

Network randomization -control methods
To detect non-trivial temporal and topological patterns of events, we compare properties obtained from real-world higher-order temporal networks with those of designed null models.We generalize the randomized reference models of pairwise evolving networks which gradually preserve and destroy temporal and topological properties of pairwise interactions [25][26][27] for higher-order temporal networks.Given a higher-order evolving network H and any given order d of events, we introduce 3 randomized null models H 1 d , H 2 d and H3 d which systematically remove or preserve specific temporal or topological properties of order d events only, while preserving the properties of events of any other size d = d.We denote as E d the set of events with the same size d.Randomized network H 1 d is obtained by randomly re-shuffling the time stamps of the events in E d , without changing the topological locations of these events.This randomization does not change the total number of activations of each hyperlink, nor the probability distribution of the topological distance of two randomly selected events.As mentioned in Subsection 2.2, the activations of a given hyperlink h can be represented by a time series X h .The randomized network H 2 d is obtained by iteratively swapping the time series of two randomly selected hyperlinks of size d.In H 2 d , the inter-event time distribution of the activity of a random hyperlink of order d is preserved as in the original network H .The third randomized network H 3 d is obtained by swapping the activity time series of two randomly selected hyperlinks with the same size d and the same total number of activations.This randomization does not change the number of activations of any hyperlink, the distribution of the topological distance of two random events, nor the inter-event (order d events) time distribution.These three randomized models preserve the unweighted higher-order time aggregated network H and the probability distribution of the temporal distance of two random events of size d.

Datasets
We will apply our method to 13 real-world datasets of human physical interactions and scientific collaborations.The first 8 datasets are collections of face-to-face interactions at a distance smaller than 2 m in several social contexts such as conferences (HT2009, SFHH), hospital, primary school (PS), high schools (HS2012,HS2013), workplace (WP2) and museum (Infectious).Face-to-face interactions are recorded as a set of pair-wise interactions.Based on them, we deduce group interactions, by promoting each set of d 2 dyadic interactions occurring at the same time and forming a fully connected clique of d nodes to an event of size d.Since a clique of order d contains all its sub-cliques of order d < d, only the maximal clique is promoted to a higher-order event, whereas sub-cliques are ignored.For example, 3 pairwise contacts ( (i, j),t), ( ( j, k),t) and ( (i, k),t) occurring at the same time t are regarded as a single event of order 3 i.e., (h(i, j, k),t) without any order 2 event.This method has been already used by Cencetti et al 24 .to deduce higher-order interactions from datasets of human face-to-face interactions.We further preprocess these datasets by removing nodes which are not connected to the largest connected component in the pairwise time-aggregated network.We also remove long periods of inactivity, when no event occurs in the network.Such periods usually correspond, e.g., to night and weekends, and are recognized as outliers in the inter-event time distribution of the time series which records the total number of events per timestamp.Such data pre-processing method has also been used in our recent work 25 .The other 5 higher-order collaborations networks are obtained based on scientific papers recorded in the arxiv in various fields: lattice high energy physics (hep-lat), theoretical nuclear physics (nucl-th), quantitative biology (q-bio), quantitative finance (q-fin) and quantum physics (quant-ph).In a collaboration network, each node represents an author, and an event of order d occurrs at time t if a paper co-authored by d authors is published at t. Assigning papers to the correct authors is not easy.The same author can be named differently, e.g., using the full or initial of the first name and typographic errors may be present.Thus, we applied standard text preprocessing methods to authors' name, and we identify each author by the initials of their first names, together with their surname according to the method of Newman et al. 28 .The total number of events of each order in each real-world temporal network is shown in Appendix (Figures S1 and S2 in Supplementary Material): In each dataset, the number of events with order 2 ≤ d ≤ 4 is not negligible; however events with an order larger than 4 are rare (if not absent) in most of the physical contact datasets.Details of the datasets after preprocessing are given in Table 1.

Characterizing temporal-topological properties of networks
In this Section we introduce a systematic characterization method of higher-order temporal networks.We characterize the temporal and topological properties of events from three different perspectives.In Subsection 4.1, we analyze the interrelation between the temporal and topological distance of two arbitrary events of different orders.In Subsection 4.2, we study the topological correlation of events, i.e., how events of different orders overlap in component nodes.Finally, Subsection 4.3 introduces a method to characterize the temporal correlation of events occurring close in topology.In this subsection we investigate how temporal and topological distance of events are related to each other.Specifically, we aim to understand to what extent events close in time are also close in topology.In our previous work 25 , we considered all interactions in a temporal network as pairwise interactions alone and found in real-world physical and virtual contact networks that pairwise interactions that are close in time tend to be close in topology (in the pairwise time aggregated network).Here, we generalize the method of characterizing the relation between topological and temporal distance of two dyadic interactions to that of two higher-order events with different orders.In this analysis, normalizations in topological distance and randomizations in networks have been applied so that we can compare real-world temporal networks with different properties in e.g., the number of nodes and contacts.We take order d = 3 as an example to illustrate our method and observations.In Figures 1 and 2 we investigate the average topological distance E[η[(e, e )|T (e, e ) < ∆t, e ∈ E d , e ∈ E \ E d ] between two events (e, e ) with different orders d = d , given that their temporal distance is smaller than ∆t in physical contact and collaboration networks, respectively.
We observe an increasing trend of the normalized average topological distance between events with their conditional temporal distance in physical contact networks, but generally not in collaboration networks.Thus, in physical contacts, events of different orders that occur close in time tend to be also close in topology.The slope of this increase indicates the relative strength of temporal-topological correlation.The highest slopes are observed in Infectious, Workplace and Hospital networks.In contrast, this slope is small around zero in the corresponding randomized network H 2 d , H 2 d and H 3 d .Hence, the randomization remove the temporal and topological correlation.

Topological correlation of events with different orders
To better understand the observed correlation between temporal and topological distance of events, we explore further whether higher-order events overlap in component nodes (correlation in topology) in this subsection and whether events that overlap in 5/22 topology are correlated in time in Subsection 4.3.Higher-order events that overlap in component nodes and occur close in time may partially explain the observed temporal and topological correlation between events.Would a node that belongs to many hyperlinks of order d, also be connected to many hyperlinks of order d = d?To investigate this question, we examine the number of hyperlinks of each order that a node belongs to in the higher-order time aggregated network.The total number of order d hyperlinks that the node u is connected to, denoted as k d (v), is also called the d-degree of node v.In Figure 3 (4), we compare the d-degree and the d -degree of a node when (d , d) is equal to (3,2), (4,2) and (4,3) respectively in each physical contact (collaboration) network.We focus on the case when (d , d) is equal to (3,2), as an example.We observe that the d -degree of a node is an increasing function of the d-degree of the node in every considered collaboration and physical contact networks.Hence, a node that participates in many groups of order 3, tends to involve in many groups of order 2. When (d , d) equals to (4,2) and (4,3), such trend is less evident in physical networks (especially in WP2, HS2012, Infectious and HT2009) and remains evident in collaboration networks.This is likely because the number of order 4 hyperlinks is generally low (see Figure S3 in Supplementary Material) in physical contact networks, but not in collaboration networks (see Figure S4 in Supplementary Material).
Furthermore, we investigate whether a node that involves in many order d events tends to join many order d interactions.The number of order d events that a node v is involved in, denoted by s d (v), is also called the d-strength of node v. Similar to our analysis of the d-degree and d -degree of node, we find the d-strength and d -strength of a node are also positively correlated when (d , d) equal to (3,2) in each temporal network, as shown in Figures 5 and 6.This trend is less evident only in physical contacts that have few order 4 events, when (d , d) is equal to (4,3) and (4,2).This suggests that an individual's large number of interactions of one order would not reduce his or her number of events of another order.Individuals tend to be consistently active or inactive in events across orders.
. Note that both axes are presented in logarithmic scales.In total 30 logarithmic bins are split for horizontal axis.

c) b) a)
Figure 9. Schematic representation of a) the egonetwork of the hyperlink h(i, j, k), i.e. ego(h(i, j, k)), b) the time series associated to links belonging to ego(h(i, j, k)) , c) the time series of the activity of ego(h(i, j, k)) , which is the sum of the time series of hyperlinks belonging to the egonetwork, and its event trains identified when ∆t = 2s.
We then evaluate the temporal correlation of the time series of an egonetwork ego(h d ), to understand whether the activation of the center hyperlink h d tend to cluster in time with the activation of the other low order hyperlinks in the egonetwork ego(h d ).
Our analysis method is based on the concept of event trains, proposed by Karsai et al. 5 .A train of events is a sequence of consecutive events whose inter-event times are shorter than or equal to a reference temporal interval ∆t and separated from the other contacts by an inter-event times larger than ∆t.Given a ∆t and an activity time series of an egonetwork ego(h d ), trains can be identified, as exemplified in Figure 9.Given ∆t and an order d, we identify all the trains for each activity series of the egonetwork centered at each order d hyperlink.The size of a train is the number of events the train contains.Then, we examine the size distribution Pr[S * d = s] of the identified trains in which a center hyperlink has been activated at least once.The timescales of physical contacts and collaboration networks are different.The two classes are measured per step of seconds and day respectively.To illustrate our method and findings we consider ∆t = 60s (60d) in physical contact (collaboration) networks to identify the trains in each egonetwork.The choice ∆t = 60s is also motivated by the observation in Figure 1 that we start to observe the positive temporal and topological correlation of higher-order events since ∆t is about 80s in physical contact networks.Moreover, we observe the same when ∆t = 120s (120d) in physical contact (collaboration) networks in the coming analysis.
Figure 10 and 11 show the train size distribution Pr[S * 3 = s] of the egonetworks centered at each order 3 hyperlink in each physical and collaboration network H and its three null models H 1 3 , H 2 3 , H 3 3 .The three randomized reference models distance, which may facility the interaction of a subgroup, resulting in events close in time and topology.

Conclusion
In this paper, we have proposed a method to systematically characterize temporal and topological properties of events of arbitrary orders.We applied our methods to 8 physical contact and 5 collaboration higher-order evolving networks and observe their difference.In physical contacts, events close in time tend to occur also close in topology.Moreover, events usually overlap in component nodes and these local events overlapping in component nodes are also usually correlated in time.Such temporal correlation of local events supports again the correlation between temporal and topological distances of events observed in our first analysis.Differently, in collaboration networks, the temporal and topological correlation of events is either weak or absent.Despite events also overlap in component nodes, their temporal correlation almost disappears in collaboration networks.The detected dissimilarities between physical contacts and collaboration networks could be related to a fundamental difference between the two kind of networks.In physical contacts individuals participate in events driven by physical proximity.The physical proximity of individuals that participate in a higher-order event may facilitate interaction of them or a subgroup in the near future.The time of scientific collaborations are likely driven more by their content and creation process.Via our analysis of the topological overlap of events with different orders in component nodes, we also observe similarities between the two kinds of networks.Nodes that participate in many events (groups) of a given order tend to interact in many events (groups) of a different order.Hence, nodes are consistent in interactions with respect to frequency and diversity across different orders.
Our method explores the temporal and topological relation of the basic building block of events, the activations of fully connected cliques.A promising direction could be generalizing this method to the activations of relevant motifs, and to investigate the interplay between topological location and temporal delay of such structures.Beyond, our method can be applied to compare different classes of networks (e.g.biological, brain or collaboration networks) and to explore how detected properties/patterns of a network can influence the dynamic processes unfolding on the network.Finally, the topological and , between an order d = 2 event and an event of a different order, in each physical contact network and its corresponding three randomized null models H

3 Figure 3 .
Figure 3.The d -degree k d (v) versus the the d-degree k d (v) of a node v when (d , d) is equal to (3,2) (blue line), (4,2) (yellow line) and (4,3) (green line) respectively in each physical contact network.Each axis (e.g., k d (v)) has been normalized by its maximum (e.g., max v (k d (v))).Only nodes whose d-degree and d -degree are both non-zero are considered.The dashed line represent the reference case

Figure 11 .
Figure 11.Probability distribution Pr[S * 3 = s] of the size S * 3 of trains (obtained from the activity series of egonetworks centered at each order 3 hyperlink), where a center link is activated at least once, in each collaboration network H (blue) and its three randomized reference models H 1 3 (yellow), H 2 3 (green) and H 3 3 (red).To identify the trains, we consider ∆t = 60s.For each network, the average size of the trains is reported.The maximum average size among network H , H 1 3 , H 2 3 and H 3 3

Table 1 .
Basic features of the empirical higher-order time-evolving networks after data processing.The number of nodes (|N |), the number of hyperlinks (|L |), the total number of events (|E |), the length of the observation time window in time steps (T ), the time resolution or duration of each time step (dt) in seconds or days and the contact type are shown.
)|T (e, e ) < ∆t, e ∈ E d , e ∈ E \ E d ] = E[η(e, e )| e ∈ E d , e ∈ E \ E d ] for any d.The horizontal axes are presented in logarithmic scale.For each dataset, the results of the three corresponding randomized models are obtained from 10 independent realizations.