Comparing methods for comparing networks

With the impressive growth of available data and the flexibility of network modelling, the problem of devising effective quantitative methods for the comparison of networks arises. Plenty of such methods have been designed to accomplish this task: most of them deal with undirected and unweighted networks only, but a few are capable of handling directed and/or weighted networks too, thus properly exploiting richer information. In this work, we contribute to the effort of comparing the different methods for comparing networks and providing a guide for the selection of an appropriate one. First, we review and classify a collection of network comparison methods, highlighting the criteria they are based on and their advantages and drawbacks. The set includes methods requiring known node-correspondence, such as DeltaCon and Cut Distance, as well as methods not requiring a priori known node-correspondence, such as alignment-based, graphlet-based, and spectral methods, and the recently proposed Portrait Divergence and NetLSD. We test the above methods on synthetic networks and we assess their usability and the meaningfulness of the results they provide. Finally, we apply the methods to two real-world datasets, the European Air Transportation Network and the FAO Trade Network, in order to discuss the results that can be drawn from this type of analysis.

if ∑ i, j∈V max (a 1 i j , a 2 i j ) = 0 and the Weighted Jaccard distance (WJAC) as

S2 Codes/executables used for testing
The codes/executables used in the analysis have been downloaded from the following URLs (the norms of the differences between adjacency matrices, the global statistics, and the spectral methods were straightforwardly coded by ourselves): -DeltaCon: http://web.eecs.umich.edu/∼dkoutra/ -MI-GRAAL: http://www0.cs.ucl.ac.uk/staff/natasa/MI-GRAAL/index.html. The code was customized to consider as nodes similarities the degree, the clustering coefficient, and the betweenness centrality.

S3 Network models
We used three different network models for testing, namely Erdős-Rényi 3 (ER), Barabási-Albert 4 (BA) and Lancichinetti-Fortunato-Radicchi 5, 6 (LFR). Tables S1 and S2 summarize the values of the parameters used to generate instances of each network model. The ER graphs were generated by setting the proper number of edges to achieve the desired densities. For the BA graphs, we tuned the number of edges to add at each step of the algorithm to get as close as possible to the required densities.
The LFR networks were generated with the algorithm and code by Lancichinetti, Fortunato and Radicchi 5,6 ; both the degree and the community size distributions are assumed to be power law with exponent γ and β , respectively, and a parameter µ is set to tune the strength of the community structure: each node will share a fraction µ of its links with nodes outside its community, thus lower values of µ denote graphs with stronger community structure. The generation of an instance of an undirected LFR graph is realised by an algorithm whose main steps are the following: -The degree of each node is assigned, from a power law distribution with exponent γ.
-The size of each community is assigned, from a power law distribution with exponent β .
-Nodes are randomly assigned to communities with an iterative procedure.
-Some rewiring steps are performed to enforce the condition on the fraction of links µ shared by each node inside its community.
The available code does not produce a network with the exact required number of edges, so we run it multiple times and took the network with the number of edges closest to the prescribed value. What "close" means is specified by the Edge tolerance parameter reported in Tables S1 and S2, so that the number of edges of the generated network belongs to the interval [Required edges ± Edge tolerance].
As for directed networks, for an ER graph the direction of edges is chosen at random with equal probability. In the case of BA graphs, at each algorithm's step a new vertex with the chosen fixed number of out-links is added, each link pointing to an already existing node with probability proportional to the in-degree of that node; this yields a BA network with hubs having large in-degree. Finally, the generation of a LFR network follows the same algorithm presented above, with the only difference that in the first step nodes are assigned the in-degree from a power law distribution with exponent γ and the out-degree from a δ distribution. The constraints needed in the undirected case are generalized to fit the directed case 6 .
The numerical values of the parameters used to generate both undirected and directed networks are reported in Tables S1 and S2 for networks with 1 000 and 2 000 nodes, respectively.   Figures S1 and S2 show the results of the perturbation tests on undirected networks with 0.05 edge density. The results are essentially equivalent to those obtained for networks with 0.01 density and described in the main paper: among other features, we again highlight the different behaviour of KNC and UNC methods, the fact that the LFR curve almost always stays above the other curves in the KNC methods, and the high variability of GCD-11. The diameter proves completely inadequate as a distance. The only qualitative difference with respect to the 0.01 case concerns the PDIV distance: the initial step is not as sharp as before, and almost all curves, with the exception of the ER curve in the pADD-test and pREM-test and of the LFR curve in the dSWI-test, saturate immediately. In conclusion, in the undirected case a change in the density of the graphs does not seem to heavily influence the behaviour of the results of the different methods. Figures S3 to S5 show the results of the perturbation tests on directed networks with 0.05 edge density. We essentially observe the same qualitative results as in the 0.01 density case. The DGCD-129 distance presents now a pronounced step in the first few perturbations, except for the pREM-test. As for the undirected case, it does not seem that the methods for directed networks are influenced by changes in the network density. In Figure S6 we show the dendrograms for the clustering tests on undirected networks that were not included in the main paper. As already pointed out, these three methods are able to group network of the same class, but only if they have the same size and density.

Directed networks
In Tables S3a and S4a we report the values of the AUPR metric obtained from the Precision-Recall analysis for undirected and directed networks, respectively. To verify that the methods perform well in grouping networks of similar size and density, we repeated the test for each one of the four size/density subsets. The results are in Tables S3b and S4b.  Tests on real-world networks European Air Transportation Network. The European Air Transportation dataset was described and analysed by Cardillo et al. 7 . We used a preprocessed version of the dataset downloaded from https://comunelab.fbk.eu/data.php, updated at 2011 and composed of 450 nodes representing airports. Each of them is labelled with the ICAO airport code, and latitudes and longitudes are reported. We found two nodes labelled as "XXXX" and "YYYY", both with zero latitude and longitude: we removed them from the dataset. Table S5 lists all the airlines considered in the dataset. The dendrograms resulting from the cluster analysis are shown in Figure S7. For the discussion of the results, see the main paper.