Community detection in networks by dynamical optimal transport formulation

Detecting communities in networks is important in various domains of applications. While a variety of methods exist to perform this task, recent efforts propose Optimal Transport (OT) principles combined with the geometric notion of Ollivier–Ricci curvature to classify nodes into groups by rigorously comparing the information encoded into nodes’ neighborhoods. We present an OT-based approach that exploits recent advances in OT theory to allow tuning between different transportation regimes. This allows for better control of the information shared between nodes’ neighborhoods. As a result, our model can flexibly capture different types of network structures and thus increase performance accuracy in recovering communities, compared to standard OT-based formulations. We test the performance of our algorithm on both synthetic and real networks, achieving a comparable or better performance than other OT-based methods in the former case, while finding communities that better represent node metadata in real data. This pushes further our understanding of geometric approaches in their ability to capture patterns in complex networks.


Introduction
Complex networks are ubiquitous, hence modeling interactions between pairs of individuals is a relevant problem in many disciplines 1,2 .Among the variety of analysis that can be performed on them, community detection [3][4][5][6] is a popular application that involves finding groups (or communities) of nodes that share similar properties.The detected communities may reveal important functional properties of the underlying system.Community detection has been used in diverse areas including, discovering potential friends on social networks 7 , evaluating social networks 8 , personalized recommendation of item to user 9 , detecting potential terrorist activities on social platforms 10 , fraud detection in finance 11 , study epidemic spreading process 12 and so on.
Several algorithms have been proposed to tackle this problem which utilize different approaches, such as statistical inference 13,14 , graph modularity 15 , statistical physics 16 and information theory 17 .Here, instead, we adopt a recent approach connecting community detection with geometry, where communities are detected using geometric methods like the Ollivier-Ricci curvature (ORC) and we exploit optimal transport theory to calculate this efficiently.
In Riemannian geometry, a curvature quantifies how geodesic paths converge or diverge, depending on the curvature's sign.In networks, the ORC plays a similar role where edges with negative curvature are traffic bottlenecks, in terms of network flow of the shortest paths.In the opposite case, positively curved edges contribute to transport on the network along with several others, reflecting the fact that they are well connected.Defining communities as robust transport of information along with the network, we could cluster edges based on their curvature: those with positive curvature can be clustered together, while those with negative curvature may be seen as "bridges" connecting different communities.The idea of using Ricci curvature to find communities on networks has been recently proposed in 18,19 .In this work we follow a similar approach, but generalize it for the case of branched 20,21 and congested 22 optimal transport problems, building from recent results 23 .Specifically, our algorithm allows us to efficiently tune the sensitivity to detecting communities in a network, by means of a parameter that controls the flow of information shared between nodes.We perform a comprehensive comparison between the proposed algorithm and existing ones on synthetic and real data.Our algorithm, named ORC-Nextrout, detects communities in synthetic networks with similar or higher accuracy compared to other OT-based methods in the regime where inference is not trivial.This is also observed in a variety of real networks, where the ability to tune between different transportation regimes allows finding at least one result that outperforms other methods, including approaches based on statistical inference and modularity-based community detection.

Related work.
The idea of exploring geometrical properties of a graph, and in particular curvature, has been explored in different branches of network science, ranging from biological 24 to communication 25 networks.Intuitively, the Ricci curvature can be seen as the amount of volume through which a geodesic ball in a curved Riemannian manifold deviates to the standard ball in Euclidean spaces 26 .When defined in graphs, it indicates whether edges (those with positive values for the curvature) connect nodes inside a cluster, or if they rather bond different clusters together (those with negative values for the curvature).
Two main discrete graph curvature approaches have been proposed: the Ollivier-Ricci (OR) curvature based on the optimal transport theory introduced by Ollivier, 27,28 and Forman-Ricci curvature introduced by Forman 29 .While the graph Laplacianbased Forman curvature is computationally fast and less geometrical, we focus on OT-based approach due to its more geometric nature.Some applications of the Ollivier-Ricci curvature include network alignment 30 and community detection 18,19,31 .
On the other hand, community detection in networks is a fundamental area of network science, with a wide range of approaches proposed for this task 3,4,32 .Our work is inspired by recent OT-based methods 18,19 for community detection.These methods consider the OR curvature to sequentially identify and prune negatively curved edges from a network to identify communities.While our approach also considers OR curvature to prune edges, it controls the flow of information exchanged between nodes by means of parameter, making the edge pruning dynamic.This is detailed in Sec. 2.

β -Wasserstein Community Detection Algorithm
In this section, we describe how our approach solves the community detection problem.As previously stated, we rely on optimal transport principles to find the communities.To solve the optimal transport problem applied in our analysis we use the discrete Dynamic Monge-Kantorovich model (DMK), as proposed by Facca et al. 33,34 to solve transportation problems on networks.
We denote a weighted undirected graph as G = (V, E,W ), where V, E,W are the set of nodes, edges and weights, respectively.We use the information of a node neighborhood N (i) = { j ∈ V |(i, j) ∈ E} to decide whether node i belongs to a given community.We do this by comparing a distribution defined on N (i) to the ones defined on other nodes close to i.This distribution is defined as m α i , where Intuitively, the distribution m assigns a unit of mass to i and its connections: α controls how much weight the node i should have, and once this is assigned, its neighbors receive the remaining mass in an even way.
The next step is to compare the distribution m α i of the node i to that of its neighbors.Consider an edge (i, j) ∈ E and m j , the distribution defined on the node j, neighbor of i.We assume that if i and j belong to the same community, then both nodes may have several neighbors in common, and therefore, m i and m j should be similar.Notice that this implies an assortativity assumption, where nodes within the same community are more likely to interact than nodes in different communities 2,4 .This has been observed for instance in social or biological networks 13,35 .On the contrary, it may not be appropriate to model disassortative datasets, where nodes tend to connect more often across communities.To estimate the similarity between m i and m j we use OT principles.Specifically, we compute the cost of transforming one distribution into the other.This is related to the cost of moving the mass from one neighborhood to the other, and it is assumed to be the weighted shortest-path distance between nodes belonging to N (i) and N ( j).A schematic representation of the algorithm can be seen in Fig. 1.The OT problem is solved in an auxiliary graph, the complete bipartite network B i j = (V i j , E i j , ω i j ) where V i j := (V i ,V j ) := (N (i) ∪ {i} , N ( j) ∪ { j}), E i j is made of all the possible edges between V i and V j .The weights of the edges are given by the weighted shortest path distance d between two nodes measured on the input network G.
The similarity between m i and m j is the Wasserstein cost W (m i , m j , ω i j ) of the solution of the transportation problem.In its standard version, this number is the inner product between the solution Q, a vector of flows defined on edges, and the cost ω i j .In our case, since the DMK model allows to control the flow of information through a hyperparameter β ∈ (0, 2], we define the β -Wasserstein cost, W β (m i , m j , ω i j ), as the inner product of the solution Q = Q(β ) of the DMK model and the cost ω i j .For β = 1 we compute the Wasserstein-1 distance between m i and m j , while for β = 1 the influence of β in the solution of the transportation problem can be seen in Fig. 2. When β < 1, more edges of B tend to be used to transport the mass, thus we observe congested transportation 22 .When β > 1 fewer edges are used, hence we observe branched transportation, and the β -Wasserstein cost coincides with a branched transport distance 21,36 .The idea of tuning β to interpolate between various transportation regimes has been used in several works and engineering applications 23,[37][38][39][40][41][42] .
Calculating the Wasserstein cost is necessary to determine our main quantity of interest, the discrete Olliver-Ricci curvature, defined as: where d i j is the weighted shortest path distance between i and j as measured in G. Intuitively, if i and j are in the same communities, several k ∈ V i and ∈ V j will be also directly connected.Thus, the Wasserstein distance between m i and m j will be shorter than d i j , yielding a positive κ β (i, j).Instead, when i and j are in different communities, their respective neighbors will be unlikely connected, hence d(i, j) < W β (m i , m j , ω i j ), yielding a negative κ β (i, j).
The Ricci flow algorithm on a network is defined by iteratively updating the weights of the graph G 18,19 .These are updated by combining the curvature and shortest path distance information 27 .We redefine these updates using our proposal for the Ollivier-Ricci curvature: where w is the weight of edge (i, j) at time t, w i j is the shortest path distance between nodes i and j at iteration t.At every time step t, the weights are normalized by their total sum.
The algorithm ORC-Nextrout dynamically changes the weights of the graph G to isolate communities: intra-community edges will be shortened, while inter-community ones will be enlarged.These changes are reached after different number of iterations, depending on the input data.We choose the one that maximises some predefined quality measure.To find the communities we apply a network surgery criterion as proposed by Ni et al. 18 based on the stabilisation of the modularity of the network.Notice that our algorithm does not need prior information about the number of communities: edges will be either enlarged or shortened depending on the optimal transport principles agnostic to community labelings.
A pseudo-code of the implementation is shown in Algorithm 1.

Synthetic Networks
To investigate the accuracy of our model in detecting communities, we consider synthetic networks generated using the Lancichinetti-Fortunato-Radicchi (LFR) benchmark 43 and the Stochastic Block Model (SBM) 44 .Both models provide community labels used as ground-truth information during the classification tasks.
Lancichinetti-Fortunato-Radicchi benchmark: this benchmark generates undirected unweighted networks G with disjoint communities.It samples node degrees and community sizes from power law distributions, see Fig. 4 for an example.One of its advantages is that it generates networks with heterogeneous distributions of degrees and community sizes.The main parameters in input are the number of nodes N, two exponents τ 1 and τ 2 for the power law distributions of the node degree and community size respectively, the expected degree d of the nodes, the maximum number of communities on the network K max and a fraction µ of inter-community edges incident to each node.To test the performance of our algorithm, we use the set of LFR networks used and provided by the authors of 18 .We set Stochastic Block Model: this model probabilistically generates networks with non-overlapping communities.One specifies the number of nodes N and the number of communities K, together with the expected degree d of a node and a ratio r Compute w β (e) end for Update w t = w β end for Networks are generated by connecting nodes with a probability r * p intra if they belong to different communities; p intra if they are part of the same community, where p intra = d × K/N.Notice that the smaller the ratio r is, the less inter-community connections would exist, which leads to networks with a more distinct community structure.We set N = 500, K = 3, d = 15 and r ∈ [0.01, 0.5] and generate 10 random networks per value of r.
Results.To evaluate the performance of our method in recovering the communities, we use the Adjusted Rand Index (ARI) 45 .ARI compares the obtained community partition with the ground truth clustering.It takes values ranging from 0 to 1, where ARI=0 is equivalent to random community assignment, and ARI=1 denotes perfect matching with the ground truth communities, hence the higher this value the better the recovery of communities.We test our algorithm for different types of information spreading in our OT-based model, as controlled by the parameter β , using the software developed in 46 1 .We used β = 1, i.e., standard Wasserstein distance; β ∈ {0.1, 0.5} for congested transportation, enforcing broad spreading across the neighbors; and β ∈ {1.5, 2} to favor branching schemes, where fewer edges are used to decide which community a node should belong to.For the OT-based algorithms, we run 15 iterations and choose the one with the best ARI scores.In some cases, high scores are reached in fewer iterations.
The results in Fig. 3 show the performance in both LFR and SBM benchmarks with OT-based methods, our method for various β and one based on the Sinkhorn algorithms (OTDSinkhorn) 47,48 .Our main goal is to assess the impact of tuning between different transportation regimes (as done by β ) in terms of community detection via OT principles.Nevertheless, to better contextualize the performance of OT-based algorithms in the wide spectrum of community detection methods, we also include comparisons with algorithms that are not OT-based.Namely, we consider a probabilistic model with latent variables (MT) 13 , and with two modularity-based algorithms, Label Propagation 14 and Infomap 17 .Our algorithm outperforms OTDSinkhorn for various values of β in an intermediate regime where OT-based inference is not trivial.This happens in both benchmark LFR and SBM, as shown in Fig. 3.For lower and higher values of the parameters, performance is similar and close to the two extremes of ARI = 0 and 1. OT-based methods have a similar sharp decay in performance from the regime where inference is easy to the more difficult one, as also observed in 18 .The other community detection methods have smoother decay, but with lower performance in the regime where OT-based approaches strive, except for Label Propagation and MT, which are more robust in this sense.In the intermediate regime where inference is not trivial (i.e.along the sharp decay of OT-based methods), we observe that different values of β give higher performance than OTDSinkhorn.For SBM the highest performance is achieved consistently for high β = 2, while for LFR the best β varies with µ.A qualitative example where ORC-Nextrout is performing better than OTDSinkhorn, in an instance of LFR of this intermediate regime, is shown in Fig. 4.  Example of community structure on a synthetic LFR network.The rightmost panel shows the ground-truth community structures to be predicted in an LFR network generated using µ = 0.35.Square-shaped markers denote nodes that are assigned to communities different than those in ground-truth.In middle and last panels, ORC-Nextrout with β = 2 perfectly retrieves the 21 communities, while OTDSinkhorn predicts only 19 communities with an ARI score of 0.73, wrongly assigning ground-truth dark green and light brown (square-shaped) nodes to the light blue community.

Analysis of real networks
Next, we evaluate our model on various real datasets 49 containing node metadata that can be used to assess the recovery of communities.While failing to recover communities that align well with node metadata should not by automatically interpreted as a model's failure 50 (e.g. the inferred communities and the chosen node metadata may capture different aspect of the data), having a reference community structure to compare against allows to inspect quantitatively difference between models.These real networks differ on structural features like number of nodes, average degree, number of communities and other standard network properties as detailed in Table 1.Specifically, we consider i) a network of co-appearances of characters in the novel Les Misérables 51 (Les Miserables).Edges are built between characters that encounter each other.ii) A network of 62 bottlenose dolphins in a community living off Doubtful Sound, in New Zealand 52 (Dolphins).Nodes represent dolphins, and edges indicate frequent associations between them.This network is clustered into four groups, conjectured as clustered from one population and three sub-populations based on the interactions between dolphins of different sex and ages 53 .The dolphins were observed between 1994 and 2001.iii) A network of Division I matches of American Football during a regular season in the fall of 2000 54 (American football).Nodes represent teams and edges are games between teams.Teams can be clustered according to their football college conference memberships.iv) A network of books on US politics published around the 2004 presidential election and sold by an online bookseller 55 (Political books).Nodes represent the books and the edges between books are frequent co-purchasing of books by the same buyers.Books are clustered based on their political spectrum as neural, liberal or conservative.OT-based algorithms outperform other community detection algorithms in detecting communities aligned with the node metadata, as shown in Fig. 5.In particular, ORC-Nextrout has the highest accuracy performance considering the best performing β .The impact of tuning this parameter is noticeable from these plots, as the best performing value varies across datasets.In the Les Miserables and Dolphins networks, β < 1 has better performance, while in American Football the best performing value is for β > 1. Performance is similar across OT-based methods in the Political Books network.In Fig. 6 we show the communities 6/12 detected by the best performing ORC-Nextrout version together with OTDSinkhorn and Infomap in Les Miserables and Political books.Focusing on Les Miserables, we see how ORC-Nextrout successfully detects three characters in the green communities, in particular a highly connected node in the center of the figure (in dark green).Notice that these are placed in the same (pink or black) community by OTDSinkhorn.Thus ORC-Nextrout achieves a higher ARI than OTDSinkhorn.Both OT-based approaches retrieve well communities exhibiting clustering patterns, with many connections within community.Instead, they both divide the communities with a hub and spokes structure due to the lack of common connections within the group.
The communities detected in the Political books datasets highlight the tendency of OT-based methods to extract a larger number of communities (10) than those observed from node metadata (3).Among these extra communities, 3 are made of a few nodes, while 5 of them are made of one isolated node each.This is related to the fact that OT-based methods perform particularly well for networks with internally densely-connected community structures, but may be weaker for community structures that are sparsely connected 19 .One could potentially assign these nodes to larger communities, for instance by preferential attachment as done in 19 , thus in practice reducing the number of communities.Devising a principled method or criterion to do this automatically is an interesting topic for future work.This tendency is further corroborated by the fact that OT-based algorithms recover robustly the two communities that are mostly assortative (red and pink in the figure), while they struggle to recover the disassortative community depicted in the centre (yellow).This is community has several connections with nodes in the other two communities and has been separated into smaller groups by OT-based approaches, as described above.This also highlights the need for methods that are robust against situations where a combination of assortative and disassortative communities coexist in a network.Communities in real networks.We show the communities inferred by ORC-Nextrout (β = 0.5, 1.5 for top and bottom rows respectively), OTDSinkhorn and Infomap and compare against those extracted using node attributes (GT).The visualization layout is given by the Fruchterman-Reingold force-directed algorithm 56 , therefore, groups of well-connected nodes are located close to each other.Dark nodes represent individual nodes who are assigned to isolated communities by OT-based methods.Square-shaped markers denote nodes assigned to communities different than those obtained from node metadata.

Conclusion
Community detection on networks is a relevant and challenging open area of research.Several methods have been proposed to tackle this issue, with no "best algorithm" that fits well every type of data.We focused here on a recent line of work that exploits principles from Optimal Transport theory combined with the geometric concept of Ollivier-Ricci curvature applied to discrete graphs.Our method is flexible in that it tunes between different transportation regimes to extract the information necessary to compute the OR curvature on edges.On synthetic data, our model is able to identify communities more robustly than other OT-based methods based on the standard Wasserstein distance in the regime where inference is not trivial.On real data, our model shows either better or comparable performance in recovering community structure aligned with node metadata compared to other approaches, thanks to the ability to tune the parameter β .A relevant advantage of OT-based methods is that the number of communities is automatically learned from data, contrarily to other approaches that need this as an input parameter.In this respect, our model has the tendency of overestimating this number, similarly to other OT-based methods.Understanding how to properly incorporate small-size communities into larger ones in a principled and automatic way is an interesting topic for future work.Similarly, it would be interesting to quantify the extent to which various β capture different network topologies.To address this, one could for instance use methods to calculate the structural distance between networks 57 and correlate this against the values of the best performing β .There are a number of directions in which this model could be extended.Nodes can be connected in more than one way, as in multilayer networks.Our model could be extended by considering a different β for each edge type, as done in 42 .Similarly, real networks are often rich in additional information, e.g.attributes on nodes.It would be interesting to incorporate a priori additional information to inform community detection 58,59 .This information can potentially be used to mitigate the problem of overestimation of the number of communities, as explained above.

Optimal Transport Formulation
Consider the proabability distributions q that take pairs of vertices and also satisfy the constraints ∑ i q i j = m j , ∑ j q i j = m i .In other words, these are the joint distributions whose marginals are m i and m j .We call these distributions transport plans between m i and m j .The Optimal Transport problem we are interested in is that of finding a transport plan q * that minimises the quantity ∑ i∼ j q i j d i j , where i ∼ j means that nodes i and j are neighbors and d i j is the cost of transporting mass from i to j, e.g. the distance between these two nodes.The quantity W β (m i , m j , d) := ∑ i, j q * i j d i j , defined for this optimal q * , is the Wasserstein distance between m i and m j .

The Dynamical Monge-Kantorovich model
It was recently proved 33,34 that solutions of the optimal transport problem previously stated can be found by turning that problem into a system of differential equations.This section is dedicated to describe this dynamical formulation.
Let G = (V, E,W ) be a weighted graph, with N the number of nodes and E the number of edges in G. Let B be the signed incidence matrix of G. Let f + and f − be two N-dimensional discrete distributions such that ∑ i∈V f i = 0 for f = f + − f − ; let µ(t) ∈ R E and u(t) ∈ R N be two time-dependent functions defined on edges and nodes, respectively.The discrete Dynamical Monge-Kantorovich model can be written as: w e ∑ j B e j u j (t) , µ e (t) = µ e (t) where | • | is the absolute value element-wise.Equation (3) corresponds to Kirchhoff's law, Eq. ( 4) is the discrete dynamics with β a traffic rate controlling the different routing optimization mechanisms; Eq. ( 5) is the initial distribution for the edge conductivities.
For β = 1 the dynamical system described by Eqs.(3)-( 5) is known to reach a steady state, i.e., the updates of µ e and u e converge to stationary functions µ * and u * as t inscreases.The flux function q defined as q * e := µ * e |u * i − u * j |/w e is the solution of the optimal transport problem presented in the previous section.Notice that µ and u depend on the chosen traffic rate β , and thus, so does q = q(β ).Therefore we can introduce a generalized version of the distance W : W β (m i , m j , w) := ∑ i, j q * i j (β ) w i j .
We then redefine the proposed Ollivier-Ricci curvature as: Probability distributions on neighborhoods ORC-Nextrout takes in input a graph and a forcing term.While the graph encapsulates the neighborhood information provided by the nodes i and j, the forcing function is related to the distributions one needs to transport.Analogously to what proposed by 18 , we define this graph to be the weighted complete bipartite B i j = (V i j , E i j , ω i j ).The weights in ω i j change iteratively based on the curvature.Notice that a bipartite graph must satisfy N (i) ∩ N ( j) = ∅, which does not hold true if i and j have common neighbors (this is always the case since i ∈ N ( j)).Nonetheless, this condition does not have great repercussions in the solution of the optimal transport problem since the weights corresponding to these edges (of the form (i, i)) are equal to 0. As for the forcing function, we define it to be f := f + − f − = m i − m j .

Other methods
To evaluate the performance of ORC-Nextrout, we compare with some of the well-established community detection algorithms including: Infomap 17 , MULTITENSOR 13 (MT), discrete Ricci flow 18 (OTDSinkhorn), and Label propagation 14 .We briefly describe each of these algorithms as follows; • The Discrete Ricci flow (here addressed as OTDSinkhorn) 18 is an iterative node clustering algorithm that deforms edge weights as time progresses, by shrinking sparsely traveled links and stretching heavily traveled edges.These edge weights are iteratively updated based on neighborhood transportation Wasserstein costs, in a similar way to what proposed in this manuscript.After a predefined number of iterations, heavily traveled links are removed from the graph.Communities are then obtained as the connected components of this modified network.
• MULTITENSOR (MT) 13 is an algorithm to find communities in multilayer networks.It is a probabilistic model with latent variables regulating community structure and runs with a complexity of O(EK) with assortative structure (as we consider here), where K is the number of communities.This model assumes that the nodes inside the communities can belong to multiple groups (mixed-membership).In this implementation we use their validity for single layer networks (a particular case of a multilayer network).
• Infomap 17 employs information theoretic approach for community detection.This method uses the map equation to attend patterns of flow on a network.This flow is simulated using random walkers' traversed paths.Based on the theoretic description of these paths, nodes with quick information flow are then clustered into the same groups.The algorithm runs in O(E).
• Label propagation 14 assigns each node to same community as majority of its neighbors.Its working principle start by initializing each node with a distinct label and converges when every node has same label as majority of its neighboring node.The algorithm has a complexity scaling as O(E).

Figure 1 .
Figure 1.Left): an example graph G where edges have unitary weights.Center): the edge (1, 5) (bold black line) is selected to define the OT problem between m 1 , m 5 ; neighborhoods of nodes 1 and 5 are highlighted with blue and red edges and are used to build the corresponding distributions m 1 , m 5 .Right): The complete bipartite graph B 15 where the OT problem is defined.The color intensity of the edges represent the distance between the associated nodes on the graph G, as shown by the colorbar.m 1 and m 5 are both defined for α = 0, i.e. no mass is left in 1 and 5.

Figure 2 .
Figure 2. Visualization of how β impacts an intra-community edge.(a) Example intra-community structure between nodes 6 and 7. (b)The weight of edge (6, 7) decreases when 0 < β < 0.6, while for 0.5 < β < 2.0 it reaches a minimum and then slightly increases again.This justifies the better performance in detecting communities obtained for higher values of β , as shown in Figs.3a and3b.(c) A similar decreasing behavior is observed for the β -Wasserstein cost: for intra-community edges, β > 1 consolidates traffic in the network as the Wasserstein cost stabilizes.(d-e) Example cost graph B 67 with fluxes solution of the OT problem (edge thickness is proportional to the amount of flux) in the regimes of small (d) and high (e) values of β .

(Figure 3 .
Figure 3. Results on LFR and SBM synthetic data.Performance in detecting ground-truth communities is measured by the ARI score.Markers and shadows are the averages and standard deviations over 10 network realisations with the same value of the parameter used in generation.Markers' shape denote different algorithms.a) LFR graph with N = 500 nodes and different values of K ranging from (17, 22).b) SBM with N = 500 nodes, K = 3 communities and average degree d = 15.The parameter r is the ratio of inter-community with intra-community edges.

Figure 4 .
Figure 4. Example of community structure on a synthetic LFR network.The rightmost panel shows the ground-truth community structures to be predicted in an LFR network generated using µ = 0.35.Square-shaped markers denote nodes that are assigned to communities different than those in ground-truth.In middle and last panels, ORC-Nextrout with β = 2 perfectly retrieves the 21 communities, while OTDSinkhorn predicts only 19 communities with an ARI score of 0.73, wrongly assigning ground-truth dark green and light brown (square-shaped) nodes to the light blue community.

Figure 5 .
Figure 5. Results on real data.Performance in terms of recovering communities using metadata information is calculated in terms of the ARI score.ORC-Nextrout shows competing results against all methods with different optimal β across datasets.

Figure 6 .
Figure 6.Communities in real networks.We show the communities inferred by ORC-Nextrout (β = 0.5, 1.5 for top and bottom rows respectively), OTDSinkhorn and Infomap and compare against those extracted using node attributes (GT).The visualization layout is given by the Fruchterman-Reingold force-directed algorithm56 , therefore, groups of well-connected nodes are located close to each other.Dark nodes represent individual nodes who are assigned to isolated communities by OT-based methods.Square-shaped markers denote nodes assigned to communities different than those obtained from node metadata.

Table 1 .
Real networks description.We report statistics for the real networks used in our experiments.N and E denote the number of nodes and edges, respectively.K is the number of communities in the ground truth data.AvgDeg, AvgBtw and AvgClust are the average degree, betweenness centrality and average clustering coefficient, respectively.
a) Les Miserables