Exploiting symmetry in network analysis

Sánchez-García, Rubén J.

doi:10.1038/s42005-020-0345-z

Download PDF

Article
Open access
Published: 15 May 2020

Exploiting symmetry in network analysis

Rubén J. Sánchez-García ORCID: orcid.org/0000-0001-6479-3028¹

Communications Physics volume 3, Article number: 87 (2020) Cite this article

7231 Accesses
30 Citations
12 Altmetric
Metrics details

Subjects

Abstract

Virtually all network analyses involve structural measures between pairs of vertices, or of the vertices themselves, and the large amount of symmetry present in real-world complex networks is inherited by such measures. This has practical consequences that have not yet been explored in full generality, nor systematically exploited by network practitioners. Here we study the effect of network symmetry on arbitrary network measures, and show how this can be exploited in practice in a number of ways, from redundancy compression, to computational reduction. We also uncover the spectral signatures of symmetry for an arbitrary network measure such as the graph Laplacian. Computing network symmetries is very efficient in practice, and we test real-world examples up to several million nodes. Since network models are ubiquitous in the Applied Sciences, and typically contain a large degree of structural redundancy, our results are not only significant, but widely applicable.

Degree difference: a simple measure to characterize structural heterogeneity in complex networks

Article Open access 07 December 2020

Exploiting graphlet decomposition to explain the structure of complex networks: the GHuST framework

Article Open access 30 July 2020

Intrinsic dimension as a multi-scale summary statistics in network modeling

Article Open access 01 August 2024

Introduction

Network models of real-world complex systems have been extremely successful at revealing structural and dynamical properties of these systems¹. The success of this approach is due to its simplicity, versatility, and surprising universality, with common properties and principles shared by many disparate systems^2,3,4.

One property of interest is the presence of structural redundancies, which manifest themselves as symmetries in a network model. Symmetries relate to system robustness^5,6, as they identify structurally equivalent nodes, and can arise from replicative growth processes such as duplication⁷, evolution from basic principles⁸, or functional optimisation⁹, and can be arbitrarily generated in model graphs¹⁰. It has been shown that real-world networks possess a large number of symmetries^{8,11,12,13,14}, and that this has important consequences for network structural¹¹, spectral¹³, and dynamical^{15,16,17,18,19} properties for instance cluster synchronisation^{14,20,21,22,23,24,25}.

Crucially, network symmetries are inherited by any measure or metric on the network, that is, any structural measurement between pairs of vertices (such as distances), vertex-valued measurements (such as centrality) or even matrices derived from the network (such as the graph Laplacian). However, the effects of symmetry on arbitrary network measures is not yet fully understood nor exploited in network analysis, even though the network symmetry of the large but sparse graphs typically found in applications can be effectively computed and manipulated.

In this article, we show how a network representation of an arbitrary pairwise measure inherits the same symmetries of the original network, and uncovers the structural and spectral signatures of symmetry on this network representation. Namely, for an arbitrary network measure, we identify subgraphs where the symmetry is generated (symmetric motifs) and their structure (Fig. 1a), use the network quotient to quantify the redundancy due to symmetry (Fig. 1b), develop general compression algorithms that eliminate this redundancy, and study the reduction in computational time obtained by exploiting the presence of symmetries. The eigenvalues and eigenvectors of a network measure also reflect the presence of symmetry: we show how symmetry explains most of the discrete spectrum of an arbitrary network measure, predict the most significant eigenvalues due to symmetry, and use this to develop a fast symmetry-based eigendecomposition algorithm. We achieve remarkable empirical results in our real-world test networks: compression factors up to 26% of the original size, over 90% of the discrete spectrum explained by symmetry, and full eigendecomposition computations in up to 13% of the original time, demonstrating the practical use of symmetry in network analysis. We also discuss the implications of network symmetry in vertex measures. We illustrate our approach in several network measures, providing novel results of independent interest for the shortest path distance, communicability, the graph Laplacian, closeness centrality, and eigenvector centrality. To facilitate dissemination, we provide full implementations of all the algorithms described in this article²⁶. Our results supersede^11,13 and help to understand other network symmetry results thereafter^{12,27,28,29,30,31}. We focus on structural and spectral properties, and symmetries commonly found in real-world networks: For a more general study of arbitrary symmetry in (networks of) dynamical systems, see refs. ^15,16,17,18. To keep our account as self-contained as possible, we include material well known in the algebraic graph theory literature, e.g.^32,33,34,35, without any originality claim.

**Fig. 1: Toy example of a symmetric network.**

Results

Symmetry in complex networks

The notion of network symmetry is captured by the mathematical concept of graph automorphism³². This is a permutation of the vertices (nodes) preserving adjacency, and can be expressed in matrix form using the adjacency matrix of the network. If a network (mathematically, a finite simple graph) ${\mathscr{G}}$ has n vertices, labelled 1 to n, its adjacency matrix A = (a_ij) is an n × n matrix with (i, j)-entry a_ij = 1 if there is an edge between nodes i and j, and zero otherwise. A graph automorphism σ is then a permutation, or relabelling, of the vertices v ↦ σ(v) such that (σ(i), σ(j)) is an edge only if (i, j) is an edge, or, equivalently, a_ij = a_σ(i)σ(j) for all i, j. In matrix terms, this can be written as

$$AP=PA$$

(1)

where P is the permutation matrix corresponding to σ, that is, the matrix with (i, j)-entry 1 if σ(i) = j, and 0 otherwise. The automorphisms of a graph form a mathematical structure called a group, the automorphism group of ${\mathscr{G}}$. In principle, any (finite) group G is the automorphism group of some graph ${\mathscr{G}}$³², but, in practice, real-world networks exhibit very specific types of symmetries generated at some small subgraphs called symmetric motifs¹¹. Namely, we can partition the vertex set V into the asymmetric core of fixed points V₀ (an automorphism σ moves a vertex i ∈ V if σ(i) ≠ i, and fixes it otherwise), and the vertex sets M_i of the symmetric motifs,

$$V={V}_{0}\cup {M}_{1}\cup \ldots \cup {M}_{m},$$

(2)

as shown in Fig. 1a for a toy example. Equation (2) is called the geometric decomposition of the network¹¹.

Real-world networks typically exhibit a core of fixed points (asymmetric core), and a large number of relatively small symmetric motifs, where all the network symmetry is generated, and hence the size of the automorphism group is often extremely large, in stark contrast to random graphs, typically asymmetric³⁸. However, each symmetry is the product (composition) of automorphisms permuting a very small number of vertices within a symmetric motif. For example, the toy graph in Fig. 1a has 2⁷ × 3! × 4! = 18, 432 symmetries (size of the automorphism group) but they generated by (all combinations of) just ten permutations, each permuting a few vertices within a symmetric motif (one permutation per motif except two for M₄, M₅, and M₇).

Each symmetric motif can be further subdivided into orbits of structurally indistinguishable nodes (shown by colour in Fig. 1a), which play the same structural role in the network and, therefore, contribute to network redundancy and thus to the robustness of the underlying system. Our notion of structurally indistinguishable nodes (nodes in the same orbit of the automorphism group) extends the notion of structurally equivalent nodes found in the social sciences³⁹, that is, nodes with the same set of neighbours. It is not equivalent: nodes in the same orbit may not have the same neighbours (e.g., M₁, M₆, or M₇ in Fig. 1a).

Network symmetries of (possibly very large) real-world networks can be effectively computed, stored and manipulated (see “Methods”). For instance, we computed generators of the automorphism group, and the subsequent geometric decomposition, for real-world networks up to several million nodes and edges in a few seconds (see t₁ and t₂ in Table 1).

Table 1 Symmetry in some real-world networks. For each test network, we show the number of vertices (${n}_{{\mathcal{G}}}$), edges (${m}_{{\mathcal{G}}}$), number of generators (gen) of the automorphism group (sizes, 10¹⁵³ to 10^197,552, not shown), computing times of generators (t₁) and geometric decomposition (t₂), in seconds, number of symmetric motifs (sm) and proportion of basic symmetric motifs (bsm), proportion of vertices moved by an automorphism (mv), proportion of vertices (${\widetilde{n}}_{{\mathcal{Q}}}={n}_{{\mathcal{Q}}}/{n}_{{\mathcal{G}}}$) and edges (${\widetilde{m}}_{{\mathcal{Q}}}={m}_{{\mathcal{Q}}}/{m}_{{\mathcal{G}}}$) in the quotient, proportion of external edges in the sparse case (ext_s, in percentage), and of internal edges in the full case (int_f, closest power of 10), full compression ratio (${c}_{\text{full}}={\widetilde{n}}_{{\mathcal{Q}}}^{2}$), and spectral computational reduction ($sp={\widetilde{n}}_{{\mathcal{Q}}}^{3}$), all for the largest connected component. The proportion of vertices in the basic quotient (${\widetilde{n}}_{{{\mathcal{Q}}}_{\text{basic}}}$, not shown) is within 1% of ${\widetilde{n}}_{{\mathcal{G}}}$ except for HumanDisease (${\widetilde{n}}_{{{\mathcal{Q}}}_{\text{basic}}}=52.2 \%$), OpenFlights (79.7%), USPowerGrid (91.6%) and WordNet (79.2%), and similar results hold for ${\widetilde{m}}_{{{\mathcal{Q}}}_{\text{basic}}}$. Data sets available at ref. ⁴⁰, except HumanDisease⁴¹, Yeast⁴², and HumanPPI⁴³. Computations on a desktop computer (3.2 GHz Intel Core i5 processor, 16 GB 1.6 GHz DDR3 memory). All networks are symmetric, although the amount of symmetry (as measured by mv or ${\widetilde{n}}_{{\mathcal{Q}}}$) ranges from several networks with 50% quotient reduction, to CalifornialRoads with only 4% of vertices participating in any symmetry. However, the effect of compression and computational reduction multiplies as, e.g., ${c}_{\text{full}}={\widetilde{n}}_{{\mathcal{Q}}}^{2}$ and $sp={\widetilde{n}}_{{\mathcal{Q}}}^{3}$, achieving significant results for most of our test networks.

Full size table

Most symmetric motifs in real-world networks (typically over 90%, see the bsm column in Table 1) are of a very specific type, called basic¹¹: they are made of one or more orbits of the same size, and every permutation of the vertices in each orbit is realisable, that is, can be extended to a network automorphism (see Fig. 1a). Basic symmetric motifs (BSMs) have a very constrained structure¹³, which we will generalise to arbitrary network measures and exploit throughout this article. Non-basic symmetric motifs (typically branched trees, as M₇ in Fig. 1a) are called complex; they are rare and can either be studied on a case-by-case basis, or removed from the symmetry computation altogether (by ignoring the symmetries generated by them).

The definition of network automorphism Eq. (1) carries to an arbitrary n × n real matrix A = (a_ij). Any such matrix can be seen as the adjacency matrix of a network with n vertices labelled 1 to n, and an edge (link) from node i to node j with weight a_ij if a_ij ≠ 0, and no such edge if a_ij = 0. This means that an automorphism does not only preserve edges, but also their weights and directions. This may not be a realistic assumption for real-world weighted networks, where the weights often come from observational or experimental data, but it applies to the matrix representing a network structural measure, as we illustrate in Fig. 2 and explain next.

Structural network measures

A (pairwise) structural network measure is a function F(i, j) on pairs of vertices which satisfies

$$F(\sigma (i),\sigma (j))=F(i,j)\,{\rm{for}}\,\,{\rm{all}}\, \, i,j\in V$$

(3)

for all automorphisms $\sigma \in \,{\text{Aut}}\,({\mathscr{G}})$. Since automorphisms identify structurally indistinguishable vertices (i and σ(i)) and, similarly, edges ((i, j) and (σ(i), σ(j))), structural network measures are (edge) functions that depend on the network structure alone, and not, for example, on node or edge labels, or other meta-data. Most network measures are structural, including graph metrics (e.g., shortest path), and matrices algebraically derived from the adjacency matrix (e.g., Laplacian matrix). (We identify matrices M with pairwise measures via F(i, j) = [M]_ij.) In particular, structural measures are independent of the ordering or labelling of the vertices. In contrast, functions depending, explicitly or implicitly, on some vertex ordering or labelling, are not structural, for example the shortest path length through a given node.

We can encode a structural measure F as a network with adjacency matrix [F(A)]_ij = F(i, j) (see Fig. 2a for the adjacency matrix and Fig. 2b, c for two examples of structural measures), and write (3) in matrix form as

$$F(A)\ P=P\ F(A),$$

(4)

where P is the permutation matrix corresponding to σ. Comparing this to Eq. (1), we see that the network representation of F, $F({\mathscr{G}})$, with adjacency matrix F(A), inherits all the symmetries of ${\mathscr{G}}$. In particular, the network $F({\mathscr{G}})$ has the same decomposition into symmetric motifs Eq. (2), and orbits, as ${\mathscr{G}}$. The BSMs in $F({\mathscr{G}})$ must occur on the same vertices M_i, although they are now all-to-all weighted subgraphs in general (Fig. 2b). Nevertheless, they have a very constrained structure: the intra and inter orbit connectivity depends on two parameters only. Namely, each orbit in a BSM is uniquely determined by β = F(v_i, v_i) (the connectivity of a vertex with itself) and α = F(v_i, v_j), i ≠ j (the connectivity of a vertex with every other vertex in the orbit), for all v_i, v_j in the orbit. Similarly, the connectivity between two orbits Δ₁ and Δ₂ in the same BSM also depends on two parameters: after a suitable reordering Δ₁ = {v₁, …, v_n} and Δ₂ = {w₁, …, w_n}, we have δ = F(v_i, w_i) and γ = F(v_i, w_j) for all 1 ≤ i, j ≤ n. (For a proof, see Theorem 1 in “Methods.”) This can be observed in Fig. 2c and is represented schematically in Fig. 3a, b. In particular, each BSM takes a very constrained form in the quotient, as shown schematically in Fig. 3c, d.

**Fig. 3: Structure of a basic symmetry motif (BSM) for an arbitrary network measure F.**

The results in this article apply to arbitrary structural measures, although the two most common cases in practice are the following. We call F full if F(i, j) ≠ 0 for all i ≠ j ∈ V (e.g., a graph metric), and sparse if F(i, j) = 0 if a_ij = 0, for all i ≠ j ∈ V (e.g., the graph Laplacian). The graph representation of $F({\mathscr{G}})$ is an all-to-all weighted graph if F is full, and has a sparsity similar to ${\mathscr{G}}$ if F is sparse (cf. Fig. 2c).

From now on, we will assume that ${\mathscr{G}}$ is undirected and F is symmetric, F(i, j) = F(j, i), which may not be the case even if ${\mathscr{G}}$ is undirected (e.g., the transition probability of a random walker $F(i,j)=\frac{{a}_{ij}}{\,{\text{deg}}\,(i)}$), and discuss directed networks and asymmetric measures in the “Methods” section.

Quotient network

The formal procedure to quantify and eliminate structural redundancies in a network is via its quotient network. This is the graph with one vertex per orbit or fixed point (see Fig. 1b) and edges representing average connectivity. Formally, if A is the n × n adjacency matrix of a graph ${\mathscr{G}}$, the quotient network with respect to a partition of the vertex set V = V₁ ∪ … ∪ V_m is the graph ${\mathscr{Q}}$ with m × m adjacency matrix the quotient matrix Q(A) = (b_kl) defined by

$${b}_{kl}=\frac{1}{| {V}_{k}| }\sum _{{{{i\in {V}_{k}}\atop {j\in {V}_{l}}}}}{a}_{ij},$$

(5)

the average connectivity from a vertex in V_k to all vertices in V_l. There is an explicit matrix equation for the quotient. Consider the n × m characteristic matrix S of the partition, that is, [S]_ik = 1 if i ∈ V_k, and zero otherwise, and the diagonal matrix Λ = diag(n₁, …, n_m), where n_k = ∣V_k∣. Then

$$Q(A)={\Lambda }^{-1}{S}^{T}A\ S.$$

(6)

The quotient network is a directed and weighted network in general. An alternative is to use the symmetric quotient, with adjacency matrix Q_sym(A) = Λ^−1/2S^TA SΛ^−1/2, which is weighted but undirected. Note that Q(A) and Q_sym(A) are spectrally equivalent matrices: they have the same eigenvalues, with eigenvectors related by the transformation v ↦ Λ^1/2w.

In the context of symmetries, we will always refer to the quotient with respect to the partition of the vertex set into orbits. This quotient removes all the original symmetries from the network: if σ(v_i) = v_j, then v_i and v_j are in the same orbit and hence represented by the same vertex in the quotient network, which is then fixed by σ. We can, therefore, infer and quantify properties arising from redundancy alone by comparing a network to its quotient. The quotients of real-world networks are often significantly smaller (in vertex and edge size) than the original networks^11,12 (see ${\widetilde{n}}_{{\mathscr{Q}}}$ and ${\widetilde{m}}_{{\mathscr{Q}}}$ in Table 1), and this reduction quantifies the structural redundancy present in an empirical network. Not every real-world network is equally symmetric, and, in our test networks, we give examples of network quotient reductions ranging from about 50% to just 2%. Computing the network quotient involves multiplication by very sparse matrices (Λ is diagonal and S has one non-zero element per row) and hence is computationally efficient (a few seconds in all our test networks).

Redundancy in network measures

The amount structural redundancy on a network (measured by ${\widetilde{n}}_{{\mathscr{Q}}}={n}_{{\mathscr{G}}}/{n}_{{\mathscr{Q}}}$) is amplified in the computation of a typical (full) network measure (see Eq. (7) below). It is therefore natural to ask how to quantify, and eliminate, the symmetry-induced redundancy. If a network has ${n}_{{\mathscr{G}}}$ vertices and ${n}_{{\mathscr{Q}}}$ orbits, there are ${n}_{{\mathscr{G}}}^{2}$ pairs of vertices but only ${n}_{{\mathscr{Q}}}^{2}$ pairs of orbits, achieving a reduction, or compression ratio, of

$${c}_{{\text{full}}}={\left(\frac{{n}_{{\mathscr{G}}}}{{n}_{{\mathscr{Q}}}}\right)}^{2}$$

(7)

for a full network measure, typically much smaller than the ratio ${\widetilde{n}}_{{\mathscr{Q}}}={n}_{{\mathscr{G}}}/{n}_{{\mathscr{Q}}}$. On the other hand, for a sparse network measure, we only need to consider edge values, hence the reduction is the ratio between the number of edges in the graph and in its quotient

$${c}_{\text{sparse}}=\frac{{m}_{{\mathscr{G}}}}{{m}_{{\mathscr{Q}}}}.$$

(8)

For an arbitrary network measure, its compression ratio, which measures the redundancy present (zero values excluded), will range between c_full and c_sparse. The compression ratios c_full and ${c}_{\text{sparse}}={\widetilde{m}}_{{\mathscr{Q}}}$ are shown on Table 1 for our test networks. We found a remarkable amount of redundancy (up to 70%) due to symmetry alone (Fig. 4).

**Fig. 4: Redundancy in some real-world networks.**

Symmetry compression

A natural question, with practical consequences for network analysis, is whether we can easily “eliminate” the symmetry-induced redundancies. This means storing only one value of a network function for each orbit of structurally indistinguishable nodes or edges, all sharing the same such value. Although this has been explored in particular cases, such as shortest path distances²⁷, here we present a general treatment. A simple method is to use the quotient matrix

$$B={S}^{T}AS,$$

(9)

which is easier to store than Λ⁻¹S^TAS. This matrix achieves a compression ratio between c_full and c_sparse (by using a sparse representation of B), as explained before. From this matrix, we can recover all but the internal connectivity inside a symmetric motif, which is replaced by the average connectivity. Namely, let us define

$${\overline{a}}_{ij}=\frac{1}{{n}_{i}}\frac{1}{{n}_{j}}{b}_{kl},$$

(10)

where n_i, respectively, n_j, is the size of the orbit containing v_i, respectively, v_j (note that these orbit sizes can be obtained as the row sums of the characteristic matrix S). Then one can show (“Methods”, Theorem 2) that

$${\overline{a}}_{ij}=\left\{\begin{array}{lc}{a}_{ij}\hfill &\,{\text{if}}\; {v}_{i}\; {{\text{and}}}\; {v}_{j}\; {\text{are}}\; {\text{external}},\\ \frac{1}{{n}_{i}}\frac{1}{{n}_{j}}{\sum }_{{{{{v}_{k}\in {\Delta }_{1}}\atop {{v}_{l}\in {\Delta }_{2}}}}}{a}_{kl}& \;\, {\text{if}}\; {v}_{i}\; {\text{and}}\; {v}_{j}\; {\text{are}}\; {\text{internal,}}\end{array}\right.$$

(11)

where we call a pair of vertices external if they belong to two different symmetric motifs, and internal otherwise, and v_i ∈ Δ₁ and v₁ ∈ Δ_l are orbits. Hence, if we are not interested in the exact internal connectivity (inside a symmetric motif), or it can be recovered easily by other means (e.g., one motif at a time), we can use this simple method to eliminate all the symmetry-induced redundancies on an arbitrary network measure encoded as a matrix A. We have included simple average symmetry compression and decompression algorithms (Algs. 1 and 2), where A_avg is the matrix with entries ${\overline{a}}_{ij}$. The original ${n}_{{\mathscr{G}}}\times {n}_{{\mathscr{G}}}$ matrix A is stored using the ${n}_{{\mathscr{Q}}}\times {n}_{{\mathscr{Q}}}$ quotient matrix B plus a very sparse (n non-zero elements) characteristic matrix S.

Algorithm 1.

Average symmetry compression.

Algorithm 2.

Average symmetry decompression.

The vast majority of edges in the network representation of a network measure are external (at least 99.999% for a full measure in our test networks, see int_f in Table 1), and hence the information loss by using A_avg instead of A is minimal. We can nevertheless enforce lossless compression, by storing the intra-motif connectivity separately. Indeed, we can exploit the fact that most symmetric motifs in empirical networks are basic, and hence each orbit, or pair of orbits, is uniquely determined by two parameters (Fig. 3). If we disregard the symmetries generated at non-basic symmetric motifs, the corresponding quotient, called basic quotient, written ${{\mathscr{Q}}}_{\text{basic}}$, leaves non-basic motifs unchanged and retains most of the symmetry in a typical real-world network. By annotating this quotient, we can recover the original network representation of the network measure exactly. We have implemented lossless compression and decompression algorithms (“Methods”, Algorithms 6 and 7), and evaluated them in our test networks (Fig. 4).

Computational reduction

Network symmetries can also reduce the computational time of evaluating an arbitrary network measure F. By Eq. (3), we only need to evaluate F on orbits, resulting in a computational reduction ratio of between ${\widetilde{m}}_{{\mathscr{Q}}}$ and ${\widetilde{n}}_{{\mathscr{Q}}}^{2}$ (Table 1) for sparse, respectively full, network measures. Of course, this assumes that the computation on each pair of vertices F(i, j) is independent of one another, which is often not the case. Moreover, the calculation of F(i, j) is still performed on the whole network ${\mathscr{G}}$.

A more substantial computational reduction can be obtained by evaluating F on the (often much smaller) quotient network instead. We call F quotient recoverable if it can be applied to the quotient network ${\mathscr{Q}}$, and $F({\mathscr{G}})$ can be recovered from $F({\mathscr{Q}})$, for all networks ${\mathscr{G}}$. Note that this may involve, beyond evaluating $F({\mathscr{Q}})$, an independent (hence parallelizable) computation on each symmetric motif (typically a very small graph). By evaluating F in the quotient network, we can obtain very substantial computational time savings, depending on the amount of symmetry present and the computational complexity of F. Depending on the network measure, it may not be possible to recover $F({\mathscr{G}})$ exactly from $F({\mathscr{Q}})$, but only partially. We call a network measure F partially quotient recoverable if it can be applied to a quotient network ${\mathscr{Q}}$ of a network ${\mathscr{G}}$, and all the external edges of $F({\mathscr{G}})$ can be recovered from $F({\mathscr{Q}})$, for all networks ${\mathscr{G}}$. Since the quotient averages the network connectivity, we can often recover the average values of F within symmetric motifs. We call F average quotient recoverable if, in addition to external edges, the average intra-motif edges can be recovered from $F({\mathscr{Q}})$. A typical situation is when $F({\mathscr{Q}})$ equals the quotient of F, that is, in symbols,

$$F({\mathscr{Q}})=Q(F({\mathscr{G}})).$$

(12)

In the “Applications” section, we will show that communicability is average quotient recoverable, and shortest path distance is partially, but not average, quotient recoverable. Not every measure can be (partially) recovered from the quotient, for example the number of distinct paths between two vertices, as the internal connectivity within each symmetric motif is lost, and replaced by its average connectivity, in the quotient. Note that the word “partially” can be misleading: typically almost all edges are external (see ext_s and int_f in Table 1).

The resulting computational time reduction obtained by evaluating F in the quotient can be very substantial, as illustrated by several popular network measures in our test networks (Fig. 5).

**Fig. 5: Quotient computational reduction.**

Spectral signatures of symmetry

The spectrum of the network’s adjacency matrix relates to a multitude of structural and dynamical properties¹. The presence of symmetries is reflected in the spectrum of the network¹³, and indeed in the spectrum of any network measure. Symmetries give rise to high-multiplicity eigenvalues (shown as “peaks” in the spectral density) and, in fact, we can explain and predict most of the discrete part of the spectrum of an arbitrary network measure on a typical real-world network.

Let A be the n × n adjacency matrix of a (possibly weighted) network (such as the network representation of a network measure). First, note that symmetry naturally produces high-multiplicity eigenvalues, since

$$AP{\bf{v}}=PA{\bf{v}}=\lambda P{\bf{v}}.$$

(13)

where (λ, v) is an eigenpair of A and P the permutation matrix of a network automorphism (Eq. (1)). This gives another eigenpair (λ, v) whenever v and Pv are linearly independent (obviously not always the case).

Let B = Q(A) be the m × m quotient of A (Eq. (6)) with respect to the partition of the vertex set into orbits. This partition satisfies a regularity condition called equitability³⁵, which can be written in matrix form as AS = SB, where S is the characteristic matrix of the partition. In particular, if (λ, v) is a quotient eigenpair, then (λ, Sv) is a parent eigenpair,

$$A(S{\bf{v}})=SB{\bf{v}}=\lambda (S{\bf{v}}).$$

(14)

In fact, one can show (“Methods”, Theorem 3) that A has an eigenbasis of the form

$$\{S{{\bf{v}}}_{1},\ldots ,S{{\bf{v}}}_{m},{{\bf{w}}}_{1},\ldots ,{{\bf{w}}}_{n-m}\},$$

(15)

where {v₁, …, v_m} is any eigenbasis of B, and S^Tw_j = 0 for all j. We can think of a vector ${\bf{v}}\in {{\mathbb{R}}}^{m}$, respectively ${\bf{w}}\in {{\mathbb{R}}}^{n}$, as a vector on (the vertices of) the quotient, respectively the parent, network. Then, each vector Sv_i equals the vector v_i lifted to the parent network by repeating the value on each orbit. Similarly, S^Tw_j = 0 means that the sum of the entries of w_j on each orbit is 0. All in all, we can always find an eigenbasis of A consisting of non-redundant eigenvectors {Sv₁, …, Sv_m} arising from a quotient eigenbasis by repeating values on each orbit, and redundant eigenvectors {w₁, …, w_n−m} arising from the network symmetries, which add up to zero on each orbit (hence “dissappering” in the quotient). Similarly, we call their respective eigenvalues redundant and non-redundant.

Analogous to the way that symmetry is generated at symmetric motifs, the redundant eigenvectors and eigenvalues arise directly from certain eigenvectors and eigenvalues of the symmetric motifs, considered as networks on their own (Fig. 6). In fact, each symmetric motif ${\mathscr{M}}$ contributes the same (called redundant) eigenpairs to any network containing ${\mathscr{M}}$ as a symmetric motif: One can show (“Methods”, Theorem 4) that if ${\mathscr{M}}$ is a symmetric motif of a network ${\mathscr{G}}$ and (λ, w) is a redundant eigenpair of ${\mathscr{M}}$ (that is, the values of w add up to zero on each orbit of ${\mathscr{M}}$), then $(\lambda ,\widetilde{{\bf{w}}})$ is an eigenpair of ${\mathscr{G}}$, where $\widetilde{{\bf{w}}}$ is equal to w on (the vertices of) ${\mathscr{M}}$, and zero elsewhere. We call such a vector $\widetilde{{\bf{w}}}$localised on the motif ${\mathscr{M}}$¹³, as it is zero outside the motif. Moreover, if ${\mathscr{M}}$ has n vertices and k orbits, then it has an eigenbasis consisting of n − k redundant eigenpairs, which are inherited by any network containing ${\mathscr{M}}$ as a symmetric motif (Fig. 6, Theorem 4 in “Methods”).

Furthermore, since most symmetric motifs in real-world networks are basic, thus have a very constrained structure (Fig. 3), we can in fact determine the redundant spectrum of BSMs with up to a few orbits, that is, we can predict where the most significant “peaks” in the spectral density of an arbitrary network function will occur. The formulae for the redundant spectra for BSMs of one or two orbits (which covers most BSMs, up to 99% of them in our test networks) is given on Table 2.

Table 2 Redundant spectra of basic symmetric motifs (BSMs) with one or two orbits. A BSM with one orbit is a uniform graph ${K}_{n}^{\alpha ,\beta }$ with n vertices and adjacency matrix ${A}_{n}^{\alpha ,\beta }=({a}_{ij})$, where a_ij = α if i ≠ j and a_ii = β, for all i, j and some constants α and β. A BSM with two orbits consists of the (γ, δ)-uniform join of two uniform graphs ${K}_{n}^{{\alpha }_{1},{\beta }_{1}}$ and ${K}_{n}^{{\alpha }_{2},{\beta }_{2}}$, that is, the graph with 2n vertices and block adjacency matrix (after a suitable labelling of the vertices) of the form $\left(\begin{array}{ll}A & B \\ C & D\end{array}\right)$, where $A={A}_{n}^{{\alpha }_{1},{\beta }_{1}}$, $B={A}_{n}^{{\alpha }_{2},{\beta }_{2}}$ and $C={A}_{n}^{\gamma ,\delta }$, each defined as above. We write e_i for the vector with non-zero entries 1 at position 1, and − 1 at position i (2 ≤ i ≤ n), κ₁ and κ₂ for the two solutions of the quadratic equation cκ² + (−a + b)κ − c = 0, where a = α₁ − β₁, b = α₂ − β₂ and c = γ − δ, and use (v∣w) to represent the concatenation of two vectors.

Full size table

We now give more details of the computation of the redundant spectrum of BSMs up to two orbits (Table 2), with full details in the “Methods” section. A BSM with one orbit is an (α, β)-uniform graph ${K}_{n}^{\alpha ,\beta }$ with adjacency matrix ${A}_{n}^{\alpha ,\beta }=({a}_{ij})$ given by a_ij = α and a_ii = β for all i ≠ j. Then ${K}_{n}^{\alpha ,\beta }$ has eigenvalues (n − 1)α + β (non-redundant), with multiplicity 1, and − α + β (redundant), with multiplicity n − 1. The corresponding eigenvectors are 1, the constant vector 1 (non-redundant), and e_i, the vectors with non-zero entries 1 at position 1, and − 1 at position i, 2 ≤ i ≤ n (redundant). For unweighted graphs without loops (β = 0, α ∈ {0, 1}), we recover the redundant eigenvalues 0 and − 1 predicted in ref. ¹³.

A BSM with two orbits must be a uniform join of the form ${K}_{n}^{{\alpha }_{1},{\beta }_{1}}\mathop{\leftrightarrow }\limits^{\gamma ,\delta }{K}_{n}^{{\alpha }_{2},{\beta }_{2}}$ (Fig. 3). Let κ₁ and κ₂ be the two solutions of the quadratic equation cκ² + (b − a)κ − c = 0, where a = α₁ − β₁, b = α₂ − β₂ and c = γ − δ. Then, the redundant eigenvalues of this BSM are (“Methods”, Theorem 5)

$${\lambda }_{1}=-b-c{\kappa }_{1}=\frac{-(a+b)+\sqrt{{(a-b)}^{2}+4{c}^{2}}}{2},\,{\text{and}}\,$$

(16)

$${\lambda }_{2}=-b-c{\kappa }_{2}=\frac{-(a+b)-\sqrt{{(a-b)}^{2}+4{c}^{2}}}{2},$$

(17)

each with multiplicity n − 1, with eigenvectors (κ₁e_i∣e_i) and (κ₂e_i∣e_i), respectively, 2 ≤ i ≤ n. For unweighted graphs without loops, we recover the redundant eigenvalues predicted in ref. ¹³, that is,

$$-2,\,-\varphi ,\,-1,\,0,\,\varphi -1\,{\text{and}}\,1,$$

(18)

where $\varphi =\frac{1+\sqrt{5}}{2}$, the golden ratio.

Eigendecomposition algorithm

Decoupling the contribution to the network spectrum from the symmetric motifs and from the quotient network, as explained above, naturally leads to an eigendecomposition algorithm that exploits the presence of symmetries: the spectrum and eigenbasis of an undirected network (equivalently, a diagonalisation of its adjacency matrix A = UDU^T) can be obtained from those of the quotient, and of the symmetric motifs, reducing the computational time (cubic on the number of vertices) to up to a third in our test networks (Fig. 5, left column of the spectral case), in line with our predictions ($sp={n}_{{\mathscr{Q}}}^{3}$ in Table 1). The algorithm is shown and explained below. A MATLAB implementation is available at a public repository²⁶.

Our eigendecomposition algorithm (Algorithm 3) applies to any undirected matrix with symmetries (identifying a matrix with the network it represents). It first computes the eigendecomposition of the quotient matrix, then, for each motif, the redundant eigenpairs. Namely, it first computes the spectral decomposition eig of the symmetric quotient B_sym = Λ^−1/2S^TASΛ^−1/2, where Λ is the diagonal matrix of the orbit sizes (which can be obtained as the column sums of S). This matrix is symmetric and has the same eigenvalues as the left quotient. Moreover, if ${B}_{\text{sym}}={U}_{q}{D}_{q}{U}_{q}^{-1}$ then the left quotient eigenvectors are the columns of ΛU_q. These become, in turn, eigenvectors of A by repeating their values on each orbit, and can be obtained mathematically by left multiplying by the characteristic matrix S. Then, for each motif, we compute the redundant eigenpairs using a null space matrix (explained below), storing eigenvalues and localised (zero outside the motif) eigenvectors.

Only redundant eigenvectors of a symmetric motif (that is, those which add up to zero on each orbit) become eigenvectors of A by extending them as zero outside the symmetric motif. Therefore, we need to construct redundant eigenvectors from the ouput of eig on each motif (the spectral decomposition of the corresponding submatrix). If ${U}_{\lambda }=\left(\begin{array}{ccc}{{\bf{v}}}_{1}&\ldots &{{\bf{v}}}_{k}\end{array}\right)$are λ-eigenvectors of a symmetric motif with characteristic matrix of the orbit partition S_sm, we need to find linear combinations such that

$${S}_{{\text{sm}}}^{T}\left({\alpha}_{1}{{\bf{v}}}_{1}+\ldots +{\alpha}_{k}{{\bf{v}}}_{k}\right)={\bf{0}}\ \iff \ {S}_{{\text{sm}}}^{T}{U}_{\lambda }\left(\begin{array}{c}{\alpha}_{1}\\ \vdots \\ {\alpha }_{k}.\end{array}\right).$$

(19)

Therefore, if the matrix Z ≠ 0 represents the null space of ${S}_{\,\text{sm}\,}^{T}{U}_{\lambda }$, that is, ${S}_{\,\text{sm}\,}^{T}{U}_{\lambda }Z=0$ and Z^TZ = 0, then the columns of U_λZ are precisely the redundant eigenvectors. This is implemented in Algorithm 3 within the innermost for loop.

Algorithm 3.

Eigendecomposition algorithm.

Vertex measures

We have so far considered network measures of the form F(i, j), where i and j are vertices. However, many important network measurements are vertex based, that is, of the form G(i) for each vertex i. We say that a vertex measure G is structural if it only depends on the network structure and, therefore, satisfies

$$G(i)=G(\sigma (i))$$

(20)

for each automorphism $\sigma \in \,{\text{Aut}}\,({\mathscr{G}})$, that is, it is constant on orbits (Fig. 1).

Although for vertex measures we do not have a network representation, we can still exploit the network symmetries. First, G needs only to be computed/stored once per orbit, resulting on a reduction/compression ratio of ${\widetilde{n}}_{{\mathscr{Q}}}={n}_{{\mathscr{Q}}}/{n}_{{\mathscr{G}}}$ (Table 1).

Secondly, when quotient recovery holds (that is, we can recover G from its values on the quotient and symmetry information alone), it amounts to a further computational reduction (Fig. 5), depending on the computational complexity of G. Finally, many vertex measures arise nevertheless from a pairwise function, such as G(i) = F(i, i) (subgraph centrality from communicability), or $G(i)=\frac{1}{n}{\sum }_{j}F(i,j)$ (closeness centrality from shortest path distance), allowing the symmetry-induced results on F to carry over to G.

Applications

We illustrate our methods on several popular pairwise and vertex-based network measures. Although novel and of independent interest, these are example applications: Our methods are general and the reader should be able to adapt our results to the network measure of their interest.

Adjacency matrix: the methods in this paper can be applied to the network itself, that is, to its adjacency matrix. We recover the structural and spectral results in refs. ^11,13, and the quotient compression ratio reported in ref. ¹², here ${c}_{\text{sparse}}={\widetilde{m}}_{{\mathscr{Q}}}$ in Table 1. The network (adjacency) eigendecomposition can be significantly sped up by exploiting symmetries (Fig. 5).

Communicability: communicability is a very general choice of structural measure, consisting on any analytical function f(x) = ∑a_nxⁿ applied to the adjacency matrix, $f(A)=\mathop{\sum }\nolimits_{n = 0}^{\infty }{a}_{n}{A}^{n},$ and it is a natural measure of network connectivity, since the matrix power A^k counts walks of length k³⁷. The most common choice of coefficients is ${a}_{n}=\frac{1}{n!}$, which gives the exponential matrix ${e}^{A}=\mathop{\sum }\nolimits_{n = 0}^{\infty }\frac{{A}^{n}}{n!}$. Communicability is a structural network measure and its network representation, the graph $f({\mathscr{G}})$ with adjacency matrix f(A), inherits all the symmetries of ${\mathscr{G}}$ and thus it has the same symmetric motifs and orbits. The BSMs are uniform joins of orbits, and each orbit is a uniform graph (Figs. 3 and 2b) characterised by the communicability of a vertex to itself (a natural measure of centrality³⁶), and the communicability between distinct vertices. As a full network measure, the compression ratio c_full applies (Table 1), indicating the fraction of storage needed by using the quotient to eliminate redundancies (Fig. 4). Moreover, average quotient recovery holds for communicability since f(Q(A)) = Q(f(A)) (Methods, Theorem 6). Alternatively, we can use the spectral decomposition algorithm on the adjacency matrix (A = UDU^T implies f(A) = Uf(D)U^T) reducing the computation, typically cubic on the number of vertices, by $sp={\widetilde{n}}_{{\mathscr{Q}}}^{3}$ (Table 1 and Fig. 5). For the spectral results, note that f(A) = Uf(D)U^T has eigenvalues f(λ), and same eigenvectors, as A. Thus,

$$f(-2),f(-\varphi ),f(-1),f(0),f(\varphi -1),\,{\text{and}}\,f(1)$$

(21)

account for most of the discrete part of the spectrum f(A), for the adjacency matrix A of a typical (undirected, unweighted) real-world network (Eq. (18)).

Shortest path distance: this is the simplest metric on a (connected) network, namely the length of a shortest path between vertices. A path of length n is a sequence (v₁, v₂, …, v_n+1) of distinct vertices, except possibly v₁ = v_n+1, such that v_i is connected to v_i+1 for all 1 ≤ i ≤ n − 1. The shortest path distance ${d}^{{\mathscr{G}}}(u,v)$ is the length of the shortest (minimal length) path from u to v. If p = (v₁, v₂, …, v_n) is a path and $\sigma \in \,{\text{Aut}}\,({\mathscr{G}})$, we define σ(p) = (σ(v₁), σ(v₂), …, σ(v_n)), also a path since σ is a bijection.

One can show that (i) automorphisms preserve shortest paths and their lengths; (ii) shortest paths between vertices in different symmetric motifs do not contain intra-orbit edges; and (iii) shortest path distance is a partially quotient recoverable structural measure (“Methods”, Theorem 7). In particular, automorphisms σ preserve the shortest path metric, $d(i,j)=d\left(\sigma (i),\sigma (j)\right)$, and we can compute shortest distances from the quotient,

$${d}^{{\mathscr{G}}}(\alpha ,\beta )={d}^{{\mathscr{Q}}}(i,j),\quad \alpha \in {V}_{i},\beta \in {V}_{j},$$

(22)

whenever V_i and V_j are orbits in different symmetric motifs. This accounts for all but the (small) intra-motif distances and reduces the computation as shown in Fig. 5.

Distances between points within the same motif cannot in general be directly recovered from the quotient, not even for BSMs. (Consider for instance the double star, motif M₁, in Fig. 1: The distance from the top red to the bottom blue vertex is three, while in the quotient is one.) In general, therefore, the shortest path distance is partially, but not average, quotient recoverable. Intra-motif distances, if needed, could still be recovered one motif at a time.

Note that these results can be exploited for other graph-theoretic notions defined in terms of distance, for example eccentricity (and thus radius or diameter), which only depends on maximal distances and thus it can be computed directly in the quotient.

In terms of symmetry compression, the compression ratio c_full applies, accounting for the amount of structural redundancy due solely to symmetries. The spectral results, although perhaps less relevant, still apply for $d({\mathscr{G}})$, the graph encoding pairwise shortest path distances. The adjacency matrix $d(A)=({d}^{{\mathscr{G}}}(i,j))$ is non-zero outside the diagonal, hence $d({\mathscr{G}})$ is a all-to-all weighted network without self-loops and integer weights, and so is each symmetric motif. Using the formula in Table 2, we can easily compute values of the most significant part of the discrete spectrum (redundant eigenvalues) of d(A), namely −3, −2, −1, 0, $-2\pm \sqrt{2}$, $-3\pm \sqrt{2}$, $\frac{-3\pm \sqrt{5}}{2}$, $\frac{-5\pm \sqrt{5}}{2}$, and $\frac{-5\pm \sqrt{13}}{2}$.

Laplacian matrix: the Laplacian matrix of a network L = D − A, where D is the diagonal matrix of vertex degrees, is a (sparse) network measure and therefore inherits all the symmetries of the network. The matrix L can be seen as the adjacency matrix of a network ${\mathscr{L}}$ with identical symmetric motifs, except that all edges are weighted by −1 and all vertices have self-loops weighted by their degrees in ${\mathscr{G}}$ (Fig. 2c). In particular, the motif structure (namely, the self-loop weights) depends on the how the motif is embedded in the network ${\mathscr{G}}$.

Quotient compression and computational reduction are less useful in this case, however the spectral results are more interesting. The spectral decomposition applies, and we can compute redundant Laplacian eigenvalues directly from Table 2, for instance positive integers for BSMs with one orbit (“Methods”, Corollary 2). This explains and predicts most of the “peaks” (high-multiplicity eigenvalues) in the Laplacian spectral density, confirmed on our test networks (Fig. 7). Using the formula in Table 2, one can similarly compute the redundant spectrum for 2-orbit BSMs, and for other versions of the Laplacian (e.g., normalised, vertex weighted). Finally, observe that the spectral decomposition applies, thus Algorithm 3 provides an efficient way of computing the Laplacian eigendecomposition with an expected $sp={\tilde{n}}_{{\mathscr{Q}}}^{3}$ (see Table 1) computational time reduction.

**Fig. 7: Spectral signatures of network symmetry.**

Commute distance and matrix inversion: the commute distance is the expected time for a random walker to travel between two vertices and back⁴⁴. In contrast to the shortest path distance, it is a global metric, which takes into account all possible paths between two vertices. The commute distance is equal up to a constant (the volume of the network) to the resistance metric r⁴⁵, which can be expressed in terms of ${L}^{\dagger }=({l}_{ij}^{\dagger })$, the pseudoinverse (or Moore-Penrose inverse) of the Laplacian, as $r(i,j)={l}_{ii}^{\dagger }+{l}_{jj}^{\dagger }-2{l}_{ij}^{\dagger }$. The commute (or resistance) distance is a (full) structural measure, and all our structural and spectral results apply. Crucially, we can use eigendecomposition algorithm to obtain L = UDU^T (and hence L^† = UD^†U^T, and r) from the quotient and symmetric motifs, resulting in significant computational gains (Fig. 5). More generally, if M_F is the matrix representation of a network measure, its pseudoinverse ${M}_{F}^{\dagger }$ is also a network measure, and the comments above apply. Note that ${M}_{F}^{\dagger }$ is generally a full measure even if M_F is sparse (the inverse of a sparse matrix is not generally sparse).

Vertex symmetry compression: as a vertex measure G is constant on orbits, we only need to store one value per orbit. Let S be the characteristic matrix of the partition of the vertex set into orbits, and Λ the diagonal matrix of orbit sizes (column sums of S). If G is represented by a vector v = (G(i)) of length ${n}_{{\mathscr{G}}}$, we can store one value per orbit by taking w = Λ⁻¹S^Tv, a vector of length ${n}_{{\mathscr{Q}}}$, and recover v = S^Tw (“Methods”, Theorem 9).

Degree centrality: the degree of a node (in- or out-degree if the network is directed) is a natural measure of vertex centrality. As expected, the degree is preserved by any automorphism σ, which can also be checked directly,

$${d}_{i}=\sum_{j\in V}{a}_{ij}=\sum_{j\in V}{a}_{\sigma (i)\sigma (j)}=\sum_{j\in V}{a}_{\sigma (i)j}={d}_{\sigma (i)},$$

(23)

as automorphisms permute orbits (so j ∈ V and σ(j) ∈ V are the same elements but in a different order). In particular, the degree is constants on orbits. We recover the degree centrality from the quotient as the out-degree (“Methods”, Proposition 2).

Closeness centrality: the closeness centrality of a node i in a graph ${\mathscr{G}}$, $c{c}^{{\mathscr{G}}}(i)$, is the average shortest path length to every node in the graph. As symmetries preserve distance, they also preserve closeness centrality, explicitly,

$$cc(i) = \, \frac{1}{{n}_{{\mathscr{G}}}}\sum _{j\in V}d(i,j)=\frac{1}{{n}_{{\mathscr{G}}}}\sum _{j\in V}d(\sigma (i),\\ \sigma (j)) = \, \frac{1}{{n}_{{\mathscr{G}}}}\sum_{j\in V}d(\sigma (i),j)=cc(\sigma (i)),$$

(24)

and centrality is constant on each orbit, as expected. Moreover, closeness centrality can be recovered from the quotient (shortest paths does not contain intra-orbit edges, except between vertices in the same symmetric motif, see above), as

$$c{c}^{{\mathscr{G}}}(i)=\sum_{l\ne k}\frac{{n}_{l}}{{n}_{{\mathscr{G}}}}{d}^{{\mathscr{Q}}}(k,l)+\frac{{n}_{i}}{{n}_{{\mathscr{G}}}}{d}_{k}$$

(25)

if i belongs to the orbit V_k and d_k is the average intra-motif distance, that is, the average distances of a vertex in V_k to any vertex in ${\mathscr{M}}$, the motif containing V_k. By annotating each orbit by d_k, we can recover betweenness centrality exactly. Alternatively, as d_k ≪ n (note that d_k ≤ m if ${\mathscr{M}}$ has m orbits), we can approximate $c{c}^{{\mathscr{G}}}(i)$ by the first summand, or simply by the quotient centrality $c{c}^{{\mathscr{Q}}}(\alpha )$, in most practical situations.

Betweenness centrality: this is the sum of proportions of shortest paths between pairs of vertices containing a given vertex. It can be computed from shortest path distances and number of shortest paths⁴⁶, both pairwise structural measures, reducing the computation of a naive O(n³) time, O(n²) space implementation by ${\widetilde{n}}_{{\mathscr{Q}}}^{3}$ and ${\widetilde{n}}_{{\mathscr{Q}}}^{2}$. It would be interesting to adapt a faster algorithm, e.g., ref. ⁴⁶ to exploit symmetries, but this is beyond our scope.

Eigenvector centrality: eigenvector centrality is obtained from a Perron–Frobenius eigenvector (i.e., of the largest eigenvalue) of the adjacency matrix of a connected graph¹. Since this eigenvalue must be simple, it cannot be a redundant eigenvalue. Hence it is a quotient eigenvalue, and, as those are a subset of the parent eigenvalues, it must still be the largest (hence the Perron–Frobenius) eigenvalue of the quotient. Its eigenvector can then be lifted to the parent network, by repeating entries on orbits. That is, if (λ, v) is the Perron–Frobenius eigenpair of the quotient, then (λ, Sv) is the Perron–Frobenius eigenpair of the parent network. In practice, we use the symmetric quotient B_sym = Λ^−1/2S^TASΛ^−1/2 for numerical reasons (Algorithm 4). Hence the computation (quadratic time by power iteration) can be reduced by ${\widetilde{n}}_{{\mathscr{Q}}}^{2}$ (Fig. 5).

Algorithm 4.

Eigenvector centrality from the quotient network.

Discussion

We have presented a general theory to describe and quantify the effects of network symmetry on arbitrary network measures, and explained how this can be exploited in practice in a number of ways.

Network symmetry of the large but sparse graphs typically found in applications can be effectively computed and manipulated, making it an inexpensive pre-processing step. We showed that the amount of symmetry is amplified in a pairwise network measure but can be easily discounted using the quotient network. We can for instance eliminate the symmetry-induced redundancies, or use them to simplify the calculation by avoiding unnecessary computations. Symmetry has also a profound effect on the spectrum, explaining the characteristic “peaks” observed in the spectral densities of empirical networks, and occurring at values we are able to predict.

Our framework is very general and apply to any pairwise or vertex-based network measure beyond the ones we discuss as examples. We emphasised practical and algorithmic aspects throughout, and provide pseudocode and full implementations²⁶. Since real-world network models and data are very common, and typically contain a large degree of structural redundancy, our results should be relevant to any network practitioner.

Methods

Geometric decomposition and symmetric motifs

We write $\,{\text{Aut}}\,({\mathscr{G}})$ for the automorphism group of an (unweighted, undirected, possibly very large) network ${\mathscr{G}}=(V,E)$ (see below for a discussion of directed and weighted networks). Each automorphism (symmetry) $\sigma \in \,{\text{Aut}}\,({\mathscr{G}})$ is a permutation of the vertices and its support is the set of vertices moved by σ,

$$\,{\text{supp}}\,(\sigma )=\{i\in V\,{\text{such}}\;{ \text{that}}\,\sigma (i)\, \ne \, i\}.$$

(26)

Two automorphisms σ and τ are support-disjoint if the intersection of their supports is empty, $\,{\text{supp}}\,(\sigma )\cap \,{\text{supp}}\,(\tau )={{\emptyset}}$. The orbit of a vertex i is the set of vertices to which i can be moved to by an automorphism, that is,

$$\{\sigma (i)\;{\text{such}}\;{ \text{that}}\;\sigma \in {\text{Aut}}\,({\mathscr{G}})\}.$$

(27)

One can show¹¹ that there is a partition a set X of generators of $\,{\text{Aut}}\,({\mathscr{G}})$ into its finest support-disjoint classes X = X₁ ∪ … ∪ X_m, which is unique up to permutation of the sets X_i. The vertex sets ${M}_{i}={\cup }_{\sigma \in {X}_{i}}\,{\text{supp}}\,(\sigma )$ give the geometric decomposition Eq. (2), and the subgraphs induced by them are, by definition, the symmetric motifs of ${\mathscr{G}}$. (The next section explains how to compute the geometric decomposition in practice.) Since support-disjoint automorphisms must commute (the order in which they are composed is irrelevant), the subgroups of $\,{\text{Aut}}\,({\mathscr{G}})$ generated by X₁ to X_m, call them H₁ to H_m, give a direct product decomposition $\,{\text{Aut}}\,({\mathscr{G}})={H}_{1}\times \ldots \times {H}_{m}$. The geometric decomposition is defined from the finest support-disjoint partition of a special set of generators (called essential), as explained in ref. ¹¹. However, the results in this article are valid for any support-disjoint decomposition of any set of generators (essential or not) of $\,{\text{Aut}}\,({\mathscr{G}})$.

If all the orbits of a symmetric motif have the same size k and every permutation of the vertices in each orbit can be extended to a network automorphism supported on the motif, we call the symmetric motif basic (or BSM) of type k. (In particular, the corresponding subgroup H_i must be Sym(k), the symmetric group of all permutations of k elements.) If a symmetric motif is not basic, we call it complex or of type 0.

Network symmetry computation

First, we compute a list of generators of the automorphism group from an edge list (we use saucy3⁴⁷, which is extremely fast for the large but sparse networks typically found in applications). Then, we partition the set of generators X into support-disjoint classes X = X₁ ∪ … ∪ X_m, that is, σ and τ are support-disjoint whenever σ ∈ X_i, τ ∈ X_j and i ≠ j. To find the finest such partition, we use a bipartite graph representation of vertices V and generators X. Namely, let ${\mathscr{B}}$ be the graph with vertex set V ∪ X and edges between i and σ whenever i ∈ supp(σ). Then X₁, …, X_m are the connected components of ${\mathscr{B}}$ (as vertex sets intersected with X). Each X_i corresponds to the vertex set M_i of a symmetric motif ${{\mathscr{M}}}_{i}$, as ${M}_{i}={\bigcup }_{\sigma \in {X}_{i}}\,{\text{supp}}\,(\sigma ).$ Finally, we use GAP⁴⁸ to compute the orbits and type of each symmetric motif (Algorithm 5). Full implementations of all the procedures outlined above are available at a public repository²⁶.