Symmetries and cluster synchronization in multilayer networks

Real-world systems in epidemiology, social sciences, power transportation, economics and engineering are often described as multilayer networks. Here we first define and compute the symmetries of multilayer networks, and then study the emergence of cluster synchronization in these networks. We distinguish between independent layer symmetries, which occur in one layer and are independent of the other layers, and dependent layer symmetries, which involve nodes in different layers. We study stability of the cluster synchronous solution by decoupling the problem into a number of independent blocks and assessing stability of each block through a Master Stability Function. We see that blocks associated with dependent layer symmetries have a different structure to the other blocks, which affects the stability of clusters associated with these symmetries. Finally, we validate the theory in a fully analog experiment in which seven electronic oscillators of three kinds are connected with two kinds of coupling.

In Table 1 of the main manuscript we compute the symmetries of several real multilayer networks. In Supplementary Table 1 we provide an extensive description of each dataset. In particular, we list information about what the nodes, the edges, and the layers represent in each dataset, as well as whether the edges are weighted or directed.
To provide a better connection between real multilayer networks and cluster synchronization, we focus on the US power-grid dataset [1]. The network is composed of N b = 4492 buses (layer b) and N g = 449 generators (layer g), which are connected in the power-grid network via both interlayer end intralayer links. We assume that all the buses have identical power demand P b i = −100 kW , while each generator can supply roughly P g i = 1M W . We thus suppose to have N b + N g rotatory machines, with equal inertia (I b i = I g i = 10 7 ton m −2 ) and damping coefficient (γ b i = γ g i = 10 5 ton s m −2 ). Finally, we assume the admittance on all the power-grid connections is equal to 10 4 M S. The power-flow in the network can be described by the well-known swing equation [2,3,4,5], and, the system dynamics is described bẏ is the phase of the signal at the bus/generator with respect to a nominal signal. Figure 1 shows the power-grid network (left) together with the time evolutions of the phases for the four largest clusters of the multi-layer network (right). As can be seen from the right panel, after a transient, the nodes in each cluster reach CS. London [6] Transportation • Edges encode existing routes between stations.
3 layers: Each node represents a train station in London.
• Edges weights indicate multiple routes.

Underground aggregated lines
• Edges are undirected 2. Overground 3. DLR EU-Air [7] Transportation • Edges stand for direct flights between two airports 37 layers Each node represents an airport.
• Edges are unweighted (either flight is present or not) • Each layer corresponds to a different airline.

layers
Each node represents a co-author • Edges are weighted.
• Each layer corresponds to a different arXiv category.
PIERRE AUGER [8] Coauthorship • Edges stand for coauthorship connections 16 layers Each node represents a co-author • Edges are weighted.
• Each layer corresponds to a different working task within the Pierre Auger collaboration.
CKM Physicians [9] Social • Edges represent information being exchanged regarding a specific drug.

layers
Each node represents a physician.
• Each layer corresponds to a different type of social interaction: • Edges are directed.

layers
Each node represents a protein.
• Each layer corresponds to interactions of a different nature.
• Edges are directed.

layers
Each node represents a protein.
• Each layer corresponds to interactions of a different nature.
• Edges are directed.

layers
Each node represents a protein.
• Each layer corresponds to interactions of a different nature.
• Edges are directed.

layers
Each node represents either a generator or a load.
• Each layer corresponds to a different node kind (generator or load.)

Supplementary Note 2: Symmetries of Model Multiplex Networks
In this section, we compute the symmetries of model multiplex networks, with M = 2 layers, constructed as follows. Both layers are Scale Free (SF) networks generated by using the static model [12] as implemented in [13], with the same average degree κ av = 5 and the same number of nodes N = 1000, but variable power-law degree distribution exponents γ 1 and γ 2 . We randomly matched each node i = 1, ..., N from layer α with another node from layer β to obtain a multiplex network. The inter-layer coupling A αβ ij = A αβ ji = 1 if node i from layer α was matched to node j from layer β, and A αβ ij = A αβ ji = 0 otherwise. In case the resulting multiplex network was not connected, only the largest connected component was considered. Supplementary Figure 2 shows the number of symmetries of the resulting multiplex network as a function of the pair (γ 1 , γ 2 ), 2 ≤ γ 1 , γ 2 ≤ 3, averaged over 200 realizations. We observe a large number of symmetries when both the power low exponents γ 1 and γ 2 are small. When both exponents exceed 2.6, the number of symmetries (except for the identity) is typically zero. The case of Erdos Renyi layers is recovered in the limit of very large exponent γ, for which symmetries are very rare.
Supplementary Figure 3: An example of a multi-layer graph, color coded according to the orbits of its automorphism group. There are two types of nodes, denoted as circles and rectangles. There are also three types of edges, denoted as solid, dashed, and dotted lines connecting pairs of nodes. The orbits of the automorphism group are colored as red, green, and blue.

Supplementary Note 3: Generating Multilayer Networks with Symmetries
Consider a multilayer network with M node colors (or types) and K edge colors (or types), where, if the nodes or edges are not colored, then M = 1 or K = 1, respectively. Let the set of nodes of type j be denoted X α , α = 1, . . . , M , and the set of edges of type k be denoted E k , k = 1, . . . , K. Let Aut(G) be the automorphism group of the graph as described in the main text and let O be the orbital partition of the nodes induced by the automorphism group. The orbital partition consists of |O| = n o orbits, and each orbit O k consists of nodes of the same type only.
An example of the structures defined is shown in Supplementary Figure 3. The graph consists of a total of 14 nodes and a total of 42 edges, with M = 2 node types (circles and rectangles) and K = 3 edge types (solid, dashed, and dotted). The nodes are colored according to their orbits in the automorphism group; red, green, and purple. The orbits are found using nauty.
The orbital partition of a graph can be represented by a quotient graph, that is, a multilayer graph Q = (O, F) where the nodes are the orbits of the original graph (colored according to the single color of its constituents) and the edges are, The minimal representation of the original graph only preserves the size of these sets, that is, n k = |O k | and m j,k = |F j,k |, losing the detailed information of the original set of edges. The quotient graph of the multilayer graph shown in Supplementary Figure 3 is shown in Supplementary Figure 4. The quotient nodes, labeled O 1 , O 2 , and O 3 , have the shape of their constituent nodes and are colored according to the orbit that they represent. The original population of each orbit is shown below each node, and the number of edges of each type passing between nodes in each orbit is shown above each edge.
Going from a graph and its automorphism group to the representative quotient graph is straightforward computationally. The reverse process is not straightforward, and a single quotient graph represents an infinite number of original graphs. Before proceeding though, the definition of a quotient graph must be made more restrictive, that is, the set of orbit populations n k , k = 1, . . . , n o , and the number of intra-and inter-orbit edges m j,k , j, k = 1, . . . , n o , = 1, . . . , K, must satisfy a set of conditions such that it is capable of representing some original graph.
The requirements can be succinctly stated as the following four conditions.
• The number of inter-cluster edges, m j,k , j = k, for each edge color , must be divisible by both n j and n k . Note that if m j,k = 0 this is trivially satisfied.
• The number of inter-cluster edges must be less than or equal to the maximum number of inter-cluster edges, that is, m j,k ≤ n j n k , = 1, . . . , K, j = k.
• Twice the number of intra-cluster edges, 2m j,j must be divisible by the number of nodes n j , for j = 1, . . . , n j and = 1, . . . , K.
• The number of intra-cluster edges must be less than the maximum possible number of edges, that is, These conditions can be expressed mathematically as, where in the above equation LCM(n j , n k ) stands for the least common multiple of the integers n j and n k . If these four conditions are met, then the quotient graph is capable of representing at least one original graph such that the quotient graph represents the orbits of its automorphism group.
The question remains how to construct the desired graph. To do this, we start by creating n o sets of disconnected nodes O k , k = 1, . . . , n o , such that V = no k=1 O k colored according to the colors of the orbits. For ease of notation in the following descriptions, we label the nodes according to their originating orbit, v j i , i = 1, . . . , n j , j = 1, . . . , n o . The edges are added so that the resulting graph's automorphism group contains permutations of the type, which can be thought of as cycles of order q within each orbit. We then add the intra-orbit edges for each value m j,j > 0, j = 1, . . . , n o and = 1, . . . , K. To do this, let k j,j = 2m j,j nj be the intra-cluster degree of color of orbit j.
• If k j,j is odd, then we choose . The set of neighbors for a node in this orbit with edges of type are, • If k j,j is even, then we choose k j,j 2 unique integers between 1 and nj 2 , labeled c i , i = 1, . . . , k j,j 2 . The set of neighbors for a node in this orbit with edges of type are, Note that any cycle of order q of the nodes in the orbits does not affect the set of edges. We now turn to the inter-cluster edges, which is somewhat more involved. Let m j,k be the number of edges between orbits O j and O k of color and let there be n j = |O j | nodes in orbit j and n k = |O k | nodes in orbit k. Let g = GCD(n j , n k ) be the greatest common divisor of integers n j and n k so that there exist two other integers, d j and d k , such that, n j = d j g and n k = d k g Define h = m j,k g nj n k and a sequence of integers f = {f 1 , . . . , f h } such that, If v j a ∈ O j , then its neighbors in O k can be listed as or, vice versa, if v b ∈ O k then its neighbors in orbit v a ∈ O j with edges of color are, Proofs of these expressions for the set of edges can be found in [14,15] A single quotient graph may represent multiple original graphs. Different choices of the values c i for the intra-orbit edges and the values f i in the inter-orbit edges can lead to a sampling of the possible graphs. Also, if we determine a set of populations n j , j = 1, . . . , n o , and edge amounts m j,k for each edge between quotient nodes, the quotient graph can be scaled up by multiplying every quantity by a positive integer. It is trivial to show that all of the constraints hold when multiplying the quantities by a constant positive integer.

Supplementary Note 4: Effects on CS Stability of Heterogeneous Systems in Different Layers
The equations describing the dynamics of a multilayer network are, For simplicity we focus on the case of M = 2 layers, and set the matrices A αα , A ββ , A αβ = (A βα ) T to represent the multilayer network in Figure 1 of the main manuscript. An example with the Van der Pol oscillators is presented in Figure 2 of the main manuscript.
Chaotic systems Here we consider the cases of the Lorenz and of the Roessler oscillators. For all the cases considered, we see that increasing the level of heterogeneity between the oscillators in different layers leads to loss of stability of DLS blocks. In this section we further investigate the effects of varying heterogeneity between the systems in the two layers on stability of DLS's. In the case of the Lorenz oscillator, we set n α = n β = 3, which both correspond to the dynamics of the Lorenz oscillator, with layer-dependent parameters c α and c β . Moreover, we set for all pairs α, β = 1, 2, σ αβ = −2 for all pairs α, β = 1, 2. We set the layer-dependent parameters c α = 10(1 + δ c ) and c β = 10(1 − δ c ), so that δ c = 0 corresponds to identical systems in the two layers and increasing values of δ c indicate the systems in the two layers are increasingly different. We then numerically investigate stability of DLS symmetries as the parameter δ c is increased. This can be seen in panel A) of Supplementary Figure 5, which shows that synchronization between oscillators 1 and 3 from the α layer and synchronization between oscillators 1 and 3 from the β layer is simultaneously lost for δ c 0.25. We then consider the case of the Roessler oscillators, for which n α = n β = 3,

Moreover, we set
for all pairs α, β = 1, 2, σ αβ = −0.04 for all pairs α, β = 1, 2. We set the layer-dependent parameters c α = 10(1+δ c ) and c β = 10(1−δ c ), so that δ c = 0 corresponds to identical systems in the two layers and increasing values of δ c indicate the systems in the two layers are increasingly different. We then numerically investigate stability of DLS symmetries as the parameter δ c is increased. Panel B) of Supplementary Figure 5 shows that synchronization between oscillators 1 and 3 from the α layer and synchronization between oscillators 1 and 3 from the β layer is simultaneously lost for δ c 0.06.

Time-varying heterogeneity
We now reconsider the example with the Van der Pol oscillator illustrated in Figure 2 of the main manuscript. We copy below the functions F and H for the reader's convenience: and we set, for all pairs α, β = 1, 2, σ αβ = 0.15. We set the layer-dependent parameters c α = (1+δ c ) and c β = (1−δ c )(1+ε sin ωt), with δ c = 0.35. In this way, when ε = 0 we are in the situation already considered in the main text, namely Figure 2 of the main manuscript shows that for δ c = 0.35 (and ε = 0) cluster synchronization is achieved in both layers. We now consider how the emergence of cluster synchronization is affected if the parameter variation is periodic in time, for different intensity of the variation ε and frequency ω. Notice that for increasing ε the time-averaged parametric difference between the systems in the two layers remains unvaried.  Figure 6 shows the results of our numerical experiments. In panel A) we plot the cluster synchronization error for increasing time-varying amplitude ε when ω = 0.04. We see that cluster-synchronization is achieved as long as ε 0.5, and thus it is robust with respect to small periodic time-varying heterogeneity, while it is lost for larger values of ε. Panel B) instead plots the cluster synchronization error for different frequencies ω at a fixed amplitude ε = 0.7. The case with ω = 0 corresponds to a case without time-varying parameters. Cluster-synchronization is achieved when the time-varying perturbation is either sufficiently slow or fast and it is lost when the time-varying perturbation has a frequency that is comparable with the frequencies of the two Van der Pol systems, which is about 0.08.

Effects of noise on cluster synchronization
Here we study the effects of noise on cluster synchronization, e.g. due to some external unpredictable factor that affects a layer parameter. To do that, we consider again that the β-layer parameter is not constant but is affected by noise c β (t) = (1 − δ c )(1 + εζ(t)), where ζ(t) is a noisy term with a prescribed spectrum and standard deviation equal to 1. We consider three kinds of noise: a standard white noise (a signal with equal power in every band of a given bandwidth-power spectral density), a pink noise (a signal with equal power in bands that are proportionally wide, i.e. with a spectral power density that decreases by 3 dB per octave), and a blue noise (a signal with spectral power density that increases by 3 dB per octave).
Supplementary Figure 7 shows the results of our numerical experiments. In panel A) ζ(t) is a white noise. On the left panel, we show its power spectral density, while on the right panel we plot the DLS synchronization error against the noise intensity ε. Panels B) and panels C) show the same information, but for the cases of a pink and a blue noise, respectively. We see in all the cases that cluster synchronization is achieved if the noise intensity is sufficiently small, but is lost above a threshold. The threshold is smaller when the power spectral density of the noise is biased towards the frequencies that are able to destabilize the system. Recalling figure 6B, the band of frequencies that are able to destabilize the cluster synchronized solution is between 0.008 and 0.04. As a result, pink noise, that has higher low band power spectrum-−11dB = 0.28 at frequency 0.01-destabilizes the cluster synchronized solution with noise intensity ε ∼ 1.5 (note that 1.5 × 0.28 = 0.42, that is approximately the threshold we have found in Supplementary Figure 6A). White noise has a lower low band power spectrum -−17dB = 0.14 at frequency 0.01-and requires a larger noise intensity ε ∼ 2.8 (note that 2.8 × 0.14 = 0.39) to yield DLS desynchronization. Finally, blue noise, the one most biased on the higher frequencies -−22dB = 0.08 at frequency 0.01-is the one for which cluster synchronization is more robust, requiring intensity ε ∼ 4.8 (note that 4.8 × 0.08 = 0.38) to destabilize the DLS CS solution.

Supplementary Note 5: Effects of the intra-and inter-layer connectivity strengths on CS stability
Here we consider a situation for which the structure of the inter-layer connections does not affect the group of symmetry of the multilayer network. This is the case in which the compatibility relations gA αβ = A αβ h and hA βα = A βα g are satisfied for all the possible symmetries in the different layers, e.g., the case with no inter-layer connections (A αβ = (A βα ) T = 0 N α ×N β , ∀α, β) or all-to-all interlayer connections (A αβ = (A βα ) T = 1 N α ×N β , ∀α, β) -note that other non-trivial cases are possible. Then, K α 1 = G α 1 , ∀α, and as a result all the symmetries of each layer are ILS-symmetries of the multilayer network. Consider for example the multilayer network in Figure 1 of the main manuscript, but with allto-all inter-layer connections. Moreover, let us consider different inter-layer coupling strengths σ αβ . While the matrix T remains the same one reported in Eq. 28 of the main manuscript, the matrix B = T AT T becomes: Note that the transverse blocks are, as expected, all ILS blocks, since the intra-layer coupling is all-to-all. It is therefore possible to identify a maximum transversal Lyapunov Exponent for each of the layers. Moreover, in this specific case, all the transverse blocks are one-dimensional. This is a consequence of the particular networks considered on each layer: in fact, without considering the inter-layer couplings, each layer has no intertwining symmetry breakings. We now want to answer the following question: under this setup, how is cluster synchronization affected by the intra-and inter-layer coupling strength? To answer this question, we focus our attention on the case of low coupling (reference value for σ αα , σ αβ , σ ββ is 0.01), and we take, as an example, the Van der Pol model discussed in the previous section, with δ c = 0.5. Supplementary Figure 8 shows that if we increase only the intra-layer coupling σ αα , we can achieve cluster-synchronization in the layer α; on the other hand, the cluster-synchronized solution of layer β becomes unstable as σ αα increases. If instead we leave the intra-layer coupling strength unvaried and we increase only the inter-layer coupling strength σ αβ , we see that cluster synchronization is achieved in both the layers. This simple example highlights the importance of properly tuning the inter-layer coupling strengths, which is key in achieving cluster synchronization in both the layers.
Supplementary Note 6: Stability analysis of the three CS patterns possible in the experiment From the analysis of the group of symmetries of the multilayer network, we have identified three possible clustered synchronized patterns. To analyze the stability of each pattern we write down a suitable T matrix.
Pattern 1a We now write down the T α , T β and T γ matrices of the multilayer network corresponding to pattern 1a. Layer α includes only one node, layer β and γ are fully synchronized, thus: The first row of each T matrix describes the synchronous solution, while the remaining rows describe motion transverse to the synchronous manifold. The matrix T that we apply to transform the system (asterisks) and E β 13 (diamonds) (top panels) and transversal MLEs (bottom panels) vs the intra-layer coupling strength σ αα (left) and the interlayer coupling strength σ αβ (right). Blue/red shadowed area indicates cluster synchronization only on the α/β layer, respectively, while in gray shadowed area both the layers are clustered synchronized. Increasing σ αα it is possible to achieve cluster synchronization in layer α (approximately, with σ αα > 0.012) but we cannot obtain cluster synchronization in layer β. Conversely, increasing the inter-layer coupling strength σ αβ , cluster synchronization can be achieved in both the layers.
is the direct sum of the three matrices, i.e., where we have reordered the rows of the matrix in order to group them. Note that we have colored lavender (pink) the rows of the matrix T 1a that correspond to layer β (γ.) Applying the transformation to the multilayer network we obtain the transformed coupling matrices In accordance with the notation introduced in the main manuscript, we write the supra-adjacency matrix A and the matrix B, Reordering the rows and columns of B in order to better identify its block structure, we obtain,

LEs of the sync manifold
LEs transverse to solutions 1 LEs transverse to solutions 1a and 2 (13) (the same result can be obtained directly applying the transformation T 1a to the matrix A). The fully synchronous solution has 3 transverse blocks. The first transverse block B 2 corresponds to the breaking of a DLS and is associated with a perturbation of the form of Eq. 27 of the main manuscript; this describes the breaking of the β layer, which also splits the γ layer into two clusters. This gives solutions of kind 2. There are a total of N β + N γ = 5 LEs associated with this block. The second and the third transverse blocks B 1b 1 and B 1b 2 (each composed of N γ = 3 LEs) correspond to ILS breakings and are associated with perturbations of the form of Eq. 26 of the main manuscript. They describe the splitting of the γ layer into two clusters, giving solutions of kind 1b. Note that the two blocks are equal: this means both breakings have the same stability property. This is because the two breakings describe the same kind of behavior; the only difference between the two breakings is the labeling order used in the γ layer.
The quotient network 1a is always monostable. In Supplementary Figure 9(a), we show the MLEs of B 2 (solid) and B 1b (dashed). This figure is the same as Figure 5 of the main manuscript and is copied here for the reader's convenience. Note that the 1a solution is stable when both the MLEs are negative, i.e. only if k C > 0.2.
The synchronous manifold 1a is contained in both the synchronous manifold 1b and the synchronous manifold 2. Thus, this diagram also shows the region where the 1a solution is present and stable in the other two manifolds. The symmetry breaking (pitchfork) bifurcations where each of the two curves change sign are highlighted in the figure with vertical lines, red if the breaking involves a solution of type 1b, yellow if the breaking involves a solution of type 2.
Patterns 1b Two equivalent patterns of type 1b are possible, due to the different possible labelings in the γ layer (i.e., Colpitts 1 & 2 and 3 & 4 are interchangeable in the network). These patterns can be obtained applying two last breakings of T γ 1a (labeled as 'transverse directions towards 1b' in Eq. (12)) to the synchronous manifold. The two possible 1b patterns have therefore: The resulting T 1b and B matrices are: As expected, the two B matrices are equal (in fact, the stability property of each of these solutions are the same). The 3-dimensional transverse block of the 1b clustered solution corresponds to the breaking of a DLS and is associated with a perturbation of the form of Eq. 27 of the main manuscript and it indicates that it is not possible for the γ layer to desynchronize while the β layer is synchronized. The transverse block is composed by N β + 2N γ = 8 LEs; it describes the breaking of the β and γ layers, giving rise to solutions without any clustered synchronization. Solution 1a is also present when we reside on the 1b synchronous manifold, see Figure 9(b). Since no catastrophic bifurcation destroys this solution, it is always present; it is stable for the k C values for which the dotted curve in figure 9(a) is negative, and unstable otherwise. When the dotted curve of figure 9(a) crosses zero, a symmetry-breaking bifurcation (pitchfork bifurcation) occurs on the 1b invariant manifold. The first crossing from the right, at k c ≈ 0.08, is a supercritical pitchfork bifurcation: the solution of type 1a loses its stability and another solution of type 1b homotopically emerges. This solution exists for all k c < 0.08. The 1a solution regains stability at k C = 0; in this case the pitchfork bifurcation is subcritical. This means that the 1a solution becomes stable because an unstable solution of type 1b is destroyed at that point. That unstable solution was generated at an analogous bifurcation at k C = −0.22, where, again, the type 1a solution loses stability. In the region k c ∈ [−0.22, 0] the quotient system 1b is bistable, with a solution of type 1a and a solution of type 1b; it is otherwise monostable.
The maximum transverse Lyapunov exponent computed around each solution is shown in Supplementary Figure 9(b), in blue when the solution is of type 1a, in red when the solution is of type 1b.

Pattern 2
The resulting block structure in (13) shows that the breaking of the β layer (second row of the matrix T β 1a ) and the first breaking of the γ layer (second row of the matrix T γ 1a ) intertwine. Therefore, they must be applied together in order to break pattern 1a towards pattern 2 (as labeled in (12)). So doing, we obtain the following transformations for each layer: The resulting T 2 and B matrices are: The matrix B has two identical transverse blocks corresponding to ILS breakings and are associated with perturbations of the form of Eq. 26 of the main manuscript. These two blocks can be studied separately, indicating that the separation of Colpitts 1 from 2 and of Colpitts 3 from 4 are not interdependent (as we expect, since they do not communicate). However, since the LEs are governed by the same equation, we expect when one pair breaks the other should also break (they are identical, and there is no reason why two Colpitts should maintain their synchrony while the other two should not). As stated earlier when discussing stability of pattern (1a), if we reside on the 2 synchronous manifold, the 1a solution is present when the solid line in Supplementary Figure 9(a) is negative. Like the 1b manifold, no catastrophic bifurcations destroy the solution of type 1a; it is always present, either stable or unstable, see Supplementary Figure 9(c). Both the stability changes are due to supercritical pitchfork bifurcations, involving a stable symmetry-broken solution of type 2. The leftmost supercritical pitchfork bifurcation, at k C = −0.32, generates a solution that exists up to k C = 0.14, where it undergoes a saddle-node bifurcation and it is destroyed. The solution that is generated at the supercritical pitchfork bifurcation at k C = 0.2 is also destroyed through a saddle-node bifurcation at k C = 0. This manifold, thus, supports bistability when k c ∈ (0, 0.14).
Around each stable solution on the manifold, we compute the two groups of N γ (identical) transverse LEs; figure 9(c) shows the maximum LE values corresponding to solution 1a (blue) and solution 2 (yellow).

Supplementary Note 7: Experimental observations
Different clustered patterns are identified in the experiment by a combined analysis of the phases and frequencies of the individual experimental oscillators.
Supplementary Figure 10 shows the observed cluster states of the experimental system as inductive Colpitts coupling k c is varied. In the top wide panels, we show four phase differences as k c is varied: Again, phase is calculated from the peaks of the oscillations and linearly interpolated in between. The average phase difference which is plotted in the top panels of Supplementary Figure 10 is calculated over 0.12 ms. Note that when the phase difference is not stationary, this value is not truly representative of the dynamics; however the average phase difference is a useful way of visually differentiating the different clustered behaviors.
In the top wide panel, experiments occur from right to left. We begin on the right side of the top panel with large positive coupling (when the inductors are close and anti-parallel); we travel leftward on this figure as k c is reduced (by increasing the separation of the inductors). We cross k c = 0 by switching the inductors from anti-parallel to parallel, and then we bring the inductors closer to reach maximum parallel inductive coupling with k c = −0.4. We reverse this procedure to produce the bottom wide panel, in which experiments occur from left to right.
Supplementary Figure 11 shows the frequencies of the oscillators in the cluster states shown in Supplementary Figure 10. The oscillator frequencies are an additional tool for differentiating between cluster states; we see that different cluster states have different frequencies and changes in the magnetic coupling produce different changes depending upon the cluster state. • Pattern 1a (green in Supplementary Figures 10 and 11) exists with strong anti-parallel coupling (k C large and positive). Pattern 1a is globally stable for a small range of k c ; it is bistable with pattern 2-(π/2) for another range of k c . All oscillators have the same frequency; the frequency increases as k c decreases.
• Pattern 2 exists with two configurations: pattern 2-(π/2) (yellow in Supplementary Figures 10  and 11) has a lower frequency and a phase offset around π/2; pattern 2-(π) (lavender in Supplementary Figures 10 and 11) has a higher frequency and a phase offset around π. Pattern 2-(π/2) has two regions of bistability (with pattern 2-(π) and pattern 1a) and one region of monostability. Pattern 2-(π) has two additional regions of bistability (with pattern 1b and the no pattern) and its own region of monostability.
• Pattern 1b (blue in Supplementary Figures 10 and 11) exists with strong parallel inductive coupling. Here, ∆φ C12 = ∆φ C43 = π. We expect ∆φ C13 = π or 0, since there are two possible indistinguishable patterns 1b; this is true for some experiments in this region. For others, the Colpitts pairs 1/2 and 3/4 oscillate at slightly different frequencies, which causes the phase difference between the left and right Colpitts ∆φ C13 (red curve on the figure) to slowly slip. As we stated, this averaged phase difference does not represent a stationary behavior, and merely serves to visually distinguish the clustered state 1b from the modified version we identify in the theory section in which synchronization is not achieved (positive but small MLE, red curve in Figure 6 of the main manuscript). This result is in accordance with the theoretical prediction: in this region the MLE associated with solution 1b varies around zero, undergoing many supercritical symmetry breaking bifurcations. In Supplementary Figure 11, we can see the slight frequency differences between the Colpitts pairs 1 & 2 and 3 & 4; the Colpitts oscillate considerably faster than the FHNs.
• No pattern (white in Supplementary Figures 10 and 11) exists with weak parallel inductive coupling. Again, the oscillators are not frequency locked. Again, the averaged phase difference shown in figure 10 does not have a real meaning, and serves to identify the white shaded region. This state appears to be related to the pattern 2-(π/2) solution, as the phase difference ∆φ F 12 slows down near (but passes through) ∆φ F 12 = π/2.
Supplementary Figure 11 shows the frequencies of the oscillators in the cluster states shown in Supplementary Figure 10; again, the upper panel reads from right to left, followed by the lower panel which reads from left to right. The shading again refers to the cluster state. The oscillator frequencies are an additional tool for differentiating between cluster states; we see that different cluster states have different frequencies and changes in the magnetic coupling produce different changes depending upon the cluster state. We show the frequencies of both FHNs and all four Colpitts oscillators.
Supplementary Figure 12 shows the phase differences of the observable cluster states reported in Figure 6 of the main manuscript This figure bridges the theoretical predictions and the experimental cluster observations which we identify by both phase difference and frequency. Phase is calculated from the peaks of the oscillations and linearly interpolated in between. We show three phase differences: ∆φ F 12 = φ FHN2 − φ FHN1 (black); ∆φ C12 = φ Colp 2 − φ Colp 1 (blue); and ∆φ C13 = φ Colp 3 − φ Colp 1 (red). We do not report the fourth phase difference, φ Colp 3 − φ Colp 4 ; because Colpitts 1 & 2 and 3 & 4 are interchangeable, this phase difference is identical to ∆φ C12 . We repeat the shading from Figure 6 of the main manuscript. In the bottom panel, we show the Pattern 2-(π) solution; in the top panel we show all other solutions. We show no phase difference in the top panel for 0 < k C < 0.07; in this region we see an unsynchronous solution and thus there is no average phase difference. We can see the slightly desynchronized pattern 1b solutions that occur in the blue region when the red curve from Figure 6 of the main manuscript is positive; in these regions the phase difference ∆φ C12 varies slightly from π and the phase differences ∆φ F 12 and ∆φ C13 vary slightly from 0.

Supplementary Note 8: Discrepancies between simulation and experiment
The main discrepancy we highlight between simulation and experiment is in the modeling of the FHN circuits. Note that in Figure 3 of the main manuscript, each FHN has four resistances, R 1,i , R 2,i , R 3,i and R 4,i , i = 1, 2. In the model, only two resistances are present (R 1,i and R 3,i ), while only the ratio of the other two appears in the model (see Equation (62) in the paper). Nevertheless, changing the value of these resistances changes the experimental waveform, which is not captured by our modeling framework. As shown in Supplementary Figure 13, changing the value of R 2,i = R 4,i changes the amplitude, frequency, and waveform of the oscillation; when R 2,i = R 4,i is sufficiently large, no oscillation occurs.
The model for the transistor 2N2222 of each Colpitts circuit is also simplified; represented with a linear piecewise function. The AD844 operational amplifier is also represented as an ideal component. Finally, component heterogeneity, battery degradation, stray inductance, and experimental noise can also generate differences between the simulations and the experiments. As a result, while we find a good qualitative agreement between the experiment and the simulations, some discrepancies exist. Some of these discrepancies are quantitative: • theoretically, we expect the blue 2 pattern to survive up to k C = −0.27 while we see this pattern only up to k C = −0.16; • theoretically, we expect the 1b pattern to survive for all the negative k C , while experimentally it disappears for k C > −0.12; • theoretically, we expect pattern 2-(π/2) to survive down to k C = 0.07, while experimentally it disappears for k C < 0.12.
In these cases, we lose a solution experimentally before its theoretical loss of stability. The center and stable manifolds of the unstable (saddle) solutions delimit the basin of attraction of the stable solution. When we approach the bifurcation, the basin of attraction of the stable solution shrinks; the unstable limit cycles collapse around the stable limit cycle, and the solution must stay strictly in this shrinking region to retain the solution it represents. If the experimental noise knocks the trajectory out of this region, it can cause the solution to transition earlier than we would expect in the absence of noise. Two other qualitative discrepancies are present: • A region of bistability between green and yellow exists in experiments, but not in the theory. This can be explained theoretically by the heterogeneity of the system; in fact, simulations with heterogeneous components show the symmetry breaking (pitchfork) bifurcation breaks into a limit-point bifurcation, thus generating bistable regions.
• Because the unclustered white region is next to the blue region and not next to the yellow region, one may wonder if the experimental white region is related with some loss of stability of the blue pattern (rather than the yellow one). Analysis of the experimental phase difference of the FHNs, ∆φ F 12 , in the white region reveals a preferred phase difference near π/2; ∆φ F 12 = π/2 characterizes the yellow pattern (while ∆φ F 12 = 0 characterizes the blue pattern). We infer this phase preference because ∆φ F 12 slows down as it passes through ∆φ F 12 = π/2. This relates the white experimental pattern with the theoretical one. We explain its shift to k C < 0 by the huge role heterogeneity plays when the coupling is small.
The interplay of theory and experiments helped us to understand the system's properties. The FHN and Colpitts oscillators have fundamentally different governing equations and behavior. They are characterized by different levels of noise, and different sources of nonlinearity. As theory seeks to explain natural systems with heterogeneous components, further validations will need to be performed in order to understand where modeling excels and where it can improve.
Top: all clustered solutions other than the pattern 2-(π) solution; bottom: the pattern 2-(π) solution. We denote clustered behavior by the shading of the background, as identified and described in Figure 6 of the main manuscript. Shading with two-colored striping indicates bistability between the two states represented by each color. Note that phase difference is shown from 0 to π radians. Supplementary Figure 13: Simulation of single FHN using OrCad PSpice software as resistances R 2,i = R 4,i are increased. Experimental value, R 2,i = R 4,i = 2kΩ, is shown in black bold.