## Introduction

The connection between searches for the ground states of physical systems and optimization problems has activated research and development into new types of computation1. The realization of such architectures has been led off by Ising-model solvers2,3,4,5,6,7,8,9. These physical-model-solver architectures have solved certain problems much faster than conventional digital architectures such as a CPU10,11,12,13. Here, embedding of problems in a physical model sometimes requires an overhead of resources, which can be large enough to be a computational bottleneck. However, the embedding overhead can be reduced by choosing a more appropriate physical model as a solver instead of the Ising model14,15,16,17,18.

The Potts model is a fundamental model describing various physical and mathematical problems19, such as those of percolation theory20. This model is a generalization of the Ising model to multivalued spins; its Hamiltonian is given by

$${{{{{{\rm{H}}}}}}}_{{{{{{\rm{Potts}}}}}}}\,=\,\mathop{\sum}\limits_{{ij}}{J}_{{ij}}\delta ({S}_{i},{S}_{j}),$$
(1)

where $${S}_{i}\,=\,\{{{{{\mathrm{0,1,2}}}}},\ldots ,M\,-\,1\}$$ is an M-component spin on the ith node of the model, where $$i\,=\,\left\{{{{{\mathrm{1,2,3}}}}},\ldots ,N\right\},$$ and $$\delta \left(a,b\right)$$ is the Kronecker delta function. Since multivalued spins naturally express integers, various integer optimization problems can be straightforwardly mapped onto ground-state search problems based on this model19. For example, graph coloring can be described as a Potts model with a smaller Hilbert space than that of the standard Ising model21. In the standard Ising-model mapping, M colors in each node are represented as M Ising spins21; thus, the size of its Hilbert space is 2NM, which is larger than that of the original M-color problem, i.e., $${M}^{N}(={2}^{{N{{\log}}}_{2}M}).$$ The Ising Hamiltonian has constraint terms to reduce the size of the enlarged space to that of the original problem, while the Potts model without the constraint Hamiltonian has the same size as that of the original. Thus, the Potts model mapping allows us to avoid the embedding overhead (see Supplementary Note 1). On the other hand, there are several challenges in realizing a physical Potts solver, namely the implementations of multivalued spins and interactions described by a Kronecker delta within physical systems. Recently, it has been proposed that physical systems based on a lattice of nonequilibrium Bose-Einstein condensates17 and a network of three-photon down-conversion oscillators18 can be used to solve specific Potts problems.

In this study, we demonstrated a scheme to solve the Potts problem using a hybrid architecture of a physical Ising-model solver and digital processing (Fig. 1a). The Potts problem can be approximately solved by iterative calculations of Ising problems with updated interactions evaluated from one-way feedforward connections. Hybrid computation enjoys the advantages of physical solvers through the aid of digital computers22,23. The physical solver obtains a low-energy solution of a complex Ising problem (known to be NP1) quickly10,11,12,13, while the digital computer can accurately handle input and output, such as interactions and energy, and also run the learning logic (see Fig. 1a). We implemented a Potts solver by using a coherent Ising machine (CIM)5,10,24,25,26 and a standard CPU (Fig. 1b). The CIM is a physical Ising-model solver based on coupled degenerate optical parametric oscillators (DOPOs)24, in which Ising spins are encoded by utilizing the bifurcation transition in each DOPO. We experimentally solved two integer optimization problems—clustering and coloring—on the same graph (see Fig. 2).

## Results and discussion

### Theoretical framework

First, we explain how to map a Potts problem on one involving iterative calculations of Ising models. Given an integer $$L\,\equiv\, {{{{{\rm{ceil}}}}}}\left({{{{{{\rm{log }}}}}}}_{2}M\right)$$, multivalued spin Si can be written by a set of Ising spins $${\sigma }_{i}^{\left(l\right)}\,=\,\left\{-1,1\right\}\left(l\,=\,{{{{\mathrm{1,2}}}}},\ldots ,L\right)$$ with a standard binary representation as $${S}_{i}\,=\,{\sum }_{l\,=\,1}^{L}\frac{1\,+\,{\sigma }_{i}^{\left(l\right)}}{2}{2}^{l\,-\,1}.$$ The Hamiltonian in Eq. 1 is rewritten as $${{{{{{\rm{H}}}}}}}_{{{{{{\rm{Potts}}}}}}}\,=\,{\sum }_{{ij}}{J}_{{ij}}{\prod }_{l\,=\,1}^{L}\frac{1\,+\,{\sigma }_{i}^{\left(l\right)}{\sigma }_{j}^{\left(l\right)}}{2}$$, where the delta functional Potts interaction is transformed into multibody Ising-spin interactions $${\prod }_{l\,=\,1}^{L}\frac{1\,+\,{\sigma }_{i}^{\left(l\right)}{\sigma }_{j}^{\left(l\right)}}{2}.$$ This complicated interaction can be simplified by decomposing it into sets of two-body interactions $${\sigma }_{i}^{\left(l\right)}{\sigma }_{j}^{\left(l\right)}$$ on L Ising problems with one-way feedforward connections: $${{{{{{\rm{H}}}}}}}_{{{{{{\rm{Ising}}}}}}}^{(l)}\,=\,{\sum }_{{ij}}{J}_{{ij}}^{(l)}{\sigma }_{i}^{(l)}{\sigma }_{j}^{(l)}.$$ Here, l represents an iteration number and is called a stage. Interaction matrix $${J}_{{ij}}^{(l\,+\,1)}$$ is determined recursively from iteration $${J}_{{ij}}^{(l)}$$ and solution $${s}_{i}^{(l)}$$ of the previous stage:

$${J}_{{ij}}^{(l\,+\,1)}\,=\,\frac{1\,+\,{s}_{i}^{(l)}{s}_{j}^{(l)}}{2}{J}_{{ij}}^{(l)},$$
(2)

where the initial Ising interactions are the same as the original Potts interactions $${J}_{{ij}}^{(1)}\,=\,{J}_{{ij}}.$$

Figure 1a illustrates the framework of the Potts solver based on hybrid computation. We repeat operation stages including two parts, the Ising-solver part that obtains solution $${s}_{j}^{\left(l\right)}$$ for input $${J}_{{ij}}^{\left(l\right)},$$ and the digital part that calculates $${J}_{{ij}}^{\left(l+1\right)}$$ in Eq. 2. The digital part also calculates the Potts energy defined by $${E}_{{{{{{\rm{Potts}}}}}}}^{\left(l\right)}\,=\,{\sum }_{{ij}}{J}_{{ij}}\delta ({S}_{i}^{\left(l\right)},{S}_{j}^{\left(l\right)}),$$ where the Potts spin $${S}_{i}^{\left(l\right)}\,=\,\left\{{{{{\mathrm{0,1}}}}},\ldots ,{M}^{\left(l\right)}\,-\,1\right\}$$, for $${M}^{(l)}\,=\,{2}^{l}$$, is given by

$${S}_{i}^{\left(l\right)}\,=\,{S}_{i}^{\left(l\,-\,1\right)}\,+\,\frac{1\,+\,{s}_{i}^{\left(l\right)}}{2}{M}^{\left(l\,-\,1\right)}.$$
(3)

From Eq. 2, we can derive two other expressions of Potts energy: $${E}_{{{{{{\rm{Potts}}}}}}}^{(l)}\,=\,{\sum }_{{ij}}{J}_{{ij}}^{\left(l\right)}\frac{{1\,+\,s}_{i}^{(l)}{s}_{j}^{(l)}}{2}$$ and $${E}_{{{{{{\rm{Potts}}}}}}}^{(l)}\,=\,{\sum }_{{ij}}{J}_{{ij}}^{\left(l\,+\,1\right)}$$. When l = L (final stage), we obtain solution $${S}_{i}^{* }(={S}_{i}^{\left(L\right)})$$ and Potts energy $${E}_{{{{{{\rm{Potts}}}}}}}^{* }(={E}_{{{{{{\rm{Potts}}}}}}}^{(L)})$$.

Let us discuss the convergence of the above approximation by considering the energy improvement $${\Delta E}_{{{{{{\rm{Potts}}}}}}}^{(l)}\,\equiv\, {E}_{{{{{{\rm{Potts}}}}}}}^{\left(l\right)}\,-\,{E}_{{{{{{\rm{Potts}}}}}}}^{\left(l\,-\,1\right)}.$$ From the above two expressions of $${E}_{{{{{{\rm{Potts}}}}}}}^{\left(l\right)},$$ we can rewrite $${\Delta E}_{{{{{{\rm{Potts}}}}}}}^{(l)}$$ as $$\frac{1}{2}\left({E}_{{{{{{\rm{Ising}}}}}}}^{\left(l\right)}\,-\,{F}_{{{{{{\rm{Ising}}}}}}}^{\left(l\right)}\right),$$ where $${E}_{{{{{{\rm{Ising}}}}}}}^{\left(l\right)}\,=\,{\sum }_{{ij}}{J}_{{ij}}^{\left(l\right)}{s}_{i}^{(l)}{s}_{j}^{(l)}$$ is the Ising energy of a solution in stage l, and $${F}_{{{{{{\rm{Ising}}}}}}}^{\left(l\right)}\,=\,{\sum }_{{ij}}{J}_{{ij}}^{\left(l\right)}$$ is the energy of ferromagnetic states in stage l (e.g., ferromagnetic means $${s}_{i}^{(l)}=-1$$ for all i). We can conclude that $${E}_{{{{{{\rm{Potts}}}}}}}^{(l)}$$ decreases in an iteration l if each stage yields a solution whose energy is lower than that of the ferromagnetic states: $${E}_{{{{{{\rm{Ising}}}}}}}^{\left(l\right)} \, < \, {F}_{{{{{{\rm{Ising}}}}}}}^{\left(l\right)}.$$ Note that the ferromagnetic states are trivially obtained for any $${J}_{{ij}}^{\left(l\right)}$$. Meanwhile, convergence condition $${\Delta E}_{{{{{{\rm{Potts}}}}}}}^{(l)}\,=\,0$$ is satisfied when the obtained solution is the same as the ferromagnetic state. Equation 3 indicates the convergence of a change in spins $${S}_{i}^{\left(l\right)}\,=\,{S}_{i}^{\left(l\,-\,1\right)}$$ for the ferromagnetic solutions (see Methods for details).

Low-energy solution $${S}_{i}^{* }$$ is, however, not assured to be the ground state of the original Potts model. This problem is mainly attributed to the one-way feedforward connection of $${J}_{{ij}}^{(l)}$$ described by Eq. 2. Namely, as l increases, the interaction matrix and graph are divided up into more and more submatrices and subgraphs, as illustrated in Fig. 1a. The loss of information from Jij and the reduction of the graph degrade the solution accuracy. Such errors can be circumvented by implementing two kinds of feedback. One is recurrent Potts problem feedback with a new (learned) interaction matrix $${J}_{{ij}}^{{{{{{\rm{new}}}}}}}$$ (long arrow in Fig. 1a). The other one is digital feedback. Namely, in each digital operation, $${J}_{{ij}}^{(l)}$$ and $${S}_{i}^{(l)}$$ are modified to improve Potts energy $${E}_{{{{{{\rm{Potts}}}}}}}^{(l)}$$(rounded arrows in Fig. 1a). The simplest example of digital feedback is filtering. For $${\Delta E}_{{{{{{\rm{Potts}}}}}}}^{(l)} \, > \, 0,$$ we can filter out a bad solution without additional calculations by choosing a better solution in the previous stage. In the next section, we experimentally demonstrate that heuristic feedback algorithms clearly improve the performance of the Potts solver.

Finally, we generalize $${J}_{{ij}}^{(l)}$$ defined in Eq. 2 to use the feedback algorithms. For convenience, we introduce weight matrix $${W}_{{ij}}^{(l)}$$ defined by $${W}_{{ij}}^{(l\,+\,1)}\,=\,\delta ({S}_{i}^{\left(l\right)},{S}_{j}^{\left(l\right)})$$, which is a more general form, i.e., $${J}_{{ij}}^{(l)}\,=\,{W}_{{ij}}^{(l)}{J}_{{ij}}$$, than Eq. 2. In summary, weights for interactions $${J}_{{ij}}^{(l)}$$ are “learned” from the solutions of the previous Potts model computation so as to decrease the Potts energy. This framework can be regarded as an artificial-neural-network-like algorithm using a physical Ising solver, where a decrease in the energy cost function (Potts energy) is assured if the solver can find a low-energy Ising solution (see Supplementary Note 9). Note that this property of convergence may allow us to utilize the advantages of physical solvers that can find low-energy solutions quickly.

### Graph clustering

We solved a graph clustering problem, which is a task to find the best grouping of nodes. This problem is widely used in various fields, such as community detection in social27 and biological networks28,29. Modularity Q is a good measure for graph clustering problems30, and a task to maximize Q can be directly mapped onto a search for the ground state of the Potts model in Eq. 131. Multivalued spin Si identifies a group number to which the ith node belongs. The interaction matrix for this problem is defined as $${J}_{{ij}}\,\equiv\, {B}_{i}{B}_{j}\,-\,C{A}_{{ij}},$$ where $${A}_{{ij}}$$ is the adjacency matrix, $${B}_{i}\,=\,{\sum }_{j}{A}_{{ij}}$$, and $$C\,=\,{\sum }_{i}{B}_{i}$$. The Potts energy is related to $$Q\,=\,-\frac{{E}_{{{{{{\rm{Potts}}}}}}}}{{C}^{2}}.$$ There is room to study on the definition of $${J}_{{ij}}$$32, but it is beyond the scope of this paper. Competition between antiferromagnetic and ferromagnetic correlations (namely, a positive $${B}_{i}{B}_{j}$$ and negative $${-{CA}}_{{ij}}$$ in $${J}_{{ij}}$$, respectively) is the intrinsic difficulty of this problem. Although the number of groups is not given, optimized M is spontaneously obtained through such ferromagnetic–antiferromagnetic competition as discussed below.

We solved a clustering problem on the graph shown in Fig. 2, compiled from the prefectures of Japan and has N = 47 nodes and $${N}_{{{{{{\rm{edge}}}}}}}\,=\,92$$ edges. The physical solver used here was a CIM implemented with 512 fully connectable nodes (see Methods and Fig. 1b), in which 470 nodes were used in parallel. It has been demonstrated that, if $${J}_{{ij}}$$ is a dense matrix, a CIM shows better performance than the standard CPU with simulated annealing10 and the D-wave system33. This suggests that a CIM is appropriate for clustering problems in which $${J}_{{ij}}$$ is dense owing to the term of $${B}_{i}{B}_{j}$$ with $${B}_{i}\,\ne\, 0.$$

Each CIM calculation took 500 μs (100 5-μs steps), and each digital part took about 30 μs or less. However, the current setup used a slow serial communications interface, and the data transfer of $${J}_{{ij}}^{(l)}$$ between the CIM and CPU took a few seconds. This bottleneck can be removed by coding the $${J}_{{ij}}$$-update logic in field-programmable gate array (FPGA) modules (see Methods).

Figure 3a–d shows the evolutions of 47 DOPOs during 100 operation steps (circulations in a cavity) in four stages (l = 1, 2, 3, and 4). Positive (negative) DOPO amplitudes represent up (down) Ising spins. The black lines in Fig. 3e show the change in modularity Q and the number of groups M in the same operation steps as those in Fig. 3a–d. For l = 1 and 2, the DOPO amplitudes show that antiferromagnetic states appear after several tens of steps (Fig. 3a, b). As a result, each group split in two, and M doubled in value (Fig. 3e). At l = 3, down-spin DOPOs were the majority (Fig. 3c), indicating that ferromagnetic correlations were dominant, and M converged to an optimized value of 5 (Fig. 3e). At the beginning of the steps in each stage, Q decreased drastically, while at the end of steps, a CIM selected a higher value of Q than that in the former stage. At l = 4, the complete ferromagnetic state prevailed finally (Fig. 3d), meaning that the stationary condition was satisfied. As mentioned in Methods, the obtained grouping with Q of 0.646 (see Fig. 2) is the same as the best solution obtained by reliable algorithms, such as the Louvain greedy34 and Infomap algorithms35. Figure 3f shows that the rate of reaching the highest Q of 1000 trials is about 20%. We can conclude that the ferromagnetic–antiferromagnetic competition solved by the CIM provided good groupings with high modularity (see Methods and Supplementary Note 6).

We found that two digital feedback algorithms—domain separation and group reunion—improved the performance of our Potts solver. Figure 3e, f (red and blue lines) reveal that they almost doubled the rate of reaching the highest Q. In addition, the highest Q was reached at early stages, so the calculation time was shortened. The digital processing including both the domain separation and group reunion algorithms takes at most 30 μs; they were on the order of $$O({N}^{2})$$ (or $$O({N}_{{{{{{\rm{edge}}}}}}})$$ for sparse $${A}_{{ij}}$$).

#### Domain separation

By detecting magnetic domains (small regions in which spins are in the same), we can recalculate $${S}_{i}^{\left(l\right)}$$ to decrease Potts energy. As illustrated in Fig. 3g, an Ising solver sometimes yields a solution consisting of two or more separated magnetic domains (see Supplementary Notes 3, 4, and also ref. 25). Nodes in separated domains should be in different groups owing to the lack of ferromagnetic correlations ($$-{A}_{{ij}}$$). Note that the antiferromagnetic connections ($${B}_{i}{B}_{j}$$) still remain. Namely, the subgraph consisting of red nodes in Fig. 3g is separated, while the corresponding matrix (red square) is not block diagonal because of the antiferromagnetic interactions (see Supplementary Note 4). As depicted in Fig. 3g, by removing the antiferromagnetic elements from $${J}_{{ij}}^{(l)}$$, this domain separation feedback yields a new $${J}_{{ij}}^{{\left(l\right)}^{{\prime} }}$$. As a result, the Potts energy decreases in proportion to the sum of removed antiferromagnetic elements, i.e., $${\sum }_{{ij}}\left[{J}_{{ij}}^{{\left(l\right)}^{{\prime} }}\,-\,{J}_{{ij}}^{\left(l\right)}\right]$$.

Detecting domains and numbering them require a calculation time of $$O({N}^{2})$$ or $$O({N}_{{{{{{\rm{edge}}}}}}}).$$ The updated $${S}_{i}^{{\left(l\right)}^{{\prime} }}$$ is determined from the domain number. Now, the number of groups is unlimited, while it was limited to at most $${2}^{l}$$ in the case without feedback. As shown in Fig. 3e, f (red lines), domain separation feedback allows us to reach the best solution with M = 5 in the early stages $$l \, < \, 3.$$ Thus, the calculation time can be shortened by up to half.

#### Group reunion

The group reunion feedback algorithm is schematically shown in Fig. 3h. On the basis of the obtained grouping described by $${S}_{i}^{(l)},$$ we can calculate the group–group interactions $${J}_{g{g}^{{\prime} }}\,\equiv\, {\sum }_{{ij}}{J}_{{ij}}\delta (S_{i}^{\left(l\right)},g)\delta (S_{j}^{\left(l\right)},{g}^{{\prime} }).$$ As shown in Fig. 3h, this interaction matrix is defined on a new graph with $${M}^{(l)}$$ nodes. The new Potts Hamiltonian $${{{{{{\rm{H}}}}}}}_{{{{{{\rm{Potts}}}}}}}\,=\,{\sum }_{g{g}^{{\prime} }}{J}_{g{g}^{{\prime} }}\delta ({S}_{g},{S}_{{g}^{{\prime} }})$$ describes the task of finding the best way of decreasing the Potts energy by executing reunions of groups. Ferromagnetic group–group interaction $${J}_{g{g}^{{\prime} }} \, < \, 0$$ requires reunion. Group reunion feedback can restore the information lost due to the approximation: Namely, $${J}_{g{g}^{{\prime} }}$$ includes information about the original Jij, whereas a block diagonal $${J}_{{ij}}^{(l)}$$ loses it.

The Potts problem for a group reunion task can be efficiently solved by making the following approximation. We unify two groups, $${g}_{a}$$ and $${g}_{b}$$, for a negative and minimum $${J}_{{g}_{a}{g}_{b}}$$ without considering the other negative elements, and repeat the same calculations by updating $${J}_{g{g}^{{\prime} }}$$. In each step, the Potts energy is reduced by $$2{J}_{{g}_{a}{g}_{b}}$$. This approximation works very well for small M (see Supplementary Note 5). The calculation takes $$O({N}^{2})$$ $${{{{{\rm{or}}}}}}$$ $$O({N}_{{{{{{\rm{edge}}}}}}})$$ time. Group reunion feedback combined with domain separation feedback improves the rate of reaching the highest Q as shown in Fig. 3e, f.

In spite of the improvement with the domain separation and group reunion feedback schemes, experimental results indicated that the success probability of our machine is still not better than those of the Informap and Metropolis algorithms (See Methods and Supplementary Note 6). We consider that one main reason for the relatively low success probability is instability in the optical system, which resulted in the fluctuation of the operational condition in each computation trial. This includes the instability of the optical parametric oscillation caused by the thermal fluctuation for the long-distance fiber in the cavity, and the instability of the relative phase between the DOPO and the injected lights. The optical stability can be improved by implementing precise temperature control of the long-distance fiber in the cavity and by suppressing the phase noise of the pump laser for the second-harmonic generation (see Fig. 1b).

### Graph coloring

Graph coloring is the task of coloring connected nodes19. We experimentally solved a four-color problem on the graph in Fig. 2a. We set L = 2 for M = 4. The interaction matrix was totally antiferromagnetic $${J}_{{ij}}\,\equiv\, {A}_{{ij}} \, > \, 0,$$ requiring that adjacent nodes be different colors. The four-color theorem36 assures the existence of a ground state with $${E}_{{{{{{\rm{Potts}}}}}}}^{* }\,=\,0.$$ The CIM operated under the same conditions as described above (see Supplementary Note 7).

Figure 4a shows the conditional success rates for 50 instances of $${J}_{{ij},{k}}^{\left(2\right)}$$ $$\left(k\,=\,1,2,\ldots ,50\right)$$, which were obtained as follows: Identical Ising models with $${J}_{{ij}}$$ in stage one were solved 50 times, and 50 solutions $${s}_{i,k}^{(1)}$$ and 50 Ising models in stage two $${J}_{{ij},{k}}^{\left(2\right)}$$ were obtained, where $${J}_{{ij},{k}}^{\left(2\right)}\,=\,{W}_{{ij},k}^{(1)}{J}_{{ij}}$$ and $${W}_{{ij},k}^{(1)}\,=\,\delta ({s}_{i,k}^{\left(1\right)},{s}_{j,k}^{(1)})$$. Then, the conditional success rate was estimated by solving the 50 Ising models with $${J}_{{ij},{k}}^{\left(2\right)}$$ 100 times. The total success rate averaged over k is about 50%. Figure 4a shows the energy in stage one, $${E}_{{{{{{\rm{Potts}}}}}},\,k}^{(1)}$$, for each $${s}_{i,k}^{(1)}$$. Successful and failed instances in stage two are clearly separated irrespective of the energy in stage one. This result can be understood from the reduction of the graph in stage two described by $${W}_{{ij},k}^{(1)}{A}_{{ij}}$$ (see Supplementary Note 3). Coloring fails with 100% probability regardless of the energy in stage one if the reduced graph in stage two has geometrical frustrations, such as a triangular structure (see Supplementary Note 8).

Such frustrations can be dissolved by implementing recurrent Potts problem feedback based on a “learning by mistake” approach. We iteratively execute the Potts solver with $${J}_{{ij}}^{{{{{{\rm{new}}}}}}}\,=\,{J}_{{ij}}^{{{{{{\rm{old}}}}}}}\,+\,w{L}_{{ij}}$$ and $${J}_{{ij}}^{{{{{{\rm{initial}}}}}}}\,=\,{w}_{0}{A}_{{ij}},$$ where w and w0 are weights to control the learning. A feedback matrix is defined by $${L}_{{ij}}\,\equiv\, {W}_{{ij}}^{(2)}{A}_{{ij}}(=\left\{{{{{\mathrm{0,1}}}}}\right\})$$ and related to the Potts energy as $${E}_{{{{{{\rm{Potts}}}}}}}^{* }\,=\,{\sum }_{{ij}}{L}_{{ij}}.$$ Here, $${L}_{{ij}}\,=\,1$$ represents adjacent nodes in the same color, while $${L}_{{ij}}\,=\,0$$ represents those having different colors (or not adjacent). Thus, a finite $${L}_{{ij}}$$ directly represents a mistake. In the new (learned) Potts problem with $${J}_{{ij}}^{{{{{{\rm{new}}}}}}},$$ interactions on the “mistaken edges” are enlarged, and then the pair of nodes on these edges are correctly colored with high priority in stage one. As a result, frustrations caused by the reduction of the graph in stage two are eliminated by learning.

Figure 4b shows how learning affects the success rate for $$w\,=\,5,10,20,{{{{{\rm{and}}}}}}$$ $$40$$ with $${w}_{0}\,=\,40.$$ Each success rate was obtained by performing 50 trials in a two-stage experiment. In each learning step, $${L}_{{ij}}$$ is determined from the worst-case instance with the largest $${E}_{{{{{{\rm{potts}}}}}}}^{* }$$ in 50 trials. As shown in Fig. 4b, success rates improved from 50% to over 80% after few learning steps, eventually reaching nearly 100%. A larger w provides fast but unstable improvements (see Supplementary Note 10).

Figure 4c represents the sum of $${L}_{{ij}}$$, corresponding to total counts of mistakes, for four independent learning processes. In each learning process, there was no more than one mistake on the same edge; thus, there was at most four mistakes. The red-colored edges and nodes in Fig. 4c are frequently involved in these mistakes: they can be regarded as the intrinsic origins of frustrations in the graph. By detecting such nodes and edges, the learning process increased the success rate to over 80%.

The present Potts solver can be applied to general coloring problems with M colors37. For instance, simple scheduling problems38 and number puzzles such as Sudoku39 can be described by graph colorings. It is straightforward to apply the present solver to cases with $$M\,=\,{2}^{L},$$ while a small number of additional nodes (at most N) are required to deal with cases in which $$M\,\ne\, {2}^{L}$$ (see Supplementary Note 2). For comparison, the usual Ising mapping21 requires MN nodes (up to $${N}^{2}$$). Thus, the Potts solver can save node resources, which will be a benefit for physical solvers having the limitation of node resources.

## Conclusions

We demonstrated a Potts model solver based on a hybrid architecture composed of physical Ising solvers and digital processing. The Potts problem is mapped onto iterative Ising problems with learning of weights for interactions, where convergence is assured if a physical solver can find a low-energy Ising solution. We experimentally realized it with a CIM and a standard CPU (Intel(R) Xeon(R)). We showed that graph coloring and clustering problems can be solved by using simple Ising models (no magnetic fields were used to introduce constraints and there was no need for a large number of spins). The resource overhead for embedding the problem is significantly suppressed. As a tradeoff, iterative calculations with learning (namely, additional computational time) are required. We expect that this additional time is insignificant if the physical solver is fast enough. The cost of the communication between physical and digital systems is an essential problem of hybrid computation, but it can be significantly suppressed by directly coding the learning logic in CIM’s measurement-feedback systems.

The proposed method approximates an M-state Potts problem with L (= log2M) Ising problems, which means that the method does not guarantee that the ground state of the given Potts problem will be obtained. Although the heuristic feedback schemes that we call domain separation and group reunion significantly improved the solution accuracies, presumably the feedback will not completely compensate for the information lost in dividing a Potts problem into Ising problems for general instances. Nevertheless, we consider that the method is important as a scheme to obtain approximate, but useful solutions to integer optimization problems in a short time, by utilizing very fast computation speed of physical-system based Ising machines and also related algorithms on non-CPU digital devices40,41.

## Methods

### Ferromagnetic solutions as a sign of convergence

By considering $${\Delta E}_{{{{{{\rm{Potts}}}}}}}^{(l)},$$ we can find a convergence condition characterized by ferromagnetic solutions. Note that there are degenerate ferromagnetic solutions due to spin inversion symmetry, and the degeneracy $${d}_{{FM}}$$ increases as $${d}_{{FM}}\,=\,{2}^{l}$$ (or $${d}_{{FM}}\,=\,2{M}^{\left(l\right)}$$ to put it more precisely) because of the reduction of the graph and interaction matrix. Equation 3 indicates that these ferromagnetic solutions, except for the complete-down state ($${s}_{i}^{(l)}\,=\,-1$$ for all i), cause a (trivial) change in the multivalued spins, which can be reduced to steady spin states $${S}_{i}^{\left(l\,+\,1\right)}\,=\,{S}_{i}^{\left(l\right)}$$. Accordingly, we can find another expression for the convergence condition $${J}_{{ij}}^{(l\,+\,1)}\,=\,{J}_{{ij}}^{(l)}.$$

### Experimental setup of CIM

As shown in Fig. 1b, the CIM contains a phase-sensitive amplifier (PSA), 1-km fiber ring cavity, and an FPGA module. We employ a periodically poled lithium niobate (PPLN) waveguide as the PSA, which amplifies lights with only the 0 or π phase components relative to the pump phase as a result of signal-idler degenerate optical parametric amplification42,43. These two amplified components express two of the Ising spins. Because the cavity round-trip time is 5 μs and the pump pulse interval is 1 ns, over 5000 DOPO pulses are generated inside the 1-km cavity, from which 512 DOPO pulses are used as artificial Ising spins. The 512 DOPO pulses are mutually coupled by using the measurement and feedback scheme with the FPGA module10,26. We can encode interaction matrix $${J}_{{ij}}$$ in the FPGA module with eight-bit integers ranging from −128 to 128. For solving the clustering problem,$${{{{{\rm{max }}}}}}|{J}_{{ij}}|\sim C\,=\,2{N}_{{{{{{\rm{edge}}}}}}}\,=\,184$$ exceeds the maximum range of the FPGA module. Thus, $${J}_{{ij}}$$ in the CIM is rounded off as $$R({B}_{i}{B}_{j}\,-\,C{A}_{{ij}})$$ with $$R\,=\,1/2.$$ By performing simulated annealing44 calculations without round off, we confirmed that an error caused by this rounding is not critical to the ground-state search.

### Actual computational time

The Ising-solver process of the CIM is completed in 500 μs, which is the time for 100 round trips of DOPO pulses in the 1-km cavity. The obtained spin configurations are stored in the FPGA module and transferred to the CPU to update the interaction matrix of the next stage. The calculations in the digital part take at most 30 μs when both the domain separation and group reunion algorithms are used simultaneously. Since the current FPGA module uses a slow serial communications interface (RS-232C), it takes a few seconds to transfer the annealing results and the updated matrix between the FPGA module and CPU. Although this technical issue is beyond the current scope, it is important to discuss how much we can shorten the transfer time.

For example, by using 10 Gigabit Ethernet (10 Gbps), the transfer time for $${J}_{{ij}}^{(l)}$$ consisting of $$8\,\times\, {512}^{2}$$ bits is estimated to be 0.2 ms ideally. Note that $${J}_{{ij}}^{(l)}$$ can be written as $${J}_{{ij}}^{(l)}\,=\,{W}_{{ij}}^{(l)}{J}_{{ij}}$$ with $${W}_{{ij}}^{(l\,+\,1)}\,\equiv\, \delta ({S}_{i}^{(l)},{S}_{j}^{(l)})$$. Except for the first stage, it is enough to transfer $${S}_{i}^{(l)}$$of $${{{{{{\rm{log }}}}}}}_{2}M\,\times\, 512$$ bits in a few microseconds. Furthermore, we can directly write the $${J}_{{ij}}^{(l)}$$-update logic in an FPGA module, which does not take any time for data transfer, except for the first input and final output. System-on-chip FPGA devices may be used to implement rather complicated feedback algorithms.

### Comparison with other algorithms

We compared our experimental clustering results with other algorithms running on a standard CPU. We used reliable algorithms45,46, namely the Louvain greedy34 and Infomap algorithms35. The greedy algorithm reached the same best solution of $${{{{{\rm{Q}}}}}}\,=\,0.646$$ with a small rate (about 2%), and it frequently reached the second and third best solution with $${{{{{\rm{Q}}}}}}\,\sim\, 0.643$$ (over 70%). The Infomap algorithm reached the best solution with highest probability of about 60% (see Supplementary Note 6). Louvain ran in about a few milliseconds, while Infomap took about a few seconds on an Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70 GHz. However, the number of nodes was too small to evaluate the run times of these algorithms. A further benchmark study like the one in ref. 45 will be left to future work, because the number of nodes is strictly limited in the current setup.