Abstract
The development of physical simulators, called Ising machines, that sample from low energy states of the Ising Hamiltonian has the potential to transform our ability to understand and control complex systems. However, most of the physical implementations of such machines have been based on a similar concept that is closely related to relaxational dynamics such as in simulated, meanfield, chaotic, and quantum annealing. Here we show that dynamics that includes a nonrelaxational component and is associated with a finite positive Gibbs entropy production rate can accelerate the sampling of low energy states compared to that of conventional methods. By implementing such dynamics on field programmable gate array, we show that the addition of nonrelaxational dynamics that we propose, called chaotic amplitude control, exhibits exponents of the scaling with problem size of the time to find optimal solutions and its variance that are smaller than those of relaxational schemes recently implemented on Ising machines.
Introduction
Many complex systems such as spin glasses, interacting proteins, largescale hardware, and financial portfolios, can be described as ensembles of disordered elements that have competing frustrated interactions^{1} and rugged energy landscapes. There has been a growing interest in using physical simulators called Ising machines in order to reduce time and resources needed to identify configurations that minimize their total interaction energy, notably that of the Ising Hamiltonians \({{{{{{{\mathcal{H}}}}}}}}\) with \({{{{{{{\mathcal{H}}}}}}}}({{{{{{{\boldsymbol{\sigma }}}}}}}})=\frac{1}{2}{\sum }_{ij}{\omega }_{ij}{\sigma }_{i}{\sigma }_{j}\) (with ω_{ij} the symmetric Ising couplings, i.e., ω_{ij} = ω_{ji}, and σ_{i} = ±1) that is related to many nondeterministic polynomialtime hard (NPhard) combinatorial optimization problems and various realworld applications^{2} (see ref. ^{3} for a review). Recently proposed implementations include memresistor networks^{4}, micro or nanoelectromechanical systems^{5}, micromagnets^{6,7}, coherent optical systems^{8}, hybrid optoelectronic hardware^{9,10,11}, integrated photonics^{12,13,14}, flux qubits^{15}, and BoseEinstein condensates^{16}. In principle, these physical systems often possess unique properties, such as coherent superposition in flux qubits^{17} and energy efficiency of memresistors^{4,18}, which could lead to a distinctive advantage compared to conventional computers (see Fig. 1(a) and (b)) for the sampling of low energy states. In practice, the difficulty in constructing connections between constituting elements of the hardware is often the main limiting factor to scalability and performance for these systems^{15,19}. Moreover, these devices often implement schemes that are directly related to the concept of annealing (either simulated^{20,21}, meanfield^{22,23}, chaotic^{18,24}, and quantum^{17,25}) in which the escape from the numerous local minima^{26} and saddle points^{27} of the free energy function can only be achieved under very slow modulation of the control parameter (see Fig. 1(c)). These methods are dependent on nonequilibrium dynamics called aging that, according to recent numerical studies^{28}, is strongly nonergodic and seems to explore only a confined subspace determined by the initial condition rather than wander in the entire configurational space^{29} for meanfield spin glass models. In general, such systems find the solutions of minimal energy only after many repetitions of the relaxation process.
Alternative dynamics that is not based on the concepts of annealing and relaxation may perform better for solving hard combinatorial optimization problems^{30,31,32}. Various kinds of dynamics have been proposed^{3,33,34,35,36}, notably chaotic dynamics^{18,37,38,39,40}, but have either not been implemented onto specialized hardware^{37,41} or use chaotic dynamics merely as a replacement to random fluctuations^{18,38}. It was recently shown that the control of amplitude in meanfield dynamics can improve the performance of Ising machines by introducing errorcorrection terms (see Fig. 1(d)), effectively doubling the dimensionality of the system, whose role is to correct the amplitude heterogeneity^{30}. Because of the similarity of such dynamics with that of a neural network, it can be implemented especially efficiently in electronic neuromorphic hardware where memory is distributed with the processing^{42,43,44}.
In this paper, we show that the addition of the nonrelaxational part to the relaxation process makes the dynamics able to escape at a much faster rate than relaxational ones from local minima and saddles of the corresponding energy function. The exponential scaling factor with respect to system size of the time needed to reach optimal configurations of spin glasses that can be found after a very long computation time using stateoftheart heuristics, called time to solution, is smaller in the former case, which raises the question whether the nonrelaxational component makes the dynamics qualitatively different from the very slow relaxation observed in classic MonteCarlo simulations of spin glasses. In order to extend numerical analysis to large problem sizes and limit finitesize effects, we implement a scheme that we name chaotic amplitude control (CAC) on a field programmable gate array (FPGA, see Fig. 1(b)) and show that the developed hardware can be faster for finding these optimal configurations in the limit of large problem sizes than many stateoftheart algorithms and Ising machines for some reference benchmarks with enhanced energy efficiency.
Results
Meanfield dynamics
For the sake of simplicity, we consider the classical limit of Ising machines for which the state space is often naturally described by analog variables (i.e., real numbers) noted x_{i} in the following. The variables x_{i} represent measured physical quantities such as voltage^{4} or optical field amplitude^{8,9,10,11,12,13} and these systems can often be simplified to networks of interacting nonlinear elements whose time evolution can be written as follows:
where f_{i} and g_{i} represent the nonlinear gain and interaction, respectively, and are assumed to be monotonic, odd, and invertible “sigmoidal” functions for the sake of simplicity; η_{i}, experimental white noise of standard deviation σ_{0} with \(\langle {\eta }_{i}(t){\eta }_{j}(t^{\prime} )\rangle ={\delta }_{ij}\delta (tt^{\prime} )\); and N, the number of spins. δ_{ij} and δ(t) are the Kronecker delta symbol and Dirac delta function, respectively. Ordinary differential equations similar to eq. (1) have been used in various computational models that are applied to NPhard combinatorial optimization problems such as HopfieldTank neural networks^{45}, coherent Ising machines^{46}, and correspond to the “soft” spin description of frustrated spin systems^{47}. Moreover, the steady states of eq. (1) correspond to the solutions of the “naive” ThoulessAndersonPalmer (nTAP) equations that arise from the meanfield description of SherringtonKirkpatrick spin glasses when the Onsager reaction term has been discarded^{48}. In the case of neural networks in particular, the variables x_{i} and constant parameters ω_{ij} correspond to firing rates of neurons and synaptic coupling weights, respectively.
It is well known that, when β_{i} = β for all i and the noise is not taken into account (σ_{0} = 0), the time evolution of this system is motion in the state space that seeks out minima of a potential function^{49} (or Lyapunov function) V given as \(V=\beta {{{{{{{\mathcal{H}}}}}}}}({{{{{{{\boldsymbol{y}}}}}}}})+{\sum }_{i}{V}_{b}({y}_{i})\), where V_{b} is a bistable potential with \({V}_{b}({y}_{i})=\int\nolimits_{0}^{{y}_{i}}{f}_{i}({g}_{i}^{1}(y))\,{{{{{{{\rm{d}}}}}}}}y\) and \({{{{{{{\mathcal{H}}}}}}}}({{{{{{{\boldsymbol{y}}}}}}}})=\frac{1}{2}{\sum }_{ij}{\omega }_{ij}{y}_{i}{y}_{j}\) is the extension of the Ising Hamiltonian in the real space with y_{i} = g_{i}(x_{i}) (see Supplementary Note 1.1). The bifurcation parameter β, which can be interpreted as the inverse temperature of the naive TAP equations^{48}, the steepness of the neuronal transfer function in HopfieldTank neural networks^{45}, or to the coupling strength in coherent Ising machines^{8,9,10}, is usually decreased gradually in order to improve the quality of solutions found. This procedure has been called meanfield annealing^{23}, and can be interpreted as a quasistatic deformation of the potential function V (see Fig. 1(c)). There is, however, no guarantee that a sufficiently slow deformation of the landscape V will ensure convergence to the lowest energy state contrary to the quantum adiabatic theorem^{50} or the convergence theorem of simulated annealing^{51}. At fixed β, global convergence to the minimum of the potential V can be assured if σ_{0} is gradually decreased with \({\sigma }_{0}{(t)}^{2} \sim \frac{c}{log(2+t)}\) and c sufficiently large^{52}. The parameter \({\sigma }_{0}^{2}\) is analogous to the temperature in simulated annealing in this case. The global minimum of the potential V does not, however, generally correspond to that of the Ising Hamiltonians \({{{{{{{\mathcal{H}}}}}}}}\) at finite β. Moreover, the statistical analysis of spin glasses suggests that the potential V is highly nonconvex at low temperature and that simple gradient descent very unlikely reaches the global minimum of \({{{{{{{\mathcal{H}}}}}}}}({{{{{{{\boldsymbol{\sigma }}}}}}}})\) because of the presence of exponentially numerous local minima^{26} and saddle points^{27} as the size of the system increases. The slow relaxation time of MonteCarlo simulations of spin glasses, such as when using simulated annealing, might also be explained by similar trapping dynamics during the descent of the free energy landscape obtained from the TAP equations^{27}. In the following, we consider in particular the soft spin description obtained by taking \({f}_{i}({x}_{i})=(1+p){x}_{i}{x}_{i}^{3}\) and y_{i} = g(x_{i}) = x_{i}, where p is the gain parameter, which is the canonical model of the system described in eq. (1) at proximity of a pitchfork bifurcation with respect to the parameter p (see theory of weakly connected neural networks^{53}). In this case, the potential function V_{b} is given as \({V}_{b}({x}_{i})=(1p)\frac{{x}_{i}^{2}}{2}+\frac{{x}_{i}^{4}}{4}\) and eq. (1) can be written as \(\frac{\,{{{{{{{\rm{d}}}}}}}}{x}_{i}}{\,{{{{{{{\rm{d}}}}}}}}t}=\frac{\partial V}{\partial {x}_{i}}\), ∀i.
Chaotic amplitude control
In order to define deterministic dynamics that is inclined to visit spin configurations associated with lower Ising Hamiltonian without relying entirely on the descent of a potential function, we introduce error signals, noted \({e}_{i}\in {\mathbb{R}}\), that modulate the strength of coupling β_{i} to the ith nonlinear element such that β_{i}(t) defined in eq. (1) is expressed as β_{i}(t) = βe_{i}(t) with β > 0. The time evolution of the error signals e_{i} are given as follows^{30}:
where a and ξ are the target amplitude and the rate of change of error variables, respectively, with a > 0 and ξ > 0. If the system settles to a steady state, the values \({y}_{i}^{* }=g({x}_{i}^{* })\) become exactly binary with \({y}_{i}^{* }=\pm\!\sqrt{a}\). When p < 1, the internal fields h_{i} at the steady state, defined as h_{i} = ∑_{j}ω_{ij}σ_{j} with \({\sigma }_{j}={y}_{j}^{* }/ {y}_{j}^{* }\), are such that h_{i}σ_{i} > 0, ∀i^{30}. Thus, each equilibrium point of the analog system corresponds to that of a zerotemperature local minimum of the binary spin system.
The dynamics described by the coupled equations (1) and (2) is not derived from a potential function because error signals e_{i} introduce asymmetric interactions between the x_{i} and the computational principle is not related to a gradient descent. Rather, the addition of error variables results in additional dimensions in the phase space via which the dynamics can escape local minima. The mechanism of this escape can be summarized as follows. It can be shown (see the Supplementary Note 1.2) that the dimension of the unstable manifold at equilibrium points corresponding to local minima σ of the Ising Hamiltonian depends on the number of eigenvalues μ(σ) with μ(σ) > F(a) where μ(σ) are the eigenvalues of the matrix \({\{\frac{{\omega }_{ij}}{ {h}_{i} }\}}_{ij}\) (with internal field h_{i}) and F a function given as \(F(y)=\frac{\psi ^{\prime} (y)}{\psi (y)}y\) and \(\psi (y)=\frac{f({g}^{1}(y))}{({g}^{1})^{\prime} (y)}\). Thus, there exists a value of a such that all local minima (including the ground state) are unstable and for which the system exhibits chaotic dynamics that explores successively candidate boolean configurations. The energy is evaluated at each step and the best configuration visited is kept as the solution of a run. This chaotic search is particularly efficient for sampling configurations of the Ising Hamiltonian close to that of the ground state using a single run although the distribution of sampled states is not necessarily given by the Boltzmann distribution. Note that the use of chaotic dynamics for solving Ising problems has been discussed previously^{18,54}, notably in the context of neural networks, and it has been argued that chaotic fluctuations may possess better properties than Brownian noise for escaping from local minima traps. In the case of the proposed scheme, the chaotic dynamics is not merely used as a replacement for noise. Rather, the interaction between nonlinear gain and errorcorrection results in the destabilization of states associated with lower Ising Hamiltonian. Increasing the amplitude of additive noise σ_{0} in eq. (1) does not, in fact, significantly increase the efficiency of the system and σ_{0} is thus set to zero.
Ensuring that fixed points are locally unstable does not guarantee that the system does not relax to periodic and chaotic attractors. We have previously proposed that nontrivial attractors can also be destabilized by ensuring the positive rate of Gibbs entropy production using a modulation of the target amplitude^{30}. In this paper, we propose an alternative heuristic modulation of the target amplitude that is more suitable for a digital implementation on FPGA than the one proposed in^{30} without significant decrease in performance for most problem instances (see Supplementary Note 1.3 for a comparison of the two schemes). Because the value of a for which all local minima is unstable is not known a priori, we propose instead to destabilize the local minima traps by dynamically modulating a depending on the visited configurations σ as follows:
where \({{\Delta }}{{{{{{{\mathcal{H}}}}}}}}(t)={{{{{{{{\mathcal{H}}}}}}}}}_{{{\mbox{opt}}}}{{{{{{{\mathcal{H}}}}}}}}(t)\); \({{{{{{{\mathcal{H}}}}}}}}(t)\), the Ising Hamiltonian of the configuration visited at time t; and \({{{{{{{{\mathcal{H}}}}}}}}}_{{{\mbox{opt}}}}\), a given target energy. In practice, we set \({{{{{{{{\mathcal{H}}}}}}}}}_{{{\mbox{opt}}}}\) to the lowest energy visited during the current run, i.e., \({{{{{{{{\mathcal{H}}}}}}}}}_{{{\mbox{opt}}}}(t)={{{\mbox{min}}}}_{t^{\prime} \le t}{{{{{{{\mathcal{H}}}}}}}}(t^{\prime} )\). The function tanh is the tangent hyperbolic. ρ and δ are positive real constants. Lastly, the parameter ξ (see eq. (2)) is modulated as follows: \(\frac{\,{{{{{{{\rm{d}}}}}}}}\xi }{\,{{{{{{{\rm{d}}}}}}}}t}={{\Gamma }}\) when t − t_{r} < Δt, where t_{r} is the last time for which either the best known energy \({{{{{{{{\mathcal{H}}}}}}}}}_{{{\mbox{opt}}}}\) was updated or ξ was reset. Otherwise, ξ is reset to 0 if t − t_{r} ≥ Δt and t_{r} is set to t. Numerical simulations shown in the following suggest that this modulation results in the destabilization of nontrivial attractors (periodic, chaotic, etc.) for typical problem instances.
Hardware independent timetosolution scaling
In order to test if the nonrelaxational dynamics of chaotic amplitude control might be able to accelerate the search of meanfield dynamics for finding the ground state of typical frustrated systems, we look for the ground states of SherringtonKirkpatrick (SK) spin glass instances using the numerical simulation of eqs. (1) to (3) and compare time to solutions with those of two closely related relaxational schemes: noisy meanfield annealing (NMFA)^{22} and the simulation of the coherent Ising machine (simCIM, see Supplementary Note 4.12). Because finding the ground states of SK spin glasses is a NPhard problem, it cannot be assured that the states found by heuristic solvers, even after a very long computation time, are the real ground states. In order to compare the performance of a set of heuristic solvers, we can define for each instance the “optimal” energy as the one equal to the lowest energy found repeatedly (>10^{6} times) by these heuristics after a very long computation time. States that have this optimal energy are called solution states in the following. Because the arithmetic complexity of calculating one step of these three schemes is dominated by the matrixvector multiplication (MVM), it is sufficient for the sake of comparison to count the number of MVM, noted ν, to find the solution state energy of a given instance with 99% success probability, with \(\nu (K)=K\frac{\,{{\mbox{ln}}}(10.99)}{{{\mbox{ln}}}\,(1{p}_{0}(K))}\) and p_{0}(K) the probability of visiting a solution state configuration at least once after a number K of MVMs in a single run. In Fig. 2, NMFA (a) and the CAC (b) are compared using the averaged success probability 〈p_{0}〉 of finding the solution state for 100 randomly generated SK spin glass instances per problem size N (see Supplementary Note 3 for details about the benchmark set). Note that the success probability of the meanfield annealing method does not seem to converge to 1 even for large annealing time (see Fig. 2(a)). Because the success probability of NMFA and simCIM remains low at larger problem size, its correct estimation requires simulating a larger number of runs, which we achieved by using GPU implementations of these methods. On the other hand, the average success probability 〈p_{0}〉 of CAC is of order 1 when the maximal number of MVM is large enough, suggesting that the system rarely gets trapped in local minima of the Ising Hamiltonian or nontrivial attractors. In Fig. 2(c) and (d) are shown the q^{th} percentile (with q = 50, i.e., the median) of the MVM to solution distribution, noted ν_{q}(K; N), for various duration of simulation K, where K is the number of MVMs of a single run. The minimum of these curves, noted \({\nu }_{q}^{* }(N)\) with \({\nu }_{q}^{* }(N)={{{\mbox{min}}}}_{K}{\nu }_{q}(K;N)\), represents the optimal scaling of MVM to solution vs. problem size N^{15}. Using the hypothesis of an exponential scaling with the square root of problem size N, CAC exhibits significantly smaller scaling exponent (γ = 0.18 ± 0.06) than the NMFA (γ = 0.47 ± 0.04) and simCIM (γ = 0.54 ± 0.03, see inset in Fig. 2(e)). We have verified that this scaling advantage holds for various parameters of the meanfield annealing (see Supplementary Note 4.1). Note that a rootexponential scaling behavior of the median time to solution has been previously reported for SK spin glass problems^{15,55} and other NPHard problems^{56}. We consider both cases of a rootexponential and exponential scaling as possible hypotheses in the following. Note that a similar rootexponential scaling at finite size N is observed when we apply the CAC algorithm to solving problems from the recently proposed Wishart planted instances^{57,58} in the “easy” regime (see Supplementary Note 3.3).
Benchmark of the FPGA implementation
Although comparison of algebraic complexity indicates that CAC has a scaling advantage over meanfield annealing, it is in practice necessary to compare its physical implementation against other stateoftheart methods because the performance of hardware depends on other factors such as memory access and information propagation delays. To this end, CAC is implemented into a FPGA because its similarity with neural networks makes it wellfitted for a design where memory is distributed with processing (see Supplementary Note 2 for the details of the FPGA implementation). The organization of the electronic circuit can be understood using the following analogy. Pairs of analog values x_{i} and e_{i}, which represent averaged activity of two types of neurons, are encoded within neighboring circuits. This microstructure is repeated N times on the whole surface of the chip, which resembles the columnar organization of the brain. The nonlinear processes f_{i}(x_{i}), which model the localpopulation activation functions and are independent for i ≠ j, are calculated in parallel. The coupling between elements i and j ∈ {1, …, N} that is achieved by the dot product in eq. (1) is implemented by circuits that are at the periphery of the chip and are organized in a summation tree reminiscent of dendritic branching (see Fig. 1(b)). The power consumption of the developed hardware never exceeds 5W because of limitations of the development board that we have used.
First, we compare the FPGA implementation of CAC against stateoftheart CPU algorithms: breakout local search^{59} (BLS) that has been used to find many of the best known maximumcuts (equivalently, Ising energies) from the GSET benchmark set (https://web.stanford.edu/~yyye/yyye/Gset/), a welloptimized singlecore CPU implementation of parallel tempering (or random replica exchange MonteCarlo Markov chain sampling)^{60,61} (PT, courtesy of S. Mandrà), simulated annealing (SA)^{62}. Figure 3(a) shows that the CAC on FPGA has the smallest real time to solution \({\tau }_{q}^{* }\) against most other stateoftheart algorithms despite just 5W powerconsumption where \({\tau }_{q}^{* }(N)\) is the optimal q^{th} percentile of time to solution with 99% success probability and is given as \({\tau }_{q}^{* }(N)={{{\mbox{min}}}}_{T}{\tau }_{q}(T;N)\) where τ(T) of a given instance is \(\tau (T)=T\frac{\,{{\mbox{ln}}}(10.99)}{{{\mbox{ln}}}\,(1{p}_{0}(T))}\) and T is the duration in seconds of a run. The probability p_{0}(T) is evaluated using 100 runs per instance. The results of the CAC algorithm run on a CPU are also included in Fig. 3. The CPU implementation of CAC written in python for this work is not optimized and is consequently slower than other algorithms. However, its scaling of time to solution with problem size is consistent with that of CAC on FPGA. Figure 3 shows that CAC implemented on either CPU or FPGA has a significantly smaller increase of time to solution with problem size than SA run on CPU. Note that the power consumption of transistors in the FPGA and CPU scales proportionally to their clock frequencies. In order to compare different hardware despite the heterogeneity in their power consumption, the q^{th} percentile of energytosolution \({E}_{q}^{* }\), i.e., the energy \({E}_{q}^{* }\) required to solve SK instances with \({E}_{q}^{* }=P{\tau }_{q}^{* }\) and P the power consumption, is plotted in Fig. 3(b). For the sake of simplicity, we assume a 20 watts power consumption for the CPU. These numbers represent typical orders to magnitude for contemporary digital systems. CAC on FPGA is 10^{2} to 10^{3} times more energy efficient than stateoftheart algorithms running on classical computers.
The MonteCarlo methods SA and PT have moreover been recently implemented on a specialpurpose electronic chip called Digital Annealer (DA)^{20}. In Fig. 4, we show the scaling exponents of 50th and 80th percentiles of the MVM and time to solution distribution for problem sizes N = 800 to N = 1100 based on the hypothesis of scaling in e^{γN} obtained by fitting data shown in Figs. 2 and 3 (see “CAC fully parallel” and ‘CAC 100 × 100”, respectively, in Fig. 4 for the numerical values of the scaling exponents), and compare them to that reported for SA and PT implemented on CPU and DA. Note that we replicated the benchmark method of ref. ^{20} for the sake of the comparison by fitting time to solution from N = 800 up to N = 1100. The scaling obtained from fitting the MVM to solution in Fig. 2 is based on the assumption that the matrixvector multiplication can be calculated fully in parallel in a time that scales as log(N) instead of N^{2} (see Methods section) at least up to N = 1100. We include this hypothesis because many other Ising machines exploit the parallelization of matrixvector multiplication for speed up^{20,63}, whereas the current implementation of CAC iterates on block matrices of size 100 by 100 and is thus only partially parallel because of resource limitations specific to the downscale FPGA used in this work. Note that the number of matrixvector multiplications necessary to find the solution state dominates the exponential scaling rather than the time to compute one matrixvector multiplication. The scaling of time to solution for chaotic amplitude control observed is significantly smaller than the ones of standard MonteCarlo methods SA and PT^{20}, especially in the case of a fully parallel implementation. The scaling exponents of fully parallel CAC is smaller than that of DA and on par with that of PT on DA (PTDA), although CAC does not require simulating replica of the system and is thus faster in absolute time than PTDA^{20}.
Next, the proposed implementation of chaotic amplitude control is compared to other recently developed Ising machines (see Fig. 5). The relatively slow increase of time to solution with respect to the number of spins N when solving larger SK problems using CAC suggests that our FPGA implementation is faster than the Hopfield neural network implemented using memresistors (memHNN)^{4}, the restricted Boltzmann machine using a FPGA (FPGARBM)^{55} at large N. Extrapolations are based on the hypotheses of scaling in e^{γN} and \({e}^{\gamma \sqrt{N}}\) by fitting the available experimental data up to N = 100 for memHNN and FPGARBM, N = 150 for NTT CIM, and N = 1100 for FPGACAC. Figure 5 shows that memHNN, FPGARBM, and NTT CIM have similar scaling exponents, although FPGARBM tends to exhibit a scaling in e^{γN} rather than \({e}^{\gamma \sqrt{N}}\) for N ≈ 100^{55}. It can be nonetheless expected that the algorithm implemented in memHNN, which is similar to meanfield annealing, has the same scaling behavior as simCIM and NMFA (see Fig. 2). Note that the lines showing the scaling in \({e}^{\gamma \sqrt{N}}\) in Fig. 5 can be seen as a lower bound of the TTS if the subexponential scaling is finitesize effect.
It is noteworthy to mention that a recent implementation of the simulated bifurcation machine^{63} (SBM), which is not based on the gradient descent, similarly to CAC, but based on adiabatic evolutions of energy conservative systems performs well in solving SK problems. Both SBM and CAC exhibit smaller time to solution than other gradient based methods. SBM has been implemented on a FPGA (the Intel Stratix 10 GX) that has approximately 5 to 10 times more adaptive logic modules than the KU040 FPGA used to implement CAC. In order to compare SBM and CAC if implemented on an equivalent FPGA hardware, we plot in Fig. 5 the estimation of the time to solution for a fully parallel implementation of CAC using the hypothesis that one matrixvector multiplication of size 1100 × 1100 can be achieved in 0.3 μs. This is the same time to compute a MVM that we can infer from time to solution reported in ref. ^{63} for SBM with binary connectivity given that problems of size N = 100 (N = 700) are solved in 29 μs (55 ms) and 94 (81000) MVMs, respectively. Note that SBM can reach the solution states after approximately 20 times less MVMs than CAC at N = 100 but only 5 times less MVM at N = 1000, suggesting that the speed of SBM depends largely on hardware rather than an algorithmic scaling advantage. Moreover, the simulated bifurcation machine^{63} does not perform significantly better than our current implementation of CAC for solving instances of the reference MAXCUT benchmark set called GSET (https://web.stanford.edu/~yyye/yyye/Gset/) (see Table 1) because CAC can find better cuts and exhibits smaller time to solution for multiple GSET instances even compared to the case of the implementation on the smaller KU040 FPGA with the probability of finding maximum cuts of the GSET in a single run that is much smaller with SBM. Comparison of the scaling behavior of time to solution between CAC and SBM is unfortunately not possible based on available data^{63}.
Estimation of time to solution distribution
Next, we consider the whole distribution of time to solution in order to compare the ability of various methods to solve harder instances. As shown in Fig. 6(a), the cumulative distribution function (CDF) P(τ; T) of time to solution with 99% success probability τ is not uniquely defined as it depends on the duration T of the runs. We can define an optimal CDF P^{*}(τ) that is independent of the runtime T as follows: P^{*}(τ) = max_{T}P(τ; T). Numerical simulations show that this optimal CDF is well described by lognormal distribution, that is \({P}^{* }(log(\tau )) \sim {{{{{{{\mathcal{N}}}}}}}}(\mu ,\sqrt{v})\) where \(\sqrt{v}\) is the standard deviation of log(τ) (see Fig. 6(b), (c), and (d) for the cases of CAC, SA, and NMFA, respectively). In Fig. 6(e), it is shown that the scaling of the standard deviation \(\sqrt{v}(N)\) with the problem size N is significantly smaller for CAC, which implies that harder instances can be solved relatively more rapidly than using other methods. This result confirms the advantageous scaling of higher percentiles for CAC that was observed in Figs. 2 and 3.
Conclusions
The framework described in this paper can be extended to solve other types of constrained combinatorial optimization problems such as the traveling salesman^{45}, vehicle routing, and lead optimization problems. Moreover, it can be adapted to a variety of recently proposed Ising machines^{4,5,6,7,8,9,10,11,12,13,15}, which would benefit from implementing a scheme that does not rely solely on the descent of a potential function. In particular, the performance of CIM^{9,10}, memHNN^{4}, and chipscale photonic Ising machine^{14}, which have small time to solution for small problem sizes (N ≈ 100) for which it may be sufficient to do rapid sampling based on convex optimization^{64} but with a relatively large scaling exponent, which limits their scalability, could be improved by adapting the implementation we propose if these hardware can be shown to be able to simulate larger numbers of spins experimentally. Rapid progress in the growing field of Ising machines may allow to verify scaling behaviors of the various methods at larger problem sizes and, thus, limit further finitesize effects.
The scaling exponents we have reported in this paper are based on the integration of the chaotic amplitude control dynamics using a Euler approximation of its ODEs. We have noted that the scaling of MVMs to solution of SK spin glass problems is reduced when the Euler time step is decreased (see Supplementary Note 2.8). The scaling exponents of CAC might thus be smaller than reported in this paper in the limit of a more accurate numerical integration over continuous time. It is therefore important future work to evaluate the scaling for N ≫ 1000 using a faster numerical simulation method. Such numerical calculations require a careful analysis of the integration method of ODEs, numerical precision, and tuning of parameters. It is also of considerable interest to implement CAC on multiple FPGAs for very high parallelization^{65} and on analog physical systems for further reduction of power consumption.
Nonrelaxational dynamics described herein is not limited to artificial simulators and likely also emerge in natural complex systems. In particular, it has been hypothesized that the brain operates out of equilibrium at large scales and produces more entropy when performing physically and cognitively demanding tasks^{66,67} in particular for the ones involving creativity for maximizing reward rather than memory retrieval. Such neural processes cannot be explained simply by the relaxation to fixed point attractors^{49}, periodic limit cycles^{68}, or even lowdimensional chaotic attractors whose selfsimilarity may not be equivalent to complexity but set the conditions for its emergence^{67,69}. Similarly, evolutionary dynamics that is characterized as nonergodic when the time required for representative genome exploration is longer than available evolutionary time^{70} may benefit from nonrelaxational dynamics rather than slow glassy relaxation for faster identification of highfitness solutions. A detailed analytic comparison between the slow relaxation dynamics observed in MonteCarlo simulations of spin glasses with the one proposed in this paper is needed in order to explain the apparent difference in their scaling of time to solution exhibited by our numerical results.
Methods
We target the implementation of a lowpower system with maximum power supply of 5W using a XCKU040 Kintex Ultrascale Xilinx FPGA integrated on an Avnet board. The implemented circuit can process Ising problems of up to at least 1100 spins fully connected and of >2000 spins sparsely connected within the 5W power supply limit. Data is encoded into 18 bits fixed point vectors with 1 sign, 5 integer and 12 decimal bits to optimize computation time and power consumption. An important feature of our FPGA implementation of CAC is the use of several clock frequencies to concentrate the electrical power on the circuits that are the bottleneck of computation and require a highspeed clock. For the realization of the matrixvector multiplication, each element of the matrix is encoded with 2 bits precision (w_{ij} is − 1, 0 or 1). An approximation based on the combination of logic equations describing the behavior of a multiplexer allows to achieve 10^{4} multiplications within one clock cycle. The results of these multiplications are summed using cascading DSP and CARRY8 connected in a tree structure. Using pipelining, a matrixvector multiplication for a squared matrix of size N is computed in \(2+5\frac{\log(N4)}{\log\,(5)}+(\frac{N}{u})^{2}\) clock cycles (see Supplementary Note 2.4) at a clock frequency of 50MHz with u = 100, which is determined by the limitation of the number of available electronic component of the XCKU040 FPGA. The block size u can be made at least 3 times larger using commercially available FPGAs, which implies that the number of clock cycles needed to compute a dot product can scale almost logarithmically for problems of size N ≈ 1000 (see Supplementary Note 2.4 for discussions of scalability) and that the calculation time can be further significantly decreased using a higherend FPGA. The calculation of the nonlinearity f_{i} and error terms is achieved at higher frequency (300MHz and 100Mhz) using DSP in 8 + (N/u) and 9 + (N/u) clock cycles, respectively. In order to minimize energy resources and maximize speed, the nonlinear and error terms are calculated multiple times during the calculation of a single matrixvector multiplication (see Supplementary Note 2).
Prior to computing the benchmark on the SherringtonKirkpatrick instances, the parameters of the system (see Supplementary Note 2.8) are optimized automatically using Bayesian optimization and banditbased methods^{71}. The automatic tuning of parameters for some previously unseen instances is out of the scope of this work but can be achieved to some extent using machine learning techniques^{72}.
Data availability
SherringtonKirkpatrick instances used in this paper are available upon request to T. Leleu. The GSET instances are available at https://web.stanford.edu/~yyye/yyye/Gset/.
Code availability
Requests for code availability should be addressed to T. Leleu.
References
Parisi, G., Mézard, M. & Virasoro, M. Spin Glass Theory and Beyond. Vol 187, 202 (World Scientific,1987).
Kochenberger, G. et al. The unconstrained binary quadratic programming problem: a survey. J. Comb. Optim. 28, 58–81 (2014).
Vadlamani, S. K., Xiao, T. P. & Yablonovitch, E. Physics successfully implements Lagrange multiplier optimization. Proc. Natl Acad. Sci. USA 117, 26639–26650 (2020).
Cai, F. et al. Powerefficient combinatorial optimization using intrinsic noise in memristor Hopfield neural networks. Nat. Electron. 3, 1–10 (2020).
Mahboob, I., Okamoto, H. & Yamaguchi, H. An electromechanical Ising Hamiltonian. Sci. Adv. 2, e1600236 (2016).
Camsari, K. Y., Faria, R., Sutton, B. M. & Datta, S. Stochastic pbits for invertible logic. Phys. Rev. X 7, 031014 (2017).
Camsari, K. Y., Sutton, B. M. & Datta, S. pbits for probabilistic spin logic. Appl. Phys. Rev. 6, 011305 (2019).
Marandi, A., Wang, Z., Takata, K., Byer, R. L. & Yamamoto, Y. Network of timemultiplexed optical parametric oscillators as a coherent Ising machine. Nat. Photon. 8, 937–942 (2014).
McMahon, P. L. et al. A fully programmable 100spin coherent Ising machine with alltoall connections. Science 354, 614–617 (2016).
Inagaki, T. et al. Largescale Ising spin network based on degenerate optical parametric oscillators. Nat. Photon. 10, 415–419 (2016).
Pierangeli, D., Marcucci, G. & Conti, C. Largescale photonic Ising machine by spatial light modulation. Phys. Rev. Lett. 122, 213902 (2019).
RoquesCarmes, C. et al. Heuristic recurrent algorithms for photonic Ising machines. Nat. Commun. 11, 1–8 (2020).
Prabhu, M. et al. Accelerating recurrent Ising machines in photonic integrated circuits. Optica 7, 551–558 (2020).
Okawachi, Y. et al. Demonstration of chipbased coupled degenerate optical parametric oscillators for realizing a nanophotonic spinglass. Nat. Commun. 11, 1–7 (2020).
Hamerly, R. et al. Experimental investigation of performance differences between coherent Ising machines and a quantum annealer. Sci. Adv. 5, eaau0823 (2019).
Kalinin, K. P. & Berloff, N. G. Global optimization of spin Hamiltonians with gaindissipative systems. Sci. Rep. 8, 17791 (2018).
Johnson, M. W. et al. Quantum annealing with manufactured spins. Nature 473, 194 (2011).
Kumar, S., Strachan, J. P. & Williams, R. S. Chaotic dynamics in nanoscale NbO2 Mott memristors for analogue computing. Nature 548, 318 (2017).
Heim, B., Rønnow, T. F., Isakov, S. V. & Troyer, M. Quantum versus classical annealing of Ising spin glasses. Science 348, 215–217 (2015).
Aramon, M. et al. Physicsinspired optimization for quadratic unconstrained problems using a digital annealer. Front. Phys. 7, 48 (2019).
Kirkpatrick, S., Gelatt, C. D. & Vecchi, M. P. Optimization by simulated annealing. Science 220, 671–680 (1983).
King, A. D., Bernoudy, W., King, J., Berkley, A. J. & Lanting, T. Emulating the coherent Ising machine with a meanfield algorithm. Preprint at https://arxiv.org/abs/1806.08422 (2018).
Bilbro, G. et al. in Advances in Neural Information Processing Systems, (ed. Touretzky, D. S.), 91–98 (Morgan Kaufmann Publishers Inc., 1989).
Chen, L. & Aihara, K. Chaotic simulated annealing by a neural network model with transient chaos. Neural Netw. 8, 915–930 (1995).
Kadowaki, T. & Nishimori, H. Quantum annealing in the transverse Ising model. Phys. Rev. E 58, 5355 (1998).
Tanaka, F. & Edwards, S. Analytic theory of the ground state properties of a spin glass. i. Ising spin glass. J. Phys. F: Metal Phys. 10, 2769 (1980).
Biroli, G. Dynamical TAP approach to mean field glassy systems. J. Phys. 32, 8365 (1999).
Bernaschi, M., Billoire, A., Maiorano, A., Parisi, G. & RicciTersenghi, F. Strong ergodicity breaking in aging of meanfield spin glasses. Proc. Natl Acad Sci USA 117, 17522–17527 (2020).
Cugliandolo, L. F. & Kurchan, J. On the outofequilibrium relaxation of the SherringtonKirkpatrick model. J. Phys. A 27, 5749 (1994).
Leleu, T., Yamamoto, Y., McMahon, P. L. & Aihara, K. Destabilization of local minima in analog spin systems by correction of amplitude heterogeneity. Phys. Rev. Lett. 122, 040607 (2019).
ErcseyRavasz, M. & Toroczkai, Z. Optimization hardness as transient chaos in an analog approach to constraint satisfaction. Nat. Phys. 7, 966–970 (2011).
Molnár, B., Molnár, F., Varga, M., Toroczkai, Z. & ErcseyRavasz, M. A continuoustime MaxSat solver with high analog performance. Nat. Commun. 9, 4864 (2018).
Aspelmeier, T. & Moore, M. Realizable solutions of the ThoulessAndersonPalmer equations. Phys. Rev. E 100, 032127 (2019).
Boettcher, S. & Percus, A. G. Optimization with extremal dynamics. Phys. Rev. Lett. 86, 5211–5214 (2001).
Zarand, G., Pazmandi, F., Pal, K. & Zimanyi, G. Using hysteresis for optimization. Phys. Rev. Lett. 89, 150201 (2002).
Leleu, T., Yamamoto, Y., Utsunomiya, S. & Aihara, K. Combinatorial optimization using dynamical phase transitions in drivendissipative systems. Phys. Rev. E 95, 022118 (2017).
Aspelmeier, T., Blythe, R., Bray, A. J. & Moore, M. A. Freeenergy landscapes, dynamics, and the edge of chaos in meanfield models of spin glasses. Phys. Rev. B 74, 184411 (2006).
Hasegawa, M., Ikeguchi, T. & Aihara, K. Combination of chaotic neurodynamics with the 2opt algorithm to solve traveling salesman problems. Phys. Rev. Lett. 79, 2344 (1997).
Horio, Y. & Aihara, K. Analog computation through highdimensional physical chaotic neurodynamics. Phys. D: Nonlin. Phenom. 237, 1215–1225 (2008).
Aihara, K. Chaos engineering and its application to parallel distributed processing with chaotic neural networks. Proc. IEEE 90, 919–930 (2002).
Montanari, A. Optimization of the SherringtonKirkpatrick Hamiltonian. SIAM J. Comput. FOCS191–FOCS1938 https://doi.org/10.1137/20M132016X (2018).
Furber, S. B., Galluppi, F., Temple, S. & Plana, L. A. The spinnaker project. Proc. IEEE 102, 652–665 (2014).
Davies, M. et al. Loihi: A neuromorphic manycore processor with onchip learning. IEEE Micro 38, 82–99 (2018).
Benjamin, B. V. et al. Neurogrid: A mixedanalogdigital multichip system for largescale neural simulations. Proc. IEEE 102, 699–716 (2014).
Hopfield, J. J. & Tank, D. W. "Neural” computation of decisions in optimization problems. Biol. Cybern. 52, 141–152 (1985).
Wang, Z., Marandi, A., Wen, K., Byer, R. L. & Yamamoto, Y. Coherent Ising machine based on degenerate optical parametric oscillators. Phys. Rev. A 88, 063853 (2013).
Sompolinsky, H. & Zippelius, A. Relaxational dynamics of the EdwardsAnderson model and the meanfield theory of spinglasses. Phys. Rev. B 25, 6860 (1982).
Thouless, D. J., Anderson, P. W. & Palmer, R. G. Solution of ’solvable model of a spin glass’. Philos. Magaz. 35, 593–601 (1977).
Hopfield, J. J. Neurons with graded response have collective computational properties like those of twostate neurons. Proc. Natl Acad Sci. USA 81, 3088–3092 (1984).
Farhi, E., Goldstone, J., Gutmann, S. & Sipser, M. Quantum computation by adiabatic evolution. Preprint at https://arxiv.org/abs/quantph/0001106 (2000).
Granville, V., Krivánek, M. & Rasson, J.P. Simulated annealing: a proof of convergence. IEEE Trans. Pattern Anal. Mach. Intell. 16, 652–656 (1994).
Geman, S. & Hwang, C.R. Diffusions for global optimization. SIAM J. Contr. Optimiz. 24, 1031–1043 (1986).
Hoppensteadt, F. C. & Izhikevich, E. M. Weakly connected neural networks, Vol. 126 (Springer Science & Business Media, 2012).
Goto, H. Bifurcationbased adiabatic quantum computation with a nonlinear oscillator network. Sci. Rep. 6, 21686 (2016).
Patel, S., Chen, L., Canoza, P. & Salahuddin, S. Ising model optimization problems on a FPGA accelerated restricted boltzmann machine. Preprint at https://arxiv.org/abs/2008.04436 (2020).
Hoos, H. H. & Stützle, T. On the empirical scaling of runtime for finding optimal solutions to the travelling salesman problem. Eur. J. Operat. Res. 238, 87–94 (2014).
Hamze, F., Raymond, J., Pattison, C. A., Biswas, K. & Katzgraber, H. G. Wishart planted ensemble: A tunably rugged pairwise Ising model with a firstorder phase transition. Phys. Rev. E 101, 052102 (2020).
Perera, D. et al. Chook–a comprehensive suite for generating binary optimization problems with planted solutions. Preprint at https://arxiv.org/abs/2005.14344 (2021).
Benlic, U. & Hao, J.K. Breakout local search for the MaxCut Problem. Eng. Appl. Artif. Intell. 26, 1162–1173 (2013).
Hukushima, K. & Nemoto, K. Exchange Monte Carlo method and application to spin glass simulations. J. Phys. Soc. Japan 65, 1604–1608 (1996).
Mandra, S., Villalonga, B., Boixo, S., Katzgraber, H. & Rieffel, E. Stateoftheart classical tools to benchmark NISQ devices. (APS Meeting Abstracts, 2019).
Isakov, S. V., Zintchenko, I. N., Rønnow, T. F. & Troyer, M. Optimised simulated annealing for Ising spin glasses. Comput. Phys. Commun. 192, 265–271 (2015).
Goto, H. et al. Highperformance combinatorial optimization based on classical mechanics. Sci. Adv. 7, eabe7953 (2021).
Ma, Y.A., Chen, Y., Jin, C., Flammarion, N. & Jordan, M. I. Sampling can be faster than optimization. Proc. Natl Acad. Sci. USA 116, 20881–20885 (2019).
Tatsumura, K., Yamasaki, M. & Goto, H. Scaling out Ising machines using a multichip architecture for simulated bifurcation. Nat. Electron. 4, 208–217 (2021).
Lynn, C. W., Papadopoulos, L., Kahn, A. E. & Bassett, D. S. Human information processing in complex networks. Nat. Phys. 16, 965–973 (2020).
Wolf, Y. I., Katsnelson, M. I. & Koonin, E. V. Physical foundations of biological complexity. Proc. Natl Acad. Sci. USA 115, E8678–E8687 (2018).
Yan, H. et al. Nonequilibrium landscape theory of neural networks. Proc. Natl Acad. Sci. USA 110, E4185–E4194 (2013).
Ageev, D., Aref’eva, I., Bagrov, A. & Katsnelson, M. I. Holographic local quench and effective complexity. J. High Energy Phys. 2018, 71 (2018).
McLeish, T. C. Are there ergodic limits to evolution? ergodic exploration of genome space and convergence. Interf. Focus 5, 20150041 (2015).
Falkner, S., Klein, A. & Hutter, F. Bohb: Robust and efficient hyperparameter optimization at scale. In: International Conference on Machine Learning, 1437–1446 (PMLR, 2018).
Dunning, I., Gupta, S. & Silberholz, J. What works best when? a systematic evaluation of heuristics for MaxCut and QUBO. INFORMS J. Comput. 30, 608–624 (2018).
Ma, F. & Hao, J.K. A multiple search operator heuristic for the maxkcut problem. Ann. Oper. Res. 248, 365–403 (2017).
Acknowledgements
The authors thank Salvatore Mandrà for providing results of the PT algorithm, Davide Venturelli, David Bernal, Sam Reifenstein, and the anonymous referee for their suggestion of applying CAC on the Wishart planted instances. This research is partially supported by the BrainMorphic AI to Resolve Social Issues (BMAI) project (NEC corporation), the University of Tokyo GAP Fund Program, AMED under Grant Number JP21dm0307009, UTokyo Center for Integrative Science of Human Behavior(CiSHuB), and Japan Science and Technology Agency Moonshot R&D Grant Number JPMJMS2021
Author information
Authors and Affiliations
Contributions
T. Leleu proposed the project. F.K. and T. Leleu performed the FPGA implementation, numerical simulations, and data analysis. R.H. and T. Levi supported theoretical analysis and experimental data. T. Leleu, F.K., T. Levi, R.H., T.K., and K.A. wrote the manuscript with inputs from all authors.
Corresponding author
Ethics declarations
Competing interests
T. Leleu and K.A. are inventors on patent WO2020027785A1 submitted by the University of Tokyo that covers amplitude heterogeneity errorcorrection scheme. The remaining authors declare no competing interests.
Peer review information
Communications Physics thanks the anonymous reviewers for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Leleu, T., Khoyratee, F., Levi, T. et al. Scaling advantage of chaotic amplitude control for highperformance combinatorial optimization. Commun Phys 4, 266 (2021). https://doi.org/10.1038/s42005021007680
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s42005021007680
This article is cited by

Recent progress on coherent computation based on quantum squeezing
AAPPS Bulletin (2023)

Simulated bifurcation assisted by thermal fluctuation
Communications Physics (2022)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.