## Introduction

Unconventional computing with special-purpose hardware devices for solving combinatorial optimization problems has attracted growing interest due to practical importance. Many combinatorial optimization problems can be mapped onto finding ground states of Ising spin models1,2, which is referred to as the Ising problem. Special-purpose hardware devices for the Ising problem are called Ising machines. Ising machines utilizing natural phenomena have been developed, such as quantum annealers3,4,5,6,7,8 with superconducting circuits9, coherent Ising machines with pulse lasers10,11,12,13,14, oscillator-based Ising machines15,16,17,18,19, and other Ising machines with various systems, such as stochastic nanomagnets20, gain-dissipative systems21, spatial light modulators22, memristor Hopfield neural networks23, and spin torque nano-oscillators24,25.

Ising machines have also been implemented with special-purpose digital processors26,27,28,29,30,31,32 using simulated annealing (SA)33 and other algorithms. Among such algorithms, simulated bifurcation (SB) is a recently proposed heuristic algorithm34. SB originates from numerical simulations of Hamiltonian dynamics with bifurcation that is a classical counterpart of quantum adiabatic bifurcation in nonlinear oscillators35,36, which itself has been studied actively37,38,39,40,41,42,43,44,45. Simulation-based approaches such as SB allow one to deal with dense spin–spin interactions with high precision, which might be challenging for physical implementations. Also, the classical dynamics can be simulated efficiently, unlike quantum dynamics. SB can be accelerated by parallel processing with, e.g., field-programmable gate arrays (FPGAs)34,46,47,48, because of its capability of simultaneous updating of variables. Recently proposed variants of SB have achieved faster and more accurate optimization49 than original SB.

To further improve the performance of SB, here we introduce thermal fluctuation to SB. SA can yield high-accuracy solutions by modeling thermal fluctuation33, while quantum annealing utilizes quantum fluctuation3. These fluctuations can assist escape from local minima of the Ising problem and lead to higher solution accuracy. Our method is based on the Nosé-Hoover method50,51, which enables simulations of Hamiltonian dynamics at finite temperatures52. Unlike SA, the Nosé-Hoover method does not use random numbers, namely, is deterministic, and thus the simplicity of SB is preserved. We find that a simplified dynamics with only a heating process can improve the performance of SB, where an ancillary dynamical variable in the Nosé–Hoover method is replaced by a constant. We numerically demonstrate this improvement by solving instances of the Ising problem with up to 2000 spin variables and all-to-all connectivity, which corresponds to the Sherrington–Kirkpatrick (SK) model introduced in studies of spin glasses53,54,55. The SK model has been widely used to measure the performance of Ising machines12,14,28,29,30,32,34,49,56. Proposed heated SB is also suitable for massively parallel implementations with, e.g., FPGAs.

## Results and discussion

### SB with thermal fluctuation

First, we briefly explain the Ising problem and SB. The Ising problem is to find N Ising spins si = ±1 minimizing a dimensionless Ising energy,

$${E}_{{{{{{\rm{Ising}}}}}}}=-\frac{1}{2}\mathop{\sum }\limits_{i=1}^{N}\mathop{\sum }\limits_{j=1}^{N}{J}_{{ij}}{s}_{i}{s}_{j},$$
(1)

where Jij represents the interactions between si and sj (Jij=Jji and Jii = 0). The SB has two latest variants, ballistic SB (bSB) and discrete SB (dSB)49. Both bSB and dSB are based on the following Hamiltonian equations of motion,

$${\dot{x}}_{i}={a}_{0}{y}_{i},$$
(2)
$${\dot{y}}_{i}=-\left[{a}_{0}-a(t)\right]{x}_{i}+{c}_{0}{f}_{i},$$
(3)

where xi and yi are respectively the positions and momenta corresponding to si, the dots denote time derivatives, a(t) is a control parameter, and a0 and c0 are constants. The force due to the interactions, fi, are given by

$${f}_{i}=\mathop{\sum }\limits_{j=1}^{N}{J}_{{ij}}{x}_{j},\kern3pc {{{{{\rm{for}}}}}}\;{{{{{\rm{bSB}}}}}},$$
(4)
$${f}_{i}=\mathop{\sum }\limits_{j=1}^{N}{J}_{{ij}}{{{{{\rm{sgn}}}}}}\left({x}_{j}\right),\kern1pc{{{{{\rm{for}}}}}}\;{{{{{\rm{dSB}}}}}},$$
(5)

where sgn(xj) is the sign of xj. Time evolutions of xi are calculated by solving Eqs. (2) and (3) with the symplectic Euler method52, where the positions xi are confined within |xi| ≤ 1 by perfectly inelastic walls at xi = ±1, that is, if |xi| > 1 after each time step, xi and yi are set to xi= sgn(xi) and yi= 0. With increasing a(t) from zero to a0, bifurcations to xi = ±1 occur, and the signs si= sgn(xi) yield a solution to the Ising problem. A solution at the final time is at least a local minimum of the Ising problem49. Ballistic behavior in bSB leads to fast convergence to a local or approximate solution, while the discretized fi in dSB enable higher solution accuracy with a longer time.

Here we apply the Nosé–Hoover method50,51,52 with a finite temperature T to Eqs. (2) and (3), obtaining

$${\dot{x}}_{i}={a}_{0}{y}_{i},$$
(6)
$$\kern4.5pc {\dot{y}}_{i}=-\left[{a}_{0}-a(t)\right]{x}_{i}+{c}_{0}{f}_{i}-\xi {y}_{i},$$
(7)
$$\kern1.5pc\dot{\xi }=\frac{1}{M}\left(\mathop{\sum }\limits_{i=1}^{N}{y}_{i}^{2}-{NT}\right),$$
(8)

where ξ is an ancillary variable playing a role of thermal fluctuation, and M is a parameter (mass). The variable ξ controls an instantaneous temperature defined by

$${T}_{{{{{{\rm{inst}}}}}}}=\frac{1}{N}\mathop{\sum }\limits_{i=1}^{N}{y}_{i}^{2},$$
(9)

to be a given T as follows. When Tinst is smaller than T, $$\dot{\xi }$$ is negative according to Eq. (8), which makes ξ negative. Then |yi| increase owing to the last term in Eq. (7), and thus Tinst increases and approaches T, which can be regarded as heating. To the contrary, when Tinst>T, cooling occurs.

We found that SB gives better solutions when ξ is kept negative by negative initial ξ and large M (leading to $$\dot{\xi }\simeq 0$$). This observation suggests that the heating can improve SB but the cooling is unnecessary. This may be because increased |yi| by the heating can lead to escape from local minima of the Ising problem. Furthermore, small $${{{{{\rm{|}}}}}}\dot{\xi }{{{{{\rm{|}}}}}}$$ due to large M above implies that constant ξ can play a similar role, and then we found that ξ replaced by a negative constant −γ (γ > 0) can rather yield higher performance. The constant γ is regarded as a rate of the heating.

Thus, in this paper, we propose SB with a heating term, which we call heated bSB (HbSB) and dSB (HdSB), as follows,

$${\dot{x}}_{i}={a}_{0}{y}_{i},$$
(10)
$$\kern4.6pc{\dot{y}}_{i}=-\left[{a}_{0}-a(t)\right]{x}_{i}+{c}_{0}{f}_{i}+\gamma {y}_{i}.$$
(11)

We numerically solve Eqs. (10) and (11) by discretizing the time by $${t}_{k+1}={t}_{k}+\Delta t$$ with a time interval Δt, and by calculating $${x}_{i}\left({t}_{k+1}\right)$$ and $${y}_{i}\left({t}_{k+1}\right)$$ from xi(tk) and $${y}_{i}\left({t}_{k}\right)$$ in each time step. Here note that the symplectic Euler method is not applicable for Eqs. (10) and (11), because these equations are no longer Hamiltonian equations owing to the term γyi52. We empirically found that solution accuracy can be improved by the same update as previous bSB and dSB49 followed by an update corresponding to the term γyi. This ordering results in nonzero momenta by the heating and can prevent from getting stuck at the walls. See “Methods” for a detailed algorithm.

In the following, we compare heated SB with previous SB by solving instances of the Ising problem with all-to-all connectivity, where Jij are randomly chosen from ±1 with equal probabilities (corresponding to the SK model). The control parameter a(t) is linearly increased from 0 to a0. The constant parameters are set as a0 = 1 and

$${c}_{0}=\frac{{c}_{1}}{\sqrt{N}},$$
(12)

where c1 is a parameter tuned around 0.5, which is based on random matrix theory34,49. xi and yi are initialized by uniform random numbers in the interval (−1,1).

### Performance for a 2000-spin Ising problem

We first solve a benchmark instance called K2000, which is a 2000-spin instance of the Ising problem with all-to-all connectivity12,30,34,49. K2000 is often expressed as a MAX-CUT problem. The MAX-CUT problem is given by weights wij with wij = wji, and the following cut value C is maximized,

$$C=\frac{1}{2}\mathop{\sum }\limits_{i=1}^{N}\mathop{\sum }\limits_{j=1}^{N}{w}_{{ij}}\frac{1-{s}_{i}{s}_{j}}{2}$$
(13)
$$=-\frac{1}{2}{E}_{{{{{{\rm{Ising}}}}}}}-\frac{1}{4}\mathop{\sum }\limits_{i=1}^{N}\mathop{\sum }\limits_{j=1}^{N}{J}_{{ij}},$$
(14)

where in Eq. (14), C has been related to EIsing [Eq. (1)] by wij=−Jij. Thus the MAX-CUT problem can be reduced to the Ising problem.

To confirm the effect of the heating, we calculate the instantaneous temperature Tinst in Eq. (9) and the cut value C in Eq. (14) at every time step. Figure 1a, b shows typical examples of time evolutions of Tinst. Here parameters are the ones optimized in advance, which are explained later. For bSB, Tinst rapidly decreases owing to collisions with the perfectly inelastic walls, while Tinst for HbSB is kept much higher owing to the heating, as expected. For dSB, Tinst is also higher than bSB, because the discretized forces fi in Eq. (5) increase energy, violating conservation of energy49. In comparison between HbSB and dSB, Tinst for HbSB is higher than dSB around the end of the evolution. HdSB shows similar Tinst to dSB, implying that the heating does not alter the dynamics of dSB.

Figure 1c, d shows last parts of typical time evolutions of C. (These final C are at around the middle of distributions in many trials, not the best ones, which we will see in Fig. 2). For bSB, C is almost constant for the last 1000 time steps, while C for HbSB continues to fluctuate until nearly the end of the evolution owing to the heating. The fluctuation for HbSB looks similar to the fluctuations for dSB and HdSB.

Next, we solve K2000 in 104 trials with random initial xi and yi. In one trial, C is evaluated every 100 time steps, and the best value is output. Figure 2 shows examples of cumulative distributions of C, where the number of trials giving cut values lower than C are normalized by the total number of trials, 104. For bSB, the majority of trials results in one of a few values of C between 33200 and 33300. On the other hand, dSB, HbSB, and HdSB yield broader distributions, owing to fluctuations (higher instantaneous temperatures) in their dynamics. It is notable that the heating changes the distribution of bSB to the one similar to the distribution of dSB (and HdSB), and that the heating much improves bSB. The insets in Fig. 2 show magnifications around the best known cut value, Cbest = 3333749. The distributions for dSB and HdSB are almost overlapping, indicating the similar performance of dSB and HdSB. The insets also show that HbSB leads to more trials resulting in C close to Cbest than dSB and HdSB, suggesting the highest performance of HbSB.

We then evaluate average and maximum C in the 104 trials and probability P for obtaining Cbest. Here, P is estimated by dividing the number of trials obtaining Cbest by the total number of trials. Also, using P and the number of time steps, Ns, we calculate the number of time steps required to find Cbest with a probability of 99%, which we call step-to-solution S, given by

$$S={N}_{{{{{{\rm{s}}}}}}}\frac{{\log }\,0.01}{{{\log }}(1-P)}.$$
(15)

Step-to-solution is a useful measure of performance of an algorithm32. (Time-to-solution is often used to measure performance of Ising machines28,31,32,49, but it depends on not only algorithms but also implementations to hardware devices. Time-to-solution equals step-to-solution multiplied by a computation time for one time step.) Here the parameters Δt, c1, and γ are set such that S are minimized. For bSB, instead, average C is maximized, because we found P= 0 for bSB and could not estimate S.

Figure 3a shows average and maximum C as functions of Ns. For large Ns, average C for dSB, HbSB, and HdSB are larger than that for bSB. Cbest is reached by three SBs other than bSB. Figure 3b shows that, for large Ns, P are the highest for HbSB, followed by HdSB, dSB, and bSB in this ordering.

Figure 3c shows S as functions of Ns. Each SB has a minimum of S at certain Ns, and in the following we compare S minimized with respect to Ns. The black cross represents a value obtained from previously reported data for dSB49. The present value of S for dSB is smaller than the previous value, because in this study the parameters Δt and c1 are optimized for K2000 while not in the previous study. In Fig. 3c, at optimal Ns, S for HbSB is the smallest. Compared with dSB, S are reduced by 32.7% for HbSB and 19.7% for HdSB. These results demonstrate that the heating improves the performance. Although dSB performs better than bSB, bSB is much more improved by the heating than dSB, and resulting HbSB shows higher performance than HdSB.

### Performance for 100 instances of a 700-spin Ising problem

Finally, we examine the performance for instances other than K2000 by solving 100 instances of the Ising problem with 700 spin variables and all-to-all connectivity32,49. For these instances, reference solutions for estimating step-to-solution were obtained by SA with sufficiently long annealing times and many iterations, which are expected to be close to optimal solutions49. Here each instance is solved in 104 trials, and C (or EIsing) is evaluated at the last time step. We set the parameters to the values optimized in K2000.

Figure 4 shows the medians of S for the 100 instances32,49 as functions of Ns. S by dSB, HbSB, and HdSB are much smaller than S by bSB, because larger fluctuations in these three SBs might assist escape from local minima, as suggested in Figs. 1 and 2 for K2000. Besides, HbSB results in the smallest S among four SBs. In comparison with dSB, HbSB reduces S by 9.55%. This result demonstrates that HbSB can improve the performance for not only K2000 but also the other instances. The reason for the highest performance of HbSB is left for future work.

## Conclusions

We have demonstrated that SB can be improved by introducing fluctuation with a heating term, which has been obtained by replacing an ancillary dynamical variable in the Nosé–Hoover method by a constant rate of heating. We have compared previous and heated SBs by solving all-to-all connected 2000-spin and 700-spin instances of the Ising problem (the SK model), and have found that HbSB gives better step-to-solution than bSB, dSB, and HdSB. This result indicates that the heating is effective, especially for bSB.

Since the proposed heated SB shares the simple dynamics with previous SB, we expect that heated SB will be accelerated by massively parallel processing implemented by, e.g., FPGAs. This study also implies that further improvements of SB will be possible by simple physics-inspired modifications like the heating term introduced here. For example, fluctuations in Hamiltonian dynamics can also be modeled by stochastic methods57.

## Methods

### Heated simulated bifurcation

First, the symplectic Euler method52 is formally applied to the terms other than γyi in Eqs. (10) and (11),

$$\kern4.1pc{\widetilde{y}}_{i}={y}_{i}\left({t}_{k}\right)+\left\{-\left[{a}_{0}-a\left({t}_{k}\right)\right]{x}_{i}\left({t}_{k}\right)+{c}_{0}{f}_{i}\right\}\Delta t,$$
(16)
$${\widetilde{x}}_{i}={x}_{i}\left({t}_{k}\right)+{a}_{0}{\widetilde{y}}_{i}\Delta t,$$
(17)

where fi are calculated from xi(tk) with Eqs. (4) and (5), and the variables with the tildes denote temporary variables used within a time step. Then, the perfectly inelastic walls work as

$${x}_{i}\left({t}_{k+1}\right)=g\left({\widetilde{x}}_{i}\right),$$
(18)
$${\widetilde{\widetilde{y}}}_{i}=h\left({\widetilde{x}}_{i},{\widetilde{y}}_{i}\right),$$
(19)

where the functions g(x) and h(x,y) are given by

$$g(x)=\left\{\begin{array}{cc}x, & {|}x{|}\le 1,\\ 1, & x \; > \; 1,\\ -1, & x \, < \, -1,\end{array}\right.$$
(20)
$$h(x,y)=\left\{\begin{array}{cc}y, & {|}x{|}\le 1,\\ 0, & {|}x{|} > 1.\end{array}\right.$$
(21)

Finally, we include the heating term, referring to the usual Euler method, as

$${y}_{i}\left({t}_{k+1}\right)={\widetilde{\widetilde{y}}}_{i}+\gamma {y}_{i}\left({t}_{k}\right)\Delta t.$$
(22)

Equations (16)–(19) and (22) are numerically solved, where the variables are represented as single precision floating-point real numbers.