Analog Approach to Constraint Satisfaction Enabled by Spin Orbit Torque Magnetic Tunnel Junctions

Boolean satisfiability (k-SAT) is an NP-complete (k ≥ 3) problem that constitute one of the hardest classes of constraint satisfaction problems. In this work, we provide a proof of concept hardware based analog k-SAT solver, that is built using Magnetic Tunnel Junctions (MTJs). The inherent physics of MTJs, enhanced by device level modifications, is harnessed here to emulate the intricate dynamics of an analog satisfiability (SAT) solver. In the presence of thermal noise, the MTJ based system can successfully solve Boolean satisfiability problems. Most importantly, our results exhibit that, the proposed MTJ based hardware SAT solver is capable of finding a solution to a significant fraction (at least 85%) of hard 3-SAT problems, within a time that has a polynomial relationship with the number of variables(<50).

where x i ∈{0, 1} is a variable and each clause is the disjunction (OR, ∪) of k (k = 3 in this case) such variables or their negation (x i ). The propositional formula  is a conjunction (AND, ∩) of M number of such clauses. The hardness of a SAT problem can be measured as the ratio between the number of clauses and the variables, known as the constraint density α c (Supplementary section S1).
Analog computational approaches have recently demonstrated promising results in a diverse array of applications 14 including aforementioned constraint satisfaction 8,9,15 . A recent analog formulation of a k-SAT solver has demonstrated its potential on locating a solution for the Boolean satisfiability problem in polynomial continuous-time 8 . However, implementing this set of analog formulae using a digital computer will diminish the polynomial time benefits, due to varying computational complexities between different time steps. Also, a hardware implementation of this analog k-SAT solver 8 is not ideal 16 , due to the exponential energy fluctuations in the system. Consequently, a Cellular Neural Network (CNN) based analog SAT solver with bounded variables was proposed 9 , and it is more appealing for hardware implementations. Although this bounded system does not have polynomial time complexity, noise effects in the analog hardware can potentially reduce the long transient times 9 in the system, as we demonstrate in this work. The dynamics of this analog SAT solver, which is also the framework of our work, can be defined by the following set of equations 9 . Here the variable s i represents the state of the i th (i = 1, 2, ..., N) Boolean variable (x i ) and a m represents the "satisfiedness" of the m th (m = 1, 2, ..., M) clause of the Boolean function. C is the problem specific 'interconnection matrix' of size M × N (Supplementary section S1). The functions f() and g() are the thresholding functions applied on variables s i and a m , respectively, as follows This system is explained in detail in the Supplementary section S1. It is mathematically shown 9 that these set of equations satisfy three theorems that demonstrate the properties of the model. The same theorems are used in Supplementary section S4, to show that our hardware SAT solver demonstrates the same properties. Following are the three theorems.
• Theorem 1: Variables s and a remain bounded.
• Theorem 2: Every k-SAT solution has a corresponding stable fixed point.
• Theorem 3: A stable fixed point always corresponds to a solution.
We present a hardware platform built on nano-scale spintronic devices, which can successfully emulate the behaviour of the aforementioned SAT solver. As a matter of fact, recent studies have demonstrated efficient hardware models that utilize the underlying device physics of nano-electronic structures to perform computationally intensive calculations [17][18][19][20][21] . In this work, each of the above differential equations is modeled by a single MTJ with an underlying heavy metal (Ta, Pt, etc.) layer. Our numerical results demonstrate that, the MTJ based SAT solver exhibits a polynomial dependency, between the number of variables and the real time for convergence, even for SAT problems that are known to be hardest to solve (note that this is not a mathematical proof that shows a guaranteed polynomial time complexity). We conjecture that, this is due to the non-deterministic nature of our system caused by the random thermal noise, and also due to the added complexities associated with MTJs.
Using the behaviour of Magnetic Tunnel Junctions for a SAT solver. The proposed hardware based SAT-solver is a collection of heavy metal-MTJ (HM-MTJ) structures, interfaced through simple CMOS peripheral circuitry as described in the next section. The device-circuit structure we propose is generic and can be adapted to solve a given k-SAT problem. The HM-MTJ structure is composed of two ferromagnetic layers called the Pinned Layer (PL) and the Free Layer (FL), separated by a thin tunneling oxide (MgO) layer and an HM under-layer ( Fig. 1(c)). The PL magnetization direction (p) is fixed and acts as a reference. In contrast, the magnetization direction m of the FL can be switched by passing a current through the HM under-layer, using the Spin Orbit Torque (SOT) phenomenon. Such a technique has emerged as an energy-efficient mechanism for magnetization reversal [22][23][24] . Furthermore, the three terminal structure of this MTJ with a heavy metal under layer is beneficial in this work due to the possibility of simultaneous read and write to an MTJ 25 . This is impossible to achieve in a two terminal MTJ structure that requires the write current to flow through the tunnel junction.
The resistance measured across the MTJ varies with the magnetization of the FL, and shows two stable states; high resistive (R AP ) anti-parallel (AP) state, and low resistive (R P ) parallel (P) state. Equations 6 and 7 depict how the resistance across the MTJ (R MTJ ) varies with the direction of the FL magnetization, where θ fp is the angle between the directions of FL and PL magnetizations 26 . The magnetization reversal dynamics of an MTJ with an applied current is explained in the Supplementary section S2 using the Landau-Lifshitz-Gilbert-Slonczewski (LLGS) equations 27 . The speed of this magnetization reversal can be controlled by the magnitude of the current passing through the HM layer. However, due to the effect of random thermal noise on nano-scale magnets, the MTJ switching speed follows a Gaussian distribution.
−ˆTh e resistance across an MTJ, R MTJ , can be easily converted into a voltage by using a simple resistor divider circuit. In the proposed hardware implementation of the SAT solver, each s i and a m variable from equations 2 and 3 are represented using a single HM-MTJ structure. The state of these variables at a particular time instant are given by the resistance across the corresponding MTJ device. The couplings between the s and a variables (terms In addition to the mathematical explanation that can be found in Supplementary section S4, we now intuitively explain how our MTJ based SAT solver mimics the system elaborated in equations (2-3). One main feature of this system is that, the current values of the variables depend on their previous states as well as some inputs. Similarly, the FL magnetization of an MTJ depends on its previous magnetization as well as the driving current. Another feature of the system in (2-3) is that, when the feedback from a m towards the dynamics of s i (i.e., ∑ c g a t ( ( )) m mi m 0 ) is zero after a particular time t 0 , s i will move towards +A if s i (t 0 ) > 0, and −A if s i (t 0 ) < 0, provided A > 1. Similarly, in an MTJ, an instantaneous removal of a current through the HM layer, will lead the free layer magnetization to settle down either to the parallel state or to the anti-parallel state. The state to which the magnet settles down is highly dependent upon the resistance it had at the time of removal of the current, in the absence of thermal noise. When the angle between the PL and FL magnetization directions θ > π fp 2 ( ) fp 2 θ < π , and if the drive current is zero, then the final FL magnetization will settle down to the anti-parallel (parallel) state. However, it should be noted that this phenomenon occurs under certain conditions. In this work, we optimized the FL thickness according to the following equation to exhibit the above behaviour. Structure of the SAT solver. In this section, we will elaborate how we mapped the system in (2)(3) to an array of HM-MTJs with a CMOS control interface. Each s and a variable in (2-3) is represented by the resistance of an MTJ. The resistance of the MTJ can be read as a voltage difference, when a constant read current (I READ ) flows through the MTJ.
Note that this read current must be sufficiently small (<1μA) to not to interfere with the proper operation of the system. The functions f() and g() can be generated efficiently using a differential amplifier shown in Fig. 3(a). Note that the amplifier is connected to the bottom of the magnet (not the heavy metal layer). This is to avoid the small variable voltage (ΔV) induced across the HM layer due to the varying current that flows through it. These amplifiers will increase the voltage differences incurred due to the changing resistance of the MTJs. The state AP results in a larger voltage difference between nodes A and B with respect to that resulting from P state. In our structure, the AP and the P states of an MTJ that represents an s variable, gets mapped in to +1 and −1 states in equation (2) respectively. In a k-SAT problem,   The reference circuit of the differential amplifier. V OUT is the non-inverting output and V OUT is the inverting output. (c) The outputs (f(s i )/g(a m )), varying with the MTJ resistance from R P to R AP .
Scientific REPORTS | (2018) 8:6940 | DOI:10.1038/s41598-018-24877-z a variable can appear as x i , or its negation (x i ) in the m th clause. This information is encoded in the elements of the connection matrix c mi , as explained in Supplementary section S1. In order to account for different values of c mi at the circuit level, we generate the 'state' (f(s i )/g(a m )) and the 'inverse-state' ( f s g a ( ) / ( ) i m ) signals of an MTJ. These signals are produced at the amplification stage outputs, as shown in Fig. 3(b). A differential amplifier is employed to read the voltage difference across an MTJ. Additionally, a source degenerated common source amplifier is used as a second amplification stage, to boost the voltage to the desired levels. A third amplifier is employed in the design to generate the aforementioned inverse functions ( f () and g()). The complete schematic of the amplification stages used in this work is shown in Fig. 3(b). The same amplifier architecture with different control voltages was used for interfacing with MTJs representing both a and s variables. The outputs V OUT (f(s i ) or g(a m )) and V OUT f s ( ( ) i or g a ( )) m are used to drive the MOSFETs controlling the current through the heavy metal layers (Fig. 4). Figure 3 (c) elaborates how the above mentioned 'states' and 'inverse states' vary with the resistance of an MTJ. Each differential amplifier output will vary between a predefined high voltage (V sH , V aH ), and a low voltage (V sL ,V aL ). Therefore, the state −1 and +1 of variable s will be mapped to V sL and V sH at the non-inverting (V sH and V sL at the inverting) output. For the variable a, the non-inverting (inverting) output will be V aL (V aH ) when the resistance of the MTJ is less than (R ap + R p )/2 and V aH (V aL ) when the resistance is R ap .
The term c g a t ( ( )) m mi m ∑ in equation 2, and the term c f s t k ( ( )) 1 i mi i −∑ + − in equation 3 (the coupling between variable s i and a m ) are mapped as currents through the HM layers of MTJs, that represent s variables and a variables, respectively. At a particular time instant when the m th clause is not satisfied, if the connection parameter c mi is positive, the current should drive the MTJ that represents variable s i towards the AP state. Similarly, when c mi is negative, the current should drive that MTJ towards the P state, and when c mi is zero, the current through the HM should be zero. Figure 4(a,b) graphically explains how this is realized at the circuit level. Figure 4(c,d) shows the circuit realization of the feedback from the s i variable acting on the a m variable, depending upon the connection parameter c mi . When c mi is positive (negative), −c mi f(s i (t)) should drive the MTJ that represents a m towards the P (AP) state (for a case where s i is in AP state). The two transistor structures (heavy metal current controllers) in Fig. 4(c,d) should provide an output voltage of V o , when the input . This is to make sure that there is no current through the HM layer when s i (t) = 0. For the Fig. 4, we assume that a charge current from left to right through an HM layer, drives the MTJ on top, towards the AP state. Figure 4(e,f) illustrates the final structure of the SAT solver with all the control logic. The connections in the 'network' depends on the SAT problem to be solved. Therefore, the connecting switches must be initialized depending upon the problem. Note that when the number of clauses of a problem increases, the connections become more complex.

Results
In order to observe the functionality of our SAT solver, we conducted circuit level simulations. Figure 5 illustrates the currents through the heavy metal layers of the two MTJs representing s variable and a variable, that correspond to a 10-variable hard SAT instance. The results were obtained from HSPICE simulations using IBM 45 nm technology node. The resultant evolution of the free layer magnetization along the x direction (M x ) is shown on the right (Fig. 5(c) and (d)). Note that a positive current drives the MTJ towards AP (+1) state and a negative current drives the MTJ towards P (−1) state in the figure.
As elaborated in the Supplementary section S1, the constraint density (α c ) is an indicator of the hardness to solve a particular SAT instance. In order to observe the functionality of our solver for SAT instances with different hardness levels, we solved randomly generated 3-SAT problems with different constraint densities, and different number of variables. Figure 6 shows the magnetization dynamics of three MTJs that correspond to three variables in two 20 variable 3-SAT problems, each having a constraint density of 4.25 and 3.00, respectively. The colour of the trajectories in Fig. 6(c,d) indicates the normalized energy of the system at that particular point. This energy of the system can be defined by the following equations. where M and N are the number of clauses and the number of variables in the k-SAT problem, respectively. The energy is a function of the number of clauses not satisfied at a particular instant. This can be used as a cost function to determine the "satisfiedness" of a particular problem at a given instant. Notice that the trajectories in Fig. 6(c),(d) pass through higher energy states as the system tries to converge to a solution. This shows that our system escapes local minimum points naturally, unlike other algorithms 28 where simulated annealing is necessary to escape from such local minimum points.
Approximate polynomial time solution from the proposed SAT solver. We also solved randomly generated satisfiable 3-SAT problems in the hard regime (α c = 4.25) for different number of variables (20,30,40,50). We have calculated the number of problems solvable within 10 μs for the purpose of illustration. However, since our system has no limit cycles owing to thermal noise (more details are available in the next section), we argue that our proposed method will probably reach a solution if sufficient time has been provided, given that  Fig. 7. It is evident that the fraction of problems not solved p(t), has an exponential decay with time t. The relationship between p(t) and t can be approximated by where r and γ are constants. The decay rate λ obeys λ(N) = bN −β , with β ≈ 1.1. Therefore the continuous time t needed to solve a (1−p) fraction of problems can be written as 1 This implies that the time to solve a (1−p) fraction from a set of k-SAT problems is of polynomial complexity (for the range of N we have considered). This polynomial relationship still holds when the fraction of problems left unsolved is a fixed number (irrespective of the number of variables) 8 . That is, the time taken to solve all possible k-SAT formulae for a given N and α c (Θ(k, N, α c )), except for a constant amount of problems c (p(t) = c/Θ(k, N, α c )), would follow a relationship as shown below when N→∞ 8 (if we assume that the relationship in equation (11) and (12) 1 A previous proposal 8 shows a similar relationship for the time required to solve k-SAT problems. However, the relationship is not valid in real time if a digital computer is to be used for the calculations. It is because, the complexity of the analog system varies over time, and when solving each step, it would take different real time values depending upon the complexity. Since our method is purely based on hardware, we argue that the above approximate polynomial time relationship is valid in real time for our proposed system. We conjecture that this behaviour is due to multiple reasons including the thermal noise associated with the MTJs. It has also been predicted 9 that, the noise effects may avoid long transient oscillations. Our results too suggest that higher amounts of noise leads to faster convergence (explained in the next section). It must be noted that, this proposed MTJ based SAT solver does not behave identical to the cellular neural network based solver in equations (2)(3). Our solver has some added complexities not present in the CNN based system (refer to the set of equations in Supplementary section S4). We conjecture that such complexities offered by the device physics, acts favorable to give faster solutions to SAT problems as well. However, these benefits come at the cost of handling the circuit limitations (fan-out etc.) that can arise when solving problems with larger N.
Effect of thermal noise. Thermal noise has significant impact on the switching dynamics of nano-magnets. Equation 14 explains the renowned Brown's model 29 that captures the behaviour of thermal noise which can be used as a random magnetic field in the LLGS equations.
Thermal B s ς → is a vector with components that are zero mean, unity standard deviation, Gaussian random variables. V is the volume of the free layer, T is the temperature, and k B is the Boltzmann's constant. The time discretization value dt must be included in the numerator when solving the equation numerically. The existence of thermal noise is mandatory for the proper operation of our SAT solver. This is due to the tilt of the FL magnetization with the easy axis, induced by thermal noise without which a magnet cannot be switched. Results indeed show that, under zero thermal noise, the magnetizations evolve to frozen non-solution states (Supplementary section S5). In the next two subsections, we will explain how the thermal noise assists in avoiding limit cycles and the impacts of larger thermal noise on the time to converge to a solution.
Not behaving as a chaotic dynamic system and absence of limit cycles. We define the states of all the MTJs that represent variable s as H N = [−1, 1] N . The P state is mapped to −1 and the AP state is mapped to +1. The boundary of H N is the N hypercube Q N , with vertices V N = {−1, 1} N . The solution space to a particular problem can be a subset of these V N . Let us denote such a solution by = ...
Due to the effect of thermal noise, the solution to which the system will ultimately converge has minimal dependency with the initial states of the MTJs. For example, let us consider a SAT problem that has multiple solutions V N sol and solving it in two trials with the same initial states of MTJs. The output solutions in the two trials may not be the same even though the starting conditions were identical. This implies that our system does not show any chaotic behaviour (it is not deterministic) in contrast to the system given by equations 2 and 3.
If the states of the MTJs that represent variable s continuously change in a periodic manner (it has entered a limit cycle), it is possible that the system never reaches a solution (even if there exists one). That is, s i (t) = s i (t + nT) for ∀n = 1, 2, … and s i (t) is not a solution of the system. This is known as the system getting trapped in a limit cycle 8 . It is shown that for the system explained in equations 2-3, this can occur for certain coupling parameter (A, B) choices 9 . However, due to the added stochasticity from the thermal noise, we argue that our MTJ based SAT solver does not get trapped in limit cycles. It is highly probable that our system reaches a solution if sufficient amount of time is provided.
Increasing temperature resulting in faster solution convergence. Now let us consider how different amounts of thermal noise will affect the operation of our MTJ based SAT solver. Temperature can affect the amount of thermal noise applied on an MTJ device (equation 14). Sufficiently higher temperatures on MTJs with smaller switching energy barriers, can cause the state of an MTJ to oscillate over time, even without any input current or magnetic field 25 . We solved randomly generated 20 variable, 3-SAT problems at different temperatures and observed the percentage of problems that can be solved within 10 μs. As Fig. 8(a) illustrates, it is evident that in the range of 20°C−130°C, there is no significant degradation in the percentage of problems solved within 10 μs. However, according to Fig. 8(b), it appears that the average time to solve a k-SAT problem decreases by ∼100 ns with increasing temperature in the range 20°C-130°C.
Effects of process variations. In our design, we selected a particular thickness (t ss ) for the free layers of the MTJs. In reality, it is impossible to achieve the exact thickness due to number of reasons including atomic limitations, process variations, fabrication limitations etc. How much precession is present along the easy axis is dependent upon how much the actual thickness has deviated from t ss . In order to observe the effects of thickness variations on the operation of our proposed SAT solver, we consider two scenarios.
1. Global variation of thickness. 2. Local variation of thickness.
In the first case, we perturb all the thicknesses of MTJs in our system by some constant percentage from t ss . For a particular percentage global variation in thickness, we solved randomly generated 20 variable, SAT problems. Then the percentage of problems solvable within 10 μs, and the average time to solve a single problem was observed. Figure 9(a) illustrates that there is no significant change in the percentage of solvable problems when the global variations in thickness is changed from −5% to +10%. However, as Fig. 9(b) shows, the average convergence time increases when the deviations in thickness increases. We also observed that the solver no longer works if the thickness is less than a particular value. This is the limit at which the perpendicular magnetic anisotropy (PMA) becomes dominant and the FL magnetization stabilizes in ẑ axis instead of x axis. We observed this when the actual thickness is ∼−10% deviated from the t ss , for the choice of materials and dimensions used in this work. In the second case, we change thicknesses of all the MTJs according to a Gaussian distribution with a 3σ value (where σ is the standard deviation) of 10% from the t ss . The solver gave an average convergence time of 588.31ns and was able to solve 97% of randomly generated 20 variable, 3-SAT instances within 10 μs.
As described in the previous section, the temperature increments inversely affect the computation time of the MTJ based system. To see the effect of both thickness and temperature variations simultaneously, we conducted the above experiment at different temperatures. The results are summarized in Fig. 9(c) and it shows that the changes in computation time due to thickness deviation is more prominent than that due to temperature drift.
During the fabrication process, other non-idealities such as edge damages 30,31 can be present in the MTJs. We introduced variations in interface anisotropy constant, width and length of the free layer of MTJs, to observe the effect of aforementioned non-idealities. The effects of anisotropy constant variations on computation time follow a similar trend as the 'computation time vs thickness' curve ( Fig. 9). We further noticed that the computation time decreases when the free layer dimensions decrease. The percentage of problems solved remained almost at 100%. More detailed results are included in the Supplementary section S6.
Power consumption and computation time of the SAT solver. In this section, we will present the power consumption and computation time of our proposed system, and compare with existing methodologies. In order to calculate the power consumption, we used SPICE simulations in IBM 45 nm technology. The measured average power consumption over solving 1,000, 20-variable (our software based LLGS solving framework could not handle bigger benchmarks such as 'ais8' 32 ) hard SAT problems (constraint density α c = 4.25) was 1.37 mW. The power requirements of each section of the solver are presented in Table 1. We observed that the peripheral circuits such as amplifiers and voltage controlled current drivers consume a significant portion of power. The power consumption of the total MTJ and heavy metal layer structures is approximately about 20% of the total power consumption of the system. Similarly, a recent work of a hardware realization 16 of an analog approach 8 , also explains that their overall power consumption to be significant (numerical value not specified), due to the usage of op-amps. In another design 15 , the CNN based SAT solving algorithm 9 used in our work was realized using op-amp based integrators. The power consumption was reported as 140 μW for a 4-variable, 4-clause problem. For comparison, we evaluated the average power consumption of our MTJ based solver for similar 4-variable 4-clause SAT instances, and witnessed a power consumption of 84 μW, which is about 40% smaller than the aforementioned analog hardware design 15 .
Digital hardware for realizing typical SAT solving algorithms such as GRASP 11 , DPLL 33 etc. can be found in literature [34][35][36] . A custom IC 34 designed for such an algorithm reported a power consumption of 871 μW. This is about 37% lower than our design. However, these digital approaches are slower than analog solvers due to the step-by-step decision making and backtracking required for solving a problem. It is also noteworthy that our proposal is an asynchronous method in contrast to the above synchronous methods. Therefore, the need for a clock signal and the power associated with it is not present in our design. Our system automatically goes to a minimum energy point (which is a solution) and stabilizes there as explained mathematically in Supplementary section S4.
In order to obtain the computation time of our system, we conducted system level simulations using random 1,000, 20-variable hard SAT instances. The time taken to solve all 1,000 problems (100%) were evaluated and the average computation time per instance was 553 ns. This computation time was then compared with a state-of-the-art software SAT solver, Minisat 37 . The same 1,000 problems were solved using Minisat in a 3.6 GHz processor and the average computation time was evaluated (1.44 ms). We observed an average speed-up  Table 2 for comparison. Note that the Boolean Constraint Propagation (BCP) accelerator 38 in Table 2 has reported the speed up with respect to a purely software based algorithm (modified zChaff 10 ) which has the same performance for BCP when compared with Minisat. Furthermore, the analog CNN based system 9 built using op-amp based integrators 15 has an average computation time of 15 μs for 10 variable, 2-SAT problems. Our MTJ based proposal has an average run time (over 1,000 instances) of 186 ns for 10 variable 3-sat (α c = 4.25) problems (showing that the MTJ based solver is ∼30× faster).

Conclusion
Boolean satisfiability is an NP-complete problem (k ≥ 3) that finds utility in vast array of applications [1][2][3][4][5] . Analog solutions to the satisfiability problem has recently appeared attractive 8,9,15 due to the massive parallelism available when solving, in contrast to the stepwise search algorithms. In this work, we provide a proof of concept hardware based analog SAT-solver using Magnetic Tunnel Junctions driven by the Spin Orbit Torque Phenomenon. We have mathematically shown how the inherent device physics of MTJs closely mimics an existing analog approach 9 to solving the Boolean satisfiability problem. Device and circuit level simulations were conducted to solve hard satisfiability problems in order to observe the performance and functionality of our proposed system. According to the observations, we witnessed that the proposed SAT solver is capable of finding a solution to a significant fraction (>85%) of hard SAT problems in polynomial time. We conjecture that this is due to the inherent thermal noise present in MTJs and the device complexities added on top of the existing analog approach 9 . The SAT solver automatically comes out of local minimum points and limit cycles due to thermal noise. Therefore, it is highly probable that the system reaches a solution if sufficient time is provided, given that the SAT problem has a solution. Further, the variation analysis illustrates that our proposed solver is robust to variations in the MTJ thickness in the range of −5% to 10%. Larger variations result in higher average convergence time. The proposed MTJ based SAT solver is 2.6 × 10 3 times faster than a state-of-the-art software solver, Minisat.

Methods
The set of equations involved in modelling the HM-MTJ structures is provided in Supplementary sections S2, S3 and S4. The t ss of the MTJs was calculated by self consistently solving the equation (8) and the analytical equations for the demagnetization factors 39 . IBM 45 nm technology node was used to simulate the CMOS interface circuitry. The material parameters (selected according to the experimental papers 24,40 ) and the device dimensions are summarized in supplementary Table S1.