Efficient Solution of Boolean Satisfiability Problems with Digital MemComputing

Boolean satisfiability is a propositional logic problem of interest in multiple fields, e.g., physics, mathematics, and computer science. Beyond a field of research, instances of the SAT problem, as it is known, require efficient solution methods in a variety of applications. It is the decision problem of determining whether a Boolean formula has a satisfying assignment, believed to require exponentially growing time for an algorithm to solve for the worst-case instances. Yet, the efficient solution of many classes of Boolean formulae eludes even the most successful algorithms, not only for the worst-case scenarios, but also for typical-case instances. Here, we introduce a memory-assisted physical system (a digital memcomputing machine) that, when its non-linear ordinary differential equations are integrated numerically, shows evidence for polynomially-bounded scalability while solving"hard"planted-solution instances of SAT, known to require exponential time to solve in the typical case for both complete and incomplete algorithms. Furthermore, we analytically demonstrate that the physical system can efficiently solve the SAT problem in continuous time, without the need to introduce chaos or an exponentially growing energy. The efficiency of the simulations is related to the collective dynamical properties of the original physical system that persist in the numerical integration to robustly guide the solution search even in the presence of numerical errors. We anticipate our results to broaden research directions in physics-inspired computing paradigms ranging from theory to application, from simulation to hardware implementation.

Boolean satisfiability is a propositional logic problem of interest in multiple fields, e.g., physics, mathematics, and computer science. Beyond a field of research, instances of the SAT problem, as it is known, require efficient solution methods in a variety of applications. It is the decision problem of determining whether a Boolean formula has a satisfying assignment, believed to require exponentially growing time for an algorithm to solve for the worst-case instances. Yet, the efficient solution of many classes of Boolean formulae eludes even the most successful algorithms, not only for the worst-case scenarios, but also for typical-case instances. Here, we introduce a memory-assisted physical system (a digital memcomputing machine) that, when its non-linear ordinary differential equations are integrated numerically, shows evidence for polynomially-bounded scalability while solving "hard" planted-solution instances of SAT, known to require exponential time to solve in the typical case for both complete and incomplete algorithms. Furthermore, we analytically demonstrate that the physical system can efficiently solve the SAT problem in continuous time, without the need to introduce chaos or an exponentially growing energy. The efficiency of the simulations is related to the collective dynamical properties of the original physical system that persist in the numerical integration to robustly guide the solution search even in the presence of numerical errors. We anticipate our results to broaden research directions in physics-inspired computing paradigms ranging from theory to application, from simulation to hardware implementation. converted into 10 voltage nodes (inner nodes) and 43 self-organizing OR gates [11]. The black nodes traditionally associated with the output of the OR gates are fixed to TRUE to enforce the constraints. Dashed lines in the circuit represent NOT gates on the OR gate terminals. Ignoring the black nodes, the circuit can be interpreted as a factor graph with the gates becoming function nodes (see also Fig. 3). The clause represented by the highlighted self-organizing OR gate is (ȳ i ∨ y j ∨ȳ k ), where NOT gates invert the polarity of the voltages. The double-headed arrow indicates this is a self-organzing logic gate with no distinction between an input and an output (terminal agnosticism). The circular representation of the linear circuit is a reminder that the ordering of gates is irrelevant to the solution search. in Ref. 8. As the problem size grows, the chaotic behavior translates into an exponentially increasing number of integration steps required to find the equilibrium points of the corresponding ODEs.

The digital memcomputing approach
In recent years, a different physics-inspired computational paradigm has been introduced, known as digital memcomputing [10,12]. Digital memcomputing machines (DMMs) are non-linear dynamical systems specifically designed to solve constraint satisfaction problems, e.g., 3-SAT, with the assistance of memory [10] (Fig. 1). The only equilibrium point(s) of the DMM is the solution(s) of the original problem. However, unlike previous work, DMMs are designed so that they have no other equilibrium points; see Sec. VI.D of the supplementary material (SM). Additionally, the Box 1. DMM for 3-SAT The 3-SAT formula is constructed by applying conjunction (AND), disjunction (OR), and negation (NOT) operations to Boolean variables (TRUE or FALSE), with parentheses used to indicate the order of operations [25].
A formula contains N Boolean variables (y i ), M clauses, and 3M literals. Each clause (constraint) consists of three literals connected by logical OR operations, i.e., (l i ∨ l j ∨ l k ), where a literal, l i , is simply one of the Boolean variables (l i = y i ) or its negation (l i =ȳ i ). A clause is satisfied if at least one literal is TRUE (OR operations), and the formula is satisfiable if all clauses (AND operations) are simultaneously satisfied. The complexity of the problem emerges from the interaction among constraints, and is observed in the well-studied easy-hard-easy transition in 3-SAT, where easy and hard regimes are identified by the ratio α r = M/N , with the complexity peak (hardest instances) occurring around α r = 4.27 [26].
To construct a DMM that finds a satisfying assignment for 3-SAT we follow the general procedure outlined in Ref. 12. To begin, the Boolean variables, y i , are transformed into continuous variables for use in the DMM. The continuous variables can be realized in practice as voltages on the terminals of a self-organizing OR gate [10].
Such a gate can influence its terminals to push voltages towards a configuration satisfying its OR logic regardless of whether the signal received by the gate originates from the traditional input or the traditional output (see one trivially multiplies that quantity by −1. The self-organizing logic circuit that comprises the DMM is built by connecting all of the self-organizing OR gates (see Fig. 1). See Sec. III.A of SM for an extended discussion of the thresholding procedure for the voltages.
Next, we interpret a Boolean clause as a dynamical constraint function, with its state of satisfaction determined by the voltages. The m-th Boolean clause, (l i,m ∨ l j,m ∨ l k,m ), becomes a constraint function, where q i,m = 1 if l i,m = y i , and q i,m = −1 if l i,m =ȳ i . The function is bounded, C m ∈ [0, 1], and a clause is necessarily satisfied when C m < 1/2. The instance is solved when C m < 1/2 for all clauses. By thresholding the clause function we avoid the ambiguity associated with v i = 0. If some voltage is ambiguous (v j = 0) and all clauses are satisfied, then any Boolean assignment to y j will be valid in that configuration. The use of a minimum function in C m preserves an important property of 3-SAT. A clause is a constraint, and, by itself, a clause can only constrain one variable (via its literal). (Note that the minimum operation introduces some form of discontinuity to the dynamical system, for which we develop the formalism to study in Secs. IV and V of SM.) The values of two literals are irrelevant to the state of the clause if the third literal results in a satisfied clause.
Finally, a DMM employs memory variables to assist with the computation [10,12]. The memory variables transform equilibrium points that do not correspond to solutions of the Boolean formula into unstable points in the voltage space (see Sec. VIII of SM), leaving the solutions of the 3-SAT problem as the only minima. We choose to introduce two memory variables per clause: short-term memory, x s,m , and long-term memory, x l,m .
The terminology intuitively describes the behavior of their dynamics. For the short-term memory, x s,m lags C m , acting as an indicator of the recent history of the clause. For the long-term memory, x l,m collects information so it can "remember" the most frustrated clauses, weighting their dynamics more than clauses that are "historically" easily satisfied. Both the number and type of memory variables, as well as the form of the resulting dynamical equations, are not unique provided neither chaotic dynamics nor periodic orbits are introduced [12].
We choose for the dynamics of voltages and memory variables the following, where G n,m and R n,m equal 0 when variable n does not appear in clause m, and the summation is taken The parameters α and β are the rates of growth for the long-term and short-term memory variables, respectively.
Each memory variable has a threshold parameter used for evaluating the state of C m , and the two parameters are restricted to obey δ < γ < 1/2. (This also guarantees that there is a sufficiently large basin of attraction for the solutions. See Sec. VII of SM for a detailed explanation.). Eq. (9) has a small, strictly-positive parameter, 0 < 1, to remove the spurious solution (x s,m = 0). However, additionally serves as a trapping rate in the sense that smaller values of make it more difficult for the system to flip voltages when some C m begins to grow larger than γ.
In Eq. (8), the first term in the summation is a "gradient-like" term, the second term is a "rigidity" term [16].
The gradient-like term attempts to influence the voltage in a clause based on the value of the other two voltages in the associated clause. Consider the two extremes: if the minimum results is G i,m = 1, then v i needs to be influenced to satisfy the clause. Conversely, if the minimum gives G i,m = 0, then v i does not need to influence the clause state (see Sec. II.A of SM).
The purpose of the three rigidity terms for a constraint is to attempt to hold one voltage at a value satisfying the associated m-th clause, while doing nothing to influence the evolution of the other two voltages in the constraint. Again, this aligns with the 3-SAT interpretation that a clause can only constrain one variable. The short-term memory variable acts as a switch between gradient-like dynamics and rigid dynamics. During the solution search, G m will seek to influence three voltages until clause m has been satisfied. Then, as x s,m decays to zero, R m takes over. The long-term memory variables weight the gradient-like dynamics, giving greater influence to clauses that have been more frustrated during the solution search. The rigidity is also weighted by x l,m , but reduced by ζ.

Numerical results and discussion
It is important to realize that any simulation of a dynamical system is an algorithm because the continuous-time The transition at left is characterized by 13 clauses becoming satisfied, the transition at right results in 4 clauses becoming satisfied. Neither transition results in satisfied clauses becoming unsatisfied. dynamics of the system must be discretized. Identifying our simulation as an algorithm invites a method to compare our results with those of popular algorithms, specifically, WalkSAT [23] and survey inspired decimation (SID) [5].
Before we compare results, we then need a general definition of a step.
We define an algorithmic step to be all the computation that occurs between checks of satisfiability. The WalkSAT algorithm flips one variable at a time then checks the satisfiability of the formula. Therefore, a WalkSAT step is a single variable flip. SID uses WalkSAT as part of its solution search, so the interpretation of steps is the same when SID uses WalkSAT. Prior to entering into WalkSAT, SID performs a message-passing procedure known as survey propagation [5]. In the SID implementation we used [27] there is no check for satisfiability during the decimation procedure, so we generously identify the entire survey propagation with decimation as a single step. Our DMM algorithm checks the satisfiability of the formula after each time step of the integration. Of course, the amount of computation within a step may vary greatly based on the algorithm, but this does not affect comparison of the scalability. In fact, if an algorithm is exponential in the number of steps, then the amount of computation within a step cannot improve its scalability. For our DMM, each step has a constant amount of computation per time step of integration. With this definition of an algorithmic step, we have a method to meaningfully compare the different algorithms.
We can now test these approaches on CDC instances with planted solutions. In Sec. III.C of the SM, we give an account of how these instances are generated, and why they are difficult to solve. Here, we just note that difficult CDC instances are created when α r > 4.25 and 0.077 < p 0 < 0.25, where p 0 is the probability that the planted solution results in a clause with zero false literals [18]. We have performed no preprocessing on the 3-SAT instances to reduce their size, not even the removal of pure literals (those appearing wholly negated or unnegated) [28].
In Fig. 2, for the problem sizes tested, we find a power-law scaling for the median number of integration steps for the simulations of DMMs. We also find that integration time variable (t), CPU time, and long-term memory (x l ) are bounded by a polynomial scaling, and the average step size shows power-law decay (see Sec. II.C of SM). The optimized WalkSAT algorithm [29] we have used instead exhibits an exponential scaling at relatively small problem sizes, confirming the previous results of Ref.
18. An exponential scaling is also observed for the SID algorithm [27].
The CDC instances are structured to confuse stochastic local-search algorithms, so the exponential scaling of Walk-SAT is expected (right inset Fig. 2). To understand the exponential performance of SID (left inset Fig. 2), we need to understand the success of SID on random 3-SAT. When generating uniform random 3-SAT at the complexity peak with a general method (no planted solutions), the typical case can be exploited by SID due to the existence of treelike structures in the factor graph [30]. (For those unfamiliar with factor graphs, if the factor graph was a tree, then one would be able to visually, thus easily, find the solution from the graph [31].) However, as demonstrated in Fig. 2, SID performs poorly when given a 3-SAT instance with a factor graph that is not locally treelike. It is also known that SID performs poorly at high ratios (α r 4.25) [6], as loops in the factor graph become more common, explaining the opposite scaling trend seen in Fig. 2.
To further confirm that the usefulness of our DMM algorithm on CDC instances is independent of our generation of formulae, we have solved generalized CDC instances [19] used in the 2017 [20] and 2018 [21] SAT competitions (satcompetition.org). Our modified competition DMM solves all tested competition CDC instances on its first attempt with random initial conditions, and does so within the 5000-second timeout established by the competition (see Sec. II.E of SM). We find the overhead of numerical simulations of ODEs does not forbid our DMM from being competitive due to the use of the forward-Euler integration scheme.

Long-range order and analytical properties of DMMs for 3-SAT
We finally show that collective behavior (long-range order) [14,15] in DMMs is responsible for the observed efficiency in the solution search. In order to do this, it is helpful to visualize subgraphs of the factor graph generated from a 3-SAT instance. In Fig. 3, we visualize the change in state of local factor graphs during a single time step of integration as our DMM approaches a solution. It is apparent that the system explores many paths in the factor graph, collecting information as it does. However, unlike SID, when the DMM explores a path leading to contradiction it can correct itself. The factor graphs shown in Fig. 3 only include clauses (function nodes) that are unsatisfied (red) or recently unsatisfied (green), and all variable nodes connected to these clauses. A clause, m, is identified as recently unsatisfied if the short-term memory is x s,m > 0 but the clause is currently satisfied. The factor graph transitions show that collective events occur that satisfy multiple clauses. This is in agreement with many results on DMMs for different types of problems [14,32]. Additionally, the factor graph transition on the left of Fig. 3 breaks up the graph into smaller, disconnected factor graphs, making the search exponentially more efficient.
As anticipated, to strengthen these numerical results, we have also analytically demonstrated that the dynamics described by Eqs. (8), (9), and (10) terminate only when the system has found the solution to the 3-SAT problem (namely the phase space has only saddle points and the minima corresponding to the solution of the given problem; Secs. VI and VII of SM However, note that such a scalability does not necessarily translate to the same scalability of the numerical integration of Eqs. (8), (9), and (10), where the discretization of time is necessary. Nevertheless, due to the absence of chaos, we empirically find that the scalability of our numerical simulations is still polynomially bounded for typical-case CDC instances.

Conclusions
We have presented an efficient dynamical-system approach to solving Boolean satisfiability problems. Along with arguments for polynomial-time scalability in continuous time, we have found that the numerical integration of the corresponding ODEs show power-law scalability for typical cases of 3-SAT instances which required exponential time to solve with successful algorithms. The efficiency derives from collective updates to the variables during the solution search (long-range order).
In contrast to previous work [8], our dynamical systems do not suffer from exponential fluctuations in the energy function due to chaotic behavior. The dynamical systems we propose find the solution of a given problem without ever entering a chaotic regime, by virtue of the variables being bounded. The implication is that a hardware implementation of DMMs would only require a polynomially-growing energy cost. Our work then also serves as a counterexample to the claim of Ref. 8 that chaotic behavior is necessary for the solution search of hard optimization For the benefit of the reader we summarize the major results presented in this Supplementary Material (SM).
• In Section II we describe the numerical method and implementation we used to solve Eqs. (2)-(4) in the main text. We also show several other numerical results on additional 3-SAT instances to support the ones reported in the main text. In particular, we show that the time variable of integration, CPU time, and slow memory variables all scale as a power law in the size of the problem. We also show that the average time step of the integration needs only to decrease as a power law with increasing problem size.
• In Sections IV and V, we show that a unique dynamical trajectory can be constructed for the discontinuous flow field governing our dynamics. For practical purposes, the analytic trajectory is constructed such that it is approximated by the numerical trajectory obtained with the forward Euler integration method used in our numerical analysis reported in the main text.
• In Section VI, we show that our dynamics are bounded by a positive invariant compact set, and the dynamics terminate only when the system has found the solution to the 3-SAT problem. This guarantees a correspondence between the fixed points of the dynamics and the solutions of the 3-SAT problem, and absence of local minima.
• In Section VII, we show that the basin of attraction of the solution for our flow field contains a large hypercube in the voltage space. In other words, once the trajectory has entered this region, the dynamics are guaranteed to converge to a solution.
• In Section IX, we show the absence of periodic orbits in the voltage dynamics. This result, augmented by the fact that the memcomputing flow is not topologically transitive, implies absence of chaos (à la Devaney).
• In Section X, we show that our system is dissipative, in the sense that the volume of any initial set in the phase space contracts under the flow field.
• In Section XI we show using topological field theory that the continuous-time dynamics reach the fixed points in a time that scales with problem size, n, as O(n α ) with α ≤ 1. This result does not necessarily apply to the numerical solution of the dynamical equations due to integration overhead and numerical noise.

A. Numerics
For ease of discussion, the equations of motion for the digital memcomputing machine (DMM) are reproduced here.
The m-th Boolean clause, (l i,m ∨ l j,m ∨ l k,m ), becomes a clause function, where q n,m = 1 if l n,m = y n , and q n,m = −1 if l n,m =ȳ n . The DMM's equations then read: where G n,m and R n,m equal 0 when variable n does not appear in clause m.
In Eq. (8), each of the N voltages (variables) are guided by M constraints (clauses). Each constraint influences three voltages simultaneously, while switching between two dynamical terms containing a gradient-like function, G n,m , and a "rigidity" function, R n,m .
In addition to the voltages, memcomputing utilizes memory variables to assist with the computation. The short-term memory, x s,m , controls the switching between G n,m and R n,m . The long-term memory, x l,m collects information so it can "remember" the most frustrated constraints (unsatisfied clauses), weighting their dynamics more than clauses that are "historically" easily satisfied.
To understand the gradient-like function better, consider the two extremes: if G n,m = 1, then v n needs to be influenced to satisfy the clause. (Recall, there are three voltages associated with the m-th constraint, but, independent of information from other constraints, no determination can be made on which voltage needs to be influenced.) Conversely, if G n,m = 0, then v n does not currently need to influence the m-th constraint state. The purpose of the rigidity term, R n,m , is to attempt to hold one voltage at a value satisfying the associated m-th constraint, but to do nothing to influence the evolution of the other two voltages in the constraint.
The long-term memory variable weights the gradient-like dynamics, giving greater influence to constraints that have been more frustrated during the solution search. The rigidity term is also weighted by x l,m , but reduced by ζ. The parameter ζ can be thought of as a "learning rate". More difficult instances, as characterized by their clause-tovariable ratio, require more time for x l,m to evolve (slower learning rate) so the phase space can be more efficiently explored.
Note that the memory dynamics generate a dynamical energy landscape under which the voltages evolve. This guarantees that the trajectory has the ability to escape any local minima of the original, static energy landscape of the Boolean satisfiability problem. Visually, whenever the voltages fall into a local minimum of the original problem, the memory variables "deform" the energy landscape in such a way that the local minimum is transformed into a saddle point, and the trajectory is allowed to continue exploring the energy landscape until it finds the global minimum, which is left invariant by the memory variables (see proposition VI. 5). An extended discussion of such dynamical properties is given in Section VIII C.
It is advised to avoid γ = 1/2 or γ = δ. When C m = 1/2 and γ = 1/2 we findẋ s,m = 0 and the system has difficulty leaving the ambiguous state. To avoid this complication, assign γ < 1/2. Eq. (10) gives x l,m the ability to decay and it aids dynamics to have 0 < δ < γ. Assigning δ less than γ allows the system an indirect means to influence x s,m whenẋ s,m = 0 and C m = 0, by allowing x l,m to continue to grow (ẋ s,m = 0 andẋ l,m > 0 =⇒ γ = C m > δ). The parameter 0 < 1 is chosen as a small positive number to guarantee that the dynamics of the short-term memory does not terminate when it reaches x s,m = 0.
Equations (8)- (10) have been numerically integrated with the forward Euler method using an adaptive time step, ∆t ∈ [2 −7 , 10 3 ], until all clauses have been satisfied, as determined from thresholding Eq. 7 for all m. The code has been written in interpreted MATLAB R2019b. Each attempt at solving a clause distribution control (CDC) instance was performed on a single core (no parallelization employed) of an AMD EPYC 7401 server.
Note that the above integration scheme is the most basic and, hence, the most unstable we could implement. We thus expect more refined integration schemes may provide both better stability and scaling.  Fig.  4(b)) in the time variable, t (arb. units), and in the CPU time (a = 3.2 in Fig. 4(c)), measured in seconds by MATLAB.
We also monitored the growth of x l to make sure there were no exponential "energy" costs. For each instance, we collect the maximum value of x l , then find the median of those values. Figure  Finally, we observe a power-law decay in the mean size of the time step of our adaptive integration scheme as a function of problem size ( Fig. 4(a)). In other words, as the problem size increases, the average time step is decreasing with a lower polynomial bound, rather than exponentially decaying. This observation invites modifications for speeding up solutions without introducing exponential growth into the system.

C. Trends for different values of p0
In the generation of Barthel instances [18], the parameter p 0 increases the backbone size as p 0 → 0.25 (see also Sec. III C 2). A large backbone implies, though not necessarily, a more difficult instance to solve because the solution space is smaller (less solutions). In Fig. 5, we indeed see the exponent of the power-law scaling of the typical-case (median) CDC instances increases with increasing p 0 .
The increase of backbone size also seems to cause issues with the forward-Euler integration scheme. We observe that our DMM algorithm encounters integration issues when attempting to extend these trends farther. This indicates that reducing the lower bound of the time step and/or a better integration scheme would be beneficial. Thus, we terminate simulations when the median number of steps is beyond 10 8 .
To effectively sample the distribution for typical-case analysis requires a larger sample size per data point. In competition winner, YalSAT [34], which clearly showcases exponential scaling. Here, we show results beyond our typical-case analysis without changing any parameters or integration scheme. We find a power-law trend as a function of problem size, N , at both the 10-th percentile and 90-th percentile for p 0 = 0.08.
In Notice how the slopes of α r = 5, 6 appear to be converging. This may indicate that finite-size effects contribute to the variance of solution steps. In Fig. 6, the α r = 6 data points at N = 10 6 fall below their respective power-law trend lines. This behavior was also observed in Fig. 2 of the main text for α r = 6, 7, 8.

E. Competition instances
We sought an independent verification of our DMMs by applying them to instances taken from previous SAT competitions [20,21]. Our solver was not designed for competition, so we added a heuristic to enhance its performance.
Some competition instances are labeled "barthel" (α r = 4.3), "komb" (α r = 5.205), and "qhid" (α r = 5.5). As shown We chose to focus on the "small" competition instances because so many competition solvers failed to solve them. For instance, in the 2017 Random Track there were 120 "small" instances that should be "easy" to solve in 5000 seconds.
However, the 2017 Random Track winner (YalSAT) solved 124 out of 300 competition instances [20]. Similarly, in the 2018 Random Track there were 165 "small" instances that should be "easy" to solve in 5000 seconds. The 2018 Random Track winner (Sparrow2Riss-2018) solved 188 out of 255 competition instances [21]. With the addition of more heuristics to our system, our DMM algorithm could possibly surpass previous competition performances.
We modified our algorithm to perform in the context of competition, by making one major modification: each constraint has its own α m associated with C m , and it is modified in regular intervals during the solution search.
Initially, for all clauses, α m = 5, and all other parameters remain unchanged from the main text. The search for the solution is initialized as before, but after 10 4 arbitrary time units the simulation is paused to modify the values of α m . The procedure starts by finding the median of the x l,m values for all m. If x l,m is greater than the median, then the corresponding α m is increased by a multiplicative factor of 1.1, otherwise, the corresponding α m is decreased by a multiplicative factor of 0.9. To prevent decay to zero, α m = 1 is the minimum value. If x l,m grows to its maximum cutoff, the process restarts by setting x l,m = 1 and α m = 1. The integration is resumed without modification to any other variables or parameters, and will repeat after another 10 4 arbitrary time units. Note that some data points overlap.

F. Random 3-SAT
While we have chosen to use planted-solution 3-SAT instances for the stated reasons (solution existence known), other authors [24,30] choose to work with 3-SAT instances that lack clause distribution control and have no guarantee of the existence of a solution. Recall that the algorithms discussed herein are all incomplete SAT solvers, meaning they cannot prove a solution does not exist (UNSAT). Therefore, scalability tests on general random 3-SAT instances have a degree of uncertainty regarding whether it is possible to find solutions to all instances tested. The SID algorithm removes much of the uncertainty by manipulating a property of the SAT/UNSAT transition: for α r < 4.267, the probability that a randomly generated instance has a solution approaches 1 as N grows [30].
When N is small, it is unlikely all generated instances will be satisfiable, so the numerical simulation of AnalogSAT [24] takes another approach to generate satisfiable instances. Starting with random 3-SAT instances, the authors use another algorithm, MiniSAT [35], to filter the instances. That is, AnalogSAT is only tested on instances that MiniSAT can solve. However, this has the drawback of excluding 3-SAT instances that the filtering algorithm is incapable of solving.
In The authors of Ref. [24] prefer wall time as the indicator used to show polynomial scaling, claiming it is a realistic measure of hardware. Therefore, we show scaling of both steps ( In Fig. 8(d), for α r = 3.4, we see the DMM's scalability of the maximum solution times, ∼ N 1. 25 , is approximately the same as that reported for AnalogSAT's scaling of the mean, ∼ N 1.26 [24]. In Fig. 8(e), for α r = 3.8, we see the DMM's scalability of the maximum solution times, ∼ N 1.11 , is better than that reported for AnalogSAT's scaling of the mean, ∼ N 1.63 [24]. Additionally, our range of N goes beyond N = 10 4 . In Fig. 8(f), for α r = 4.25, we see the DMM's scalability of the maximum solution times, ∼ N 3.55 , is better than that reported for AnalogSAT's scaling of the mean, ∼ N 4.35 [24]. For the largest value of N tested, N = 463, the maximum wall time is 177 seconds, where AnalogSAT's mean wall time for the same value of N is ∼ 10 3 seconds.
For another test, we generated random 3-SAT instances at α r = 4.25, where no solution has been planted (0-hidden).
Due to being to the left of the SAT/UNSAT transition, there is a high probability that a randomly generated 3-SAT instance will be satisfiable. Therefore, we should be able to solve more than 50% of instances generated, and can use the median as another measure of scalability. We use the same DMM and parameters as presented in the main text, with ζ = 10 −2 . In Fig. 9, we find power-law scalability for these instances as well.

III. CONTINUOUS 3-SAT
In this Section, we establish the formalism for studying the continuous version of the 3-SAT problem we have solved in the main text. This continuous version generates an energy landscape that we explore with the memcomputing dynamics (see Section VI). In addition, we provide a brief discussion on the class of planted 3-SAT instances [18] that we used in this paper as benchmarks. To facilitate the theoretical analysis we will also slightly change the notation so that we can write Eqs. (7)-(12) in a more compact way.

A. From Discrete to Continuous Variables
Consider a 3-SAT Boolean formula with n variables and m clauses, where α r is commonly referred to as the clause density, as it is the ratio between the number of clauses and number of Boolean variables 1 . We let +1 correspond to the true assignment of a Boolean variable, and −1 to the false assignment. We then map the n Boolean variables into n continuous variables, v ∈ [−1, +1] n , which we term voltages. For each clause, we can define various energy functions Definition III.1 (Polarity and Constraint). Consider a 3-SAT Boolean formula with n Boolean variables and m clauses. We denote the i-th Boolean variable as x i , and its polarity in the j-th clause as if x i appears positively in clause j, The polarity matrix, Q, is the matrix with the element on the i-th row and j-th column being q ij . Note that a 3-SAT Boolean formula is completely specified by Q.
Given a voltage assignment v ∈ [−1, +1] n , we rewrite Eq. (7), the constraint of the j-th clause as The global constraint is the sum of the constraints of all clauses For any v 0 such that C(v 0 ) = 0 is satisfied, we call v 0 a solution vector.
Remark. v 0 is called a solution vector because if we take the corresponding Boolean vector x 0 by thresholding v 0 (converting v 0,i > 0 to true and v 0,i ≤ 0 to false), then x 0 must be a solution to the original 3-SAT problem. This is because the global energy being zero, C(v 0 ) = 0, implies that every clause energy must also be zero, C j (v 0 ) = 0, which further implies that every clause is satisfied under the assignment x 0 . Note that the converse is also true; if C(v ) = 0, then v must be a solution vector.
To ease the burden of notation, it is useful to define the following index notation which can be simply interpreted as the index of the voltage whose assignment is closest to satisfaction among all voltages in clause j. Note that by this definition, we have q σj ,j = ±1, denoting the polarity of the Boolean variable whose assignment determines the value of C j (v). This notation allows us to simplify the expression of the clause constraint as given in Eq. (13) Note that if the goal is for an effective numerical implementation of the memory dynamics solely as a means to find a solution, rather than relaxing into an equilibrium point, one can exploit the fact that if an assignment of v such that C j (v) < 1 2 for every clause, then the original 3-SAT problem is solved by thresholding v to generate x 0 .

Proposition III.1. Given an assignment of the voltages
Remark. This means that once we have discovered an assignment of voltages such that the constraints of all clauses are less than 1 2 , we can simply threshold the voltages to obtain the corresponding Boolean variables for a solution of the original 3-SAT problem.
The global constraint defined in Eq. (14) is not everywhere differentiable with respect to the voltages due to the use of a minimum operation, and this causes some inconvenience in analyzing certain properties of the 3-SAT problem structure from the perspective of statistical mechanics (see Eq. (21)). We then construct an energy function that is continuous (and also smooth) in anticipation of such analysis.
Definition III.2 (Energy). Given a 3-SAT Boolean formula defined by an n × m polarity matrix Q, we define the energy of the j-th clause for any voltage assignment v ∈ [−1, +1] n as The global energy is the sum of the energies for all clauses Remark. We can show in a similar fashion (see the remark of definition III.1) that if the 3-SAT problem is satisfiable, then the global energy of a solution vector v 0 will also be zero, or E(v 0 ) = 0, which is also its global minimum. The converse is also true. Therefore, the problem of minimizing the global constraint, C, and minimizing the global energy, E, are in fact equivalent problems.
The flow field of the memory dynamics for the voltages (see Section VI) contains two terms, one being similar to the gradient of E(v) (see Eq. (24)) which we name the gradient-like term and the other one closely related the clause function C j (v) (see Eq. (25)) which we name the rigidity term. At certain hyperplanes, the gradient-like term is not differentiable and the rigidity term is discontinuous (see section VI A). We develop the mathematical formalism for studying such irregular flow fields in Section V. 2 While it is possible for v i = 0, resulting in sgn(0) = 0, this rare event does not affect the remaining nonzero voltages from satisfying all clauses. In such a scenario, x i can be set to TRUE or FALSE without affecting the satisfiability of the solution vector.

B. Gauging the 3-SAT Problem
If the original 3-SAT Boolean formula has a known solution, analysis can be simplified by converting the 3-SAT formula into an equivalent 3-SAT formula in such a way that the known solution of the original formula is now a solution to the gauged formula with an all-true assignment of the Boolean variables 3 . After the conversion, there will be a restriction on the possible clause types that can appear in the formula (no clause appears with all variables negated). This will allow for a natural description of the clause distribution control (CDC) class of planted instances, and greatly simplify the analysis of memory dynamics.
Definition III.3 (Gauge Fixing). Consider a satisfiable 3-SAT Boolean formula given by an n × m polarity matrix Q. Given any solution x 0 to the 3-SAT problem, we gauge fix the polarity matrix Q with respect to x 0 , G x0 : We refer to Q = G x0 (Q) as the gauged polarity matrix.
Remark. It can be easily shown that the formulas given by Q and Q have the same structure. In particular, given some mapping of the polarity matrix G x0 , we can simultaneously map each Boolean state x to a new one as follow where * denotes component-wise multiplication. It is then obvious that the satisfaction state of each literal q ij x i is invariant under this mapping. A similar procedure applies for mapping the voltages as well Note that the performance of most SAT solvers (including the canonical Walk-SAT algorithm [23] and our memcomputing one as presented in Eqs. (7)-(12)) are invariant under gauge conjugation [? ]. Informally, this means that nothing is gained or lost in terms of the efficiency of optimization by gauging the problem first before running the algorithm, as the behavior of a SAT solver at each time step will not change under a gauge mapping (see Section VI C). The choice to gauge fix a solution to +1 is purely for analytic convenience.
An important property of a gauge fixed 3-SAT formula is that no clause can contain three negated Boolean variables.
Lemma III.2. Given a gauged polarity matrix Q of a k-SAT problem [4], we have the following ∀j, ∃i, q ij = +1.
In other words, all clauses must contain at least one literal that is an unnegated variable.
Proof. We prove this by contradiction. We first assume the negation of the Lemma, then ∃j, ∀i, q ij = −1.
Then without loss of generality (WLOG), we can assume that the j-th clause is the following Since Q is a gauged polarity matrix, a solution must be x 0 = +1. However, this assignment evaluates to false by the above clause, so it cannot be a solution. We therefore have a contradiction.
Remark. It should be noted that the inclusion of clauses with all negations does not preclude the possibility of the formula being satisfiable, as solutions other than +1 may still exist.
Lastly, we point out that the clause constraint defined in Eq. (13) has the important property of being invariant under a gauge mapping.
Lemma III.3 (Gauge Invariance of Constraints). Given a satisfiable 3-SAT instance with some solution vector v 0 , Remark. It directly follows that the global constraint must be gauge invariant as well. It can be shown in a similar fashion that the energy of each clause is also gauge invariant.

C. Planted Instances
Here, we consider a class of random 3-SAT instances generated with a planted solution to guarantee an instance to be satisfiable, however, planted in such a way so as to be hard for local-search SAT solvers to find [18]. In particular, we consider instances whose polarity matrix Q satisfies Lemma III.2 up to a gauge mapping. In other words, when we construct Q, we cannot allow the appearance of columns whose nonzero elements are all −1. We formally describe a particular method of constructing such matrices in the following section.

Randomly Planted Formula
We first consider the general method of generating satisfiable formulas where every clause is formed independently by randomly including Boolean variables, with the clause type randomly sampled from some given distribution [17].
Definition III.4 (Planted Instance). We consider a random matrix Q generated by parameters {α r , p 0 , p 1 , p 2 } that satisfies the following normalization condition For each column j, we randomly select three distinct rows {i j,1 , i j,2 , i j,3 } uniformly. We then randomly assign the elements (q ij,1j , q ij,2j , q ij,3j ) with an element from the following set with each assignment associated with the sampling probability given as follows We then assign all other elements in column j to zero.
Remark. To explain this construction in simple terms, we can consider a 3-SAT Boolean formula where each clause is independently generated through the inclusion of 3 randomly chosen Boolean variables out of the n total variables without replacement. The negations of the Boolean variables in the clause are randomly assigned such that there is a probability p 0 that all variables appear without negation; there is a probability 3p 1 that only one variable is negated (the prefactor of 3 is to account for the fact that there are 3 possible variables to negate); and there is a probability 3p 2 that two variables are negated (the prefactor of 3 arises similarly).

Clause Distribution Control Instances
We now consider a class of hard instances [18] that is generated based on the method described in Definition III.4. In particular, the generation method is restricted in the presence of a new constraints on the parameters {α r , p 0 , p 1 , p 2 }, in addition to the normalization condition given in Eq. (18). This gives us only 4 − 2 = 2 degrees of freedom in the selection of the parameters, α r and p 0 .
Definition III.5 (Clause Distribution Control Instances). A Clause Distribution Control 4 (CDC) instance generated with the parameters α r and p 0 is an instance whose polarity matrix Q is randomly generated by the following constraints based on the method given in Definition III.4.
Remark. It has been claimed that this class of instances is difficult for local-search procedures [18], though, it has been shown that the difficulty does not persist for some upper limit on α r that depends on the problem size, n [36].The results from the Walk-SAT algorithm confirm the instances generated for numerical simulation are difficult in that the showcase exponential scalability.
The reason for enforcing the condition p 0 < 1 4 is twofold. First, p 0 is restricted so that parameter p 1 is non-negative, as it represents a probability. Second, the instances created with p 0 = 1/4 are known to be solvable in polynomial time using a global algorithm [18]. It can be easily verified that the probabilities given in Eq. (19) satisfy the normalization condition (Eq. (18)) in addition to the following condition If the above constraint is satisfied, then it can be shown that a greedy local-search SAT solver initialized with a random assignment of variables will not be biased towards the planted solution [18]. In the language of statistical mechanics, we say that the instance is equivalent to an instance of a disordered diluted spin glass with couplings up to three spins [37]. The Hamiltonian of this diluted spin glass can be written as which is equivalent to the global energy as defined in Eq. (17). If Eq. (20) is enforced, then the average of the local field over the disorder H i is zero for all spins, so there is typically no direct bias towards the planted state S = +1. An extended discussion of the CDC instances can be found in literature on the statistical mechanics of Boolean satisfiability problems [17].

Solution Backbone and Cluster
As briefly addressed in the remark of Lemma III.2, planting the +1 solution in an instance does not forbid the existence of additional solutions. In fact, multiple solutions may exist, however, their locations in phase space, with respect to one another, and the similarity of solutions are generally what determine the difficultly of an instance.
In most cases, some solutions will overlap non-trivially, meaning that their assignments will coincide for a certain number of variables. For instances admitting overlapping solutions, there are two concepts (occurring non-exclusively) important for analytic studies.
For the first concept, given a solution to an instance, we can define a solution cluster as the subset of all solutions that can be assigned from the given solution via a sequence of single spin flips (Boolean variable negation) [8]. Note, after each flip the assignment must remain a solution to be considered part of the cluster. While the clustering of solutions into one big cluster may intuitively seem like a more difficult instance, knowing only one solution cluster exists is not enough information to categorize an instance as more difficult than others. The second concept will give additional information about the difficulty. Given the set of all solutions, we define the backbone to be the number of variables that appear with only one parity in all solutions [17]. In other words, for the SAT solver to find a solution, it is necessary for the backbone to be assigned correctly 5 . In general, the emergence of a backbone in a 3-SAT instance results in variables that must be assigned to a particular value to find any solution (an inherent difficulty), however, there can still exist a local field that can guide a greedy local-search SAT solver to the solution.
To understand why the CDC instances (planted solution) are difficult, it aids understanding to describe the solution cluster distribution in uniform random 3-SAT (no guaranteed solution). Using the replica symmetry approximation [37][38][39], a variational approach accounting for replica-symmetry breaking [40], and the cavity method [5,41] from statistical mechanics, it was shown that the 3-SAT problem undergoes phase transitions as clause density is The approximate backbone size for CDC instances range from 0.72n at p 0 0.077 to 0.94n at p 0 = 0.25 [18].
Therefore, with all factors considered above, p 0 serves as a measure of difficulty for the CDC instances.
In this material, we base our focus on the study of the dynamical properties of our DMM by defining solution planes on hyperfaces of the voltage hypercube [8]. When the solution vector is on a hyperface that corresponds to a solution plane (see VI D), the voltage dynamics are near a branch of a solution cluster, effectively solving the CDC instance.
To further associate the concepts, when a solution is found on a vertex of the hypercube, the solution cluster can be traversed by traveling along the hyperedges of the hypercube that connect to other solution vertices.

IV. LIPSCHITZ CONTINUITY
Before we present the equations governing the dynamics of our memcomputing solver in Section VI, it is necessary to first introduce a few formal mathematical arguments that will help establish the existence and uniqueness of the solution trajectory under an ordinary differential equation (ODE). For instance, the requirement for the existence and uniqueness of a local solution to a first order autonomous ODE is the Lipschitz continuity of the flow field [42]. We begin by formally defining Lipschitz continuity.
Definition IV.1 (Lipscthiz Continuity). Let X and Y be two metric spaces. A function f : where d X and d Y denote the metrics on X and Y respectively.
Remark. This definition can be easily specialized to a vector field V : R n → R n .
Theorem IV.1 (Picard-Lindelöf theorem). Given a Lipschitz continuous vector field V : R n → R n , the classical solution x(x 0 , t) to the first order autonomous ODE,ẋ(t) = V (x), exists and is unique for ∀t ∈ R.
Our dynamics are governed by a high dimensional vector flow field, F : R n → R n . To study the Lipschitz continuity of the vector field F , we simply study the Lipschitz continuity of the field components in the quotient spaces instead, by the following lemma.
Lemma IV.2. Given a metric space X, and a product metric space Proof. We first assume that f i is Lipschitz continuous ∀i, with its Lipschitz constant being K i . Then ∀x 1 , In other words, the Lipschitz constant for f is simply max i (K i )n 1/p so f is Lipschitz continuous. Now, we assume that f i is not Lipschitz continuous for some i . Then ∃x 1 , x 2 ∈ X such that We then have meaning that f is also not Lipschitz continuous.
For our work, we are also interested in the Lipschitz continuity of a vector field that is projected onto another vector field. In particular, in definition V.7, we show how a vector field can be projected onto a regular surface. In the following lemma, we give the condition for this "projected" vector field to be Lipschitz continuous. From here on, we shall use the notation a, b to denote the inner product of vectors a and b.
Lemma IV.3 (Continuity of Projection). Let proj v : R n → R n be the projection mapping defined as Let X be a metric space. Let f 1 : X → R n be some Lipschitz continuous function bounded from below by ∃m > 0 in norm, and let f 2 : X → R n be some Lipschitz continuous function bounded from above by ∃M > 0. Then for some constants K 1 , K 2 > 0. For the sake of simplicity, we denote Note that the first term is bounded as follows To bound the second term, it is convenient to denote φ = arccos f 1 ,f 1 , then it can be easily shown that This means that φ ≤ 2K1d(x1,x2) m . We then see that the second term in the last line of Eq. (22) is bounded as follows Therefore, we have so f is Lipschitz continuous.

Remark.
We use this lemma to study the Lipschitz continuity of a vector flow field projected onto some regular boundary, which allows for the existence of a solution at the boundary that follows the projected flow field almost everywhere. We formalize this discussion in Section V B.
In Section VIII, techniques of linear algebra are used extensively to relate the dynamics of the voltage flow field to the trajectory of the auxiliary variable, so to conclude this Section, we provide the following useful lemma in anticipation.
where denotes the Lebesgue integral. Remark.
An equivalent definition states that the Caratheodory solution follows the vector field everywhere along the solution trajectory except for a subset of measure zero [43].
We construct the Caratheodory solution in a way such that the analytic trajectory is closely mimicked by the dynamics governed by numerical simulations. In particular, the memory dynamics are governed by a discontinuous flow field, where occasionally the discretized trajectories will oscillate at certain hyper-planes of discontinuities until they "escape" the planes when the fields become sufficiently regular to allow so. The analytic construction of the Caratheodory solution is given such that the oscillatory dynamics at these hyperplanes are accounted for in a similar fashion. An extended discussion of how the analytic trajectory is simulated effectively by forward Euler is given in Section VI A.

A. Patching Vector Fields
Before we discuss the construction of Caratheodory solutions, we first formally define the class of discontinuous vector fields of interest referred to as the patchy vector fields. As the name suggests, the vector field is the result of patching together two different vector fields in a way such that a Caratheodory solution is admitted. For ease of analysis, we first assume some regularity condition on the boundary at which the fields are patched together. For any vector field with domain ∂Ω, there is a unique "projection" of the field onto the boundary, such that the projection is in the tangent bundle generated by ∂Ω.
For v, w ∈ R n , we denote the parallel and orthogonal components of v with respect to w as follows Definition V.4 (Decomposition at Boundary).
Let Ω ⊂ R n be a regular domain, and let n(x) be the unit normal vector of Ω at x ∈ ∂Ω. Let V : R n → R n be some vector field, then we denote the decomposition of the vector field at the boundary, V ∂Ω, : ∂Ω → R n and V ∂Ω,⊥ : ∂Ω → R n , as follows Lemma V.1. If V : ∂Ω → R n is bounded above and Lipscthiz continuous, then V ∂Ω, and V ∂Ω,⊥ are Lipschitz continuous as well.
Proof. Note that since V is bounded, ∃M , ||V (x)|| ≤ M , ∀x ∈ ∂Ω. Furthermore, it is clear that n(x) is bounded from below as ||n(x)|| = 1 by definition of a unit vector. It can also be easily shown that n(x) is Lipschitz continuous due to the regularity of Ω. Therefore, by using lemma IV.3, we see that V (x) ∂Ω,⊥ is Lipschitz continuous, which implies Since a projected vector field is Lipscthiz continuous, it admits a classical solution on the boundary (see Lemma IV.1).
However, at some point the trajectory has to escape the boundary once the field outside the boundary admits it. This escape condition depends on the direction of the field relative to the curvature of the boundary (see Proposition V.6).
It is difficult to give a general definition of curvature for high dimensional hyper-surfaces. However, the definition of a directional curvature is relatively straightforward.
Definition V.5 (Directional Curvature). Let Ω ⊂ R n be a regular domain. Given a point in the boundary x 0 ∈ ∂Ω and a vector field V : R n → R n . Let γ(t) ∈ ∂Ω be a trajectory such that ∃t , We then define the m-th order directional curvature at point x 0 with respect to V as for m ≥ 0. For notational compactness, we define as the lowest order curvature that does not vanish.

Remark.
Visually This definition of the curvature informs the patching operation of two vector fields at the boundary.
Definition V.6. Let Ω ⊂ R n be a regular domain, and V : R n → R n be some vector field. For x ∈ ∂Ω, we define the function ψ ∂Ω,V : ∂Ω → {0, 1} as follows Similarly, we define the function φ ∂Ω,V : ∂Ω → {0, 1} as follows Remark. Note that the definition of ψ and φ is symmetric with respect to the exchange of the interior and exterior of the domain Ω.
Definition V.7 (Patching). Let Ω ⊂ R n be a smooth open domain, and V, W : R n → R n be two distinct vector fields.
We define the patching of the two vector fields with respect to domain Ω as Remark. Note that the vector field P Ω (V, W ) is piecewise Lipschitz continuous, with its discontinuity being at the boundary ∂Ω. We can refer to V as the interior vector field and W as the exterior vector field. Visually, we can view the patched field at the boundary ∂Ω as some form of "projection" of the interior field V and exterior field W .

B. Solution in the Boundary
It is clear that the patched field P Ω (V, W )(p) is Lipschitz continuous in Ω and ext(Ω) separately. This implies that a classical solution to the ODEẋ = P Ω (V, W )(x) with initial value x 0 ∈ Ω exists up to the boundary ∂Ω (and similarly for x 0 ∈ ext(Ω)). Naturally, we also have to discuss the existence of a classical solution with x 0 ∈ ∂Ω. To do so, we first make a preliminary definition that specifies two important subsets of ∂Ω, relative to which we attach the startand end-points of the solution segments.
Definition V.8. Given a regular domain Ω ⊂ R n and two vector fields V, W : R n → R n , we denote Remark. Visually, D 1 describes a region of the boundary where the interior field points outward, and D 2 describes a region of the boundary where the exterior field points inward. This gives rise to an irregular region D 1 ∩ D 2 where the two fields "collide" at the boundary, which generates a Lipschitz continuous field that admits a classical solution in the boundary. Proof. From the definitions of κ and ψ (see definitions V.5 and V.6), we can express ∂Ω/D 1 as the following intersection of countably many sets We first assume that ∂Ω/D 1 is non-empty, otherwise D 1 = ∂Ω = D is clearly open. Note from corollary V.1 that V ∂Ω,⊥ is Lipschitz continuous in ∂Ω, so the κ (0) is a continuous mapping from ∂Ω to R. Furthermore, ∂Ω is compact, so its image must also be compact, with the infinum denoted as −C = inf x∈∂Ω {κ To show that the field P Ω (V, W ) is Lipschitz continuous in D, we first begin by noting that ψ ∂Ω,V (x) = φ ∂Ω,W (x) = 0, ∀x ∈ D, which follows directly from the definition of D and definition V.6. Then from definition V.7, we see that Proof. We here provide a brief proof sketch. We begin by treating ∂Ω as a n − 1 dimensional differentiable manifold (equipped with the pullback of the Euclidean metric by the natural embedding ∂Ω → R n ), then U : ∂Ω → T ∂Ω is clearly Lipschitz continuous on the manifold. This implies that there is a unique classical solution to the ODĖ x = U (x) on the manifold ∂Ω (under some suitable connection).

Proposition V.3 (Containment in Boundary)
. Given a regular domain Ω ⊂ R n and two bounded Lipschitz continuous vector fields V, W : R n → R n , denote x(t, x 0 ) as the classical solution to the ODEẋ = U (x) with x 0 ∈ D, where U is defined in corollary V.2.1. If we restrict the solution to t ∈ [0, t 0 ), where t 0 = inf{t ≥ 0 | x(t, x 0 ) ∈ ∂Ω}, with ∂D being the boundary of D with respect to ∂Ω, then x(t, x 0 ) is a classical solution to the ODEẋ = P(V, W )(x).
Proof. From lemma V.8, we see that To conclude, we have shown that the patched field admits a classical solution in D = D 1 ∩ D 2 at least up to some positive time t 0 .

C. Solution in the Domain
In the previous Section, we have shown how a solution segment can be constructed in the boundary ∂Ω. In this subsection, we focus on the construction of a solution in the interior Ω and exterior ext(Ω) to the ODEẋ(t) = P(V, W )(x).
WLOG, we can assume that the initial point is in the interior (see the remark of Definition V.6).
There are three possibilities for the evolution of the trajectory. First, the trajectory never leaves the interior Ω. Second, the trajectory escapes to the exterior ext(Ω), intersecting the boundary ∂Ω as required by the Jordan-Brouwer separation theorem [44]. Finally, the trajectory hits the boundary ∂Ω and "returns" back to the interior Ω.
Clearly, in the first case, the trajectory is simply the classical solution to the ODE,ẋ = V (x), and in the last two non-trivial cases, the trajectory reaches the boundary ∂Ω at some point. We first begin by noting that if the trajectory were to reach the boundary, it must enter ∂Ω through its subset D 1 .
Proposition V.4. Given a regular domain Ω ⊂ R n and two bounded Lipschitz continuous vector fields V, W : R n → R n , let x(t, x 0 ) be the solution to the ODEẋ = V (x) with initial value x 0 ∈ Ω. If the solution intersects the boundary Proof. We provide here a sketch of the proof. Note that x ∈ D 1 implies the condition V (x), n(x) ≥ 0, required at the point of intersection. This condition can be shown by the fact that the trajectory x(t) intersects the boundary ∂Ω from the interior, and x(t) is continuously differentiable and the boundary ∂Ω is smooth.
At this point, we have shown how a trajectory initialized in the interior Ω reaches the boundary ∂Ω. In order for the trajectory to be extended, we also have to consider how a solution exits the boundary. In order to guarantee that the trajectory does not violate the patched vector field in a non-zero measure set, we have to carefully specify the direction at which the trajectory exits the boundary to avoid "collision" with the field. We first formally define the notion of existence for a Caratheodory solution in a manner that suits our purpose.
Definition V.9. Given a regular domain Ω ⊂ R n and a bounded Lipschitz continuous vector field V : R n → R n , the solution x(t, x 0 ) to the ODEẋ = V (x) is said to exist in Ω up to t 0 if ∃t 0 > 0 such that x(t) ∈ Ω for ∀t ∈ [0, t 0 ).
Lemma V.5. Given a regular domain Ω ⊂ R n and a bounded Lipschitz continuous vector field V : R n → R n , and a solution to the ODEẋ = V (x) initialized at x 0 ∈ Ω. Then the following statements are true: • If x 0 ∈ Ω, then a solution always exists in Ω.
Proof. The proof of the first statement is simple. We first let x(t) be a classical solution to the ODEẋ = V (x) initialized at x 0 ∈ Ω. Note that since Ω is open, it is possible to find an open ball in Ω, B δ (x 0 ) ⊂ Ω, centered at x 0 with radius δ. Since x(t) is continuous with respect to t, it is possible to find a t > 0 such that x(t) ∈ B δ (x 0 ) for ∀t ∈ (0, t ). Therefore, we see that x(t) exists in Ω.
The proof of the second statement is more involved, and we here only provide a proof sketch. We first let x(t) be a classical solution toẋ = V (x) initialized at x 0 ∈ ∂Ω. We can then express a small neighborhood of x 0 as a graph of some analytic function f : R n−1 → R. We can then "project" the trajectory x(t) onto the boundary, and denote its projection as x (t). We can time-evolve the trajectory and its projection simultaneously forward infinitesimally by δt. We can find the displacement between the solution trajectory and its projection along the direction of the

Remark.
The visual interpretation of this lemma is rather straightforward. It essentially states that a trajectory initialized at the boundary of a regular domain can enter into the interior only if the field points inward at that point.
If the trajectory is initialized in D, then the trajectory clearly must remain in the boundary as discussed in the remark of Lemma V.8. Therefore, the trajectory can exit the boundary only if which case a solution exists in the interior and exterior respectively. Proof. The proof follows directly from Definition V.7 and Lemma V.5.

Remark.
To interpret this proposition visually, we imagine a point in the boundary such that either the interior field or the exterior field points away from the boundary. If the interior field points away from the boundary, then the trajectory should enter Ω from ∂Ω, and the trajectory will "follow" the field initially, as both the trajectory and the Definition V. 10. Given a set X and two functions, Proof. This is obvious if we note that the procedure of attaching the two solution segments will result in potentially violating the ODE only at a single point x 1 (t 1 ).

Theorem V.8 (Construction of Maximal Caratheodory Solution).
Given an open regular domain Ω ∈ R n and two bounded Lipschitz continuous vector fields V, W : R n → R n , it is possible to construct a unique Caratheodory solution to the ODEẋ = P(V, W )(x), where P is the patching operation defined in Definition V.7.
Proof. WLOG, we assume that the initial value of the ODE is x 0 ∈ Ω. Let x 1 (t) be the classical solution toẋ = V (x) existing up to t 1 in Ω. If t 1 = +∞, then x 1 (t) is trivially a Caratheodory solution as well. We then consider the case where t 1 is finite, meaning that the trajectory enters the boundary ∂Ω at some point p = x 1 (t 1 ) ∈ D 1 (see Proposition V.4). Note that κ V (p) ≥ 0, so we are left with the following cases: • If κ W (p) ≥ 0, then we extend x 1 (t) with the maximal classical solution to the ODEẋ = W (x) in ext(Ω) initialized at p. The extended solution violates the ODE only at p.
• If κ W (p) < 0 and κ V (p) = 0, then we extend x 1 (t) with the maximal classical solution to the ODEẋ = V (x) in Ω initialized at p. The extended solution violates the ODE only at p.
• If κ W (p) < 0 and κ V (p) > 0, then p ∈ D, and let x 2 (t) be the maximal classical solution to the ODEẋ = U (x) existing in D up to t 2 , where U (x) is defined in corollary V.2.1. If t 2 = +∞, then we are done; if t 2 is finite, then we let q = x 2 (t 2 ) ∈ ∂D, implying that κ V (q) = 0 or κ W (q) = 0, reducing to the previous two cases. The extended solution violates the ODE only at p and q.
We iterate this procedure every time the trajectory enters the boundary, with the treatment of the entrance from the exterior ext(Ω) mirroring the entrance from interior Ω. This gives us the maximal Caratheodory solution if we take t → ∞. It is clear that the solution can only be extended countably many times, and each segment is classical in nature (see Proposition V.6) meaning that the ODE is only violated at countably many points, so the maximal solution is in fact Caratheodory by Definition V.1.
Remark. Visually, for a trajectory initialized in Ω that enters the boundary ∂Ω, we have three scenarios. In the first scenario, the trajectory is guided by the interior field in a way such that it barely "scrapes" the boundary and returns back to the interior. In the second scenario, the trajectory "crosses" the boundary and continues its path into the exterior if the exterior field at the intersection points outward. Finally, if the trajectory enters into the boundary at a point where the interior and exterior fields both point inward, then the trajectory "tunnels" in the boundary to avoid the two fields and continues to do so until it reaches a point where one of the two fields begins pointing outward, then the trajectory begins to follow that field. If both fields never point outward, then the trajectory remains in the boundary forever.
This concludes the section which establishes the necessary mathematical formalism for discussing the memcomputing dynamics which are guided autonomously by such patchy vector fields (see Eq. (23)).

VI. MEMORY DYNAMICS
To find an assignment v that minimizes the constraint in Eq. (14), we can time-evolve the voltages autonomously with the following ODE (which is an equivalent way of writing Eqs. (7)- (12)): From now on, we shall refer to this particular ODE as memory dynamics, where v ∈ [−1, +1] n are voltages corresponding to the Boolean variables of the original 3-SAT problem with n variables and m clauses. Furthermore, we refer to x s ∈ [0, 1] m as short-term memory and x l ∈ [1, x max ] m as long-term memory, where x max > 1 is some upper bound to the slow variable dynamics 8 . The parameters {α, β, γ, δ, ζ, } are positive constants empirically tuned to provide the regularity and convergence of the dynamics with a sufficiently fast time scale (see Sec. II A). We will use the non-subscript symbol, x = {v, x s , x l } ∈ R n+2m , to denote the collection of all dynamic variables, allowing us to write the ODE asẋ where F is some flow field corresponding to the RHS of Eqs. (23).
For the sake of having a more compact expression for the ODE equations, it is convenient for us to borrow the notation of Eq. (11) and denote as the gradient-like term, as it approximately follows the directional gradient of the energy of the j-th clause along the direction of v i (see Eq. (16)). Note that the actual directional gradient is similar to Eq. (24) with the only exception being the min operation replaced with the product . The magnitude of the gradient-like term for a voltage in the j-th constraint is related to the value of the other two voltages in the constraint. Similarly, we borrow the notation of Eq. (12) and denote as the rigidity term. Its magnitude is equivalent to the clause constraint C j defined in Eq. (13) if v i is the voltage that defines C j , and zero otherwise.
We can then succinctly write the voltage dynamics aṡ where G and R are treated as n × m matrices dependent on v, the operator * denotes element-wise multiplication, and x s and x l are treated as column vectors for the sake of matrix operation. In this form, we can clearly see that the gradient-like and rigidity dynamics are weighted clause-wise by the memory variables. The presence of dynamic memory is a central feature of our dynamics.
For certain analyses of dynamical properties, it is sufficient and more convenient for us to focus on the analytic properties of the following simplified dynamicsv All the dynamical properties derived in this work under the assumption of this simplified dynamics can be easily generalized to the full dynamics if we assume sufficiently general forms for G and C (see Section VIII and IX).

A. Discontinuous Hyperplanes
We first make the important observation that the gradient-like term G is not differentiable everywhere and the rigidity term R is not Lipschitz continuous. These irregular points form hyperplanes generated by the minimum operation in the voltage space. In this section, we construct the hyperplanes which contain all the points of discontinuity for the rigidity term. These hyperplanes are generated by the binary values of δ iσj (which contains implicitly a minimum operation), and they form n − 1 dimensional hyperplanes in the voltage space R n . A similar construction also applies for the gradient-like term 9 .
Proposition VI.1 (Hyperplanes). There exists a union of countably many (n − 1)-dimensional hyperplanes in R n such that it contains all the points where the field F is discontinuous.
Proof. To lessen the burden of notation, we let N = [ [1, n]] and M = [ [1, m]]. We first recall from Eq. (15) that which implies that the field can only be discontinuous at a point where some j can be chosen such that the argmin operation is degenerate, which is equivalent to the following condition We denote the set of all points x that satisfies the above condition as ∂Ω.
For any two distinct indices of the Boolean variables, or ∀i 1 , i 2 ∈ N where i 1 = i 2 , we can define a positive hyperplane H P and a negative hyperplane H N as follows Note that both are (n − 1)-dimensional. If we recall that q ij = ±1 for all nonzero elements of the polarity matrix Q, then it can be shown that any voltage assignment v that satisfies condition (28) must be in one of such hyperplanes.
Therefore, the union of all such hyperplanes must contain ∂Ω, or Note that there are n 2 positive and negative hyperplanes each, so there are 2 n 2 hyperplanes in total, which is a countable number. This proves the proposition.
Remark. An immediate consequence of this proposition is that the rigidity term is only discontinuous at a measure zero subset of the phase space, as all the hyperplanes of discontinuities are of measure zero, and there are only countably many of them. Therefore, the rigidity term is smooth almost everywhere. Note that these hyperplanes also contain the points at which the gradient-like term is non-differentiable, meaning that the gradient-like term is also smooth almost everywhere.
As the field is continuous almost everywhere, it clearly admits a Caratheodory solution for any initial value, if the fields are patched appropriately at the hyperplanes according to the procedure in Definition V.7. Note that 9 Finding these hyperplanes for the gradient-like term is not strictly necessary, as the gradient-like term is already Lipschitz continuous. The hyperplanes will only contain points of non-differentiability, which will not affect the existence and uniqueness of the dynamical trajectory (see Section V). the phase space of the dynamics is an n + 2m-dimensional hypercube (see Section VI B), which is partitioned into disjoint subsets by the hyperplanes. A caveat here is that the domains are almost regular as the intersections of the hyperplanes generate regions of non-smoothness. However, note that these intersections have zero measure relative to the hyperplanes, so it is unlikely for a trajectory to encounter them. For the sake of analytic completeness, even if we assume that a trajectory were to encounter an intersection of planes, this does not invalidate our method of constructing a Caratheodory solution, as there is still a unique projection of vector fields on these intersecting regions.
As for using the directional curvature as the exit condition, the zeroth order directional curvature can be defined as ±∞ accordingly at these regions, and the exit protocol as given in proposition V.6 remains unchanged.

B. Compact Positive Invariant Set
To respect the Boolean structure of the original 3-SAT problem, the dynamics as given in Eqs. (23) must be bounded explicitly. First of all, we choose the bound the voltages explicitly in a compact set, which is [−1, +1] n for our work 10 . Furthermore, the short-term memory x s has to be bounded in [0, 1] m , as a way to completely stop either the gradient-like or rigidity contribution to the dynamics for each clause. Finally, the long-term memory x l has to be bounded in [1, x max ] m in practice 11 . In fact, for the analysis in the following sections, we will regularly assume that the bound x max on the long-term memory is absent, meaning that x l ∈ [1, +∞) m , in an effort to increase the generality of certain propositions. The bounds on the short-term memory is crucial, however, and will always be assumed present.
Putting everything together, this means that the dynamics must be fully contained within the region O = [−1, 1] n × [0, 1] m × [1, x max ] m , which is a compact set in R n+2m . To put this formally, we have to show that O is an invariant set, and any trajectory with initial value in O must remain in O forever. To do so, we consider a general ODE with the flow field defined in a regular domain Ω, such that a Caratheodory solution exists in the domain. In other words, we haveẋ = F (x), where F : R n+2m → R n+2m is some sufficiently regular vector field in Ω ⊂ R n+2m . Suppose we now wish to modify the vector field in such a way that, for any initial value x(0) ∈ Ω, the trajectory is contained entirely within the closure of that domain Ω, or x(t) ∈ Ω for ∀t ≥ 0. This has to be done carefully such that the original flow field in Ω remains the same. We do so by patching the original vector field with a "bounding" vector field in ext(Ω) as follows.
Lemma VI.2 (Bounding Field). Let Ω ⊂ R n be a smooth open domain, and let F : R n → R n be some bounded vector field that admits a Caratheodory solution in Ω. Let G : R n → R n be some Lipschitz continuous vector field satisfying where M > 0 can be any positive constant, and n(x) is the outward pointing unit normal vector of the boundary ∂Ω at x. Then any construction of the Caratheodory solution (see theorem V.8) to the ODE,ẋ = P(F, G)(x), with initial value x 0 ∈ Ω, has the property that x(t) ∈ Ω for ∀t ≥ 0.
Proof. Note that based on the construction given in theorem V.8, it is sufficient to show that κ G (x) < 0 for ∀x ∈ ∂Ω, as the trajectory will never be able to exit into the region ext(Ω). By construction, we have G(x), n(x) = −M for ∀x ∈ ∂Ω, so it follows directly from definition V.5 that κ G (x) = −M < 0.

Remark.
By adding the "bounding" vector field G, we are essentially "projecting" any "stray" fields onto the boundary ∂Ω, such that whenever a trajectory enters the boundary, it will continue to "flow" inside the boundary (see corollary V.2.1) and never escape Ω. An important point to note is that the dynamics do not stop after reaching ∂Ω.
Let G : R n → R n be some Lipschitz continuous vector field such that ∀i ∈ [[1, n + 2m]]: where M > 0 can be any positive constant, andê i is the i-th component of the standard basis. Then O is a positive invariant set under the ODE,ẋ = P(F, G)(x). Furthermore, the superposed flow field on the hyperplanes is given by where H denotes the Heaviside step function.

Remark.
To visualize the bounding flow field, one can imagine a hypercube O where the internal field remains unchanged, and the exterior field is "pressing against" the faces of the cube to ensure that any trajectory initialized inside the cube remains inside. The flow field on the "faces" of the cube is simply the projection of the field onto the plane if the field were to point outward. This bounding procedure effectively mimics the numerical technique that we use to bound the dynamics, where any outward pointing component of the flow field on the boundary is simply ignored.
From now on, when we refer to memory dynamics, we are referring to the system of ODEs given in Eqs. (23), with the bounds of the dynamics enforced by the exterior field G as constructed in corollary VI.2.1. To lessen the burden of notation, we shall refer to the patched flow field of the memory dynamics, P(F, G), simply as F . The set x max ] m is then a positive invariant set of the memory dynamics.

C. Gauge Invariance of Dynamics
In this Section, we primarily focus on formalizing the notion of gauge invariance for the dynamics governed by Eqs. (23). To do so, it is convenient to first reformulate the flow field as a group action.
Definition VI.1 (Time Mapping). Given a vector field V : R n → R n such that there is a unique positive solution x(x 0 , t) to the ODEẋ = F (x) for any initial value x 0 ∈ R n , we define a mapping T s : R n → R n for ∀s ≥ 0 as follows

Remark.
It should first be noted that T s is a well defined operator ∀s ≥ 0, as the solution to the ODE with any initial value is unique. It can also be easily checked that the operators T s form a semigroup with the identity element being T 0 . In fact, we have The reason why the operators form only a semigroup is because it does not necessarily have a group inverse, as we do not require the negative solution to the ODE to exist or be unique.
For our memory dynamics, an important property of T s is that it is invariant under gauge conjugation. This is important as it essentially allows us to simplify the analysis of the memory dynamics by assuming that a solution vector is v 0 = +1.
Proposition VI.3 (Gauge Invariance of Dynamics). Given a polarity matrix Q corresponding to a satisfiable 3-SAT instance with some solution vector v 0 , and an operator T s corresponding to the memory flow field F , we have the following where G v0 is the gauge mapping operation in Definition III.3.

Remark.
We here provide a proof sketch of this proposition. We first begin by noting that the operators T s form a semigroup, so it is sufficient to show that the infinitesimal group generator F is invariant under gauge conjugation, This is equivalent to showing that transforming both the LHS and RHS of the equations in (23) does not violate the equalities, which can be easily shown by recalling that C j (v) is gauge invariant (see lemma IX.4.1). Then we see that a prefactor of v 0,i appears in both the LHS and RHS of the voltage equations. Furthermore, we can also easily show that the discontinuous hyperplanes (see Section VI A) and the boundaries of the hypercube containing the dynamics (see corollary VI.2.1) are also invariant under the gauge mapping. Therefore, the operator T s must be invariant under gauge conjugation for ∀s ≥ 0.

D. Correspondence between Fixed Points and Solutions
The fixed points of the dynamics must correspond to a solution to the original 3-SAT instance (if the instance is satisfiable). Otherwise, the correctness of the memory dynamics as a SAT solver cannot be guaranteed, as it is possible for the dynamics to terminate at a point corresponding to a non-solution. We dedicate this section to the correspondence between the fixed points of the dynamics and the solutions of a 3-SAT instance. Before we continue this discussion, we first note that it is possible to solve a 3-SAT Boolean formula with a partial assignment of the Boolean variables, which corresponds to hyperfaces on the voltage hypercube (see Section III C 3). In other words, it is possible for the dynamics to solve a 3-SAT instance by converging to a hyperface instead of any particular solution vector, and the solution can be extracted by choosing an arbitrary vertex of that hyperface. Let v be a solution vector, and I be a proper index set. We define the solution plane to be The vertices (which are solution vectors) are said to be connected by this plane. Any solution vector that is not connected by a solution plane is said to be isolated.

Remark.
Note that for a given solution vector v , its proper index set is not necessarily unique, and depends on the polarity matrix of the 3-SAT Boolean formula. The solution plane is, however, unique given a solution vector and its proper index set. Proof. The proof follows trivially from the definition of the solution plane (see definition VI.2) and the definition of the clause constraint (see Eq. (13)).
Remark. One immediate implication of this lemma is that once we have found a voltage assignment such that the global constraint (or energy) is zero, then the voltage vector must be either a solution vector, or it must be in some solution plane. If it is in a solution plane, then we can take any vertex of that plane as a solution to the 3-SAT problem.
Since a solution vector and a vector in a solution plane both solve the 3-SAT problem, we can treat a solution vector equivalently to a solution plane. Then for an isolated solution vector v , its solution plane simply refers to itself. Proof. We first show the first part of the proposition. Given any x s and x l , we denote x = {v , x f , x s }, where v is in a solution plane. WLOG, we can assume that the solution plane is H +1, [ [1, n ]] , where n < n and sgn(v ) = +1 (if not, we can simply gauge the polarity matrix and relabel the indices such that it is true). We first begin by showing that C(v(x , t)) = 0 for ∀t > 0. To do so, it is sufficient to show that for all such v (and arbitrary memory), the voltage flow field is positive, meaning that the trajectory will be pressed against the solution plane.
WLOG, we first focus only on the dynamics of v 1 influenced by clause j (assuming that q 1j = 0). The gradient influence is Note that the gradient-like term is non-positive only if q 1j = −1, which implies that q ij v j = +1 otherwise C j = 0. In this case, we have G 1j = 0, therefore it is required that G 1j ≥ 0 for all cases. For the rigidity term, we have which is necessarily zero as C j (v) = 0. Therefore, all possible contributions to v 1 are non-negative, and this applies for ∀i ∈ I. This means that C(v(t, x )) = 0 for ∀t > 0, thenẋ s (t) < 0 andẋ l (t) < 0, so both memory variables will decay and terminate at 0 and 1 respectively.
The proof of the second part of this proposition is shown as Corollary VII.2.1, immediately after we establish certain properties of the basin of attraction for our dynamics.
Remark. The proposition essentially states that once the voltage vector reaches a solution plane, then the dynamics will flow to a fixed point. On the other hand, if the voltage vector has not reached a solution plane yet, then the dynamics will continue to evolve (until it finds the solution). If the original 3-SAT problem is unsatisfiable, then the dynamics will continue to evolve forever.

VII. BASIN OF ATTRACTION
From proposition III.1, we see that the 3-SAT problem is essentially solved once we have discovered a voltage assignment such that C(v) < 1 2 , and the dynamics can be terminated. However, in some cases, the implementation of this termination condition is perhaps not feasible, so we have to allow the dynamics to fully converge to a solution vector v 0 . In this case, it is necessary for us to determine the basin of attraction in which the dynamics are guaranteed to evolve towards the solution. We first formally define the basin of attraction as follows.
Definition VII.1. Given some flow field F : R n → R n , let x be a fixed point of this field. We define the basin of attraction of x as Remark. From the first part of proposition VI.5, we see that every solution plane must contain a fixed point. We can then modify the above definition to solution plane as follows WLOG, we let v = +1 and the proper isolated index set be I = [ [1, n ]], then J (v ) = J (+1) = [1 − 2γ, +1] n × [−1, +1] n−n , which we simply refer to as J from here on. We first note that if x s = 0, then for ∀v ∈ J , we havė v(0) ≥ 0 (see the first line of Eqs. (23)). Furthermore, ∀v ∈ J , we haveẋ f ≤ 0 (which follows from the second line of Eqs. (23) and Lemma VII.1). We first show, by contradiction, that for any point initialized in the supposed subset of the basin, then the evolution of each isolated component of the voltage vector must be weakly monotonous, oṙ v i (t) ≥ 0 for ∀i ∈ [[1, n ]] and ∀t > 0.
We let some initial point be x max ] m , and the solution trajectory be x(t).
WLOG, we assume that v 1 (t) is not monotonously increasing, and is the first voltage in time to violate the inequalitẏ v 1 (t) ≥ 0. We denote this time to be It is clear that v 1 (t) ∈ J for ∀t ∈ [0, T ]. In addition, it is required that x s (T ) = 0. This is, however, only possible if But as v(t ) ∈ J , the above condition is not possible. Therefore, by contradiction, we must havev(t) ≥ 0 for ∀t ≥ 0.
To complete the proof, it is sufficient to show that ∀i ∈ I, we have lim t→∞ v i (t) = +1. Again, we prove this by contradiction. We first assume that the statement is not true, then ∃i ∈ I (WLOG let i = n − 1), ∃ > 0 such that lim t→∞ v i (t) = 1 − , and lim t→∞vi (t) = 0 (as v is monotonous). This means that there is a time T , after which v i can no longer appear as the most satisfied literal in any clause. If this is not the case, then ∀T , ∃t > T such thaṫ v i (t ) = v i (t ), which is clearly not possible as the limits of the LHS and RHS converge to different values. As v i is no longer the most satisfied literal in any clause, we can set its value arbitrarily in [−1, +1], and the condition C(v) ≤ γ will still remain true, as the clause energy of each clause only depends on the most satisfied literal (see Eq. (13)).
From Lemma VII.1, this implies that the restricted solution orthant should be [+ 1 2 , +1] n −1 × [−1, +1] n−n +1 instead. However, the restricted solution orthant of a solution vector is unique given a proper index set I (see the remark of definition VI.2), so we have a contradiction. Therefore, the dynamics must converge to a solution plane, and thus also to a fixed point by proposition VI.5.

Remark.
Note that this basin of attraction is a superset of the basin of attraction proven in another work [45] using continuous dynamics for solving k-SAT problems. This means that the basin of attraction for our dynamics is larger, which is a desirable property for using our dynamics as a SAT solver.
Corollary VII.2.1. If x is a fixed point of the memory dynamics given in Eqs. (23), then v is in a solution plane.
Proof. If x is a fixed point, then clearly we require C j (v ) ≤ δ for ∀j, otherwise x l,j will increase. And since δ < γ, we have C j (v ) < γ, meaning that x s,j = 0, otherwise x s,j will decrease. Since C j (v) ≤ δ < 1 2 , sgn(v ) is a solution vector (see Proposition III.1), and x ∈ B(sgn(v )) (see Theorem VII.2). If v is in a solution plane, then we are done; if not, then x necessarily evolves to a solution plane by Theorem VII.2, meaning that it cannot be a fixed point, creating a contradiction. Therefore, v must already be in a solution plane to begin with.

VIII. DYNAMIC VOLTAGE FLOW
Often times we are only interested in the convergent properties of the voltage dynamics, as they correspond directly to the state of Boolean variables. On the other hand, the evolution of the memory variables is important in influencing the trajectory of the voltages indirectly by changing the strength of the gradient-like and rigidity terms (see Eqs. (23)).
It then makes sense to restrict our attention to only the components of the flow field that govern the dynamics of v directly, which we can denote as F v , and refer to as reduced flow field in the voltage space, or simply the voltage flow.
It should be noted that this flow is not autonomous and is, in fact, dynamically governed by the memory. In this Section, we establish the tools needed to study such reduced flow field, which we will use to show certain properties such as the absence of periodic orbits (see Section IX) in the voltage space. For the remainder of this material, we shall assume that the full flow field is always Lipschitz continuous to simplify discussion, since we have seen in Section VI A that the existence of measure-zero discontinuities does not alter significantly the behavior of our dynamics. Often times, we will focus on the simplified dynamics as given in Eq. (27) and assume continuity for G and C.

A. Reduced Flow
In this Section, we will first factor the full phase space into the reduced space and the auxiliary space, which will allow us to formalize the notion of a reduced flow field. Visually, the reduced flow field can be viewed as the full flow field "projected" onto a subspace. We proceed with the following series of definitions.
Definition VIII.1. Let X be a set and Y = m j=1 Y j be a product space. Given any mapping F : X → Y , we define the decomposition of F as F j : X → Y j for ∀j ∈ [ [1, m]] such that Definition VIII.2. Let F = (F 1 , F 2 ) : R n × R m → R n × R m be a flow field, and x(t, x 0 ) be a trajectory under this flow field. If we denote R n as the reduced space and R m as the auxiliary space, then we define the reduced trajectory and the auxiliary trajectory, x 1 (t, x 0 ) ∈ R n and x 2 (t, x 0 ) ∈ R m , such that Definition VIII.3 (Reduced Flow). Let F = (F 1 , F 2 ) : R n × R m → R n × R m be a flow field, and x(t, x 0 ) be a trajectory under this flow field. For a given initial point x 0 , we construct the reduced flow field F r : R × R n such that ∀t ≥ 0, F r (t, x 0,1 ) = F 1 x 0,1 , x 2 (t, x 0 ) .

Remark.
It can be easily verified that the reduced flow field is well defined at any given time. Visually, if we view the reduced space as a hyperplane that "cuts" the full flow field, then the reduced flow field is simply the "cross section" of the field in the plane. As the auxiliary variables evolve in time, the plane will move in the auxiliary space, or simply some direction orthogonal to the plane, thereby changing the cross section. We then see that the reduced flow field is effectively a dynamic flow field governed by the auxiliary trajectory.

B. Flow Kernel and Complement
In this subsection, we will relate the dynamics of the reduced flow field to the auxiliary trajectory explicitly. We will focus specifically on the case where the reduced flow field is linear in the auxiliary variables, with the simplified memory dynamics in Eq. (27) as an example. In particular, at every point in the reduced space, the auxiliary space can be factored into two subspaces, one of them in which the auxiliary trajectory can evolve without affecting the reduced flow field. We refer to this subspace as the flow kernel of the reduced flow field at that point, and the other factor subspace as the flow complement. We formally define the two subspaces as follows: as the flow kernel of F 1 at x 1 .

Remark.
Clearly, the flow kernel is a vector space. In fact, given any fixed x 1 , the operation F 1 (x 1 , x 2 ) can be regarded as a mapping from R m to R n via an n × m matrix, with its kernel being the flow kernel. Let the rank of the matrix be n ≤ n. If n ≥ m, then clearly the kernel is trivial. On the other hand, if n < m, then the dimension of the kernel is m − n by the rank-nullity theorem. Hard 3-SAT instances generally are at clause ratios near 4, meaning that m ≈ 4n (if we let the voltage space be the reduced space), and the kernel is generally non-trivial.
Definition VIII.5 (Flow Complement). Let F = (F 1 , F 2 ) : R n × R m → R n × R m be a vector field. We refer to the orthogonal complement of the flow kernel at point x 1 , as the flow complement of F 1 at x 1 .
Remark. For any fixed Definition VIII.6 (Auxiliary Relevance). Let F = (F 1 , F 2 ) : R n × R m → R n × R m be a vector field. Given x 1 ∈ R n and x 2 ∈ R m , we refer to the projection of x 2 to K F1 (x 1 ) as the irrelevant component, and the projection to G F1 (x 1 ) as the relevant component, which we denote as x * 2 .
Remark. Given a reduced flow field that is linear in the auxiliary variables, it can be shown that the time derivative of the field is zero at time t and location x 0,1 , if and only if the auxiliary variable evolves in the flow kernel of F 1 , oṙ Equivalently, this means that the time derivative of the relevant component of x 2 must be zero, oṙ x * 2 (t, x 0 ) = 0.

C. Unstable Non-solution Fixed Points
In proposition VI.5, it was shown that every fixed point in the full phase space R n+2m must correspond to a so- For simplicity, we focus on the simplified memory dynamics as given in Eq. (27) We here temporarily relax the specific forms of functions C and G (Eqs. (13) and (24) respectively), and simply require they be general and non-singular. A voltage fixed point means thatv = 0, implying that the memory must be in the flow kernel, or x ∈ K(v). If the conditionv = 0 is to be held in time (orv = 0), then the memory must also evolve in the flow kernel, orẋ ∈ K(v). Equivalently, G(v) · C(v) = 0. The LHS is simply a R n → R n mapping, so the preimage of 0 consist of finitely many points in general, and they constitute a measure-zero set in R n . This shows the unlikeliness of the dynamics being trapped in a non-solution fixed point.
To show that the gradient-like influence is unstable, we first note that a Jacobian element of the gradient-like term can be written as where the last equality is from Eq. (24), and derivations across the discontinuous hyperplanes are neglected. In this form, it is clear that the diagonal elements of the Jacobian are zero, or J ii = 0 for ∀i. To see this, we simply note that ∂ vi v j = δ ij , and the condition i = j imposed by the min function. This means that the trace of the Jacobian is zero, meaning that any fixed point cannot be stable (otherwise the Jacobian would necessarily be negative in the real component of the trace). 12 If we were to extend the analysis of this subsection to the full memory dynamics (by including the rigidity and fast memory dynamics), then the RHS tov can be decomposed into two terms, one quadratic in x and the other being only dependent on v. The equatioṅ v = 0 would still be a polynomial equation for x, and the solution space of x can be similarly decomposed into a hyperface defined by the corresponding algebraic variety and its complement, and the analysis in this subsection can be easily extended for the full memory dynamics as well by considering the local tangent space.

IX. NON-PERIODICITY OF DYNAMICS
In dimensions greater than 2, a dissipative system 13 may admit periodic orbits. Therefore, we shall show the absence of periodic orbits explicitly in this Section. This result directly precludes the possibility of chaos (see Section IX D).
We formulate the proof of non-periodicity on the voltage space by making use of the formalism developed in Section VIII. Note that showing the absence of periodic orbits in the full state space (voltages plus memories) is not sufficient for our purpose as it does not preclude the existence of periodic orbits in the reduced voltage space, which is directly relevant to the solution of the 3-SAT problem. For analytic convenience, we shall assume all mentioned fields in this Section is sufficiently well-behaved (i.e., Lipschitz continuous in space and continuous in time) such that it admits a unique classical solution for all initial values.

A. Generalized Periodicity
As the voltage dynamics by itself is not autonomous (since it is memory dependent), we first have to construct a non-standard definition of periodicity for dynamic fields that suffices in the context of optimization. In general, a dynamic field admits periodic orbits of non-constant periods. We first recall that the classical definition of periodicity for static fields is given as follows, and generalize this definition for dynamic fields.
Definition IX.1 (Regular Periodic Orbit). Let x : [0, +∞) → R n be a trajectory with initial value x 0 . The trajectory is said to be periodic if ∃T > 0 such that The periodic orbit of x 0 is and the period of this orbit is T .
Remark. It is fairly easy to show the following meaning that the periodic orbit is also the maximal positive orbit of x 0 , which makes sense because the trajectory cannot escape the periodic orbit even given infinite time. This property generalizes naturally for dynamic fields.
Definition IX.2 (Speed). Let x : [0, +∞) → R n be some trajectory with initial value x 0 . If the trajectory is everywhere differentiable in time, we define the velocity along the trajectory to bė x(t, x 0 ), and the speed to be s(t, x 0 ) = ||ẋ(t, x 0 )||. 13 See Section X for a detailed discussion of the dissipativeness of the memory dynamics.

Remark.
Clearly, the velocity and speed of the trajectory is also periodic with the same period as the trajectory itself. If the trajectory is governed by the flow field F , then the period of the orbit is given by the following contour This integral is well-defined for a static flow field, but it is no longer well defined if F is explicitly time dependent, in which case the period may be time-dependent as well.
Definition IX.3 (General Periodic Orbit). Given a time-dependent flow field F : R × R n → R n , a general periodic orbit is said to exist for x 0 if ∃T such that Let the periodic orbit be γ x = {x(t) | t ∈ [0, T )}, then it is required that x(t, 0) ∈ γ x for ∀t > 0. Furthermore, For a given time t, we let then the period at time t is given as is a constant in time, then the periodic orbit is said to be regular; otherwise, the periodic orbit is said to be irregular.
Remark. Essentially, a general periodic orbit is a closed trajectory which contains the maximal positive solution.
Furthermore, there must be a point and its neighborhood on the orbit which the trajectory visits two separate times, with the period simply being the time duration until the next visit. Technically, the period can be zero if the flow field is zero at that particular time and point and infinite if the trajectory never revisits the point, but there must be at least one point in time where the period is positive finite.
Lemma IX.1. If a dynamic flow field F : R × R n → R n admits an irregular periodic orbit, then ∃x 0 such that Proof. The proof is omitted. See remark instead.
Remark. Essentially, there must be at least one point on the periodic orbit where the dynamic flow field align (or anti-align) with itself at two separate times. This is clearly required so that the trajectory can "revisit" the orbit in the neighborhood of that point.
Corollary IX.1.1 (Change in Relevant Component). Given a static flow field F = (F 1 , is linear in x 2 and an irregular periodic orbit γ exists in R n , then ∃x 0 ∈ R n+m such that ∃t 1 ≥ 0, where x * 2 (t, x 0 ) is the relevant component of x 2 at time t as defined in Definition VIII.6.
Proof. The proof follows directly from Definition VIII.6 and Lemma IX.1.1.
Remark. In terms of the relevant component of the auxiliary variable, the periodic orbit is regular if and only if

B. Absence of Irregular Periodic Orbits
In the previous subsection, we have seen that a periodic orbit under a dynamic field can be categorized as either being regular or irregular. To show the absence of periodic orbits, we treat the two cases separately, as they require different proof techniques. In this subsection, we focus on the irregular case, which requires the physical notion of speed as defined in definition IX.2; we treat the regular case in the next subsection, by formulating the problem in the geometric context of hypersurface intersections.
For the sake of simplicity, we again focus on the simplified dynamics as given in Eq. (27) We require the functions C and G to be everywhere differentiable 15 (which automatically guarantees Lipscthiz continuity). This also guarantees that any image of a compact set in R n is bounded above in norm. For C, we can assume that it is bounded below in norm also, otherwise C = 0 implies that the trajectory is in a solution plane in which case it must converge to a fixed point (see Proposition VI.5) 16 . To make the definition of speed (see Definition IX.2) useful, we first have to formally define the generalized concept of location for trajectories governed by a dynamic field.
Remark. The location is quite literally the location on the real line if we unwind the trajectory locally to a straight line. As long as the trajectory keeps moving "forward" in some time interval, then the mapping ϕ is bijective, meaning that there is a one-to-one correspondence between time and location. This is why the condition x(t, x 0 ) = 0 is required locally.
Lemma IX.2. In a time interval in which location can be defined, the speed is differentiable with respect to location in the interval. In other words, the mapping s • u −1 is locally differentiable.
Proof. First of all, we have s(t) = ||v(t)|| = 0 in the time interval, and we note thaẗ 14 The rigidity influence is negligible in the periodicity analysis as the dynamics is dominated by the gradient-like term when the system is continuously in an unsatisfied state, which is clearly the case when the dynamics is trapped in a periodic orbit. 15 Note that the actual gradient-like term G defined in Eq. (24) is everywhere differentiable except at certain hyperplanes which constitute a measure-zero set in the voltage space (see section VI A). It is easy to see that the presence of these hyperplanes will not affect the periodicity analysis. 16 In fact, it can be assume that |C(v)| ≥ δ for the full equations by Proposition VII.2.
which is well-defined as G is everywhere differentiable, meaning that s(t) is differentiable with respect to t (as long as s(t) = 0). Furthermore, we note that u (t) = s(t), meaning that u is also differentiable with respect to time. Since s(t) = 0, the inverse u −1 is differentiable in the interval as well. Therefore, s • u −1 is differentiable in the interval, as the composition of two differentiable mappings.
Theorem IX.3. An irregular orbit does not exist in the voltage space.
Proof. We prove this by contradiction, by assuming that an irregular orbit does exist. Then by Lemma IX.1, there is ∃v 0 such that kv(t 1 ) =v(t 2 ) = 0, where t 2 = t 1 , v(t 1 ) = v(t 2 ) = v 0 , and WLOG k ∈ (0, 1). We let u 0 be the location of v 0 , δu be the infinitesimal change in location from u 0 . Furthermore, we let s 1 be the speed at time t 1 , and δs be the change in speed with respect to δu (which is well-defined as shown in Lemma IX.2). If we let δθ be the change in direction, then we have the following equality where we discarded third order terms on the LHS and applied Eq. (27) and u (t) = s(t) on the RHS. Rearranging the terms gives us If we let the speed at time t 2 be s 2 (with s 2 = ks 1 ), then the above relation will hold similarly. We can assume that s (u 0 ) = 0 at t 2 , which is justified as s is bounded and differentiable everywhere. Since s 1 > s 2 , and {x, C, G} are everywhere differentiable, we must have s (u 0 ) 2 < 0 at time t 1 by the above relationship, which is clearly impossible.
Therefore, an irregular orbit cannot exist.

C. Absence of Regular Periodic Orbits
In the previous subsection, we showed the absence of irregular orbits, so if a periodic orbit were to exist in the voltage space, it must be a regular periodic orbit. In this Section, we show that the existence of a periodic orbit is also absent in general. The problem can be described geometrically where a regular orbit can be described as the intersection between two low dimensional hypersurfaces in a high-dimensional space, which cannot occur if the two surfaces are in general positions.
Theorem IX.4. In general, a periodic orbit does not exist in the voltage space.
Proof. In theorem IX.3, we showed the absence of irregular periodic orbits in the voltage space, it is then sufficient to show that a regular periodic orbit is absent as well.
If a regular orbit were to exist in the voltage space, then we can denote its period as T , and its initial point as x 0 ∈ R n+m . The change in the memory over a period from time t is then given by where in the last equality, we used the periodicity of v to remove the explicit dependency on t in the integral bounds.
This allows us to simply set the result as some constant vector K that is constant in time.
Clearly, K must be in the flow kernel of the reduced flow field ∀t ≥ 0 (otherwise the velocity would not be the same after a period), which implies It can be assumed that any sensible matrix G dictating the evolution of the voltages must coincide with the polarity matrix Q exactly in its nonzero elements, so G : R n → R 3m as there are m clauses and 3 literals per clause, which gives us an injective mapping. Furthermore, it can be assumed that the mapping is C ∞ diffeomorphic and general so that the image of R n is a smooth n-dimensional hypersurface at general position in R 3m . On the other hand, condition (29) is a system of m linear equations, so the set of all matrices (with nonzero elements matching the polarity matrix) solving the system for a given K forms a 3m − m = 2m dimensional hyperplane in R 3m , which is also in general position as K is general. The n-dimensional hypersurface generated by the voltages and the 2m dimensional hyperplane do not intersect if they are in general position, as 2m + n < 3m, where we have assumed n < m (see the remark of definition VIII.4).

Remark.
Note that the dimension of the surface containing the periodic orbit must be at least 2, which means that the intersection of the two surfaces in R 3m must be at least 2 dimensional as well, and this makes the existence of periodic orbits even less likely. Even assuming that n > m, meaning that the intersection of the two surfaces is non-trivial, the existence of a periodic orbit in the voltage space is still unlikely. We require an initial memory value that generates a reduced flow field that guarantees the containment of the voltage trajectory completely in the intersection, which does not exist in general.

D. Absence of Chaos
Devaney's definition of chaos [46] requires a dense set of periodic orbits and topological transitivity. We have shown in the previous Section that periodic orbits are not supported by the dynamics defined by Eqs. (23), and this directly precludes the existence of chaos. We can then state the following: Corollary IX.4.1 (Absence of Chaos). The voltage dynamics are non-chaotic.

Remark.
Although not required to show the absence of chaos, we also note that the voltage dynamics are not topologically transitive if a solution of the 3-SAT problem exists which implies the existence of fixed points (See Proposition VI.5). This is because the dynamics are confined in a compact positive invariant set O with nonempty interior (see Section VI B) and there is at least one fixed point in that set with its ω-limit set that is not O.

X. DISSIPATIVENESS
A rather important property of the memory dynamics is dissipativeness. In other words, the measure (or volume) of an initial set contracts under the flow field, eventually evolving to a measure zero set. To show dissipativeness for wellbehaved (everywhere differentiable) vector fields, it is sufficient to show that the divergence is negative everywhere.
However, our dynamics are governed by a discontinuous flow field, so we have to carefully account for the regions of discontinuities (see Section VI A).

A. Preliminaries
Before we discuss the dissipative property of the memory dynamics, we first have to formally define the notion of dissipativeness for a continuous dynamical system.
If the following is true ∀Ω 0 , ∀s > 0, µ(s, Ω 0 ) < µ(0, Ω 0 ), then the system is said to be dissipative. If the last inequality is not strict, then the system is said to be weakly dissipative.
An equivalent definition of a dissipative system would then be the following ∀Ω 0 ,μ(0, Ω 0 ) < 0, meaning that any initial domain must continually shrink in time. For the sake of clarity, we can discard the trivial argument s = 0 and simply writeμ(Ω 0 ) =μ(0, Ω 0 ) from here on.
It is well known that a bounded domain can be approximated 17 as the union of regular domains [44] (see Definition V.2). Therefore, to show that a system is dissipative, it is sufficient to show that the volume of any regular domain contracts under the flow field. In the case where the flow field is in C 1 , the mapping T s is diffeomorphic for ∀s > 0, meaning that the shape of the boundary will be preserved (being always diffeomorphic to a sphere), allowing us to make use of the following Lemma.
Lemma X.1. Given a vector field F : R n → R n differentiable everywhere, the system is dissipative if the following is true ∀x ∈ R n , ∇ · F (x) < 0.
Proof. The proof follows directly from divergence theoreṁ where Ω ⊆ R n is any smooth domain. This implies that F is dissipative.

Remark.
The converse of Lemma X.1 is almost true, in the sense that if the vector field F contains regions of positive 18 divergence, then the system cannot be dissipative. To show this, we assume ∃x ∈ R n such that ∇ · F (x) > 0, then it is clear that the region where the divergence is positive is an open set. This means that for any ∇ · F (x) dV > 0, implying thatμ B (x 0 ) > 0, meaning that the system cannot be dissipative.
The analysis in this subsection assumes that the field is differentiable everywhere. In the next subsection, we generalize this analysis to fields that are differentiable everywhere except at certain hyperplanes. This class of fields contains the flow field of the memory dynamics (see Section VI A). 17 Here, we are speaking of approximation in the measure-theoretic sense. In other words, the measure of the domain and the measure of its approximation are the same. 18 Note that it is not sufficient that the divergence be non-negative, as it is possible for the divergence to be zero, forming a closed set. This is why the converse of Lemma X.1 is not strictly true.
In Section VI A, we argued that the memory flow field is separated into continuous regions by hyperplanes. The divergence cannot be defined at the hyperplanes as the field is discontinuous, so we restrict the divergence analysis to a domain where the field is in C 1 .
The divergence of the flow field in the voltage space 19 is where the divergence of the gradient-like term is zero because the element G ij never depends on v i (see Eq. (24)).
The expression for the divergence of the rigidity term can be further simplified if we realize that And the divergence expression reduces to meaning that the voltage is dissipative at any point where the field is continuous.
It is easy to study how the addition of discontinuous hyperplanes affects the dissipativeness of the voltages. For the sake of simplicity, we focus on the following simple 2-SAT formula with only one clause where, WLOG, the polarity can be assumed positive for both literals (see Section III B). The discontinuity of the voltage flow field clearly is in the line v 1 = v 2 . Note that the gradient-like field The reason why we only care about the dissipativeness in the voltage space is because it is directly relevant to the convergence of the 3-SAT solution search. It is possible for the solver to be efficient even if the memory is non-dissipative.
is continuous everywhere, while the rigidity field is not, which is given in the upper-left and lower-right regions as respectively. The field points away from the line, meaning that the volume of a domain approaching this boundary will be expanded, with the expansion being greater the more unsatisfied the clause is. This is in fact a desired feature of the rigidity field as it attempts to expand the volume if the initial domain is in a frustrated region, which allows for a more thorough exploration of the voltage space.
In conclusion, the flow field in the voltage space is dissipative everywhere except at certain hyperplanes, where the domain may be expanded in a manner that facilitates the finding of the fixed points.
XI. O(n α ), α ≤ 1, SCALING WITH PROBLEM SIZE In this Section we employ results from the (supersymmetric) topological-field theory (TFT) of dynamical systems [47] to prove that the continuous-time dynamics defined by Eqs. (23) is such that the system reaches a fixed point/solution plane (solution of the 3-SAT) in a time that scales with the 3-SAT instance size, n, as O(n α ), with α ≤ 1. WLOG by "size" we mean the number of variables, n, in the instance at a fixed clause-to-variable density, α r = m/n (see section III). When the system has reached an equilibrium point then all clauses, C j (v 0 ), in the problem instance are strictly zero at the solution vector, v 0 .

Remark.
Note that for a continuous-time dynamics the time a physical system requires to reach an equilibrium point is strictly infinite, irrespective of the size of the problem/phase space. In practice, as it is done in numerical simulations, we say that the system has found the solution to the problem when all clauses are less then a threshold whose value does not depend on the size of the instance. As shown in proposition III.1, this threshold can be chosen to be as large as 1/2, namely when C(v) < 1 2 , then sgn(v) is a solution vector. Therefore, when we discuss about the time to find the solution, we mean the shortest time for the dynamical system to cross a fixed threshold, in either the clauses or voltage variables.
Using supersymmetric TFT it was shown that instantons are the only "low-energy" (collective) dynamics of digital memcomputing machines, as those described by Eqs. (23) [14,48]. Instantons are families of classical trajectories in the phase space connecting critical points with given index (number of unstable directions) to critical points with a lower index (less number of unstable directions). The difference between indexes of the two critical points is typically 1, but it could be larger than 1. The reverse process (anti-instantons) connecting critical points of increasing index can only occur in the presence of noise and is "gapped", which means that even in the presence of noise it is exponentially suppressed compared to the instantonic process [47].
As shown in [14,48], the dynamics described by Eqs. (23) then proceed via a succession of instantonic "jumps" that "shed" unstable directions in going from a critical point to the next. In addition, as proved in Section IX, if the dynamics admit solutions, periodic orbits and chaos cannot co-exit.
Since the instantonic trajectories are bounded (see Section VI B) the number of unstable directions is at most equal to the dimensionality of the phase space, n + 2m = n(1 + 2α r ), and the latter grows linearly with problem size (at fixed density). Therefore, the total number of instantonic steps to reach equilibrium can only grow at most linearly with system size [48]. The fact that the number of steps could scale sub-linearly with the system size is because an instanton can connect critical points that differ by more than one unstable direction. We now want to translate this result into the actual physical time to reach a solution.
The time associated with each instanton (the instanton "width") is independent of the size of the instance and depends only on the parameters α and β in Eqs. (23), the rate of change of the x s and x l variables, respectively. This can be seen by considering the path-integral form of the topological action associated to Eqs. (23): where the symbol {Q, Ψ} means (summation over repeated indexes is understood) with B the vector of momenta conjugate to the bosonic variables x, and the vectors χ andχ representing pairs of Faddeev-Popov ghosts and anti-ghosts, respectively (fermionic/Grassmann variables).
The Lagrangian of the system can be read from Eq. (30). By taking the second derivative of this Lagrangian with respect to the memory variables we obtain the frequency of the instanton, and its inverse is its time width [49]. Let us call this time T inst,j for each instanton j. To this time we need to add the time, T cr,j , the system spends on the initial critical point (local supersymmetric vacuum) before each instantonic jump. This time is also independent of the size of the problem and depends only on the degree of memory in the system, which is again dictated by the parameters α and β [11]. Let us call T max = max j (T cr,j + T inst,j ) the maximum time required to do an instantonic jump in the phase space, including the time the system spends on initial critical points. Again, this time does not depend on the size of the instance (size of the phase space), only on the parameters α and β. By putting all this together the maximum physical time, T phys , required by the system to reach the solution of a given 3-SAT problem of size n and density α r is then T phys ≤ n(1 + 2r)T max . We have then proved Proposition XI.1. Given solvable 3-SAT instances of n variables and fixed density α r . The dynamics described by Eqs. (23) reach the solution of these instances in a physical time O(n α ), with α ≤ 1.
Remark. Although the physical time scales (sub-)linearly with problem size, its actual magnitude depends on T max (the slope of the growth with respect to n), which in turn depends on the rate of change of the x s and x l variables.
In addition, in the presence of physical noise, anti-instantons appear in the dynamics. Since anti-instantons are gapped (exponentially suppressed) they may increase somewhat the degree of the polynomial, but cannot transform a polynomial scalability into an exponential one.
Note also that the (sub-)linear scalability obtained above does not necessarily apply to the numerical integration of Eqs. (23). The reason is that time discretization transforms continuous dynamics to an effective discrete map. For discrete maps topological supersymmetry is broken explicitly, namely the evolution operator does not commute with