## Main

Boolean satisfiability1 (k-SAT, k≥3) is the quintessential constraint-satisfaction problem, lying at the basis of many decision, scheduling, error-correction and computational applications. k-SAT is in NP (refs 1, 2, 3), that is its solutions are efficiently (polynomial time) checkable, but no efficient (polynomial time) algorithms are known to compute those solutions. If such algorithms would be found for k-SAT, all NP problems would be efficiently computable, because k-SAT is NP-complete2,3.

In k-SAT there are given N Boolean variables {x1,…,xN},xi{0,1} and M clauses (constraints), each clause being the disjunction (OR, denoted as ) of kvariables or their negation . One has to find an assignment of the variables such that all clauses (called collectively as a formula) are satisfied (TRUE=‘1’). When the number of constraints is small, it is easy to find solutions, whereas for too many constraints it is easy to decide that the formula is unsatisfiable (UNSAT). Deciding satisfiability, in the ‘intermediate range’, however, can be very hard: the worst-case complexity of all known algorithms for k-SAT is exponential in N.

Inspired by the mechanisms of information processing in biological systems, analog computing received increasing interest from both theoretical14,15,16 and engineering communities17,18,19,20,21. Although the theoretical possibility of efficient computation using chaotic dynamical systems has been shown previously15, nonlinear dynamical systems theory has not been exploited for NP-complete problems in spite of the fact that, as shown previously19,20,21, k-SAT can be formulated as a continuous global optimization problem19, and even cast as an analog dynamical system20,21.

Here we present a continuous-time dynamical system for k-SAT, with a dynamics that is rather different from previous approaches. Let us introduce the continuous variables19si[−1,1], such that si=−1 if the ith variable (xi) is FALSE and si=1 if it is TRUE. We define cm i=1 for the direct form (xi),cm i=−1 for the negated form and cm i=0 for the absence of the ith variable from clause m. Defining the constraint function corresponding to clause m, we have Km[0,1] and Km=0 if and only if clause m is satisfied. The goal would be to find a solution s* with si*{−1,1} to E(s*)=0, where E is the energy function . If such s*exists, it will be a global minimum for E and a solution to the k-SAT problem. However, finding s*by a direct minimization of E(s) will typically fail owing to non-solution attractors trapping the search dynamics. To avoid such traps, here we define a modified energy function , using auxiliary variables similar to Lagrange multipliers20,21. Let us denote by the continuous domain [−1,1]N. Its boundary is the N-hypercube with vertex set . The set of solutions for a given k-SAT formula, called solution space, occupies a subset of . Solution clusters are formed by solutions that can be connected through single-variable flips, always staying within satisfying assignments22. Clearly, V ≥0 in , and V (s,a)=0 within if and only if is a k-SAT solution, for any . We now introduce a continuous-time dynamical system on Ω through:  where is the gradient operator with respect to s and Km i=Km/(1−cm isi). The initial conditions for s are arbitrary ; however, for athey have to be strictly positive, am(0)>0 (for example, am(0)=1). The k-SAT solutions are fixed points of (1), for any . The k-SAT solution clusters are spanning piecewise compact, connected sets in QN, and every point in them is a fixed point of (1) (Supplementary Section SA). System (1) has a number of key properties (see Supplementary Information). First, the dynamics in s stays confined to . Second, the k-SAT solutions are attractive fixed points of (1). In particular, every point s from the orthant of a k-SAT solution s* with the property |s|2N−1+(k−1)2/(k+1)2 is guaranteed to flow into the attractor corresponding to s*. Third, there are no limit cycles. Fourth, for satisfiable formulae the only fixed point attractors of the dynamics are the global minima of V with V =0. Note that in principle, the projection of the dynamics onto could be stuck in some point , while da/dt≠0 indefinitely. This does not happen here, as shown in Supplementary Section SE. Moreover, analytical arguments supported by simulations indicate that the trajectory will leave any domain that does not contain solutions, see the discussion in Supplementary Section SE1. Note that the constraint functions (hence their satisfiability) depend directly only on the location of the trajectory in , Km=Km(s), and not on the auxiliary variables. The dynamics in the a-space is simple expansion, and for this reason the features of the full phase space Ωlie within its projection onto . One can actually eliminate entirely the auxiliary variables from the equations by first solving (1b) to give and then inserting it into (1a).

Another fundamental feature of (1) is that it is deterministic: for a given formula f, any initial condition generates a unique trajectory, and any set from has a unique pre-image arbitrarily back in time. Hence, the characteristics of the solution space are reflected in the properties of the invariant sets7 of the dynamics (1) within the hypercube . The deterministic nature of (1) allows us to define basins of attractions of solution clusters by colouring every point in according to which cluster the trajectory flows to, if started from there. These basins fill up to a set of zero (Lebesgue) measure, which forms the basin boundary7, from where the dynamics (by definition) cannot flow to any of the attractors. A k-SAT formula fcan be represented as a hypergraph (or equivalently, a factor graph) in which nodes are variables and hyperedges are clauses connecting the nodes/variables in the clause. Pure literals are those that participate in one or more clauses but always in the same form (direct or negated); hence, they can always be chosen such as to satisfy those clauses. The core of G(f) is the subgraph left after sequentially removing all of the hyperedges having pure literals23. For simple formulae (such as those without a core), the dynamics of (1) is laminar flow and the basin boundaries form smooth, non-fractal sets (Figs 1a,c and 2, top two rows). Adding more constraints develops a core, the spin equations (1a) become mutually coupled, and the trajectories may become chaotic (Fig. 1b, Supplementary Section SF and Fig. S8) and the basin boundaries fractal7,8,9 (Figs 1d, 2 and Supplementary Fig. S4). Therefore, as the constraint density α=M/N is increased within predefined ensembles of formulae (random k-SAT, occupation problems, k-XORSAT and so on) a sharp change to chaotic behaviour is expected at a chaotic transition point αχ, where a chaotic core appears with non-zero statistical weight in the ensemble as . As an example, let us consider 3-XORSAT. In this case, owing to its inherently linear nature, it is actually better to work directly with the parity check equations as constraints, instead of their conjunctive normal form. The chaotic core here is a small finite hypergraph, and thus αχ coincides with the so-called dynamical transition point αd computed exactly in ref. 24 (see Supplementary Section SG and Fig. S4). Note, a core can be non-chaotic, and thus the existence of a core is only a necessary condition for the appearance of chaos and in general the two transitions might not coincide. Further increasing the number of constraints (within any formula ensemble) unsatisfiability appears at the threshold value αs>αχ beyond which almost all formulae are unsatisfiable (UNSAT regime)11,12,22,24,25,26,27,28. The closer α is to αs, the harder it is to find solutions, and beyond the so-called freezing transition point αf<αs (called the frozen regime) all known algorithms take exponentially long times or simply fail to find solutions11,12. A variable is frozen if it takes on the same value for all solutions within a cluster, and a cluster is frozen if an extensive number of its variables are frozen. In the frozen regime all clusters are frozen and they are also far apart ( Hamming distance)11,12. For random 3-SAT (clauses chosen uniformly at random for fixed α) (ref. 27), (ref. 28) and all known local search algorithms become exponential or fail beyond α=4.21 (ref. 29), and survey-propagation25-based algorithms fail beyond α=4.25 (ref. 28). As the frozen regime is very thin in random 3-SAT, the so-called locked occupation problems (LOPs) have been introduced12. In LOPs all clusters are formed by exactly one solution; hence, they are completely frozen and the frozen regime extends from the clustering (dynamical) transition point ℓdto the satisfiability threshold ℓs, and thus it is very wide12. An example LOP is random ‘+1-in-3-SAT’ (ref. 12), made of constraints that have no negated variables and a constraint is satisfied only if exactly one of its variables is 1 (TRUE). In +1-in-3-SAT , and beyond ℓd all known algorithms have exponential search times or fail to find solutions (here ℓ=3M/N).

As chaos is present for satisfiable formulae, that is, when system (1) has attracting fixed points, it is necessarily of transient type. Transient chaos4,5,6,7 is ubiquitous in systems with many degrees of freedom such as fluid turbulence30. It appears as the result of homoclinic/heteroclinic intersections of the invariant manifolds of hyperbolic (unstable) fixed points of (1) lying within the basin boundary7,8,9, leading to complex (fractal) foliations of the phase space (see Supplementary Section SF). We observed the prevalence of transient chaos in the whole region αχ<α<αs for all of the problem classes we studied. Interestingly, the velocity fluctuations of trajectories in the chaotic regime are qualitatively similar to those of fluid parcels in turbulent flows as shown in Supplementary Section SK. Our findings indicate that chaotic behaviour may be a generic feature of algorithms searching for solutions in hard optimization problems, corroborating previous observations10 using a heuristic algorithm based on iterated maps.

In the following we show results on random 3-SAT and +1-in-3-SAT formulae in the frozen regime; however, the same conclusions hold for other ensembles that we tested. To investigate the complexity of computation by the flow (1), we monitored the fraction of problems p(t)not solved by continuous time t, as a function of N and α. Figure 3a,c shows that even in the frozen phase, the fraction of unsolved problems by time tdecays exponentially with t, that is, by a law p(t)=reλ(N)t. The decay rate λ(N) obeys λ(N)=b Nβ, with β≈1.6 in both cases, see Fig. 3b,d. From these two equations, the continuous time t(p,N) needed to solve a fixed (1−p)th fraction of random formulae (or to miss solving the pth fraction of them) is: indicating that the continuous time needed to find solutions scales as a power law with N. Equation (2) also implies power-law scaling for almost all hard instances in the limit (Supplementary Section SH). The length in of the corresponding continuous trajectories also scales as a power law with N (Supplementary Fig. S7b and Section SJ). However, note that this does not mean that the algorithm itself is a polynomial-cost algorithm, as the energy function V can have exponentially large fluctuations. As the numerical integration happens on a digital machine, it approximates the continuous trajectory with discrete points. Monitoring the fraction of formulae left unsolved as a function of the number of discretization steps nstep in the frozen phase, we find exponential behaviour for nstep(p,N) (Supplementary Sections SI, SJ and Fig. S6). The difference between the continuous- and discrete-time complexities is due to the wildly fluctuating nature of the chaotic trajectories (see Fig. 1b and Methods) in the frozen phase. Compounding this, we also observe the appearance of the Wada property7,8 in the basin boundaries (Fig. 4). A fractal basin boundary has Wada property if its points are simultaneously on the boundary of at least three colours/basins. (An amusing method that creates such sets uses four Christmas ball ornaments7.) Although the Wada property does not affect the true/mathematical analog trajectories, owing to numerical errors, it may switch the numerical trajectories between the basins. As the clusters are far apart, the switched trajectory will flow towards another cluster into a practically opposing region of until it may come close again to the basin boundary and so on, partially randomizing the trajectory in .