On speeding up factoring with quantum SAT solvers

There have been several efforts to apply quantum SAT solving methods to factor large integers. While these methods may provide insight into quantum SAT solving, to date they have not led to a convincing path to integer factorization that is competitive with the best known classical method, the Number Field Sieve. Many of the techniques tried involved directly encoding multiplication to SAT or an equivalent NP-hard problem and looking for satisfying assignments of the variables representing the prime factors. The main challenge in these cases is that, to compete with the Number Field Sieve, the quantum SAT solver would need to be superpolynomially faster than classical SAT solvers. In this paper the use of SAT solvers is restricted to a smaller task related to factoring: finding smooth numbers, which is an essential step of the Number Field Sieve. We present a SAT circuit that can be given to quantum SAT solvers such as annealers in order to perform this step of factoring. If quantum SAT solvers achieve any asymptotic speedup over classical brute-force search for smooth numbers, then our factoring algorithm is faster than the classical NFS.

Scientific RepoRtS | (2020) 10:15022 | https://doi.org/10.1038/s41598-020-71654-y www.nature.com/scientificreports/ contributions of this paper. We show in general that a few approaches for smoothness detection with SAT circuits are not enough to speed up the NFS. Moreover, we run benchmarks and find that a classical SAT solver does not appear to pick up on any patterns that allow one to claim otherwise for these approaches. Most importantly, we present a circuit that, when used as a NFS subroutine, yields an algorithm with the same asymptotic runtime as the classical NFS, and faster if quantum SAT solvers achieve any non-trivial speedup. In the optimistic case that a quantum SAT solver achieves a full quadratic speedup, the algorithm would be as fast as the low-resource quantum algorithm, while not necessarily requiring a fault-tolerant quantum computer to operate.
nomenclature. We refer to the algorithm of Buhler et al. 2 as the classical NFS, to the algorithm in Bernstein, Biasse and Mosca 4 as the low-resource quantum NFS, and to our algorithm that uses ECM for smoothness detection as circuit-NFS. organization. We start by reviewing previous work related to SAT solving as well as factoring, including a factoring algorithm that encodes a multiplication circuit as a SAT instance whose solution represents the prime factors. We also recall work done to speed up the Number Field Sieve using Grover's search on a quantum computer with (log N) 2/3+o(1) logical qubits, where N is the number being factored. Then, we present a few encodings of smoothness detection into circuits. We show in general that the SAT instances belonging to smoothnessdetection circuits which have the prime exponents or the factors of the number being tested for smoothness as variables cannot be solved fast enough to speed up factoring. Most importantly, we present a circuit implementing the ECM, analyze it and make statements about the solver runtime relative to the speedup obtained by a quantum SAT solver. Lastly, we discuss the results of this paper as well as future work.

previous work
The work by Mosca and Verschoor 1 investigates the use of SAT solvers for factoring semi-primes, that is, numbers with only two primes factors of similar size. It encodes a multiplication circuit into a SAT instance, fixing the output as the number being factored and making the multiplicands variable. Therefore, solving such SAT instance is equivalent to factoring the semi-prime. The paper finds no evidence that this approach to factoring via classical SAT solvers provides any advantage, or even matches the classical NFS. It also points out that quantum SAT solvers are not expected to do much better if factoring is encoded as a SAT instance in this direct fashion. Further work by Benjamin et al. 7 exhibits a quantum algorithm for solving 3-SAT, an instance of SAT where all clauses contain 3 literals. The numerical simulations for small systems presented in the paper indicate that the algorithm has performance comparable to that of one of the best known classical algorithms. This is of importance to our paper since any significant speedup for SAT solving implies a speedup for factoring as well.
The general number field sieve (NFS) improves on the special number field sieve 8 by removing any restrictions on the numbers that can be factored. The NFS algorithm is conjectured to factor any integer N in Here we give a brief overview of the algorithm, highlighting the details relevant to the present paper. Note that the NFS is explained and analyzed in thorough detail by Buhler et al. 2 . For a simplified overview, see 4 , Section 2, whose notation we follow.
The algorithm takes in an integer N to be factored and parameters d, y, u, with and β, δ, ǫ are parameters to be optimized for in the analysis. Further, define From the above, one can see that d represents the degree of the polynomial f and that u is, in a sense, a bound on the search space U. Moreover, as explained below, y is taken to be the smoothness bound on F(a, b). N is assumed to be odd.
The NFS attempts to find a suitable set S ⊆ U such that on the rational side and on the algebraic side In order to f ind an appropriate S , the algorithm lo oks for a T ⊆ U such that (1) . After T is found, a linear dependence relation between the exponent vectors (reduced modulo 2) of F(a, b) for (a, b) ∈ T reveals a suitable set S ⊆ T such that both Eq. (1) and Eq. (2) are satisfied. The two main bottlenecks of NFS are to (1) find T and (2) find the linear dependence relation. In the classical NFS (1) takes L 2ǫ+o(1) time, since that is the size of U, and (2) takes L 2β+o(1) with Wiedemann's algorithm 9 . By balancing both, one obtains a total runtime of L 1.923 . The low-resource algorithm does (1) using Grover's search and yields a better runtime, namely L 1.387 . Note as well that, if (2) is assumed to take L 2.5β+o (1) , as considered by Bernstein 10 , the classical NFS ends up with runtime L 1.976 and the low-resource algorithm with L 1.456 . For completeness, we repeat the derivations in Corollary 5 and Corollary 6.

circuits for smoothness detection
The circuit SAT problem asks whether there exists an input for a given Boolean circuit, encoded as a SAT instance, such that the output will be TRUE. For a satisfiable circuit SAT formula in v variables one can easily find a solution with v queries to a decision oracle for SAT. In practice, the best known algorithms for deciding SAT implicitly also provide a solution and thus the repeated applications of a SAT decision algorithm are not necessary. Using binary encoding for integers we construct circuits that encode a predicate on numbers, so that solving the corresponding SAT instance is a search for numbers satisfying the predicate. From here on we refer to this process as "solving the circuit".
Instead of using Grover's search to look for (a, b) ∈ T as in the low-resource quantum NFS 4 , we let a SAT solver find these using the encoded circuit. In particular, we encode the predicate "F(a, b) is a y-smooth number" on the input pair (a, b), while we assume the conditions |a| ≤ u and 0 < b ≤ u are enforced by the input encoding. Similar to the low-resource quantum NFS, we assume that the case gcd{a, b} > 1 is handled by post-processing.
A naive algorithm for circuit SAT simply evaluates the full circuit for every possible input until a one is found at the output. For a circuit with v input variables and size g this strategy has runtime O(2 v g) , which is the runtime we assume for solving circuits. Given that circuit SAT is an NP-complete problem, it is widely believed that no efficient algorithm exists. However, in practice modern SAT solvers perform well on solving large SAT instances for certain problems, so that the conjectured runtime requires some confirmation in the form of benchmark results.
In this section we analyze a few natural circuits for implementing the required predicate and prove the approach does not offer any improvement over the classical NFS. We show that, in general, circuits encoding all primes p i ≤ y or the prime exponents e i can not be solved efficient enough. On the other hand, solving a circuit implementing the Elliptic Curve Method (ECM) 6 is shown to achieve runtimes comparable to that of the classical NFS. We recall a few results important for the analysis.

Lemma 1
3. Taking logs of the expression above, the dominant term is log log N.
where ω(n) is the number of prime divisors of n without multiplicity.
Proof All the prime factors of F(a, b) are at least 2, so �(F(a, b)) ≤ log 2 F(a, b) , where �(n) is the number of prime divisors of n with multiplicity. Since ω(n) ≤ �(n) , the result follows from Lemma 1.
circuit with variable exponents. A natural idea is to hard-code all the primes p i ≤ y into the circuit (see Fig. 1), and let a, b and e i be the variables, where 1 ≤ i ≤ π(y) , and π(x) counts the number of primes ≤ x . A satisfying assignment finds the exponent e i for each prime p i that forms the factorization of F(a, b): Scientific RepoRtS | (2020) 10:15022 | https://doi.org/10.1038/s41598-020-71654-y www.nature.com/scientificreports/ The circuit provides no improvement over the classical NFS. Indeed, the number of bits necessary to represent � e = (e 1 , e 2 , ..., e π(y) ) is lower-bounded by π(y) ∈ y 1+o(1) , which implies that the time to solve the circuit is at least exponential in L β+o (1) , much larger than the overall NFS complexity. This also proves the following.

Proposition 1 Any circuit that has e as variable input to be found by an exponential-time SAT solver is not sufficient to speed up integer factorization.
Despite the theoretical result above, one might hope that SAT solvers are able to pick up on specific patterns of this circuit and exploit them to improve the overall runtime. In order to investigate this possibility, we encoded this circuit into a satisfiability instance and ran benchmarks using MapleCOMSPS 11 .
A circuit is generated for each number N, with all other parameters generated as described before and by setting o(1) = 0 . In order to keep the circuit from growing too large, intermediate values in the computation of p e i i are truncated to log 2 F(u, u) bits and multiplication is computed by schoolbook multiplication. Despite these techniques the SAT instances can grow large: on the tested range they contain up to eighty thousand variables after simplification. This is partially explained by the fact that F (both the bound F(u, u) and the found values  F(a, b)) is much larger than N for these small values of N. With the used parameters the desired F(u, u) < N will only occur for 140 bit values of N and greater. All code for generating circuits (including tests for correctness), benchmarks and measurements is made available online 12 . Figure 2 shows the benchmarking results. For each N ≤ 2 18 we measured the median time of solving the same instance many times, for larger N we report the solver runtime directly. Each measured runtime is multiplied by y (N).
Since there are many (a, b) that satisfy the predicate, we could run the solver many times to find multiple (a, b) ∈ T . Closer inspection of our results indicate that the SAT solver does indeed find many valid pairs. If collisions are a problem, we could arbitrarily partition the search space by putting restrictions on the input and have multiple solvers work in parallel. Alternatively we could encode the negation of found solutions as a new  Given the asymptotic behaviour displayed in Fig. 2 it appears that the optimizations from the SAT solver do not seem large enough to provide a speed-up to the NFS. Although this is not a statement about quantum SAT solvers, it is one more argument supporting the lack of speedup attributable to the SAT solver learning specific structures of this problem. circuit with variable factors. Exploiting the small number of prime factors of F(a, b) following from Lemma 2, one can hope to turn the factors into variables (see Fig. 3). At the end, the factors q i must multiply to F(a, b). Note that the q i need not be prime, but only ≤ y . This restriction could be enforced at no cost by allowing at most ⌈log 2 y⌉ bits to encode each q i or by an efficient test on each input.
However, this strategy is too costly. That is, the number of variables in the circuit is 2 log u + i log q i > log i q i . In the very best case that the q i are encoded with the exact number of necessary bits, which is log F(a, b) , then by Lemma 1, results in L N [2/3, ·] time to solve the circuit. This also implies the following.

Proposition 2
If i q i = F(a, b) , any circuit that has the q i as variables to be found by an exponential-time SAT solver is not sufficient to speed up integer factorization. ecM circuit. The Elliptic Curve Method (ECM) is a factoring algorithm devised by Lenstra 6 . One of its key features is that its runtime is conjectured to depend on the smallest prime factor of the number being factored, making it very suitable for smoothness detection. We create a circuit that executes repeated runs of the ECM to obtain prime factors p i ≤ y of F(a, b). For each prime obtained, repeated divisions are performed in order to eliminate that prime from the factorization. Figure 4 shows a simplified circuit. There are implicit operations such as checking if the obtained prime is ≤ y and only performing division when the remainder is zero. RAND represents a random choice of parameters for the ECM, more specifically a, x, y, using the notation by Lenstra 6 , section (2.5). Note that, for a given SAT instance, the random generator seeds are fixed.
This circuit meets the desirable time complexity by decreasing the number of variables significantly. Indeed, the only variables are a, b, so the search space is just U. The following theorem establishes the size and probability of success of the ECM circuit.  (1) . It is uncertain that the found non-trivial divisor is the smallest prime dividing n, but in practical circumstances this will often be the case 6 , (2.10). For our purposes the divisors are allowed to be any factor of F(a, b), as long as it is ≤ y.
Hence, let the circuit repeat the ECM step O((log N) 2/3 (log log N) 4/3 ) times and perform O((log N) 2/3 (log log N) 1/3 ) conditional divisions of an obtained prime, since this is the maximum power a prime factor can have in the factorization of F(a, b), by Lemma 1. Each ECM has a different run-time since the least prime p changes and n is subsequently divided by the discovered factors. For upper-bound estimations, however, one can fix p = y and n = N . In order to estimate the size of the ECM block, one can multiply the time and space complexity. The former is K(y)M(N) and the latter is estimated to be O(log N) . This yields a total circuit size of .
In order to analyze the runtime of solving the ECM circuit to find smooth F(a, b), we need the following.

Definition 1
If a search space E has size #E , an algorithm that is able to search through E within time O(#E 1/γ ) is said to achieve a γ-speedup.
For instance, Grover's search achieves a 2-speedup. The following establishes a generalization of the runtime analysis by Bernstein, Biasse and Mosca 4 .

Theorem 4
If an algorithm A achieves a γ-speedup, for γ > 0 , and the linear algebra step in the NFS is assumed to take L 2β+o(1) , the NFS can use A to run in time L 3 32(γ +1) 9γ 2 +o(1).
The following two corollaries are restatements of the results for the low-resource algorithm 4 . The final runtime of circuit-NFS depends on the runtime of the SAT solver used. Figure 5 shows the exponent α in the final runtime L α+o(1) of circuit-NFS achieved if the SAT solver used achieves a γ-speedup, that is, solves a circuit with v variables in 2 v/γ +o(1) time.
The following results portray the two extreme scenarios highlighted in Fig. 5: a classical solver with 2 v+o(1) runtime versus an ideal quantum SAT solver that achieves a 2 v/2+o(1) runtime. The naive circuit SAT algorithm applied to the ECM circuit achieves runtime O(2 2 log 2 u L N [1/6, ·]) = L 3 √ 64/9+o(1) , corresponding to γ = 1 . Note that we do not expect γ > 2 since γ = 2 has been proved optimal for a quantum computer 13 in a black-box context.

(6)
Pr(x ≥ ω (F(a, b) Theorem 7 is not an improvement on the classical NFS, but it shows that the circuit-NFS approach is asymptotically at least as good. Under the assumption that quantum annealears can achieve the aforementioned 2-speedup in solving SAT circuits, one can obtain the same asymptotic runtime as the low-resource quantum algorithm. However, this does not require a fault-tolerant quantum computer capable of running Grover's algorithm.
We emphasize that the speedup is computed over the naive circuit SAT algorithm. A standard translation of the circuit to CNF-SAT results in a SAT instance of size L N [1/6, ·] and a superpolynomial speedup over exponential-time CNF-SAT solvers would be required for speeding up factoring. Given the highly structured nature of the resulting SAT instance this might be feasible. Alternative solutions to avoid the superpolynomial overhead, such as direct translations from the ECM method to quantum annealer instances, are left as an open question for future work.
It is harder to make a statement about the qubit requirement of circuit-NFS. Instead of SAT, one can reduce to other NP-hard problems like QUBO for more direct application of DWave's quantum annealer. If the smoothness detection circuit could be simplified and written as an instance of QUBO in terms of the variables a, b only, that would total 2 log u ∈ (log N) 1/3+o(1) qubits. However, simplification is not trivial and does not seem to come without overhead, given our preliminary tests. It is more likely that intermediate wires of the circuit would also have to be QUBO variables, increasing the qubit requirement up to the full circuit size L N [1/6, √ 2β/3 + o(1)] . Therefore it remains an open question how many annealing qubits circuit-NFS requires. On the other hand, annealing qubits are currently produced in much higher quantity than other types of qubits, suggesting the possibility that circuit-NFS could be implemented sooner than the low-resource quantum NFS.

conclusion
A potential speedup to integer factorization comes from replacing the search for smooth numbers in the NFS by finding those numbers using a SAT solver. This requires solving a circuit that detects if F(a, b) is smooth upon input a and b. Two natural circuits for that task are the circuit with variable exponents of Fig. 1 which explicitly lists all primes that can be factors and the circuit with variable factors of Fig. 3 which relaxes the requirement that these factors are prime. Both have too many input wires for any exponential-time SAT solver to provide any asymptotic speedup over brute-force search.
Despite the exponential upper bound on the runtime of SAT solvers, practical solvers are known to perform well on certain problems by picking up on patterns in the problem instances. One could hope that a speedup over the theoretical upper bound is therefore achieved in practice on these particular circuits, although this speedup would have to be superpolynomial in order to result in more efficient integer factorization. Benchmarks on the variable exponents circuit suggest that no such speedup is realized in practice.
The circuit-NFS algorithm is specialized to the smoothness detection problem in the sense that the ECM performs well for finding small factors. Our algorithm has at least the same asymptotic runtime as the classical NFS. Measurements of solving smoothness detection circuits however indicate that there is a massive overhead to this approach. Any speedup in SAT solving (be it quantum or classical) needs to make up for this overhead before resulting in a speedup for factoring. Still, if the overhead is only constant then any γ-speedup will eventually be www.nature.com/scientificreports/ sufficient. Given a quantum annealer that solves SAT instances with any γ-speedup ( γ > 1 ) over classical search, circuit-NFS performs asymptotically better than the classical NFS. If a full quadratic speedup is attained, circuit-NFS achieves the asymptotic time complexity of the low-resource quantum NFS, while perhaps not requiring a fault-tolerant quantum computer (depending on the quantum SAT solving device).
Open problems remain, such as benchmarking circuit-NFS on the ECM circuit and estimating its quantum resource requirements.