Introduction

A fundamental tenet of classical computer science is based on the Church-Turing thesis, which asserts that any practically realizable computational device can be simulated by a universal computer known as the Turing machine1. However, this hypothesis implicitly relies on the laws of classical physics2 and was challenged by Feynman3 and others who suggested that computational devices behaving according to quantum mechanics could be qualitatively more powerful than classical computers. A first proof of this conjecture was given in 1993 by Bernstein and Vazirani4. They showed that a quantum mechanical Turing machine is capable of simulating other quantum mechanical systems in polynomial time, an exponential improvement in computational power over the classical Turing machine. Their proof did not give an actual fast quantum algorithm, but in the following year, Peter Shor came up with his famous factoring algorithm5, which solves the integer factorization problem in polynomial time, exponentially faster than any known classical algorithms. The essential part of this algorithm is a solution of the order-finding problem, which can be formulated as a hidden subgroup problem (HSP)6. A hidden subgroup problem is like to find out the period of a given periodic function. The structure of the function's periodicity may be so complicated that it can not be easily determined by classical means. The importance of the HSP is that various instances (eg. Pell's equation, the principal ideal problem, unit group computing) and variants like the hidden shift problem and hidden nonlinear structures encompass most of the quantum algorithms found so far that are exponentially faster than their classical counterparts7. This relatively narrow range of existing fast quantum algorithms shows the urgent need for different types of quantum algorithms that will make other classes of problems accessible to efficient solutions.

Here we describe such a quantum algorithm that does not fall into the framework of HSP. It solves two number-theoretical problems in polynomial time, i.e., testing the square-freeness and computing the square-free part of a given integer. Compared to the known classical algorithms, this provides an exponential increase in computational efficiency. While these problems are related to the factorization problem solved by Shor, our algorithm relies on a different approach. Furthermore, while Shor's algorithm is probabilistic, the algorithm presented here is exact and its computational complexity is lower.

We consider a positive integer N with its unique prime factor decomposition (pi are primes). N is called square-free if no prime factor occurs more than once, i.e., for all i (i = 1, 2, …, k), αi = 0 or 1. An arbitrary positive integer can always be written as

where r is square-free and this square-free decomposition is unique. Thus, usually r and s2 are called the square-free part and the square part of N, respectively. The square-freeness testing problem corresponds to determining whether s = 1. An additional problem consists in finding the square-free part r of N. These problems were listed as two unsolved open problems8, since no efficient algorithm is currently known for either of them. Actually they may be no easier than the general problem of integer factorization9. It was found10 that the factorization of N = pq2 (p, q both prime) is almost as hard as the factorization of N = pq. This fact has been used in a proposed digital signature scheme called TSH-ESIGN, which is more efficient than any representative signature scheme such as elliptic curve and RSA based signature10. A concrete estimation of the lower bound of classical Boolean circuit complexity11 showed that testing square-free numbers by unbounded fan-in circuits of bounded depth requires a superpolynomial size. On the other hand, the square-free part problem appears to be a representative of a larger class of computational problems. As an example, computing the ring of integers of an algebraic number field, one of the main tasks of computational algebraic number theory, reduces to it in deterministic polynomial time12,13.

We now describe an efficient, exact quantum algorithm that solves both problems. It uses the Gauss sum32,33,34, an important object which has been extensively investigated in mathematics (see supplementary information). Throughout this paper, we will assume that N is an odd integer (the case of even numbers can be trivially reduced to this case). The Gauss sum is defined as

where a is an integer and the function χN(m) represents the Jacobi symbol of m relative to N14.

The evaluation of the Gauss sum is closely related to the square-freeness of N. Let notation (x, y) indicate the greatest common divider (GCD) of x and y. If N is square-free, then we have

Conversely15, if N is not square-free

This remarkable fact suggests a dichotomy criterion for testing square-freeness, it represents the cornerstone of our algorithm.

Results

We present the algorithm first for the relatively simple case where N = pq2 (p, q both prime) and subsequently generalize it. The algorithm consists of two parts, as illustrated in Fig. 1. In the first part, we generate the state

where the normalization coefficient ϕ(N) represents Euler's function (number of integers smaller than N that are coprime to N). van Dam and Seroussi16 proposed a general method for preparing such a superposition state. They gave the example of computing the Legendre symbol, which is a special case of the Jacobi symbol, which reduces to the Legendre symbol when N is prime. They also computed the Jacobi symbol for the case when the factorization of N is known. In our case, the factors of N are not known. Thus we would adopt another technique for computing the Jaocbi symbol17, which we discuss in the following. The second part of the algorithm is to apply the quantum Fourier transform (QFT) to |φ〉. The resulting state encodes the factors p and q of N, which can be retrieved by performing measurements on the qubits.

Figure 1
figure 1

Outline of quantum circuit for computing the square-free part for N = pq2.

The procedure denoted as Ω in the text consists of two main parts. In the first part, we generate the state |φ〉; in the second part, we apply the quantum Fourier transform (QFT) to it. Single lines represent qubits and boxes represent operations. Time runs from left to right. The transformation U1 and U2 are defined by Eq. (5) and Eq. (6). The meters M1 and M2 represent the measurements. The double lines coming from M1 carry the classical bits, here the algorithm continues only if register B collapses to 1.

Now we discuss the details of the algorithm. Set n = [log N], the smallest integer for which 2nN. We need two main registers A and B, both initialized to |0〉n. Additional registers needed for storing auxiliary variables and constants are not represented explicitly for simplicity. The first part starts with a state uniformly superposed from 1 to N − 1, which is prepared just by an N − 1 dimensional Fourier transform on register A and a subsequent addition with 1

Note that this Fourier transform is of order N − 1 and it was known18 that the quantum fast Fourier transform can be made exact for arbitrary orders. Next we compute the greatest common divisor of m and N into register B

Classically, the GCD problem can be efficiently solved by the classical Euclidean algorithm in quadratic polynomial time. In order not to involve the complicated division arithmetics of the Euclidean algorithm, we prefer to adopt the extended Euclidean algorithm19. The extended Euclidean algorithm can be directly generalized to a quantum GCD algorithm that operates on a superposition state with the same computational complexity (see supplementary information for the quantum network construction).

We then take a measurement M1 of register B. If the result is not 1, then it must be p or q or q2 and clearly the algorithm already succeeds. However, it's highly possible that we would not obtain such results and the algorithm continues. This is because the probability of obtaining (m, N) = 1 is ϕ(N)/(N − 1) = (p − 1)(q − 1)/(pq − 1), which asymptotically approaches 1 for sufficiently large p and q. If M1 results in 1, we get

The next step is to obtain the state |φ〉 as given in (4), i.e., we do the following unitary operation on register A

where χ(m) are 1 or −1 as by the definition of Jacobi symbol and register B is omitted. The key part of U2 is to compute the Jacobi symbol χ(m) for all (m, N) = 1. Classically, the Jacobi symbol can be efficiently solved by many algorithms. There exists20 a binary algorithm which has the advantage of lower complexity and easier implementation on a binary computer. The binary algorithm can be seen as a variant of the extended Euclidean algorithm and hence can also be extended to a quantum algorithm (see supplementary information for the quantum network construction).

As the last step of the algorithm, we take a Fourier transform on |φ〉 and obtain

According to the properties (2) and (3) of the Gauss sum, all amplitudes vanish unless k shares a nontrivial common factor with N. If we perform a measurement M2 on the register, it always collapses to a state |k0〉, whose GCD with N is a non-trivial factor p or q of N. It therefore yields the complete decomposition of N.

We now determine the computational complexity of this algorithm. All the transformations involved in the algorithm, including the extended Euclidean algorithm for GCD and Jacobi symbol and QFT, require O((log N)2) elementary gate operations6. Thus this algorithm has only a polynomial-time complexity.

For a general N with possibly many distinct prime factors (square-freeness of N is unknown), the procedure outlined above may not work. However, it can be generalized to include this case and the generalized algorithm remains simple and efficient. We refer to the algorithm described above as Ω and discuss now the generalized algorithm, which includes Ω as a subroutine.

As we discussed, the algorithm Ω includes two measurements, M1 and M2. With a certain probability, M1 yields a nontrivial factor of N. If this does not happen, we proceed to the second measurement M2. Two possibilities will occur at M2 due to the dichotomy property of Gauss sum (2, 3) : we obtain (i) a non-trivial factor of N if N is not square-free, or (ii) a result coprime to N, which signifies that N is definitely square-free. As a result, no matter whether Ω ends at M1 or M2, it either yields a non-trivial factor (say c) of N or determines that N is square-free. In the latter case, we have succeeded already, hence the algorithm finishes. In the former case, if the two parts c and N/c share a common factor d = (c, N/c), we know that d2 is a factor of the square part s2 of N. We thus can split the problem of finding the square-free part r of N into two smaller problems: finding the square-free parts of c/d and N/(cd). From the solutions of these subproblems, we find the corresponding parts of N as

Here, R(·) and S(·) represent the square-free part and the square part of their argument, respectively. Clearly, this procedure can be iterated until all branches have determined that the arguments are square-free. Figure 2 illustrates this recursive procedure.

Figure 2
figure 2

Schematic flow chart of the recursive quantum algorithm for computing the squrare-free part of an arbitrary odd integer N.

(a) Possible outcomes of applying the algorithm Ω on an arbitrary odd integer N: either return a factor c or else ensure that N is a square-free number with the square-free part r = N. If a factor c > 1 is returned and N is tested to be not square-free, then the problem is converted to two smaller sub-problems for c/d and N/cd where d is the greatest common divisor of c and N/c.This serves as the subroutine of the recursive quantum algorithm. (b) Recursive algorithm for a general N. Different colors are used to designate two different outcomes after applying the subroutine Ω. The red color denotes that number is square-free, then this branch terminates. The blue color denotes the other outcome; in this case, the algorithm proceeds to the next step of recursion. The Ω operation needs to be performed at most log N times to solve this problem.

The execution time of the extended algorithm reaches a maximum when each execution of Ω yields just one factor, but clearly, the number of repetitions is still bounded by O(log N). Each execution of the subroutine Ω requires at most O((log N)2) steps. The worst-case complexity of the extended algorithm is therefore O((log N)3). Actually, we have a better estimation of how long it takes untill the algorithm succeeds. This is by virtue of the observation that M2 yields the square part with high probability and calculations show that the algorithm will finish with high probability in just O((log N)2(log log N)2) (see methods).

Discussion

Classically, finding the square-free part of an integer is believed to be very difficult. It was argued10 that the best method known for its solution is through factorization. The fastest classical algorithm for factorization would be the number field sieve21, which requires O(exp(c(log N)1/3(log log N)2/3)) steps. Thus the quantum algorithm presented here offers an exponential speed-up over the classical algorithm. A feasible alternative to our algorithm would be to use Shor's algorithm to obtain the complete decomposition of N also in polynomial time. Application of Shor's algorithm yields, with some probability, two divisors of N in time O((log N)2 log log N log log log N)6. Like our algorithm, Shor's algorithm would thus also be applied repetitively, with the number of iterations bounded by O(log N). The overall computational complexity using Shor's algorithm would be O((log N)3 log log N log log log N). We further remark that achieving complete factorization through Shor's algorithm raises more subtleness. A necessary part of complete factorization is primality test, however Shor's algorithm fails to recognize a prime number with probability 1, this of course increases algorithmic complexity30,31. Figure 3 compares the computational costs of the three algorithms described above, clearly showing the increase in computational efficiency by the algorithm presented here.

Figure 3
figure 3

Comparison between the computational costs of the three algorithms discussed in the text.

Both quantum algorithms offer exponential speedup over the classical methods. For hundreds of digits, our algorithm is almost two orders of magnitude faster than the Shor's algorithm.

Our algorithm relies on the mathematical properties of the Gauss sums. The possibility of using the periodicity properties of Gauss sums for factorization was suggested earlier22,23 and the feasibility of this approach was demonstrated in various physical systems including nuclear magnetic resoance24,25,26, cold atoms27 and superconducting circuits28. However, these schemes did not use the specific properties of quantum mechanical systems. They can be implemented in classical as well as in quantum systems and the scaling properties are therefore not superior to other classical algorithms32,33. In contrast, the algorithm that we have described in this paper relies on quantum superpositions and is both efficient and exact in solving the square-free part computation problem, even demonstrates advantages over Shor's approach. In Shor's algorithm, the major cost comes from the modular exponentiation operation, while Gauss sums can be generated through O((log N)2) modular square operation. In our algorithm, we have noticed that Gauss sum evaluations are closely related to the factorization of N. While we have not found such an algorithm so far, it may thus be possible to develop a quantum algorithm on the basis of Gauss sums that solves integer factorization.

Methods

Realization of U1

U1 is to compute the greatest common divisor of m and N. Classically, the GCD problem can be efficiently solved based on the famous extended Euclidean algorithm. There is a variant of this algorithm, called the binary GCD algorithm, which can be more conveniently performed on a binary computer. We adopt this method here and succeed in finding a quantum network that performs the binary GCD algorithm on a quantum superposition state (see supplementary information for details).

Realization of U2

In Fig. 1, the operation U2 is realized through the following steps

where we use the phase kickback trick16 and the identity e(χ(m)−1)/2 = χ(m). The computation of Jacobi symbol can be implemented by binary Jaocbi algorithm (see supplementary information for details).

Complexity estimation of the algorithm

In the following, we discuss the algorithm complexity for a general case N. To do this, we slightly change the algorithm presented in the text. Our analysis is based on the finding: if at the measurement M2 we obtain a result whose common divisor with N is a square number, then the common divisor must be the square part s2 of N; and the probability of this case is larger than (ϕ(N)/N)2 (see supplementary information for proofs). Hence the algorithm can be altered in the way that if any branch of the algorithm proceeds to M2 and results in a square number, then that branch terminates.

Denote P(·) as the probability of obtaining the square part of its argument by application of Ω. Let pk denotes the probability that the algorithm succeeds at the k-th iteration step. Obviously

where P(N) ≥ (ϕ(N)/N)2. If Ω does not succeed at the first step and suppose we have obtained c and N/c and d = (c, N/c), then

Here, the second inequality is valid because of a basic property of the Euler function

Analogously, we will have

Therefore, after k steps, the probability that the algorithm still does not succeed is

According to the inequality (Theorem 8.8.729)

where γ = 0.5772… is the Euler-Mascheroni constant and for a large N, ϕ(N)/N> 1/(2 log log N).

So we have

When k = O((log log N)2), Q → 0, this means, the algorithm doesn't need to go for k = O(log N) times, but would finish with high probability in O((log log N)2) steps.