Variational quantum support vector machine based on \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Gamma $$\end{document}Γ matrix expansion and variational universal-quantum-state generator

We analyze a binary classification problem by using a support vector machine based on variational quantum-circuit model. We propose to solve a linear equation of the support vector machine by using a \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\Gamma $$\end{document}Γ matrix expansion. In addition, it is shown that an arbitrary quantum state is prepared by optimizing a universal quantum circuit representing an arbitrary \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$U(2^N)$$\end{document}U(2N) based on the steepest descent method. It may be a quantum generalization of Field-Programmable-Gate Array (FPGA).


Introduction
Quantum computation is a hottest topic in contemporary physics [1][2][3] .An efficient application of quantum computations is machine learning, which is called quantum machine learning [4][5][6][7][8][9][10][11][12][13][14][15] .A support vector machine is one of the most fundamental algorithms for machine learning [16][17][18] , which classifies data into two classes by a hyperplane.The optimal hyperplane is determined by an associated linear equation F |ψ in = |ψ out , where F and |ψ out are given.A quantum support vector machine solves this linear equation by a quantum computer 10,13,19 .Usually, the linear equation is solved by the Harrow-Hassidim-Lloyd (HHL) algorithm 20 .However, this algorithm requires many quantum gates.Thus, the HHL algorithm is hard to be executed by using a near-term quantum computer.Actually, this algorithm has experimentally been verified only for two and three qubits [21][22][23] .In addition, it requires a unitary operator to execute e iF t , which is quite hard to be implemented.
The number of qubits in current quantum computers is restricted.Variational quantum algorithms are appropriate for these small-qubit quantum computers, which use both quantum computers and classical computers.Various methods have been proposed such as Quantum Approximate Optimization Algorithm (QAOA) 24 , variational eigenvalue solver 25 , quantum circuit learning 26 and quantum linear solver 27,28 .We use wave functions with variational parameters in QAOA, which are optimized by minimizing the expectation value of the Hamiltonian.A quantum circuit has variational parameters in quantum circuit learning 26 , which are optimized by minimizing a certain cost function.A quantum linear solver solves a linear equation by variational ansatz 27,28 .The simplest method of the optimization is a steepest-descent method.
In this paper, we present a variational method for a quantum support vector machine by solving an associated linear equation based on variational quantum circuit learning.We propose a method to expand the matrix F by the Γ matrices, which gives simple quantum circuits.We also propose a variational method to construct an arbitrary state by using a universal quantum circuit to represent an arbitrary unitary matrix U (2 N ).We prepare various internal parameters for a universal quantum circuit, which we optimize by minimizing a certain cost function.Our circuit is capable to determine the unitary transformation U satisfying U |ψ initial = |ψ final with arbitrary given states |ψ initial and |ψ final .It will be a quantum generalization of field-programmablegate array (FPGA), which may execute arbitrary outputs with arbitrary inputs.

Risults Support vector machine.
A support vector machine (SVM) is a computer algorithm that learns by examples to assign labels to objects.It is a typical method to solve a binary-classification problem 16 .A simplest example reads as follows.Suppose that there are red and blue points whose distributions are almost separated into two dimensions.We classify these data points into two classes by a line, as illustrated in Fig. 1.
In general, M data points are spattered in D dimensions, which we denote x j , where 1 ≤ j ≤ M .The problem is to determine a hyperplane, separating data into two classes with the use of a support vector machine.We set for red points and for blue points.These conditions are implemented by introducing a function which assigns f (x) = 1 to red points and f (x) = −1 to blue points.In order to determine ω 0 and ω for a given set of data x j , we introduce real numbers α j by A support vector machine enables us to determine ω 0 and α j by solving the linear equation where y i = f (x i ) = ±1, and F is a (M + 1) × (M + 1) matrix given by Here, is a Kernel matrix, and γ is a certain fixed constant which assures the existence of the solution of the linear equation ( 6) even when the red and blue points are slightly inseparable.Note that γ → ∞ corresponds to the hard margin condition.Details of the derivation of Eq.( 6) are given in Method A.
Quantum linear solver based on Γ matrix expansion.
We solve the linear equation ( 6) by a quantum computer.In general, we solve a linear equation for an arbitrary given non-unitary matrix F and an arbitrary given state |ψ out .Here, the coefficient c is introduced to preserve the norm of the state, and it is given by The HHL algorithm 20 is a most famous algorithm to solve this linear equation by a quantum computer.We first construct a Hermitian matrix by Then, a unitary matrix associated with F is uniquely obtained by e iHt .Nevertheless, it requires many quantum gates.In addition, it is a nontrivial problem to implement e iHt .Recently, variational methods have been proposed 27 to solve the linear equation (9).In one of the methods, the matrix F is expanded in terms of some unitary matrices U j as In general, a complicated quantum circuit is necessary to determine the coefficient c j .
We start with a trial state | ψin to determine the state |ψ in .Application of each unitary matrix to this state is efficiently done by a quantum computer, out , and we obtain where | ψout is an approximation of the given state |ψ out .We tune a trial state | ψin by a variational method so as to minimize the cost function 27 which measures the similarity between the approximate state | ψout and the state |ψ out in (9).We have 0 ≤ E cost ≤ 1, where E cost = 0 for the exact solution.The merit of this cost function is that the inner product is naturally calculated by a quantum computer.
Let the dimension of the matrix F be 2 N .It is enough to use N satisfying 2 N −1 < D ≤ 2 N without loss of generality by adding trivial 2 N − D components to the linear equation.We propose to expand the matrix F by the gamma matrices Γ j as with where α = 0, x, y and z.
The merit of our method is that it is straightforward to determine c j by the well-known formula In order to construct a quantum circuit to calculate c j , we express the matrix F by column vectors as We have (|f q−1 ) p = F pq , where subscript p denotes the p-th component of |f q−1 .Then c j is given by where the subscript q denotes the (q + 1)-th component of Γ j |f q .We have introduced a notation |q ≡ |n 1 n 2 • • • n N with n i = 0, 1, where q is the decimal representation of the binary number n 1 n 2 x σ (2) x σ (3) The state |q ≡ |n 1 n 2 • • • n N is generated as follows.We prepare the NOT gates σ x for the i-th qubit if n i = 1.Using all these NOT gates we define We act it on the initial state |0 and obtain Next, we construct a unitary gate U fq generating |f q , We will discuss how to prepare U fq by a quantum circuit soon later; See Eq. (28).By using these operators, c j is expressed as which can be executed by a quantum computer.We show explicit examples in Fig. 2.
Once we have c j , the final state is obtained by applying Γ j to | ψin and taking sum over j, which leads to The implementation of the Γ matrix is straightforward in quantum circuit, because the Γ matrix is composed of the Pauli sigma matrices, as shown in Fig. 2. We may use the steepest descent method to find an optimal trial state | ψin closest to the state |ψ in .We calculate the difference of the cost function ∆E cost when we slightly change the trial state | ψin (t) at step t by the amount of ∆| ψin (t) as We explain how to construct | ψin (t) by a quantum circuit soon later; See Eq.( 28).Then, we renew the state as where we use an exponential function for η t , We choose appropriate constants ξ 1 and ξ 2 for an efficient search of the optimal solution, whose explicit examples are given in the caption of Fig. 2. We stop the renewal of the variational step when the difference ∆| ψin (t) becomes sufficiently small, which gives the optimal state of the linear equation (9).Variational universal-quantum-state generator.
In order to construct the trial state | ψin (t) , it is necessary to prepare an arbitrary state |ψ by a quantum circuit.Alternatively, we need such a unitary transformation U that It is known that any unitary transformation is done by a sequential application of the Hadamard, the π/4 phase-shift and the CNOT gates 29,30 .Indeed, an arbitrary unitary matrix is decomposable into a sequential application of quantum gates 29,30 , each of which is constructed as a universal quantum circuit systematically [31][32][33][34][35][36] .Universal quantum circuits have so far been demonstrated experimentally for two and three qubits [37][38][39][40] .
We may use a variational method to construct U satisfying Eq.( 28).Quantum circuit learning is a variational method 26 , where angle variables θ i are used as variational parameters in a quantum circuit U , and the cost function is optimized by tuning θ i .We propose to use a quantum circuit learning for a universal quantum circuit.We show that an arbitrary state |ψ (θ i ) can be generated by tuning U (θ i ) starting from the initial state |0 as We adjust θ i by minimizing the cost function which is the same as that of the variational quantum support vector machine.We present explicit examples of universal quantum circuits for one, two and three qubits in Method C.
We next consider a problem to find a unitary transformation U ini-fin which maps an arbitrary initial state |ψ initial to an arbitrary final state |ψ final , Since we can generate an arbitrary unitary matrix as in Eq.( 28), it is possible to generate such matrices U ini and U fin that Then, Eq.( 31) is solved as since An FPGA is a classical integrated circuit, which can be programmable by a customer or a designer after manufacturing in a factory.An FPGA executes any classical algorithms.On the other hand, our variational universal quantum-state generator creates an arbitrary quantum state.We program by using the variational parameters θ i .In this sense, the above quantum circuit may be considered as a quantum generalization of FPGA, which is a quantum FPGA (q-FPGA).
We show explicitly how the cost function is renewed for each variational step in the case of two-and three-qubit universal quantum circuits in Fig. 3, where we have generated the initial and the final states randomly.We optimize 15 parameters θ i for two-qubit universal quantum circuits and 82 parameters θ i for three-qubit universal quantum circuits.We find that U ini-fin is well determined by variational method as in Fig. 3.

Variational quantum support vector machine.
We demonstrate a binary classification problem in two dimensions based on the support vector machine.We prepare a data set, where red points have a distribution around (r cos Θ, r sin Θ) with variance r, while blue points have a distribution around (−r cos Θ, −r sin Θ) with variance r.We assume the Gaussian normal distribution.We choose Θ randomly.We note that there are some overlaps between the red and blue points, which is the soft margin model.
As an example, we show the distribution of red and blue points and the lines obtained by the variational method marked in cyan and by the direct solution of ( 6) marked in magenta in Fig. 1.They agrees well with one another, where both of the lines well separate red and blue points.We have prepared 31 red points and 32 blue points, and used six qubits.

Discussion
We have proposed that the matrix F is efficiently inputted into a quantum computer by using the Γ-matrix expansion method.There are many ways to use a matrix in a quantum computer such as linear regression and principal component analysis.Our method will be applicable to these cases.
Although it is possible to obtain the exact solution for the linear equation by the HHL algorithm, it requires many gates.On the other hand, it is often hard to obtain the exact solution by variational methods since trial functions may be trapped to a local minimum.However, this problem is not serious for the machine learning problem because it is more important to obtain an approximate solution efficiently rather than an exact solution by using many gates.Indeed, our optimized hyperplane also well separates red and blue points as shown in Fig. 1(a).
In order to classify M data, we need to prepare log 2 M qubits.It is hard to execute a large number of data points by current quantum computers.Recently, it is shown that electric circuits may simulate universal quantum gates [41][42][43] based on the fact that the Kirchhoff law is rewritten in the form of the Schrödinger equation 44 .Our variational algorithm will be simulated by using them.

Methods
A: Support vector machine.
A support vector machine is an algorithm for supervised learning [16][17][18] .We first prepare a set of training data, where each point is marked either in red or blue.Then, we determine a hyperplane separating red and blue points.After learning, input data are classified into red or blue by comparing the input data with the hyperplane.The support vector machine maximizes a margin, which is a distance between the hyperplane and data points.If red and blue points are perfectly separated by the hyperplane, it is called a hard margin problem [Fig.4(a)].Otherwise, it is called a soft margin problem [Fig.4(b)].
We minimize the distance d j between a data point x j and the hyperplane given by We define support vectors x as the closest points to the hyperplane.There is such a vector in each side of the hyperplane, as shown in Fig. 4(a).This is the origin of the name of the support vector machine.Without loss of generality, we set for the support vectors, because the hyperplane is present at the equidistance of two closest data points and because it is possible to set the magnitude of |ω • x + ω 0 | to be 1 by scaling ω and ω 0 .Then, we maximize the distance which is identical to minimize |ω|.First, we consider the hard margin problem, where red and blue points are perfectly separable.All red points satisfy ω • x j + ω 0 > 1 and all blue points satisfy ω • x j + ω 0 < −1.We introduce variables y j , where y j = 1 for red points and y j = −1 for blue points.Using them, the condition is rewritten as for each j.The problem is reduced to find the minimum of |ω| 2 under the above inequalities.The optimization under inequality conditions is done by the Lagrange multiplier method with the Karush-Kuhn-Tucker condition 45 .It is expressed in terms of the Lagrangian as where β j are Lagrange multipliers to ensure the constraints.
For the soft margin case, we cannot separate two classes exactly.In order to treat this case, we introduce slack variables ξ j satisfying and redefine the cost function as Here, γ = ∞ corresponds to the hard margin.The second term represents the penalty for some of data points to have crossed over the hyperplane.The Lagrangian is modified as The stationary points are determined by We may solve these equations to determine ω and ν j as from (42), and from (44).Inserting them into (45), we find Since y 2 j = 1, it is rewritten as Since β j appears always in a pair with y j , we introduce a new variable defined by and we define the Kernel matrix K ij as Then, ω 0 and α j are obtained by solving linear equations which are summarized as which is Eq.( 6) in the main text.Finally, ω is determined by Once the hyperplane is determined, we can classify new input data into red if and blue if Thus, we obtain the hyperplane for binary classification.
We explicitly show how to calculate c j in (17) based on the Γ matrix expansion for the one and two qubits.

One qubit:
We show an explicit example of the Γ-matrix expansion for one qubit.
The coefficient c j in ( 17) is calculated as (69) The one-qubit universal quantum circuit is constructed as (70) FIG. 1: (a) Binary classification of red and blue points based on a quantum support vector machine with soft margin.A magenta (cyan) line obtained by an exact solution (variational method).(b) Evolution of the cost function.The vertical axis is the log10Ecost.The horizontal axis is the variational step number.We have used r = 2, ξ1 = 0.001 and ξ2 = 0.0005 and γ = 1.We have runed simulations ten times.

FIG. 3 :
FIG.3: Evolution of the cost function for (a) two qubits and (b) three qubits.The vertical axis is the log10Ecost.The horizontal axis is the number of variational steps.We use c1 = 0.005 and c2 = 0.005 for both the two-and three-qubit universal quantum circuits.We prepare random initial and final states, where we have runed simulations ten times.

FIG. 4 :
FIG. 4: Illustration of the hyperplane and the support vector.Two support vectors are marked by red and blue squares.(a) Hard margin where red and blue points are separated perfectly, and (b) soft margin where they are separated imperfectly.
Ome qubit is represented by a 2 × 2 matrix,