Quantum neural networks with multi-qubit potentials

Ban, Yue; Torrontegui, E.; Casanova, J.

doi:10.1038/s41598-023-35867-1

Download PDF

Article
Open access
Published: 05 June 2023

Quantum neural networks with multi-qubit potentials

Yue Ban¹,
E. Torrontegui^2,3 &
J. Casanova^4,5,6

Scientific Reports volume 13, Article number: 9096 (2023) Cite this article

1897 Accesses
8 Altmetric
Metrics details

Subjects

Abstract

We propose quantum neural networks that include multi-qubit interactions in the neural potential leading to a reduction of the network depth without losing approximative power. We show that the presence of multi-qubit potentials in the quantum perceptrons enables more efficient information processing tasks such as XOR gate implementation and prime numbers search, while it also provides a depth reduction to construct distinct entangling quantum gates like CNOT, Toffoli, and Fredkin. This simplification in the network architecture paves the way to address the connectivity challenge to scale up a quantum neural network while facilitating its training.

Quantum neural network cost function concentration dependency on the parametrization expressivity

Article Open access 20 June 2023

Theory of overparametrization in quantum neural networks

Article 26 June 2023

Training deep quantum neural networks

Article Open access 10 February 2020

Introduction

Information is a resource due to its advance and expansion in the digitalization and control¹. However, programing explicit algorithms with good performance may become unfeasible due to the vertiginous growth in (i) the amount of available information with which classical algorithms have to deal², and (ii) the inherent difficulty of finding efficient algorithms for specific problems³. All these limit our current capabilities in information processing tasks. Two alternative approaches that would contest these limitations are machine learning and quantum computing^4,5.

On the one hand, machine learning (ML) is a branch of artificial intelligence that uses statistical techniques to give computers the ability to progressively learn with input data without being explicitly programmed. ML is based on the generation of a hypothesis that is optimized from sample inputs and re-used to generate new predictions⁶. Thus the algorithms can learn from data and overcome the static program instructions by making data-driven decisions from sample inputs. Among the distinct hypothesis models, neural networks⁷ are very extended due to the blooming of deep learning^8,9. Artificial neural networks are organized in layers and each layer learns new behavior patterns¹⁰. The computational power of artificial neural networks relies on this architecture where neurons in each layer feed signals into other neurons allowing parallel-processed computing^11,12. In this manner, several calculations can be performed at the same time, and large computational problems can often be divided into smaller ones, which can be then solved simultaneously. The versatility of neural networks to classify complex data relies on the universal approximation theorem, which leads artificial neural networks the capacity to approximate any function¹³. As result, they span a broad range of applications such as speech¹⁴ or object recognition¹⁵, spam filters¹⁶, vehicle control^17,18, trajectory prediction¹⁹, decision making²⁰, game-playing²¹, or automated trading systems²².

On the other hand, quantum computing represents a different paradigm from classical information processing. Based on an alternative information encoding that exploits the quantum properties of matter, systems that encompass several quantum bits (qubits) are exponentially hard to simulate with classical devices⁵ showing that quantum systems do not seem to obey the Church thesis²³, and consequently they are not polynomially equivalent to classical systems. Then, quantum systems harnessed as computational devices, might be dramatically more powerful than any other classical system²⁴. The universality of quantum computing²⁵ expands a broad range of applications. Some illustrative examples are linear systems solvers²⁶, molecule simulators²⁷, combinatorial optimizers²⁸, black-box²⁹ and factorization problems³⁰, or Hamiltonian simulations³¹. Although a set of single $N=1$ and two-qubit $N=2$ gates is a universal approximator³², larger multi-qubit ($N>2$) gates may offer a computational advantage that reduces complexity in existing algorithms^33,34,35.

Quantum machine learning^{36,37,38,39,40,41,42,43,44} aims for the symbiosis of both paradigms to the mutual reinforcement. To achieve improvements of the machine learning protocols by leveraging quantum resources in comparison to their classical counterparts is the goal for the future. On the other hand, the universality of artificial networks may enhance the accuracy and efficiency of quantum protocols^{45,46,47,48,49}. By making analogy to classical neural networks, a quantum neural network (QNN) consists of quantum perceptrons (neurons) possessing nonlinear activation functions in different layers. In the network, the hidden layers in a QNN are the intermediate ones which are composed of quantum perceptrons, each of which is a qubit encoded in an Ising Hamiltonian⁴⁴. By measuring the excitation probability of the reduced eigenstate of such a Hamiltonian, one can get the nonlinear activation function. In this work, we propose an extension of a universal QNN enabling multi-qubit ($N>2$) interactions that lead to a reduction of the network depth while keeping the approximation power. As a result, achieving a simplification of existing protocols not only requires shorter operation times, but it may also introduce less accumulation of errors due to the reduction in the amount of requested gates. The article is structured as follows: We firstly define that the quantum perceptron is a single-output quantum neuron with multi-qubit interactions, connecting to several-input quantum neurons without hidden layers. The result of nesting several quantum perceptrons is a QNN. We show that a quantum perceptron with multi-qubit interactions can do an XOR gate and prime number search from 3 to 5 bits, improving the performance of approximation power compared to the classical counterpart, as well as compared to standard QNNs (i.e. QNNs that include nested perceptrons without multi-qubit interactions). Then, we show that quantum gates such as CNOT, Toffoli, and Fredkin can be implemented by QNN involving quantum perceptrons with multi-qubit interactions, thus reducing the circuit depth as it is not needed to add hidden layers.

Results

Quantum perceptrons with multi-qubit potentials

A quantum perceptron, or a quantum neuron, is the basic building block of a QNN. It can be constructed as a qubit that presents a nonlinear response to an input potential $\hat{x}_j$ in the excitation probability. This can be written as the following quantum gate acting on a jth qubit that encodes the quantum perceptron^44,45:

$$\begin{aligned} \hat{U}_j(\hat{x}_j; f) |0_j\rangle = \sqrt{1-f(\hat{x}_j)} |0_j\rangle + \sqrt{f(\hat{x}_j)} |1_j\rangle , \end{aligned}$$

(1)

where

$$\begin{aligned} f(x) = \frac{1}{2} \left( 1+\frac{x}{\sqrt{1+x^2}}\right) . \end{aligned}$$

(2)

corresponds to a nonlinear function. The transformation in Eq. (1) can be engineered, e.g., by evolving adiabatically the qubit with the Hamiltonian

$$\begin{aligned} \hat{H} = \frac{1}{2}\left[ \hat{x}_j \hat{\sigma }^z_j + \Omega (t)\hat{\sigma }_j^x\right] \end{aligned}$$

(3)

where $\hat{x}_j$ is the potential exerted by other neurons on the perceptron, and the applied external field $\Omega (t)$ leads to a tunable energy gap in the dressed-state qubit basis $|\pm \rangle$, with $\hat{\sigma }^x_j|\pm \rangle = \pm |\pm \rangle$. Typically, $\hat{x}_j = \sum _{i=1}^k (w_{ji} \hat{\sigma }^z_i) + b_j$^44,45 which implies that the jth perceptron is coupled to a number k of neurons (labelled with i) in the previous/input layer via standard spin-spin interactions. The Hamiltonian in Eq. (3) has the following reduced eigenstate (i.e. when the degrees of freedom of any other neuron are traced-out):

$$\begin{aligned} |\Phi (x_j/ \Omega (t)) \rangle = \sqrt{1- f(x_j / \Omega (t))} |0_j\rangle + \sqrt{f(x_j / \Omega (t))} |1_j\rangle ,~~~~ \end{aligned}$$

(4)

with f(x), the excitation probability, in the form of Eq. (2). Specifically, to accomplish $\hat{U}_j(\hat{x}_j;f) |0_j\rangle$, one can use a Hadamard gate to firstly get the transformation $|0_j\rangle \rightarrow |+_j\rangle$ and finally obtain $|\Psi \rangle = |\Phi (x_j/ \Omega (t_f)) \rangle$ at a certain time $t_f$ by evolving the system adiabatically with Hamiltonian (3). With the fixed trajectory always along with the instantaneous eigenstate of the Hamiltonian, one can deduce the external driving $\Omega (t)$ from the fast quasi-adiabatic passage⁴⁴. In this manner, the nonlinear activation function of the quantum perceptron is encoded in the probability of the excited state $P_j = f(x_j) = \frac{1}{2} (1+\langle \hat{\sigma }_j^z \rangle )$ during the adiabatic evolution⁴⁴. In order to speed up the operation of this perceptron, one can also use inverse engineering techniques which directly impose conditions in the wave function evolution at the initial and final time instants, resulting in nonlinear response in the quantum perceptron⁴⁵. Correspondingly, a smoother control $\Omega (t)$ easily to be implemented experimentally can also be found. In addition, this accelerated activation mechanism by inverse engineering for the quantum neurons would reduce the decoherence and the variation of the input potential induced by neurons in the previous layer leading to enhanced performance.

Now, we introduce a different type of potentials which rely on the possibility to implement multi-qubit interactions. In particular, we consider a potential of the kind

$$\begin{aligned} \hat{x}_j = \sum _{i=1}^k (w_{ji} \hat{\sigma }^z_i) + w_{\textrm{m}} \hat{\sigma }_{l_1}^z \cdots \ \hat{\sigma }_{l_n}^z + b_j. \end{aligned}$$

(5)

where $w_{\textrm{m}}$ is a multi-qubit coefficient marked by the subscript $\textrm{m}$, and $l_p \in [1, 2,\ldots , k]$ (namely, the term involving several Pauli matrices includes products of an arbitrary number of neurons in the previous input). For the sake of simplicity in the presentation, there is only a single multi-qubit term in Eq. (5). However, this can include several products of distinct neurons in the input layer (i.e. additional multi-qubit terms). Later we will provide specific examples of these interactions associated to definite problems.

In the following, we show that the multi-qubit potential enables tasks such as (i) Constructing XOR gates at the perceptron level (ii) Searching prime numbers and (iii) Encoding quantum gates. All these are implemented without hidden layers and/or ancillary qubits, thus, showing the significant role of multi-qubit potentials in the simplification of QNNs.

XOR gate

As a classical perceptron is a linear separator, a nonlinear logic gate such as the well-known XOR problem (i.e., the exclusive OR boolean function) requires at least one hidden layer to be implemented in classical neural networks⁵¹. Now we show that a quantum perceptron with multi-qubit interactions in the neural potential is a nonlinear classifier. In particular, we illustrate the construction of an XOR gate by a single quantum perceptron with multi-qubit interactions. We also show that the lack of hidden layers prevents classical neural networks and standard QNNs to achieve the same task.

As the values of the output neuron 0 and 1 cannot be separated linearly, a classical perceptron with two inputs and one output (i.e. without hidden layers) fails to solve the XOR gate. To show this, we use the standard gradient descent algorithm to train this simple classical perceptron with a sigmoidal activation potential f(x), see Eq. (2), and $x_j=\sum _{i=i}^kw_{ij}s_i+b_j$ with the classical input $s_i \in \{0,1\}$ whose cost function is in the form of the mean square value

$$\begin{aligned} C=\frac{1}{2N}\sum _{n=1}^N (y^{(n)} -t^{(n)})^2. \end{aligned}$$

(6)

However, besides a sigmoid function, neurons in other forms of nonlinear activation functions, such as a ReLu function, can also create a neural network to learn the XOR gate. Here, $N=4$ determines the four possible examples 00, 10, 01, 11, while $y^{(n)}$ and $t^{(n)}$ are the output and the target respectively, for the nth example. For simplicity, the subscript j labelling one perceptron will be neglected in the following text. During the training, the parameters of the classical perceptron are updated after each epoch as

$$\begin{aligned} \tilde{w}_i=& \,w_i - \eta \frac{\partial C}{\partial w_i} = w_i - \frac{\eta }{N} \sum _{n=1}^N \left( y^{(n)} - t^{(n)}\right) f'(x) s_i, \nonumber \\ \tilde{b}= & \,b -\eta \frac{\partial C}{\partial b} = b - \frac{\eta }{N} \sum _{i=1}^N (y^{(n)} - t^{(n)}) f'(x), \end{aligned}$$

(7)

with the learning rate $\eta$. Any other nonlinear function g(x), similar to f(x), can be applied to train this perceptron in order to obtain the same approximation power. As it is shown in Fig. 1a, the cost function value of this classical perceptron stucks in $C=0.125$ (dotted-red line with superimposed diamonds for a better identification). As a classical perceptron can only converge on linearly separable data, it is not able to imitate the XOR function. In order to complete an XOR, a hidden layer with two neurons is needed. These two neurons can be regarded to perform an OR and a NAND gate.

To implement an XOR gate with a quantum perceptron without the multi-qubit term in Eq. (5), this is by using the potential $\hat{x}= w_1 \hat{\sigma }_1^z + w_2 \hat{\sigma }_2^z +b$, one can follow the procedure described in Ref.⁴⁴. To encode an XOR gate, the neural potential is derived from the four basis states $|00\rangle$, $|01\rangle$, $|10\rangle$, $|11\rangle$, which play the role of the four examples of the input in the XOR gate (namely, 00, 10, 01, 11). In this case, as the input values of the XOR gate are bits, one can transform them into the measurement value of $\hat{\sigma }^z$ of the input qubits for the perceptron, i.e. $\sigma ^{z}_{\textrm{in}} = \frac{1}{2} (1+\langle \hat{\sigma }^z\rangle )$. Therefore, the input values 0 and 1 refer to the input states as the ground state $|0\rangle$ ($\langle \hat{\sigma }^z\rangle = -1$) and the excited state $|1\rangle$ ($\langle \hat{\sigma }^z \rangle =1$), respectively. Such a quantum perceptron without hidden layers and with only two-qubit interactions is equivalent to a classical perceptron, from the point of view of the following training process. Aiming at training it with the gradient descent method, one has to update the weights and bias as

$$\begin{aligned} \tilde{w}_i= & \,w_i - \frac{\eta }{N} \sum _{n=1}^N \left( y^{(n)} - t^{(n)}\right) \frac{\partial y^{(n)}}{\partial x} \frac{\partial x}{\partial w_i}, \nonumber \\ \tilde{b}= & \,b - \frac{\eta }{N} \sum _{i=1}^N \left( y^{(n)} - t^{(n)}\right) \frac{\partial y^{(n)}}{\partial x}, \end{aligned}$$

(8)

where

$$\begin{aligned} \frac{\partial y^{(n)}}{\partial x}= & \,\frac{1}{2} \left( \left\langle \frac{\partial \Psi }{\partial x} \bigl | \hat{\sigma }_z \bigr | \Psi \right\rangle + \left\langle \Psi \bigl | \hat{\sigma }_z \bigr | \frac{\partial \Psi }{\partial x} \right\rangle \right) \nonumber \\= & \,f'(x) \end{aligned}$$

(9)

and $|\Psi \rangle = \hat{U}(\hat{x},f) |0\rangle$ is the solution to the Schrödinger equation driven by the Hamiltonian (3) with $\hat{x}= w_1 \hat{\sigma }_1^z + w_2 \hat{\sigma }_2^z +b$. In the previous equations we can see that weights and bias are obtained in the same way as their classical counterparts. For that, one has to compare Eq. (7) with Eqs. (8) and (9) provided that $\frac{\partial x}{\partial w_i}=\hat{\sigma }_i^z \rightarrow s_i$. This indicates that a single quantum perceptron with two-qubit interactions and the basis states $|00\rangle$, $|01\rangle$, $|10\rangle$, $|11\rangle$ as input is equivalent to a classical perceptron.

In order to implement the XOR gate with this quantum perceptron, one would need to perform two adiabatic passages with different controls $\Omega (t)$, as well as to use different neural potentials in each passage by appropriately changing the weights and biases⁴⁴. Equivalently, one could also do the XOR gate by including one hidden layer with two additional quantum neurons and the application of a single $\Omega (t)$. It is worth mentioning that Eq. (9) holds only for a quantum perceptron instead of a QNN with hidden layers. If a QNN has more layers, one only needs to measure the output qubit value

$$\begin{aligned} y=P(x_{\textrm{out}}) = \frac{1}{2} (1+\langle \Psi | \hat{\sigma }^z_{\textrm{out}}|\Psi \rangle ) \end{aligned}$$

(10)

instead of the intermediate ones, i.e., $|\Psi \rangle = \hat{U}_{\text {tot}} |0\rangle$ with $\hat{U}_{\text {tot}} = \Pi _{j=1}^M \hat{U}_j$ where M is the total number of quantum perceptrons in a QNN. Otherwise, shot noise is introduced by measuring the neurons in the hidden layer.

This procedure to do the XOR gate can be simplified when considering a quantum perceptron with multi-qubit interactions. We can find the output of the quantum perceptron by using Eq. (10), where $|\Psi \rangle = \hat{U}(\hat{x};f) |0\rangle$. In the unitary transformation implemented by the quantum perceptron gate in the Heisenberg picture

$$\begin{aligned} \hat{U}^\dag \hat{\sigma }^z \hat{U} = [1-2f(\hat{x})] \hat{\sigma }^z + 2 \sqrt{f(\hat{x})[1-f(\hat{x})]} \hat{\sigma }^x, \end{aligned}$$

(11)

leads to $y = P(x) = f(x)$, where $y=1$, if $x>1/2$; and $y=0$, if $x \le 1/2$. In particular, we explore the following multi-qubit potential

$$\begin{aligned} \hat{x}= w_1 \hat{\sigma }_1^z + w_2 \hat{\sigma }_2^z +b + w_{\textrm{m}} \hat{\sigma }_1^z \hat{\sigma }_2^z. \end{aligned}$$

(12)

In this case, the weights $w_1$, $w_2$ and b are updated as in Eq. (8), while the updating formula for the weight of the $w_{\textrm{m}} \hat{\sigma }_1^z \hat{\sigma }_2^z$ term is

$$\begin{aligned} \tilde{w}_{\textrm{m}}= & \,w_{\textrm{m}}- \frac{\eta }{N} \sum _{n=1}^N \left( y^{(n)} - t^{(n)}\right) \frac{\partial y^{(n)}}{\partial x} \hat{\sigma }^z_1 \hat{\sigma }^z_2. \end{aligned}$$

(13)

Considering the construction of an XOR gate from the equations

$$\begin{aligned}{} & {} 0\times w_1 + 0 \times w_2 + b \le \frac{1}{2} \Leftrightarrow b - \frac{1}{2} \le 0, \end{aligned}$$

(14)

$$\begin{aligned}{} & {} 0\times w_1 + 1 \times w_2 + b> \frac{1}{2} \Leftrightarrow b - \frac{1}{2} > - w_2, \end{aligned}$$

(15)

$$\begin{aligned}{} & {} 1\times w_1 + 0 \times w_2 + b> \frac{1}{2} \Leftrightarrow b - \frac{1}{2} > - w_1, \end{aligned}$$

(16)

$$\begin{aligned}{} & {} 1\times w_1 + 1 \times w_2 + b + w_{\textrm{m}} \le \frac{1}{2} \Leftrightarrow b - \frac{1}{2} + w_{\textrm{m}} \le -w_1 - w_2, \end{aligned}$$

(17)

we find that Eqs. (15)–(17) are contradictory, if the mutli-qubit interaction term does not exist. However, one can always find approprint values for $w_{\textrm{m}}$ to satisfy the above inequalities. The existence of $w_{\textrm{m}} \hat{\sigma }_1^z \hat{\sigma }_2^z$ in the neural potential enables the quantum perceptron to construct an XOR gate as a nonlinear separator. We test the cost function value (see, Eq. (6)) for our quantum perceptron and find that $C< 1\%$ occurs at epoch $=197$ as shown in Fig. 1a (solid-blue line with squares). From the numerical calcluation, we can see that the cost function value continues to decrease. On the contrary the classical and qubit-qubit interaction perceptron are not able to produce an XOR gate, see plateau behavior of the cost function (dotted-red) that remains constant after the epoch $\sim 50$.

Searching prime numbers

In Ref.⁴⁴, a specific example of two, three, and four perceptrons per layer was illustrated to detect prime numbers from 0 to $2^i -1$ where i ranges from $i=3$ to $i=7$ bits. In particular, it was shown that a QNN with two perceptrons –one in the hidden layer and the other one as the output– can classify prime numbers from 0 to 7 (3 bits). This exemplifies the better performance of QNNs compared with classical ones where a hidden layer with two neurons are necessary to accomplish the same task. A scheme of these networks with 3 bits in input is shown figure in Fig. 2a,b, while details regarding QNN-training to search prime numbers are in “Methods”.

Now, we introduce the multi-qubit term into the neural potential and find that the same task is achieved by a QNN without hidden layers, i.e. at the single quantum perceptron level. To search a prime number for 3 bits (Truth table listed in Fig. 2c) using a single quantum perceptron we consider the potential

$$\begin{aligned} \hat{x}= w_1 \hat{\sigma }_1^z + w_2 \hat{\sigma }_2^z + w_3 \hat{\sigma }_3^z +b + w_\text {m} \hat{\sigma }_2^z \hat{\sigma }_3^z. \end{aligned}$$

(18)

As we demonstrate later, adding the multi-qubit term $w_\text {m} \hat{\sigma }_2^z \hat{\sigma }_3^z$ is enough to fulfill the prime numbers searching task. In this respect, one can include additional multi-qubit terms in the neural potential. However, we use the simplest example in Eq. (18) to simplify the network and training process.

We note that when the input of a quantum perceptron that presents only two-qubit interactions are the basis states, this is equivalent to the classical perceptron as shown in Eqs. (7)–(9). For this reason, now we compare the cost function of a classical perceptron with the cost function of a quantum perceptron using the potential in Eq. (18) for the specific purpose of searching prime numbers from 0 to 7. As it is shown in Fig. 1b the quantum perceptron achieves $C < 1\%$ at epoch = 667 (solid-blue with squares), while the C value saturates to 0.126 in the classical counterpart (dotted-red with diamonds). This indicates that our perceptron achieves the searching task without hidden layers. We have also verified the ability of the quantum perceptron by minimizing the cost function to an acceptable error for searching prime numbers for 4 bits with the potential

$$\begin{aligned} \hat{x}= w_1 \hat{ \sigma }_1^z + w_2 \hat{\sigma }_2^z + w_3 \hat{\sigma }_3^z + w_4 \hat{\sigma }_4^z + b + w_\text {m} \hat{\sigma }_2^z \hat{\sigma }_3^z, \end{aligned}$$

(19)

and for 5 bits with

$$\begin{aligned} \hat{x}= & \,w_1 \hat{ \sigma }_1^z + w_2 \hat{\sigma }_2^z + w_3 \hat{\sigma }_3^z + w_4 \hat{\sigma }_4^z + w_5 \hat{\sigma }_5^z + b \nonumber \\{} & {} + w_\text {m} \hat{\sigma }_2^z \hat{\sigma }_3^z. \end{aligned}$$

(20)

The multi-qubit terms in the above two neural potentials for 4 and 5 bits are chosen due to the fact that the cost function result in $C<1\%$ at epoch $=1692$ for Eq. (19) and epoch $=1698$ for Eq. (20) with the consideration of adopting the minimal number of multi-qubit terms. Further training the above quantum perceptrons presents that the value of C continue to decrease, proving the success to achieve these tasks.

Quantum gates

Now we show that one can use a QNN with multi-qubit interactions to construct quantum gates such as CNOT, Toffoli, and Fredkin gates without the necessity of hidden layers. In comparison, one can demonstrate that a QNN with only two-qubit interactions cannot construct the above mentioned quantum gates in the same conditions. This will be shown later. As quantum gates are reversible, the number of input and output qubits of quantum gates should be equal. Hence, the number of the neurons in the input equals that in the output. Correspondingly, the cost function is changed into

$$\begin{aligned} C=\frac{1}{2Nk}\sum _{n=1}^N \sum _{i=1}^k (y_i^{(n)} -t_i^{(n)})^2, \end{aligned}$$

(21)

where k is the number of quantum perceptrons.

For CNOT gate where $k=2$, the truth table and the schematic configuration of the QNN is shown in Fig. 3a,b respectively, where each perceptron (output neuron) gives the output value $y_{i} = \frac{1}{2}(1+\langle \hat{\sigma }_{\textrm{out,i}}^z\rangle )$. The first perceptron should have the same value as the first input neuron, while the second one aims to achieve the same value as the XOR output of two input neurons, i.e. $\hat{\sigma }^z_{\textrm{out,2}} = \hat{\sigma }_1^z \oplus \hat{\sigma }_2^z$⁵². The neural potential of the second perceptron is

$$\begin{aligned} \hat{x}_2 = w_{21} \hat{\sigma }_1^z + w_{22} \hat{\sigma }_2^z + b_2 + w_{\textrm{m}} \hat{\sigma }_1^z\hat{\sigma }_2^z, \end{aligned}$$

(22)

and the first perceptron has two-qubit interaction terms. We find that this is the most concise choice, as it requires least connectivity. We obtain the cost function value (using Eq. (21) $C<1\%$ at epoch $=116$ and $C\rightarrow 0$ at a larger epoch, as shown in Fig. 4a (solid-blue with squares) indicating the satisfaction to train a CNOT gate well. On the contrary, training such a CNOT gate by a QNN with only two-qubit interaction terms leads to a C tending to 0.0625 at large epoch, see Fig. 4b. This indicates that a standard QNN cannot encode the CNOT without hidden layers.

A QNN with multi-qubit terms also works for constructing a Toffoli gate without hidden layers. Truth table, and QNN for Toffoli are shown in Fig. 5a,b. Known as Controlled-Controlled-Not gate, a Toffoli gate has 3-qubit inputs and outputs. In the QNN, the outputs of the first two qubits should be the same value as their inputs $\hat{\sigma }_{\textrm{out,i}}^z = \hat{\sigma }^z_i$ ($i = 1, 2$), while the output of the third qubit aims at $\hat{\sigma }_{\textrm{out,3}}^z = \hat{\sigma }_3^z \oplus \hat{\sigma }^z_1 \hat{\sigma }^z_2$⁵². In this case, we propose the following neural potential of the third perceptron

$$\begin{aligned} \hat{x}_3= & \,w_{31} \hat{\sigma }_1^z + w_{32} \hat{\sigma }_2^z + w_{33} \hat{\sigma }_3^z \nonumber \\{} & {} + b_3 + w_\text {m} \hat{\sigma }_1^z \hat{\sigma }_2^z \hat{\sigma }_3^z, \end{aligned}$$

(23)

with a minimal number of multi-qubit terms. Demonstrated in Fig. 4a,b, the value $C<1\%$ (dashed-red line with crosses) occurs at epoch $=140$ and goes to 0 at large epoch during the training of a Toffoli gate. In close similarity with the previous case, a standard QNN without hidden layers fails to achieve the Toffoli gate, as C saturates at 0.0238.

Another example that can be successfully constructed by our QNNs including multi-qubit interactions without hidden layers is the Fredkin gate (also known as Controlled-SWAP gate) whose truth table and schematic configuration are in Fig. 6a,b. In this case, the adopted potential of the third perceptron is

$$\begin{aligned} \hat{x}_3= & \,w_{31} \hat{\sigma }_1^z + w_{32} \hat{\sigma }_2^z + w_{33} \hat{\sigma }_3^z \end{aligned}$$

(24)

$$\begin{aligned}{} & {} + b_3 + w_\text {m} \hat{\sigma }_2^z \hat{\sigma }_3^z, \end{aligned}$$

(25)

According to the numerical calculation, the cost function value C tends to zero, shown in Fig. 4a (dotted-black line with asterisks), indicating that these gates can be constructed by quantum perceptrons without hidden layers. Again, C saturates at the value 0.0417 at large epoch for a standard QNN without hidden layers, see Fig. 4b.

Discussions

We have shown different applications of QNNs which possess multi-qubit interactions. Meanwhile, the performance is enhanced compared to the same topology of a QNN without multi-qubit interactions. The multi-qubit interaction terms induce connectivities among quantum perceptrons that deviate from the current network paradigm of additive activations. It is due to the multi-qubit terms in the potentials that one can avoid the presence of some hidden layer without sacrificing approximative power. Such architecture allows us to address the connectivity challenge in scaling up QNNs. Moreover, the simple configuration helps to control the efficiency of the training processes. During the training process of all the examples shown above, the activation function (Eq. 2) based on the adiabatic evolution of the system is used. Instead, one may use shortcuts to adiabaticity to accelerate the formation of the activation function in physical registers⁴⁵.

The Hamiltonian of the jth perceptron given by Eq. (3) where the neural potential expressed by Eq. (5) corresponds to the linear Ising Hamiltonian with higher order interactions. Such a model is present in distinct quantum platforms^{53,54,55,56,57,58} experimentally and theoretically, although the developments in different platforms vary. An optical Ising machine hosting adjustable four-body interaction with all-to-all connections over a large number of spins was experimentally demonstrated in Ref.⁵⁴. Moreover, the four-body interactions are designed to be realized via superconducting quantum interference devices (SQUIDs)⁵⁵, while a single shot method for executing an i-Toffoli gate which is a three-qubit gate with two control and one target qubits was proposed in Ref.⁵⁶, with the application of currently existing superconducting hardware. Using resonant microwave-mediated interactions between distant electron spins to implement multi-qubit potentials⁵⁷ marks an important milestone for all-to-all qubit connectivity and scalability in silicon-based quantum circuits. Quantum evolutions governed by terms involving up to six-qubit interactions has already been implemented in trapped-ion systems⁵⁸ where the value of qubit-qubit interactions can be tuned, allowing the same architecture to be used to implement different types of gates and leading to a variety of quantum gates in the system. Each of the examples considered above corresponds to a particular case of multi-qubit interaction and its physical implementation would depend on the specific considered platform. The development in the quantum hardwares highlights the potential for practical implementations of multi-qubit potentials in QNNs. The promotion of quantum hardware and QNN protocols in respective fields will boost mutual developments.

Methods

Hamiltonian and training of the neural networks

In the article, we propose a QNN with multi-qubit interactions where the Hamiltonian Eq. (3) of one perceptron has the neural potential expressed in Eq. (5). In preparation for a QNN to demonstrate advantages in comparison with classical counterparts, we need to develop concepts to improve scalability of QNNs where controlling the network depth becomes crucial. One main objective to develop a QNN is to therefore minimize the depth without sacrificing approximative power. To this end, we explore deviations from the current network paradigm of additive activations and include multi-qubit interactions in the neural potential leading to a reduction of the network depth. In all the calculations, we use the standard gradient descent to train the neural networks with the cost function Eq. (21). Correspondingly, the weight of multi-qubit interactions term $w_{\textrm{m}} \hat{\sigma }_{l_1}^z\cdots \hat{\sigma }_{l_n}^z$ in the neural potential Eq. (5) should be updated as

$$\begin{aligned} \tilde{w}_{\textrm{m}}= & \,w_{\textrm{m}}- \frac{\eta }{N} \sum _{n=1}^N \left( y^{(n)} - t^{(n)}\right) \frac{\partial y^{(n)}}{\partial x} \hat{\sigma }^z_{l_1}\cdots \hat{\sigma }^z_{l_n}. \end{aligned}$$

(26)

Training for prime numbers search

To train a QNN for the task of searching prime numbers, a set of $2^i$ pairs for i bits containing the input and output values $\{A^{(n)}, y^{(n)}\}^{2^i}_{n=1}$ is taken, where the inputs are binary numbers $A^{(n)} = (a_1, a_2,\ldots , a_i)$ corresponding to the integers belonging to the set $\in \{0, 1,\ldots ,2^i -1\}$. As the targets are $t \in \{0,1\}$, the output of the network $y^n = Q(A^{(n)}) = 1$, if and only if the input number is prime.

Data availability

The datasets used and analysed during the current study available from the corresponding author on reasonable request.

References

Castells, M. The Information Age: Economy, Society and Culture (Blackwell, Oxford, 1996).
Google Scholar
Walter, C. Kryder’s Law. Sci. Am. 293, 32–33. https://doi.org/10.1038/scientificamerican0805-32 (2005).
Article PubMed Google Scholar
Kolmogorov, A. On Tables of Random Numbers. Sankhya Ser. A. 25, 369–376 (1963).
MathSciNet MATH Google Scholar
Samuel, A. L. Some studies in machine learning using the game of Checkers. IBM J. Res. Dev. 3, 210–229. https://doi.org/10.1147/rd.33.0210 (1959).
Article MathSciNet Google Scholar
Feynman, R. P. Simulating physics with computers. Int. J. Theor. Phys. 21, 467–488. https://doi.org/10.1007/BF02650179 (1982).
Article MathSciNet Google Scholar
Hebb, D. The organization of behavior (Wiley, New York, 1949).
Google Scholar
McCulloch, W. S. & Pitts, W. A. Logical calculus of ideas immanent in nervous activity. Bull. Math. Biol. 5, 115–133. https://doi.org/10.1007/BF02478259 (1943).
Article MathSciNet MATH Google Scholar
Oh, K. S. & Jung, K. GPU implementation of neural networks. Pattern Recogn. 37, 1311–1314. https://doi.org/10.1016/j.patcog.2004.01.013 (2004).
Article ADS MATH Google Scholar
LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444. https://doi.org/10.1038/nature14539 (2015).
Article ADS CAS PubMed Google Scholar
Kleene, S. C. Representation of events in nerve nets and finite automata. Ann. Math. Stud. 34 (Princeton University Press, 1956). https://doi.org/10.1515/9781400882618-002
Rosenblatt, F. The perceptron—A perceiving and recognizing automaton. Tech. Rep. 85-460-1, Cornell Aeronautical Laboraroty, (1957).
Hopfield, J. J. Neurons with graded response have collective computational properties like those of two-state neurons. Proc. Natl. Acad. Sci. 81, 3088–3092. https://doi.org/10.1073/pnas.81.10.3088 (1984).
Article ADS CAS PubMed Central MATH PubMed Google Scholar
Cybenko, G. Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst. 2, 303–314. https://doi.org/10.1007/BF02551274 (1989).
Article MathSciNet MATH Google Scholar
Dahl, G. E., Yu, D., Deng, L. & Acero, A. Context-dependent pre-trained deep neural networks for large vocabulary speech recognition. IEEE Trans. Audio Speech Lang. Process. 20, 30–42. https://doi.org/10.1109/TASL.2011.2134090 (2012).
Article Google Scholar
Hinton, G. E., Osindero, S. & Teh, Y. A. Fast learning algorithm for deep belief nets. Neural Comput. 18, 1527–1554. https://doi.org/10.1162/neco.2006.18.7.1527 (2006).
Article MathSciNet MATH PubMed Google Scholar
Dada, E. G. et al. Machine learning for email spam filtering: Review, approaches and open research problems. Heliyon 5, e01802. https://doi.org/10.1016/j.heliyon.2019.e01802 (2019).
Article PubMed Central PubMed Google Scholar
Buehler, M., Iagnemma, K. & Singh, S. The DARPA urbanchallenge: autonomous vehicles in city traffic (Springer U.S., 2009).
Devi, S., Malarvezhi, P., Dayana, R. & Vadivukkarasi, K. A comprehensive survey on autonomous driving cars: A perspective view. Wirel. Pers. Commun. 114, 2121–2133. https://doi.org/10.1007/s11277-020-07468-y (2020).
Article Google Scholar
Valsamis, A., Tserpes, K., Zissis, D., Anagnostopoulos, D. & Varvarigou, T. Employing traditional machine learning algorithms for big data streams analysis: The case of object trajectory prediction. J. Syst. Softw. 127, 249–257 (2017). https://doi.org/10.1016/j.jss.2016.06.016
Kashyap, P. Machine learning for decision makers: Cognitive computing fundamentals for better decision making (APress, U. S., 2018).
Google Scholar
Fürnkranz, J. Machine learning and game playing, encyclopedia of machine learning (Springer U.S., 2010).
Huang, B., Huan, Y., Xu, L. D., Zheng, L. & Zou, Z. Automated trading systems statistical and machine learning methods and hardware implementation: A survey. Enterp. Inf. Syst. 13, 132–144. https://doi.org/10.1080/17517575.2018.1493145 (2019).
Article Google Scholar
Kleene, S. C. $\lambda$-definability and recursiveness. Duke Math. J. 2, 340–353. https://doi.org/10.1215/S0012-7094-36-00227-2 (1936).
Article MathSciNet MATH Google Scholar
Arute, F. et al. Quantum supremacy using a programmable superconducting processor. Nature 574, 505–510. https://doi.org/10.1038/s41586-019-1666-5 (2019).
Article ADS CAS PubMed Google Scholar
Kitaev, A. Y. Quantum computations: Algorithms and error correction. Russ. Math. Surv. 52, 1191–1249. https://doi.org/10.1070/RM1997v052n06ABEH002155 (1997).
Article MathSciNet MATH Google Scholar
Harrow, A. W., Hassidim, A. & Lloyd, S. Quantum algorithm for linear systems of equations. Phys. Rev. Lett. 103, 150502. https://doi.org/10.1103/PhysRevLett.103.150502 (2009).
Article ADS MathSciNet CAS PubMed Google Scholar
Peruzzo, A. et al. A variational eigenvalue solver on a photonic quantum processor. Nat. Commun. 5, 4213. https://doi.org/10.1038/ncomms5213 (2014).
Article ADS CAS PubMed Google Scholar
E. Farhi, J. Goldstone, & S. A. Gutmann, quantum approximate optimization algorithm. arXiv:1411.4028 (2014).
Grover, L. K. A fast quantum mechanical algorithm for database search. Proceedings, 28th Annual ACM Symposium on the Theory of Computing, 212–219 (1996). https://doi.org/10.1145/237814.237866
Shor, P. W. Algorithms for quantum computation: Discrete logarithms and factoring. In Proceedings 35th Annual Symposium on Foundations of Computer Science, 124–134 (1994). https://doi.org/10.1109/SFCS.1994.365700
Lloyd, S. Universal quantum simulators. Science 273, 1073. https://doi.org/10.1126/science.273.5278.1073 (1996).
Article ADS MathSciNet CAS MATH PubMed Google Scholar
Nielsen, M. A., & Chuang, I. Quantum computation and quantum information (Cambridge University Press, New York, U.S., 2011).
Möttönen, M., Vartiainen, J. J., Bergholm, V. & Salomaa, M. M. Quantum circuits for general multiqubit gates. Phys. Rev. Lett. 93, 130502. https://doi.org/10.1103/PhysRevLett.93.130502 (2004).
Article ADS CAS PubMed Google Scholar
Reagor, M., Osborn, C. B., Tezak, N., et al. Demonstration of universal parametric entangling gates on a multi-qubit lattice. Sci. Adv. 4, eaao3603 (2018). https://doi.org/10.1126/sciadv.aao3603
Bækkegaard, T. et al. Realization of efficient quantum gates with a superconducting qubit-qutrit circuit. Sci. Rep. 9, 13389. https://doi.org/10.1038/s41598-019-49657-1 (2019).
Article ADS CAS PubMed Central PubMed Google Scholar
Kak, S. C. Quantum neural computing. Adv. Imag. Elect. Phys. 94, 259–313. https://doi.org/10.1016/S1076-5670(08)70147-2 (1995).
Article Google Scholar
Schuld, M., Sinaysky, I. & Petruccione, F. The quest for a Quantum Neural Network. Quan. Inf. Proc. 13, 2567–2586. https://doi.org/10.1007/s11128-014-0809-8 (2014).
Article MathSciNet MATH Google Scholar
Biamonte, J. et al. Quantum machine learning. Nature 549, 195–202. https://doi.org/10.1038/nature23474 (2017).
Article ADS CAS PubMed Google Scholar
Farhi, E. & Neven, H. Classification with quantum neural networks on near term processors. arXiv:1802.06002 (2018).
Schuld, M. & Killoran, N. Quantum machine learning in feature Hilbert spaces. Phys. Rev. Lett. 122, 040504. https://doi.org/10.1103/PhysRevLett.122.040504 (2019).
Article ADS CAS PubMed Google Scholar
Schuld, M., Bocharov, A., Svore, K. & Wiebe, N. Circuit-centric quantum classifiers. Phys. Rev. A 101, 032308. https://doi.org/10.1103/PhysRevA.101.032308 (2020).
Article ADS MathSciNet CAS Google Scholar
Cao, Y., Guerreschi, G., & Aspuru-Guzik, A. Quantum neuron: An elementary building block for machine learning on quantum computers. arXiv:1711.11240 (2017).
Pérez-Salinas, A., Cervera-Lierta, A., Gil-Fuster, E., & Latorre, J. I. Data re-uploading for a universal quantum classifier. Quantum 4, 226 (2020). https://doi.org/10.22331/q-2020-02-06-226
Torrontegui, E. & García-Ripoll, J. J. Unitary quantum perceptron as efficient universal approximator. EPL 125, 30004. https://doi.org/10.1209/0295-5075/125/30004 (2019).
Article ADS CAS Google Scholar
Ban, Y., Chen, X., Torrontegui, E., Solano, E., & Casanova, J. Speeding up quantum perceptron via shortcuts to adiabaticity. Sci. Rep. 11, 5783 (2021). https://doi.org/10.1038/s41598-021-85208-3
Carrasquilla, J. & Melko, R. G. Machine learning phases of matter. Nat. Phys. 13, 431–434. https://doi.org/10.1038/nphys4035 (2017).
Article CAS Google Scholar
Deng, D.-L., Li, X., & Das Sarma, S. Quantum entanglement in neural network states. Phys. Rev. X 7, 021021 (2017). https://doi.org/10.1103/PhysRevX.7.021021
Torlai, G. et al. Neural-network quantum state tomography. Nat. Phys. 14, 447–450. https://doi.org/10.1038/s41567-018-0048-5 (2018).
Article CAS Google Scholar
Aharon, N. et al. NV center based nano-NMR enhanced by deep learning. Sci. Rep. 9, 17802. https://doi.org/10.1038/s41598-019-54119-9 (2019).
Article ADS CAS PubMed Central PubMed Google Scholar
Ban, Y., Echanobe, J., Ding, Y., Puebla, R. & Casanova, J. Neural-network-based parameter estimation for quantum detection. Quant. Sci. Technol. 6, 045012. https://doi.org/10.1088/2058-9565/ac16ed (2021).
Article ADS Google Scholar
Minsky, M. & Papert, S. A. Perceptrons: An introduction to computational geometry (MIT Press, Cambridge, MA, USA, 2017).
Book MATH Google Scholar
Nielsen, M. A., & Chuang, I. L. Quantum computation and quantum information (Cambridge university press, 2010).
Donskaya, I. S. Higher-order interactions in the linear rising model. Theor. Math. Phys. 74, 324–328. https://doi.org/10.1007/BF01016629 (1988).
Article MathSciNet Google Scholar
Kumar, S., Zhang, H. & Huang, Y. P. Large-scale Ising emulation with four body interaction and all-to-all connections. Commun. Phys. 3, 108. https://doi.org/10.1038/s42005-020-0376-5 (2020).
Article Google Scholar
Sameti, M., Potočnik, A., Browne, D. E., Wallraff, A. & Hartmann, M. J. Superconducting quantum simulator for topological order and the toric code. Phys. Rev. A 95, 042330. https://doi.org/10.1103/PhysRevA.95.042330 (2017).
Article ADS Google Scholar
Baker, A. J. et al. Single shot i-Toffoli gate in dispersively coupled superconducting qubits editors-pick. Appl. Phys. Lett. 120, 054002. https://doi.org/10.1063/5.0077443 (2022).
Article ADS CAS Google Scholar
Borjans, F., Croot, X. G., Mi, X., Gullans, M. J. & Petta, J. R. Resonant microwave-mediated interactions between distant electron spins. Nature 577, 195–198. https://doi.org/10.1038/s41586-019-1867-y (2020).
Article ADS CAS PubMed Google Scholar
Lanyon, B. P., Hempel, C., Nigg, D., Müller, M., Gerritsma, R. Z., Fähringer, Schindler, P., Barreiro, J. T., Rambach, M., Kirchmair, G., Hennrich, M., Zoller, P., Blatt, R., & Roos, C. F. Universal digital quantum simulation with trapped ions. Science 334, 57–61 (2011). https://doi.org/10.1126/science.1208001

Download references

Acknowledgements

We acknowledge financial support from Spanish Government via PGC2018-095113-B-I00 (MCIU/AEI/FEDER, UE), Basque Government via IT986-16, as well as from QMiCS (820505) and OpenSuperQ (820363) of the EU Flagship on Quantum Technologies, and the EU FET Open Grant Quromorphic (828826). Y. B. acknowledges the CDTI within the Misiones 2021 program and the Ministry of Science and Innovation under the Recovery, Transformation and Resilience Plan—Next Generation EU under the project “CUCO: Quantum Computing and its Application to Strategic Industries”. E. T. acknowledges the Ramón y Cajal (RYC2020-030060-I) research fellowship, the Spanish Government via the project PID2021-126694NA-C22 (MCIU/AEI/FEDER, EU), CSIC Research Platform PTI-001, CAM/FEDER Project No. S2018/TCS-4342 (QUITEMAD-CM), and by Comunidad de Madrid-EPUC3M14. J. C. acknowledges the Ramón y Cajal (RYC2018-025197-I) research fellowship, the financial support from Spanish Government via EUR2020-112117 and Nanoscale NMR and complex systems (PID2021-126694NB-C21) projects, the ELKARTEK project Dispositivos en Tecnologías Cuánticas (KK-2022/00062), and the Basque Government grant IT1470-22.

Author information

Authors and Affiliations

TECNALIA, Basque Research and Technology Alliance (BRTA), 48160, Derio, Spain
Yue Ban
Departamento de Física, Universidad Carlos III de Madrid, Avda. de la Universidad 30, 28911, Leganés (Madrid), Spain
E. Torrontegui
Instituto de Física Fundamental IFF-CSIC, Calle Serrano 113, 28006, Madrid, Spain
E. Torrontegui
Department of Physical Chemistry, University of the Basque Country UPV/EHU, Apartado 644, 48080, Bilbao, Spain
J. Casanova
EHU Quantum Center, University of the Basque Country UPV/EHU, Leioa, Spain
J. Casanova
Basque Foundation for Science, IKERBASQUE, Plaza Euskadi 5, 48009, Bilbao, Spain
J. Casanova

Authors

Yue Ban
View author publications
You can also search for this author in PubMed Google Scholar
E. Torrontegui
View author publications
You can also search for this author in PubMed Google Scholar
J. Casanova
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Y.B. developed the theoretical formalism, performed the analytic calculations and the numerical simulations. E.T. verified the method. J.C. supervised the project. All the authors contributed to the final version of the manuscript.

Corresponding author

Correspondence to Yue Ban.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Ban, Y., Torrontegui, E. & Casanova, J. Quantum neural networks with multi-qubit potentials. Sci Rep 13, 9096 (2023). https://doi.org/10.1038/s41598-023-35867-1

Download citation

Received: 11 November 2022
Accepted: 25 May 2023
Published: 05 June 2023
DOI: https://doi.org/10.1038/s41598-023-35867-1

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.