Abstract
We put forward a strategy to encode a quantum operation into the unmodulated dynamics of a quantum network without the need for external control pulses, measurements or active feedback. Our optimisation scheme, inspired by supervised machine learning, consists in engineering the pairwise couplings between the network qubits so that the target quantum operation is encoded in the natural reduced dynamics of a network section. The efficacy of the proposed scheme is demonstrated by the finding of uncontrolled fourqubit networks that implement either the Toffoli gate, the Fredkin gate or remote logic operations. The proposed Toffoli gate is stable against imperfections, has a high fidelity for faulttolerant quantum computation and is fast, being based on the nonequilibrium dynamics.
Introduction
Computational devices based on the laws of quantum mechanics hold promise to speed up many algorithms known to be hard for classical computers.^{1} The implementation of a fullscale computation with existing technology requires an outstanding ability to maintain quantum coherence (i.e., isolation from the environment) without compromising the ability to control the interactions among the qubits in a scalable way. Among the most successful paradigms of quantum computation, there is the ‘circuit model’, in which the algorithm is decomposed into an universal set of single and twoqubit gates,^{2} and, to some extent, the socalled adiabatic quantum computation (AQC),^{3} in which the output of the algorithm is encoded in the ground state of an interacting manyqubit Hamiltonian. A different approach^{4} is based on the use of alwayson interactions, naturally occurring between physical qubits, to accomplish the computation. Compared with the circuit model, this scheme has the advantage of requiring minimal external control and avoiding the continuous switch off and on of the interactions between all but two qubits, whereas compared with AQC it has the advantage of being faster, being based on the nonequilibrium evolution of the system. Quantum computation with alwayson interactions is accomplished by combining the natural couplings with a moderate external control, e.g., with a smooth shifting of Zeeman energies,^{5} via feedforward techniques,^{6} using measurementbased computation^{7} or quantum control.^{8,9} Most of these approaches are based on the assumption that the natural couplings are fixed by nature and not tunable, whereas local interactions can be modulated with external fields. However, the amount of external control required can be minimised if the couplings between the qubits can be statically tuned^{10}—e.g., during the creation of the quantum device.
The recent advances in the fabrication of superconducting quantum devices has opened up to the realisation of interacting quantum networks. In a superconducting device, the qubits are built with a Josephson tunnel element, an inductance and a capacitor,^{11} whereas local operations and measurements are performed by coupling the qubit to a resonator.^{12} The interactions can be designed using lithographic techniques by jointly coupling two qubits via a capacitor^{13} or an inductance,^{14} and can be modelled via an effective twobody Hamiltonian ${\sum}_{\alpha}{J}_{\alpha}{\sigma}_{\alpha}\otimes {\sigma}_{\alpha}$ ^{15,16} where σ _{α} are the Pauli matrices. Because of the flexibility in wiring the pairwise interactions among the qubits, it is possible to arrange them in a planar graph structure, namely a collection of vertices and links, in which the vertices correspond to the qubits and the links correspond to the twobody interactions between them. Moreover, thanks to the development of threedimensional superconducting circuits,^{17} it may be possible in the near future to wire also nonplanar configurations, namely a general qubit network.
Motivated by the above, we ask the following question: is it possible to encode a quantum algorithm into the unmodulated dynamics of a suitably large quantum network of pairwise interacting qubits? This would be extremely interesting, as it would enable quantum computation by simply ‘waiting’, without the need of continuously applying external control pulses or measurements. Even when sequential operations cannot be avoided, our scheme can enable the inhardware implementation of recurring multiqubit operations of a quantum algorithm (see e.g., Figure 1), such as quantum arithmetic operations,^{18} and possibly also the quantum Fourier transform or errorcorrecting codes.^{1} We focus on twobody interactions, as they are the most common in physical setups, and we consider an enlarged network in which auxiliary qubits enrich the quantum dynamics. The important question analysed in this paper is as follows: given a target unitary operation U_{Q} on a given set of qubits Q, we consider an extended network Q∪A in which A is a set of auxiliary qubit (ancillae), and we ask whether it is possible to engineer the pairwise interactions in Q∪A, modelled by the timeindependent Hamiltonian H_{QA}, such that ${e}^{it{H}_{QA}}={U}_{Q}\otimes {V}_{A}$ after some time t (V_{A} may be an extra unitary operation on the auxiliary space). More generally, the target operation can depend also on the ancillae initial state: if ${e}^{it{H}_{QA}}={\sum}_{n}{U}_{Q}^{\left(n\right)}\otimes {A}_{n}\u3009\u3008{A}_{n}$, where $\left\{{A}_{n}\u3009\right\}$ forms a basis of the ancillae Hilbert space and, e.g., ${U}_{Q}={U}_{Q}^{\left(1\right)}$, then the target operation is implemented when A is initialised in ${A}_{1}\u3009$. Our method is particularly useful for implementing quantum gates, which requires kbody interactions (k>2), such as the Toffoli or Fredkin gates^{1,19,20} where ${U}_{Q}\ne {e}^{it{H}_{Q}}$ for any twolocal H_{Q}, and for remote logic, namely for applying a gate to qubits that are not directly connected but are rather interacting via intermediate systems. Our approach is completely different from the simulation of klocal Hamiltonians with pairwise interactions discussed in the AQC literature,^{21,22} being based on the unmodulated dynamics. Moreover, being based on unmodulated (timeindependent) interactions and ancillary qubits, it is significantly different from quantum optimal control.^{23}
Our quantum network design procedure is inspired by supervised learning in feedforward networks,^{24} in which the training procedure involves the optimisation of the network couplings (i.e., the weights between different nodes) such that the output corresponding to some input data has a desired functional form (e.g., for data classification). Although there are many recent developments about using a quantum device to speed up machine learning algorithms^{25,26,27,28,29} or storing data,^{30} our optimisation procedure is entirely classical, but specifically developed for quantum hardware design. Our scheme is completely different from other recent proposals^{31,32,33} because it avoids measurements or active feedbacks and requires minimal external control.
Results
Supervised quantum network design
Supervised learning is all about function approximation: given a training set {(I_{1}, O_{1}), (I_{2}, O_{2}), …}, namely a collection of inputs I_{k} and the corresponding known outputs O_{k}, the goal is to find a function f with two desired properties: (i) O_{k}≃f(I_{k}) for any training pair, and (ii) f should be able to infer the unknown output of an input not contained in the training set. In classical feedforward networks, the function f is approximated with a directed graph organised in layers, where the first layer is the input register and the last one encodes the output. The value ${s}_{k}^{\left(\ell \right)}$ of the kth node in layer ℓ is updated via the equation ${s}_{k}^{(\ell )}={A}^{\ell}[{\sum}_{j}{\lambda}_{kj}^{\left(\ell 1\right)}{s}_{j}^{\left(\ell 1\right)}],$ where ${A}^{\ell}$ is an appropriate (typically nonlinear) activation function and ${\lambda}_{kj}^{\left(\ell 1\right)}$ is the weight between node k in layer ℓ and node j in ℓ − 1. The training procedure consists in finding the optimal weights λ by minimising a suitable cost function such as $\mathcal{C}={\sum}_{k}{\left{O}_{k}f\left({I}_{k}\right)\right}^{2}$.
A quantum network consists, on the other hand, of an undirected graph (V, E) of vertices V and links E described by a twolocal Hamiltonian $$\begin{array}{}\text{(1)}& \mathcal{H}=\sum _{\left(n,m\right)\in E}\sum _{\alpha ,\beta}{J}_{nm}^{\alpha \beta}\frac{{\sigma}_{n}^{\alpha}{\sigma}_{m}^{\beta}}{4}+\sum _{n\in V}\sum _{\alpha}{h}_{n}^{\alpha}\frac{{\sigma}_{n}^{\alpha}}{2},\end{array}$$ where ${\sigma}_{n}^{\alpha}$, α=x, y, z, are the Pauli matrices acting on qubit n and, to simplify the notation, we call $\lambda =\left\{{J}_{nm}^{\alpha \beta},{h}_{n}^{\alpha}\right\}$ the set of parameters. The vertices are composed of two disjoints sets V=Q∪A, where Q consists of register qubits and A consists of auxiliary qubits. Given a separable initial state ${\mathit{\psi}}_{Q}\u3009\otimes {\mathit{\psi}}_{A}\u3009$, the time evolution according to Hamiltonian (1) generates a quantum channel^{1} ${\mathcal{E}}_{\lambda}\left[{\mathit{\psi}}_{Q}\u3009\u3008{\mathit{\psi}}_{Q}\right]{=\mathrm{Tr}}_{A}[{e}^{i\mathcal{H}\tilde{t}}{\mathit{\psi}}_{Q}\u3009\u3008{\mathit{\psi}}_{Q}\otimes {\mathit{\psi}}_{A}\u3009\u3008{\mathit{\psi}}_{A}{e}^{i\mathcal{H}\tilde{t}}]$ on subsystem Q—as we are interested in a fixed operational time $\tilde{t}$ for simplicity we set $\tilde{t}=1$, reabsorbing $\tilde{t}$ into the definition of the definition of $\mathcal{H}$. Depending on the flexibility of the experimental apparatus in reliably initialising the auxiliary qubits, one can add ${\mathit{\psi}}_{A}\u3009$ to the set λ. Network design consists in the following procedure: given a target unitary operation U_{Q} that we want to implement, the goal is to find the parameters λ, if they exist, such that ${\mathcal{E}}_{\lambda}\left[{\rho}_{Q}\right]={U}_{Q}{\rho}_{Q}{U}_{Q}^{\u2020}$ for any ρ_{Q}. To simplify the notation, we assume that the gate output is encoded in Q but it is straightforward to generalise the formalism when the output sites differ from the input ones.
Motivated by the similarity with classical supervised learning, where the weights λ are tuned to maximise the ability of the network to reproduce a known output given the corresponding input, we create a training set $\mathcal{T}$ with a random set of initial input states. For each input $\mathit{\psi}\u3009\in \mathcal{T}$ the expected known output is ${U}_{Q}\mathit{\psi}\u3009$, whereas the output of the network evolution is ${\mathcal{E}}_{\lambda}[\mathit{\psi}\u3009\u3008\mathit{\psi}]$. The ‘learning’ procedure involves the minimisation of the difference between the output of the network and the expected output, and corresponds to the maximisation of the fidelity $$\begin{array}{}\text{(2)}& \begin{array}{cc}\mathcal{F}=\sum _{\mathit{\psi}\u3009\in \mathcal{T}}\frac{{\mathcal{F}}_{\mathit{\psi}}}{\left\mathcal{T}\right},& {\mathcal{F}}_{\mathit{\psi}}=\u3008\mathit{\psi}{U}_{Q}^{\u2020}{\mathcal{E}}_{\lambda}\left[\mathit{\psi}\u3009\u3008\mathit{\psi}\right]{U}_{Q}\mathit{\psi}\u3009\end{array}.\end{array}$$
If the average is performed over all possible states, then equation (2) can be substituted by the average gate fidelity $\overline{\mathcal{F}}=\int {\mathcal{F}}_{\mathit{\psi}}d\mathit{\psi}$ where the formal integration can be explicitly evaluated^{10,34,35} yielding $$\begin{array}{}\text{(3)}& \overline{\mathcal{F}}=\frac{1}{D+1}+\frac{1}{D\left(D+1\right)}\sum _{ijkl}{U}_{ik}^{*}{\mathcal{E}}_{\lambda}^{ij,kl}{U}_{jl},\end{array}$$ where ${\mathcal{E}}_{\lambda}^{ij,kl}=\u3008{q}_{i}{\mathcal{E}}_{\lambda}\left[{q}_{k}\u3009\u3008{q}_{l}\right]{q}_{j}\u3009$, ${U}_{ij}=\u3008{q}_{i}{U}_{Q}{q}_{j}\u3009$ and $\left\{{q}_{j}\u3009\right\}$ form the computational basis of the Ddimensional Hilbert space of qubits Q. The typical value of the fidelity for a random nonoptimal evolution of the qubit network is $\overline{\mathcal{F}}={D}^{1}$, obtained using Haar integration techniques.^{36} This value is independent of the details of the ancillae, as it depends only on the dimension of the target Hilbert space, and provides an estimate for the initial fidelity of an untrained network.
The gatelearning procedure corresponds to a global maximisation of the fidelity (3). However, because of the many parameters in the Hamiltonian (1), $\overline{\mathcal{F}}$ can have many local maxima, making the global optimisation extremely complicated. As most global optimisation algorithms introduce stochastic strategies, rather than introducing unphysical random jumps, we take advantage of the explicit stochastic nature of the problem ($\overline{\mathcal{F}}$ is a uniform average over random states) and we propose the following learning algorithm to design the interactions of the quantum network.
1: Choose an initial parameter set λ (e.g., at random), and choose an initial learning rate ε;
2: Repeat
3: Generate a random $\mathit{\psi}\u3009$;
4: Update L times the coupling strengths as $$\begin{array}{}\text{(4)}& \lambda \to \lambda +\epsilon {\nabla}_{\lambda}\u3008\mathit{\psi}{U}_{Q}^{\u2020}{\mathcal{E}}_{\lambda}\left[\mathit{\psi}\u3009\u3008\mathit{\psi}\right]{U}_{Q}\mathit{\psi}\u3009;\end{array}$$
5: Decrease ε (see Materials and methods);
6: Until convergence (or maximum number of operations).
Specifically, we combine the above algorithm with the maximisation of the average fidelity (see below) and we observe a drastic speed up of the optimisation process. The parameter L tunes the number of deterministic steps in the learning procedure, and can be set to the minimum value L=1, so that after each interaction the state is changed or to a higher value. In our simulations, we use L=1, for simplicity. Our algorithm is an application of the stochastic gradient descent (SGD) method^{37} to the maximisation of the function (2). In classical feedforward networks, SGD is the de facto standard algorithm for network training^{24,37} and is specifically used for large training data sets, when the evaluation of the cost function and its gradient are computationally intensive. On the other hand, the average in equation (2) can be evaluated explicitly over a uniform distribution of an infinite number of initial states, giving equation (3). Although ${\mathcal{F}}_{\mathit{\psi}}$ is easier than $\overline{\mathcal{F}}$ to compute, the major advantage of SGD for quantum network design comes from its ability to escape local maxima. The crucial observation to show the latter point is that the statistical variance over random states $\mathrm{Var}\mathcal{F}=\overline{{\mathcal{F}}^{2}}{\overline{\mathcal{F}}}^{2}$ vanishes when $\overline{\mathcal{F}}=1$ (see e.g., refs 34 and 35)—indeed, intuitively, as both $\overline{\mathcal{F}}$ and ${\mathcal{F}}_{\mathit{\psi}}$ are bounded in [0, 1], $\overline{\mathcal{F}}$ can achieve its maximum only if ${\mathcal{F}}_{\mathit{\psi}}=1$ for all the states, apart from a set of measure zero. On the other hand, if $0<\overline{\mathcal{F}}<1$, then $\mathrm{Var}\mathcal{F}>0$ and the fluctuations can be so high that a local maximum of $\overline{\mathcal{F}}$ may not correspond to a maximum of ${\mathcal{F}}_{\mathit{\psi}}$ for some state ψ. This is indeed shown in Figure 2 with a real example for the implementation of the Toffoli gate (see the application section below). In Figure 2, the average fidelity $\overline{\mathcal{F}}$ has three local maxima at ${\lambda}_{k}^{\mathrm{loc}.}$ (k=1, 2, 3) and a single global maximum at λ^{gl.}, namely the optimal parameters, whereas the fidelities ${\mathcal{F}}_{\mathit{\psi}}$ for different random states ψ have a more complicated behaviour. In view of the argument discussed above, all the state fidelities ${\mathcal{F}}_{\mathit{\psi}}$ have a global maximum at λ^{gl.}, whereas, remarkably, at least one fidelity ${\mathcal{F}}_{\mathit{\psi}}$ has no local maximum at ${\lambda}_{k}^{\mathrm{loc}.}$. Our stochastic learning algorithm uses a gradient descent technique for locally maximising the function ${\mathcal{F}}_{\mathit{\psi}}\left(\lambda \right)$. Therefore, if we are around the slopes of a local maximum of ${\mathcal{F}}_{\mathit{\psi}}\left(\lambda \right)$ (say ${\lambda}_{k}^{\mathrm{loc}.}$ from the previous example) and the state $\mathit{\psi}\u3009$ is randomly changed to $\varphi \u3009$, that local maximum may disappear from ${\mathcal{F}}_{\varphi}\left(\lambda \right)$, allowing the algorithm to escape from this nonoptimal region when the parameters are updated via equation (4). On the other hand, when the algorithm is probing the neighbourhood of a true optimal point for which $\overline{\mathcal{F}}\left(\lambda \right)=1$ (e.g., λ^{gl.} in the previous example), then the maximum of ${\mathcal{F}}_{\mathit{\psi}}\left(\lambda \right)$ does not disappear when the state ψ is changed, allowing the ‘climbing’ procedure to continue.
The above stochastic algorithm may be combined with a deterministic maximisation of equation (3). In our simulations, we use stochastic learning for the initial global span of the parameter manifold, and if it reaches a suitably high fidelity (e.g., $\overline{\mathcal{F}}>\mathrm{95\%}$) then it is reasonable to suppose that the algorithm has found a global maximum. Starting from this point, we perform a local maximisation of equation (3), and if $\overline{\mathcal{F}}$≃1 is reached the learning has been successful. Otherwise, we repeat the procedure.
It is worth emphasising that given a target gate U it is an open question to understand a priori whether a solution may exist for a graph with a certain set of interactions (e.g., Heisemberg, Ising and so on). Unlike in quantum control, where given a timedependent Hamiltonian $\mathcal{H}\left(t\right)={\mathcal{H}}_{0}+\beta \left(t\right)\mathcal{V}$ one can check in advance whether $U=T[\mathrm{exp}(i{\int}_{0}^{1}\mathcal{H}\left(t\right)dt)]$ for some control profile β(t): such a profile can exist only if U is contained in the group associated with the algebra generated by the repeated commutators of ${\mathcal{H}}_{0}$ and $\mathcal{V}$. Although no complete algebraic characterisation is known for our case (see, however, the Materials and methods for a necessary condition) and we have to study each problem numerically, in the next sections we find some structures that enable the implementation of important quantum gates. All numerical simulations have been obtained in a laptop computer using QuTiP.^{38}
Application: Toffoli gate
The Toffoli gate is a key component for many important quantum algorithms, notably the Shor algorithm,^{39} quantum error correction,^{20} faulttolerant computation^{40} and quantum arithmetic operations,^{18} and, together with the Hadamard gate, is universal for quantum computation.^{41} Experimental implementations of this gate have been obtained with trapped ions,^{42} superconducting circuits^{19,43} or photonic architectures.^{44} Toffoli gate is a controlledcontrollednot (CCNOT) operation acting on three qubits. It can be implemented in a circuit using five twoqubit gates,^{1} or it can be obtained in coupled systems via quantum control techniques.^{45,46} Efficient schemes require higher dimensional system (i.e., qudits).^{44} On the other hand, the direct implementation using natural interactions is complicated, as the Hamiltonian ${\mathcal{H}}_{\mathrm{CCNOT}}$ corresponding to the gate, i.e., $\text{CCNOT}={e}^{i{\mathcal{H}}_{\mathrm{CCNOT}}}$, has threebody interactions, which are unlikely to appear in nature.
By applying our quantum hardware design procedure, we show that the Toffoli gate can be implemented in a fourqubit network using only pairwise interactions and constant control fields. Our findings enable the construction of a device that implements the Toffoli gate with a fidelity $\overline{\mathcal{F}}=\mathrm{99.98\%}$ by simply ‘waiting’ for the natural dynamics to occur, without the need for external control pulses. We consider a fourqubit network as displayed in Figure 3, in which the control qubits are labelled by the indices 1,2, the target is qubit 3 and the ancilla is qubit 4. We start our analysis by considering a fully connected graph in which each qubit interacts with the others using XX and ZZtype pairwise interactions, as this kind of interaction can be obtained in superconducting circuits.^{15} Because of the symmetries of the Toffoli gate (see Materials and methods), we consider the two control qubits to be equally coupled to the target and the ancilla: ${J}_{1m}^{\alpha \beta}={J}_{2m}^{\alpha \beta}$, for m=3, 4 and similarly we set ${h}_{1}^{\alpha}={h}_{2}^{\alpha}$. Moreover, as the Toffoli gate is real, we only consider local fields in the X and Z directions and set ${\mathit{\psi}}_{A}\u3009=\mathrm{cos}\eta \uparrow \u3009+{e}^{i\xi}\mathrm{sin}\eta \downarrow \u3009$. By combining SGD with the maximisation of equation (3), we find the following optimal parameters, $$\begin{array}{}\text{(5)}& \begin{array}{ccc}{J}_{12}^{zz}=\mathrm{8.940,}\hfill & {J}_{13}^{zz}=4.957,\hfill & {J}_{14}^{zz}=\mathrm{5.657,}\hfill \\ {h}_{1}^{z}=\mathrm{2.428,}\hfill & {h}_{3}^{z}={J}_{13}^{zz},\hfill & {h}_{4}^{z}=\mathrm{0.165,}\hfill \\ {h}_{3}^{x}=\mathrm{19.08,}\hfill & {h}_{4}^{x}=\mathrm{4.267,}\hfill & {J}_{34}^{xx}\mathrm{=15.06,}\hfill \\ \eta \mathrm{=0.8182,}\hfill & \xi \mathrm{=0.0587,}\hfill & \hfill \end{array}\end{array}$$ in which the other XX and ZZtype interactions not displayed in equation (5) are found to be zero by the learning algorithm, so the optimal configuration is the one summarised in Figure 3, where the XX coupling is only between qubits 3 and 4. In more physical terms, if the maximal allowed coupling is fixed to J/2π≈40 MHz, then we find a gate time of 60 ns and $$\begin{array}{}\text{(6)}& \begin{array}{ccc}{J}_{12}^{zz}=149.2\phantom{\rule{1mm}{0ex}}\mathrm{MHz},\hfill & {J}_{13}^{zz}=82.71\phantom{\rule{1mm}{0ex}}\mathrm{MHz},\hfill & {J}_{14}^{zz}=94.39\phantom{\rule{1mm}{0ex}}\mathrm{MHz},\hfill \\ {h}_{1}^{z}=40.52\phantom{\rule{1mm}{0ex}}\mathrm{MHz},\hfill & {h}_{3}^{z}={J}_{13}^{zz},\hfill & {h}_{4}^{z}=2.751\phantom{\rule{1mm}{0ex}}\mathrm{MHz},\hfill \\ {h}_{3}^{x}=318.4\phantom{\rule{1mm}{0ex}}\mathrm{MHz},\hfill & {h}_{4}^{x}=71.2\phantom{\rule{1mm}{0ex}}\mathrm{MHz},\hfill & {J}_{34}^{xx}\mathrm{=251.3}\phantom{\rule{1mm}{0ex}}\mathrm{MHz}.\hfill \end{array}\end{array}$$ With the optimal parameters of equations (5) and (6), we obtain an average gate fidelity of 99.98%, above the threshold for topological fault tolerance for single and twoqubit gates, whereas by avoiding the extra phase fixing ξ=0 we still obtain $\overline{\mathcal{F}}=\mathrm{99.92\%}$. Moreover, our gate fidelity is above the Toffoli gate accuracy threshold (755/756≃99.87%) for faulttolerant computation in the limit in which Clifford gate errors are negligible.^{47}
The optimal parameters (5) and (6) are stable against an imperfect tuning of the interactions. Indeed, we considered a perturbation ${\lambda}_{k}\to {\lambda}_{k}+\epsilon {r}_{k}$, r_{k}∈[0. 1] being a random number and ε being the strength of the static perturbation, and found that $\overline{\mathcal{F}}>\mathrm{99.9\%}$ if $\epsilon <0.04$ ($\epsilon <0.7\phantom{\rule{1mm}{0ex}}\mathrm{MHz})$ and $\overline{\mathcal{F}}>\mathrm{99\%}$ if $\epsilon <0.18$ ($\epsilon <3\phantom{\rule{1mm}{0ex}}\mathrm{MHz}$).
Application: Fredkin gate
Fredkin gate is a controlledswap (CSWAP) operation acting on three qubits, which is universal for reversible computation.^{1} We found that this gate can be obtained with perfect fidelity (up to the numerical precision) in a fourqubit network with Hamiltonian (1) where ${J}_{12}^{xx}={J}_{13}^{xx}=13.60$ (227.0 MHz), ${J}_{23}^{\alpha \alpha}=4.712$ (−78.62 MHz), ${J}_{24}^{xx}={J}_{34}^{xx}=8.400$ (140.2 MHz), ${J}_{12}^{zz}={J}_{13}^{zz}=11.15$ (186.1 MHz, ${h}_{4}^{x}=1.025$ (17.11 MHz), ${h}_{1}^{z}=\pi $ (54.42 MHz). The values in MHz correspond to a gate time of 60 ns. Moreover, ${e}^{i\mathcal{H}}{=\mathrm{CSWAP}}_{123}\otimes {U}_{4}$ so the gate is independent of the initial state of the ancilla. As for the Toffoli gate, this optimal configuration has been obtained by starting the training procedure with a fully connected graph with all the interactions, and thus the fact that some interactions are zero is a result of the optimisation process.
Application: remote logic
We study a qubit network that implements a maximally entangling gate between two sites that are not directly coupled. Remote logic has been studied extensively in spin chains for achieving entangling operations between the boundary sites,^{4,10,48,49} and it is a building block for a proposed architecture for solidstate quantum computation at room temperature.^{50} For simplicity, we consider a SU(2) invariant fourqubit network, interacting with a Heisenberg Hamiltonian $H={\sum}_{i\ne j\mathrm{=1}}^{4}{\sum}_{\alpha =x,y,z}{J}_{ij}{\sigma}_{i}^{\alpha}{\sigma}_{j}^{\alpha}/4$ where there is no direct coupling between qubits 1 and 4 (J_{14}=0). Applying our learning algorithm, we found that the $\sqrt{\mathrm{SWAP}}$ gate, which is universal for quantum computation when paired with single qubit operations,^{1} can be achieved between qubits 1 and 4 with unit fidelity with different choices of J_{12}=J_{24}, J_{13}=J_{34} and J_{23} when the initial state of ancillae is $\left(\uparrow \downarrow \u3009\downarrow \uparrow \u3009\right)/\sqrt{2}$. Given this simplification, one can then find a solution analytically: ${J}_{12}=\alpha +\pi \sqrt{{\left(2n\right)}^{2}1}/\sqrt{8}$, ${J}_{13}=\alpha \pi \sqrt{{\left(2n\right)}^{2}1}/\sqrt{8}$, ${J}_{23}=\alpha +{\left(1\right)}^{n}\pi $, where n is an integer. We find analytically that irrespective of α the above choice gives perfect fidelity. Our strategy has not found any threequbit configurations that implement a remote $\sqrt{\mathrm{SWAP}}$ gate, and thus the fourqubit network is the minimal nontrivial example. Remarkably, some of our fourqubit configurations are more stable to noise than the direct implementation of the gate in a twoqubit system (namely when J_{14}=π/2 and the other couplings are zero). For instance, if ${J}_{ij}={J}_{ij}^{\mathrm{optimal}}+\epsilon $, ${J}_{ij}^{\mathrm{optimal}}$ being the optimal value for implementing the gate, we found that when ε is randomly distributed in [0, 1/2], then the fourqubit system with n=1 still has, on average, $\overline{\mathcal{F}}$≃99.1%, whereas the direct twoqubit case has $\overline{\mathcal{F}}$≃98.8%.
Towards a scalable architecture for quantum computation
Current architectures for quantum computation, e.g., with superconducting qubits^{51} or ion traps,^{52} are based on an arrays of interacting qubits that are continuously controlled via external pulses to implement the desired operation. This approach may suffer from scalability issues because, even assuming the ability to maintain quantum coherence for a long time, extremely large (classical) control units will be necessary to generate the sophisticated pulse sequences required to implement a fullscale quantum algorithm. On the other hand, the approach that we have in mind shares more similarities with integrated circuits in presentday electronics, where a set of specialpurpose logic units (modules) are wired together to achieve computation (or other tasks). In our vision, different modules can be fabricated with qubit networks designed to produce a specific logic task, namely a quantum gate, automatically without the need of external control. As in Figure 1, the different logic and memory units can be reciprocally connected using a quantum bus, whose purpose is to transfer the qubit states between the quantum registers/memory and the input/output qubits of the modules. In Figure 1, for simplicity the input/output ports of the modules are designed in the same physical qubits, although this can be easily extended to more general cases. The quantum bus can be realised with different technologies, e.g., with microwave resonators,^{53} or it can also be implemented via quantumstate transfer in a qubit network.^{54} The modules shown in Figure 1 can be designed to produce either simple basic operations, such as the CNOT or the Toffoli gate, or, in principle, they can directly implement larger components of a quantum algorithm such as the Quantum Fourier Transform or errorcorrecting codes.^{1} In this respect, to treat systems with many parameters, one can easily combine our optimisation strategy, based on fidelity statistics, with metaheuristic strategies,^{55} which simultaneously deals with many candidate solutions and are known to be fast in global optimisation with highdimensional parameter spaces. Moreover, highly optimised deep learning algorithms are already used to train neural networks with 60 millions of parameters.^{56} However, given the difficulty in numerically simulating large quantum systems, this approach may be reasonable for networks up to, say, 20–30 qubits.
Discussion
Inspired by classical supervised learning, we have proposed an optimisation scheme to encode a quantum operation into the unmodulated dynamics of a qubit register, which is part of a bigger network of pairwise interacting qubits. Our strategy is based on the static engineering of the pairwise couplings, and it enables the creation of a quantum device, which implements the desired operation by simply waiting for the natural dynamics to occur, without the need of external control pulses. Our findings show that machine learninginspired techniques can be combined with quantum mechanics not only for data classification speed up^{25,26} or quantum blackbox certification,^{57,58} but also for quantum hardware design.
This paper opens up the topic of encoding quantum gates and operations into the unmodulated dynamics of qubit networks. Although we have focused on small systems, larger networks can be considered using more efficient training schemes. These would enable the simulation of larger components of a quantum algorithm, as different multiqubit gates can be combined into a unique quantum operation, which can be simulated in a large quantum network. Moreover, when combined with a quantum bus as in Figure 1, our strategy can provide an alternative approach to universal quantum computation, which avoids the decomposition of the algorithm into one and twoqubit gates. Note that most quantum algorithms take classical inputs, and thus the extra control required for initialisation demands the further ability to fully polarise globally the spins. The latter step is, however, typically much easier than the implementation of entangling gates, which has been considered in this paper. Moreover, in view of the recent experimental measures of the average gate fidelity,^{59} it is tempting to predict an allquantum version of our learning procedure, where $\overline{\mathcal{F}}$ is not classically simulated, but rather directly measured. This would require a further highly controlled system to infer the optimal parameters of an uncontrolled quantum network, which can be used to industrialise the production of unmodulated quantum devices implementing the desired algorithm.
Our results demonstrate the efficacy of the proposed scheme in designing fourqubit networks that implement the Toffoli and Fredkin gates or remote logic operations. The proposed Toffoli gate is fast, has high fidelity for faulttolerant computation and only uses static XX and ZZtype interactions, which can be achieved in superconducting systems.^{15} The key advantage of our method is in exploiting all the permanent interactions in the qubit network without trying to suppress some of them sequentially to implement pairwise gates. Moreover, being based on nonequilibrium dynamics, our gate is fast: if J/2π≈40 MHz, then the total operation time is ~60 ns, which matches the current gate times for single and twoqubit operations.^{51}
Materials and methods
Learning rate
The choice of the learning rate ε is crucial. If the initial learning rate is too small, it might not escape from the different ‘local maximum’ points, whereas if it is too large it will continue to randomly jump without even seeing the local maxima. To maximise the speed and precision of SDG, the learning rate ε has to decrease as a function of the steps, a common choice being $\epsilon \propto {m}^{\mathrm{1/2}}$ where m is the step counter.^{37} However, when the gradient in equation (4) cannot be performed analytically, one can use more sophisticated techniques^{60} in which both the learning rate and the finite difference approximation of the gradient change as a function of m.
Symmetries
In the design of the quantum network and its couplings, the number of parameters can be drastically reduced if the target unitary operation U_{Q} has some symmetries, namely if there exists some unitary matrix S such that [U_{Q}, S_{Q}]=0. This condition requires the quantum channel ${\mathcal{E}}_{\lambda}\left[\rho \right]{=\text{Tr}}_{A}[{e}^{i\mathcal{H}}\rho \otimes {\rho}_{A}{e}^{i\mathcal{H}}]$ to satisfy ${\mathcal{E}}_{\lambda}({S}_{Q}\rho {S}_{Q}^{\u2020})={S}_{Q}{\mathcal{E}}_{\lambda}\left(\rho \right){S}_{Q}^{\u2020}$ for each state ρ, e.g., $\left[\mathcal{H},{S}_{Q}\otimes {\mathbb{1}}_{A}\right]=0$. Conversely, if the interaction type is fixed by nature (for instance, only Ising or Heisenberg interactions are allowed), then one has to check whether the Lie algebra spanned by the operators in $\mathcal{H}$ contains the generators of U_{Q}.
Bottomup construction: Lie algebraic characterisation
All the numerical results presented in the main text are obtained using a topdown approach: after selecting the interaction types (e.g., XX, ZZ, Heisenberg and so on), the algorithm starts with a zerobias fully connected configuration in which all the qubit pairs of the network interact with all possible interactions, each weighted with a different parameter, and different local fields. As a result of the training procedure, we found numerically that most of these parameters are indeed zero. However, for larger networks it is better to use a bottomup approach, in which one starts with a minimal set of parameters and then adds other parameters until a solution is found.
To construct a minimal set of parameters, one can use a Lie algebraic characterisation inspired by quantum control. We write the Hamiltonian as $\mathcal{H}={\sum}_{j}{\lambda}_{j}{\mathcal{O}}_{j}$, where λ_{j} is the independent parameter and ${\mathcal{O}}_{j}$ is the operator. If the parameters are time dependent, then there exist suitable pulses λ_{j}(t) such that the dynamics implements the target gate G only if log(G) is contained in the algebra generated by the repeated commutators $[{\mathcal{O}}_{j},[{\mathcal{O}}_{k},\dots ]]$. As our scheme is based on the particular choice where λ_{j}(t) is constant, the above characterisation still provides a necessary condition. As an example, we consider the Toffoli gate and the solution equation (5) where ${\mathcal{O}}_{1}={\sigma}_{1}^{z}{\sigma}_{2}^{z}$, ${\mathcal{O}}_{2}={\sigma}_{1}^{z}{\sigma}_{3}^{z}+{\sigma}_{2}^{z}{\sigma}_{3}^{z}+2{\sigma}_{3}^{z}$, ${\mathcal{O}}_{3}={\sigma}_{1}^{z}{\sigma}_{4}^{z}+{\sigma}_{2}^{z}{\sigma}_{4}^{z}$, ${\mathcal{O}}_{4}={\sigma}_{3}^{x}{\sigma}_{4}^{x}$, ${\mathcal{O}}_{5}={\sigma}_{1}^{z}+{\sigma}_{2}^{z}$, ${\mathcal{O}}_{7}={\sigma}_{4}^{z}$, ${\mathcal{O}}_{8}={\sigma}_{3}^{x}$, ${\mathcal{O}}_{9}={\sigma}_{4}^{x}$. It is simple to check that log G (up to an irrelevant constant factor) is contained in the algebra generated by the operators O_{j}, whereas this is not the case if the operator ${\mathcal{O}}_{8}$ is removed from the Hamiltonian. Therefore, no solution is possible if λ_{8}≡0.
Inspired by the above example, the bottomup approach consists in the following steps: (i) based on the symmetries of the target gate and on the physically allowed interactions, one defines an initial set of operators; (ii) other operators are added to the set until the dynamical algebra contains log(G); (iii) one starts the numerical parameter training to check for convergence (different runs may be required). Until the solution is found, one can either adds new operators or changes the previous ones.
References
 1.
Nielsen, M. A. & Chuang, I. L. Quantum Computation and Quantum Information (Cambridge Univ. Press, 2000).
 2.
Barenco, A. et al. Elementary gates for quantum computation. Phys. Rev. A 52, 3457 (1995).
 3.
Aharonov, D. et al. Adiabatic quantum computation is equivalent to standard quantum computation. SIAM Rev. 50, 755–787 (2008).
 4.
Benjamin, S. C. & Bose, S. Quantum computing with an alwayson heisenberg interaction. Phys. Rev. Lett. 90, 247901 (2003).
 5.
Benjamin, S. C. & Bose, S. Quantum computing in arrays coupled by “alwayson” interactions. Phys. Rev. A 70, 032314 (2004).
 6.
Satoh, T. et al. Scalable quantum computation architecture using alwayson ising interactions via quantum feedforward. Phys. Rev. A 91, 052329 (2015).
 7.
Li, Y., Browne, D. E., Kwek, L. C., Raussendorf, R. & Wei, T.C. Thermal states as universal resources for quantum computation with alwayson interactions. Phys.Rev. Lett. 107, 060501 (2011).
 8.
Burgarth, D. et al. Scalable quantum computation via local control of only two qubits. Phys. Rev. A 81, 040303 (2010).
 9.
Müller, M. et al. Optimizing entangling quantum gates for physical systems. Phys. Rev. A 84, 042315 (2011).
 10.
Banchi, L., Bayat, A., Verrucchi, P. & Bose, S. Nonperturbative entangling gates between distant qubits using uniform cold atom chains. Phys. Rev. Lett. 106, 140501 (2011).
 11.
Devoret, M. & Schoelkopf, R. Superconducting circuits for quantum information: an outlook. Science 339, 1169–1174 (2013).
 12.
Wallraff, A. et al. Strong coupling of a single photon to a superconducting qubit using circuit quantum electrodynamics. Nature 431, 162–167 (2004).
 13.
Barends, R. et al. Coherent josephson qubit suitable for scalable quantum integrated circuits. Phys. Rev. Lett. 111, 080502 (2013).
 14.
Chen, Y. et al. Qubit architecture with high coherence and fast tunable coupling. Phys. Rev. Lett. 113, 220502 (2014).
 15.
Geller, M. R. et al. Tunable Coupler for Superconducting Xmon Qubits: Perturbative Nonlinear Model. Preprint at http://arxiv.org/abs/1405.1915 (2014).
 16.
Neeley, M. et al. Generation of threequbit entangled states using superconducting phase qubits. Nature 467, 570–573 (2010).
 17.
Paik, H. et al. Observation of high coherence in josephson junction qubits measured in a threedimensional circuit qed architecture. Phys. Rev. Lett. 107, 240501 (2011).
 18.
Vedral, V., Barenco, A. & Ekert, A. Quantum networks for elementary arithmetic operations. Phys. Rev. A 54, 147 (1996).
 19.
Fedorov, A., Steffen, L., Baur, M., Da Silva, M. & Wallraff, A. Implementation of a toffoli gate with superconducting circuits. Nature 481, 170–172 (2011).
 20.
Cory, D. G. et al. Experimental quantum error correction. Phys. Rev. Lett. 81, 2152 (1998).
 21.
Bravyi, S., DiVincenzo, D. P., Loss, D. & Terhal, B. M. Quantum simulation of manybody hamiltonians using perturbation theory with boundedstrength interactions. Phys. Rev. Lett. 101, 070503 (2008).
 22.
Biamonte, J. D. & Love, P. J. Realizable hamiltonians for universal adiabatic quantum computers. Phys. Rev. A 78, 012352 (2008).
 23.
Brif, C., Chakrabarti, R. & Rabitz, H. Control of quantum phenomena: past, present and future. New J. Phys. 12, 075008 (2010).
 24.
Bishop, C. M. Pattern Recognition and Machine Learning (Springer, 2006).
 25.
Wittek, P. Quantum Machine Learning: What Quantum Computing Means to Data Mining (Elsevier, 2014).
 26.
Rebentrost, P., Mohseni, M. & Lloyd, S. Quantum support vector machine for big data classification. Phys. Rev. Lett. 113, 130503 (2014).
 27.
Paparo, G. D., Dunjko, V., Makmal, A., MartinDelgado, M. A. & Briegel, H. J. Quantum speedup for active learning agents. Phys. Rev. X 4, 031002 (2014).
 28.
Wiebe, N., Kapoor, A. & Svore, K. Quantum nearestneighbor algorithms for machine learning. Quantum Inf. Comput. 15, 0318–0358 (2015).
 29.
Lloyd, S., Mohseni, M. & Rebentrost, P. Quantum principal component analysis. Nat. Phys. 10, 631–633 (2014).
 30.
Rotondo, P., Lagomarsino, M. C. & Viola, G. Dicke simulators with emergent collective quantum computational abilities. Phys. Rev. Lett. 114, 143601 (2015).
 31.
Nagaj, D. Universal twobodyhamiltonian quantum computing. Phys. Rev. A 85, 032330 (2012).
 32.
Bang, J., Lim, J., Kim, M. & Lee, J. Quantum Learning Machine. Preprint at https://arxiv.org/abs/0803.2976 (2008).
 33.
Gammelmark, S. & Mølmer, K. Quantum learning by measurement and feedback. New J. Phys. 11, 033017 (2009).
 34.
Magesan, E., BlumeKohout, R. & Emerson, J. Gate fidelity fluctuations and quantum process invariants. Phys. Rev. A 84, 012309 (2011).
 35.
Pedersen, L. H., Møller, N. M. & Mølmer, K. The distribution of quantum fidelities. Phys. Lett. A 372, 7028–7032 (2008).
 36.
Collins, B. & Śniady, P. Integration with respect to the haar measure on unitary, orthogonal and symplectic group. Commun. Math. Phys. 264, 773–795 (2006).
 37.
Bottou, L. in Online Learning and Neural Networks (ed. Saad, D.) (Cambridge University Press, Cambridge, UK, 1998).
 38.
Johansson, J., Nation, P. & Nori, F. Qutip 2: A python framework for the dynamics of open quantum systems. Comput. Phys. Commun. 184, 1234–1240 (2013).
 39.
Shor, P. W. Polynomialtime algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM J. Comput. 26, 1484–1509 (1997).
 40.
Dennis, E. Toward faulttolerant quantum computation without concatenation. Phys. Rev. A 63, 052314 (2001).
 41.
Shi, Y. Both toffoli and controllednot need little help to do universal quantum computing. Quantum Inf. Comput. 3, 84–92 (2003).
 42.
Monz, T. et al. Realization of the quantum toffoli gate with trapped ions. Phys. Rev. Lett. 102, 040501 (2009).
 43.
Reed, M. et al. Realization of threequbit quantum error correction with superconducting circuits. Nature 482, 382–385 (2012).
 44.
Lanyon, B. P. et al. Simplifying quantum logic using higherdimensional hilbert spaces. Nat. Phys. 5, 134–140 (2009).
 45.
Stojanović, V. M., Fedorov, A., Wallraff, A. & Bruder, C. Quantumcontrol approach to realizing a toffoli gate in circuit qed. Phys. Rev. B 85, 054504 (2012).
 46.
Zahedinejad, E., Ghosh, J. & Sanders, B. C. Highfidelity singleshot toffoli gate via quantum control. Phys. Rev. Lett. 114, 200502 (2015).
 47.
Gaitan, F. Quantum Error Correction and Fault Tolerant Quantum Computing (CRC Press, 2008).
 48.
Yao, N. Y. et al. Quantum logic between remote quantum registers. Phys. Rev. A 87, 022306 (2013).
 49.
Banchi, L., Compagno, E. & Bose, S. Perfect wavepacket splitting and reconstruction in a onedimensional lattice. Phys. Rev. A 91, 052323 (2015).
 50.
Yao, N. Y. et al. Scalable architecture for a room temperature solidstate quantum information processor. Nat. Commun. 3, 800 (2012).
 51.
Barends, R. et al. Superconducting quantum circuits at the surface code threshold for fault tolerance. Nature 508, 500–503 (2014).
 52.
Monz, T. et al. Realization of a scalable shor algorithm. Science 351, 1068–1070 (2016).
 53.
Mariantoni, M. et al. Implementing the quantum von Neumann architecture with superconducting circuits. Science 334, 61–65 (2011).
 54.
Nikolopoulos, G. M. & Jex, I. Quantum State Transfer and Network Engineering (Springer, 2014).
 55.
Storn, R. & Price, K. Differential evolutiona simple and efficient heuristic for global optimization over continuous spaces. J. Global Opt. 11, 341–359 (1997).
 56.
Krizhevsky, A., Sutskever, I. & Hinton, G. E. in Advances in Neural Information Processing Systems Vol.25 (eds Pereira, F., Burges, C. J. C., Bottou, L. & Weinberger, K. Q.) 1097–1105 (Curran Associates, Inc., 2012).
 57.
Bisio, A., Chiribella, G., DAriano, G. M., Facchini, S. & Perinotti, P. Optimal quantum learning of a unitary transformation. Phys. Rev. A 81, 032324 (2010).
 58.
Wiebe, N., Granade, C., Ferrie, C. & Cory, D. Hamiltonian learning and certification using quantum resources. Phys. Rev. Lett. 112, 190501 (2014).
 59.
Lu, D. et al. Experimental estimation of average fidelity of a clifford gate on a 7qubit quantum processor. Phys. Rev. Lett. 114, 140505 (2015).
 60.
Spall, J. C. Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control vol. 65 (John Wiley & Sons, 2005).
Acknowledgements
L.B. and S.B. acknowledge the financial support by the ERC under Starting Grant 308253 PACOMANEDIA. We thank P. Wittek, A. Monras and J.I. Cirac for their valuable comments and suggestions.
Author information
Affiliations
Department of Physics and Astronomy, University College London, London, UK
 Leonardo Banchi
 & Sougato Bose
Theory Division, MaxPlanckInstitut für Quantenoptik, Garching, Germany
 Nicola Pancotti
Dipartimento di Fisica, Sapienza Università di Roma, Roma, Italy
 Nicola Pancotti
Authors
Search for Leonardo Banchi in:
Search for Nicola Pancotti in:
Search for Sougato Bose in:
Competing interests
The authors declare no conflict of interest.
Corresponding author
Correspondence to Leonardo Banchi.
Rights and permissions
This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
About this article
Further reading

1.
Quantum generalisation of feedforward neural networks
npj Quantum Information (2017)

2.
Nature (2017)